Skip to content

Commit e73a946

Browse files
authored
Add access_history and polish history tables (#2434)
* Add `system_history.access_history` table and polish config * polish access-history * Polish other history tables to keep style consistency * polish
1 parent 7f2ea1f commit e73a946

File tree

7 files changed

+228
-105
lines changed

7 files changed

+228
-105
lines changed

docs/en/guides/10-deploy/04-references/02-node-config/02-query-config.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,9 +109,10 @@ The following is a list of the parameters available within the [log.history] sec
109109
| Parameter | Description |
110110
| ----------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- |
111111
| on | Enables or disables the history logging feature. Defaults to false. Set to true to enable history tables. |
112+
| log_only | Nodes with enabled will delegate transformation tasks to other nodes, reducing their own workload. |
112113
| interval | Specifies the interval (in seconds) at which the history log is flushed. Defaults to 2. |
113114
| stage_name | Specifies the name of the staging area that temporarily holds log data before it is finally copied into the table. Defaults to a unique value to avoid conflicts.|
114-
| level | Sets the log level (DEBUG, TRACE, INFO, WARN, or ERROR) for history logging. Defaults to WARN. |
115+
| level | Sets the log level (DEBUG, TRACE, INFO, WARN, or ERROR) for history logging. Defaults to WARN. |
115116
| retention_interval| The interval (in hours) at which the retention process is triggered to check if need to clean up old data. Defaults to 24. |
116117
| tables | Specifies which history tables to enable and their retention policies. This is an array of objects, each with table_name (the name of the history table) and retention (the retention period in hours for that table). |
117118

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
---
2+
title: system_history.access_history
3+
---
4+
5+
This table provides detailed logging of objects accessed and modified by each query, including tables, columns, and stages, as part of the query metadata. It provides structured information about DDL and DML operations to enhance auditing.
6+
7+
8+
## Fields
9+
10+
| Field | Type | Description |
11+
|-------------------------|-----------|-----------------------------------------------------------------------------|
12+
| query_id | VARCHAR | The ID of the query. |
13+
| query_start | TIMESTAMP | The start time of the query. |
14+
| user_name | VARCHAR | The name of the user who executed the query. |
15+
| base_objects_accessed | VARIANT | The objects accessed by the query. |
16+
| direct_objects_accessed | VARIANT | Reserved for future use; currently not in use. |
17+
| objects_modified | VARIANT | The objects modified by the query. |
18+
| object_modified_by_ddl | VARIANT | The objects modified by the DDL (e.g `CREATE TABLE`, `ALTER TABLE`). |
19+
20+
The fields `base_objects_accessed`, `objects_modified`, and `object_modified_by_ddl` are all arrays of JSON objects. Each object may include the following fields:
21+
22+
- `object_domain`: The type of object, one of [`Database`, `Table`, `Stage`].
23+
- `object_name`: The name of the object. For stages, this is the stage name.
24+
- `columns`: Column information, present only when `object_domain` is `Table`.
25+
- `stage_type`: The type of stage, present only when `object_domain` is `Stage`.
26+
- `operation_type`: The DDL operation type, one of [`Create`, `Alter`, `Drop`, `Undrop`], present only in the `object_modified_by_ddl` field.
27+
- `properties`: Detailed information about the DDL operation, present only in the `object_modified_by_ddl` field.
28+
29+
## Examples
30+
31+
32+
```sql
33+
CREATE TABLE t (a INT, b string);
34+
```
35+
36+
Will be recorded as:
37+
38+
```
39+
query_id: c2c1c7be-cee4-4868-a28e-8862b122c365
40+
query_start: 2025-06-12 03:31:19.042128
41+
user_name: root
42+
base_objects_accessed: []
43+
direct_objects_accessed: []
44+
objects_modified: []
45+
object_modified_by_ddl: [{"object_domain":"Table","object_name":"default.default.t","operation_type":"Create","properties":{"columns":[{"column_name":"a","sub_operation_type":"Add"},{"column_name":"b","sub_operation_type":"Add"}],"create_options":{"compression":"zstd","database_id":"1","storage_format":"parquet"}}}]
46+
```
47+
48+
`CREATE TABLE` is a DDL operation, so it will be recorded in the `object_modified_by_ddl` field.
49+
50+
51+
```sql
52+
INSERT INTO t VALUES (1, 'book');
53+
```
54+
55+
Will be recorded as:
56+
57+
```
58+
query_id: e92ebc00-a07e-4138-92a9-ea17a06f0165
59+
query_start: 2025-06-12 03:31:29.849848
60+
user_name: root
61+
base_objects_accessed: []
62+
direct_objects_accessed: []
63+
objects_modified: [{"columns":[{"column_name":"a"},{"column_name":"b"}],"object_domain":"Table","object_name":"default.default.t"}]
64+
object_modified_by_ddl: []
65+
```
66+
67+
`INSERT INTO` is a DML operation, so it will be recorded in the `objects_modified` field.
68+
69+
70+
```sql
71+
COPY INTO @s FROM t;
72+
```
73+
74+
```
75+
query_id: 7fd74374-c04a-4989-a6f7-bfe8cc27e511
76+
query_start: 2025-06-12 03:32:25.682248
77+
user_name: root
78+
base_objects_accessed: [{"columns":[{"column_name":"a"},{"column_name":"b"}],"object_domain":"Table","object_name":"default.default.t"}]
79+
direct_objects_accessed: []
80+
objects_modified: [{"object_domain":"Stage","object_name":"s","stage_type":"Internal"}]
81+
object_modified_by_ddl: []
82+
```
83+
84+
The `COPY INTO` operation from table `t` to internal stage `s` involves both read and write actions. After executing this query, the source table will be recorded in the `base_objects_accessed` field, and the target stage will be recorded in the `objects_modified` field.

docs/en/sql-reference/00-sql-reference/32-system-history-tables/index.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ System history tables store persistent data in the `system_history` schema for a
1818
| [system_history.query_history](query-history.md) | Stores structured details of query execution. |
1919
| [system_history.profile_history](profile-history.md)| Stores detailed query execution profiles and statistics. |
2020
| [system_history.login_history](login-history.md) | Records information about user login events. |
21+
| [system_history.access_history](access-history.md) | Stores information about query access events. |
2122

2223
## Enabling System History Tables
2324

@@ -51,7 +52,7 @@ table_name = "login_history"
5152
retention = 168
5253
```
5354

54-
> **Note:** The `log_history` table is enabled by default when history logging is turned on.
55+
> **Note:** The `log_history` table is enabled by default when history logging is turned on. The `level` configuration determines the number of log entries stored in the log_history table. A more detailed level will result in more entries.
5556
5657

5758
For more details about configuration options, see [Query Configuration: [log.history] Section](/guides/deploy/references/node-config/query-config#loghistory-section).

docs/en/sql-reference/00-sql-reference/32-system-history-tables/log-history.md

Lines changed: 28 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -6,27 +6,36 @@ Stores raw log entries ingested from various nodes. This table acts as the prima
66

77
All the other log tables are derived from this table, the difference is that other log tables will do some transformations to make the data more structured.
88

9-
```sql
10-
DESCRIBE system_history.log_history;
11-
12-
╭──────────────────────────────────────────────────────╮
13-
│ Field │ Type │ Null │ Default │ Extra │
14-
│ String │ String │ String │ String │ String │
15-
├──────────────┼───────────┼────────┼─────────┼────────┤
16-
timestampTIMESTAMP │ YES │ NULL │ │
17-
pathVARCHAR │ YES │ NULL │ │
18-
│ target │ VARCHAR │ YES │ NULL │ │
19-
│ log_level │ VARCHAR │ YES │ NULL │ │
20-
│ cluster_id │ VARCHAR │ YES │ NULL │ │
21-
│ node_id │ VARCHAR │ YES │ NULL │ │
22-
│ warehouse_id │ VARCHAR │ YES │ NULL │ │
23-
│ query_id │ VARCHAR │ YES │ NULL │ │
24-
│ message │ VARCHAR │ YES │ NULL │ │
25-
│ fields │ VARIANT │ YES │ NULL │ │
26-
│ batch_number │ BIGINT │ YES │ NULL │ │
27-
╰──────────────────────────────────────────────────────╯
9+
## Fields
10+
11+
| Field | Type | Description |
12+
|--------------|-----------|--------------------------------------------------|
13+
| timestamp | TIMESTAMP | The timestamp when the log entry was recorded |
14+
| path | VARCHAR | Source file path and line number of the log |
15+
| target | VARCHAR | Target module or component of the log |
16+
| log_level | VARCHAR | Log level (e.g., `INFO`, `ERROR`) |
17+
| cluster_id | VARCHAR | Identifier of the cluster |
18+
| node_id | VARCHAR | Identifier of the node |
19+
| warehouse_id | VARCHAR | Identifier of the warehouse |
20+
| query_id | VARCHAR | Query ID associated with the log |
21+
| message | VARCHAR | Log message content |
22+
| fields | VARIANT | Additional fields (as a JSON object) |
23+
| batch_number | BIGINT | Internal use, no special meaning |
24+
25+
Note: The `message` field stores plain text logs, while the `fields` field stores logs in JSON format.
26+
27+
For example, the `fields` field of a log entry might look like:
28+
```
29+
fields: {"node_id":"8R5ZMF8q0HHE6x9H7U1gr4","query_id":"72d2319a-b6d6-4b1d-8694-670137a40d87","session_id":"189fd3e2-e6ac-48c3-97ef-73094c141312","sql":"select * from system_history.log_history"}
2830
```
2931

32+
the `message` field of another log entry might appear as follows:
33+
```
34+
message: [HTTP-QUERY] Preparing to plan SQL query
35+
```
36+
37+
## Examples
38+
3039
```sql
3140
SELECT * FROM system_history.log_history LIMIT 1;
3241

docs/en/sql-reference/00-sql-reference/32-system-history-tables/login-history.md

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,25 @@ title: system_history.login_history
44

55
Records all login attempts in the system, including successful and failed login attempts. This table is useful for auditing user access and troubleshooting authentication issues.
66

7+
8+
## Fields
9+
10+
| Field | Type | Description |
11+
|----------------|-----------|------------------------------------------------------------ |
12+
| event_time | TIMESTAMP | The timestamp when the login event occurred |
13+
| handler | VARCHAR | The protocol or handler used for the login (e.g., `HTTP`) |
14+
| event_type | VARCHAR | The type of login event (e.g., `LoginSuccess`, `LoginFailed`) |
15+
| connection_uri | VARCHAR | The URI used for the connection |
16+
| auth_type | VARCHAR | The authentication method used (e.g., Password) |
17+
| user_name | VARCHAR | The name of the user attempting to log in |
18+
| client_ip | VARCHAR | The IP address of the client |
19+
| user_agent | VARCHAR | The user agent string of the client |
20+
| session_id | VARCHAR | The session ID associated with the login attempt |
21+
| node_id | VARCHAR | The node ID where the login was processed |
22+
| error_message | VARCHAR | The error message if the login failed |
23+
24+
## Examples
25+
726
Login successful example:
827
```sql
928
SELECT * FROM system_history.login_history LIMIT 1;
@@ -12,7 +31,7 @@ SELECT * FROM system_history.login_history LIMIT 1;
1231
event_time: 2025-06-03 06:04:57.353108
1332
handler: HTTP
1433
event_type: LoginSuccess
15-
connection_uri: /query
34+
connection_uri: /session/login?disable_session_token=true
1635
auth_type: Password
1736
user_name: root
1837
client_ip: 127.0.0.1

docs/en/sql-reference/00-sql-reference/32-system-history-tables/profile-history.md

Lines changed: 12 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -3,21 +3,19 @@ title: system_history.profile_history
33
---
44
Stores detailed execution profiles for SQL queries in Databend. Each entry provides performance metrics and execution statistics, allowing users to analyze and optimize query performance.
55

6-
The `profiles` field contains a JSON object with detailed information.
6+
## Fields
77

8-
```sql
9-
DESCRIBE system_history.profile_history;
10-
11-
╭─────────────────────────────────────────────────────────╮
12-
│ Field │ Type │ Null │ Default │ Extra │
13-
│ String │ String │ String │ String │ String │
14-
├─────────────────┼───────────┼────────┼─────────┼────────┤
15-
timestampTIMESTAMP │ YES │ NULL │ │
16-
│ query_id │ VARCHAR │ YES │ NULL │ │
17-
│ profiles │ VARIANT │ YES │ NULL │ │
18-
│ statistics_desc │ VARIANT │ YES │ NULL │ │
19-
╰─────────────────────────────────────────────────────────╯
20-
```
8+
9+
| Field | Type | Description |
10+
|-----------------|-----------|-----------------------------------------------------------------------------|
11+
| timestamp | TIMESTAMP | The timestamp when the profile was recorded |
12+
| query_id | VARCHAR | The ID of the query associated with this profile |
13+
| profiles | VARIANT | A JSON object containing detailed execution profile information |
14+
| statistics_desc | VARIANT | A JSON object describing statistics format |
15+
16+
17+
18+
## Examples
2119

2220
The `profiles` field can be used to extract specific information. For example, to get the `OutputRows` value for every physical plan, the following query can be used:
2321
```sql

0 commit comments

Comments
 (0)