Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 70 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,75 @@

All notable changes to this project will be documented in this file. See [standard-version](https://github.com/conventional-changelog/standard-version) for commit guidelines.

### [4.1.1](https://github.com/DTStack/dt-sql-parser/compare/v4.1.0...v4.1.1) (2025-02-17)


### Bug Fixes

* **flink:** [#398](https://github.com/DTStack/dt-sql-parser/issues/398) fix flinksql built-in function's using ([#399](https://github.com/DTStack/dt-sql-parser/issues/399)) ([f8afbe2](https://github.com/DTStack/dt-sql-parser/commit/f8afbe29b3bbe47ace0e04476ceb50fb44994235))

## [4.1.0](https://github.com/DTStack/dt-sql-parser/compare/v4.0.1...v4.1.0) (2025-02-13)


### Features

* add alter table stmt ([#312](https://github.com/DTStack/dt-sql-parser/issues/312)) ([5aade9e](https://github.com/DTStack/dt-sql-parser/commit/5aade9e6daafc2c6e70c5202d7ef06572ec37f6e))
* add benchmark test suite ([#273](https://github.com/DTStack/dt-sql-parser/issues/273)) ([de1bd9d](https://github.com/DTStack/dt-sql-parser/commit/de1bd9de4cb7c3b42d51bedd79635eb91afba9ed))
* **basicSql:** remove judge splitListener/collectListener, all sqlParser implements it ([#316](https://github.com/DTStack/dt-sql-parser/issues/316)) ([eb2e920](https://github.com/DTStack/dt-sql-parser/commit/eb2e920e345aef98285ba261c2060db61d1d56b8))
* sync some useful syntax from antlr/grammars-v4 ([95a1087](https://github.com/DTStack/dt-sql-parser/commit/95a108744bb40e418056faaf86bd97b85dd191f8))
* upgrade trino to 450 ([#323](https://github.com/DTStack/dt-sql-parser/issues/323)) ([2b0de6a](https://github.com/DTStack/dt-sql-parser/commit/2b0de6a3da16561ec52b0c69d4e052226d54a553))
* use common sql to run benchmark ([#326](https://github.com/DTStack/dt-sql-parser/issues/326)) ([76d0900](https://github.com/DTStack/dt-sql-parser/commit/76d090040e7af26227727673a82f77cda08b3f9e))


### Bug Fixes

* [#351](https://github.com/DTStack/dt-sql-parser/issues/351) antlr4 command optimize ([74d6435](https://github.com/DTStack/dt-sql-parser/commit/74d643599eb5603279a180262c49eccb04779a30))
* [#381](https://github.com/DTStack/dt-sql-parser/issues/381) antlr4 flink grammar ([74be81c](https://github.com/DTStack/dt-sql-parser/commit/74be81cc695cb26f9b7e90c866e8183f34020a42))
* add hash partition table keywords MODULUS and REMAINDER ([#384](https://github.com/DTStack/dt-sql-parser/issues/384)) ([f2e6b60](https://github.com/DTStack/dt-sql-parser/commit/f2e6b605eca5f8221588d2ca9b85ac2b824aae8d))
* alert to alterView ([#346](https://github.com/DTStack/dt-sql-parser/issues/346)) ([9ba5100](https://github.com/DTStack/dt-sql-parser/commit/9ba51007e2f21ab8bc42623596ee281801904cfa))
* **benchmark:** add reports dir judge and remove plsql and include pgsql ([9c534c2](https://github.com/DTStack/dt-sql-parser/commit/9c534c25cacba3cfba6bd234c68e8f27bd90b2e2))
* build mysql ([5d6ff46](https://github.com/DTStack/dt-sql-parser/commit/5d6ff4662a11acf9f16b1f18c41c204922890df9))
* **ci:** add antlr4 all sql in ci ([2b30e78](https://github.com/DTStack/dt-sql-parser/commit/2b30e781a24f9d7685e46ebc90b1cc153f7e267e))
* **ci:** change ci and add hash judge ([276cc34](https://github.com/DTStack/dt-sql-parser/commit/276cc34c55bacd34cda4e8eeb7eef5f0955f9b82))
* **ci:** change crypto to devDependencies ([b788e1c](https://github.com/DTStack/dt-sql-parser/commit/b788e1ca788308cc56601bcbf7ae24f3156e3af9))
* createFunction and createFunctionLoadable ([e83449a](https://github.com/DTStack/dt-sql-parser/commit/e83449a0cc0a50be510c7b4a3337597b1890fc92))
* flinksql function params add more time functions ([#347](https://github.com/DTStack/dt-sql-parser/issues/347)) ([b835c4b](https://github.com/DTStack/dt-sql-parser/commit/b835c4b5b506c8e4bf0bd9c99fe66c15e53a179b))
* **hive:** add select into configPropertiesItem ([#365](https://github.com/DTStack/dt-sql-parser/issues/365)) ([bdb4b96](https://github.com/DTStack/dt-sql-parser/commit/bdb4b962f2e170c4e703359a9cd6a451f7b8fd60))
* **impala:** fix alter table change statement ([#332](https://github.com/DTStack/dt-sql-parser/issues/332)) ([4a9681e](https://github.com/DTStack/dt-sql-parser/commit/4a9681ed3bd188e41c30a6d7be39d6e77df7f61b))
* mysql case when ([#317](https://github.com/DTStack/dt-sql-parser/issues/317)) ([fea1ad1](https://github.com/DTStack/dt-sql-parser/commit/fea1ad1a357b70291a240eca6d2058bab9b49469))
* **postgresql:** change func_application to add column_name and paren ([#359](https://github.com/DTStack/dt-sql-parser/issues/359)) ([9a5eda8](https://github.com/DTStack/dt-sql-parser/commit/9a5eda8d80789e37f2904a1ceb3f8c646237a207))
* **postgresql:** combine plsql_unreserved_keyword to unreserved_keyword and remove unused rules ([7884cbe](https://github.com/DTStack/dt-sql-parser/commit/7884cbe37844c057fa41fde4d0716af43c4023af))
* **trino:** update timezone grammar to avoid ambiguity ([#394](https://github.com/DTStack/dt-sql-parser/issues/394)) ([05134bc](https://github.com/DTStack/dt-sql-parser/commit/05134bc569996d108f961a7228c2e34cea0fd98b))
* update isContainCaret judgment when caret position token is whit… ([#390](https://github.com/DTStack/dt-sql-parser/issues/390)) ([20f065d](https://github.com/DTStack/dt-sql-parser/commit/20f065d6f099ee6e021d9b0499e4c4aa7de92e6b))

## [4.1.0-beta.0](https://github.com/DTStack/dt-sql-parser/compare/v4.0.1...v4.1.0-beta.0) (2024-08-27)


### Features

* add alter table stmt ([#312](https://github.com/DTStack/dt-sql-parser/issues/312)) ([5aade9e](https://github.com/DTStack/dt-sql-parser/commit/5aade9e6daafc2c6e70c5202d7ef06572ec37f6e))
* add benchmark test suite ([#273](https://github.com/DTStack/dt-sql-parser/issues/273)) ([de1bd9d](https://github.com/DTStack/dt-sql-parser/commit/de1bd9de4cb7c3b42d51bedd79635eb91afba9ed))
* **basicSql:** remove judge splitListener/collectListener, all sqlParser implements it ([#316](https://github.com/DTStack/dt-sql-parser/issues/316)) ([eb2e920](https://github.com/DTStack/dt-sql-parser/commit/eb2e920e345aef98285ba261c2060db61d1d56b8))
* collect entity's attribute([#333](https://github.com/DTStack/dt-sql-parser/issues/333)) ([a3b6b7e](https://github.com/DTStack/dt-sql-parser/commit/a3b6b7eb8bad2444b16481985278461c35360570))
* **flinksql:** collect comment, type attribute for entity ([#319](https://github.com/DTStack/dt-sql-parser/issues/319)) ([ae52ebd](https://github.com/DTStack/dt-sql-parser/commit/ae52ebdd6b6d1511cf92eb09521b06bdec66ba0d)), closes [#305](https://github.com/DTStack/dt-sql-parser/issues/305)
* improve errorListener msg ([#281](https://github.com/DTStack/dt-sql-parser/issues/281)) ([deef123](https://github.com/DTStack/dt-sql-parser/commit/deef1238bb25d5bfee80ddaf1fea5ad48178d17b))
* sync some useful syntax from antlr/grammars-v4 ([95a1087](https://github.com/DTStack/dt-sql-parser/commit/95a108744bb40e418056faaf86bd97b85dd191f8))
* upgrade trino to 450 ([#323](https://github.com/DTStack/dt-sql-parser/issues/323)) ([2b0de6a](https://github.com/DTStack/dt-sql-parser/commit/2b0de6a3da16561ec52b0c69d4e052226d54a553))
* use common sql to run benchmark ([#326](https://github.com/DTStack/dt-sql-parser/issues/326)) ([76d0900](https://github.com/DTStack/dt-sql-parser/commit/76d090040e7af26227727673a82f77cda08b3f9e))


### Bug Fixes

* alert to alterView ([#346](https://github.com/DTStack/dt-sql-parser/issues/346)) ([9ba5100](https://github.com/DTStack/dt-sql-parser/commit/9ba51007e2f21ab8bc42623596ee281801904cfa))
* **benchmark:** add reports dir judge and remove plsql and include pgsql ([9c534c2](https://github.com/DTStack/dt-sql-parser/commit/9c534c25cacba3cfba6bd234c68e8f27bd90b2e2))
* build mysql ([5d6ff46](https://github.com/DTStack/dt-sql-parser/commit/5d6ff4662a11acf9f16b1f18c41c204922890df9))
* createFunction and createFunctionLoadable ([e83449a](https://github.com/DTStack/dt-sql-parser/commit/e83449a0cc0a50be510c7b4a3337597b1890fc92))
* flinksql function params add more time functions ([#347](https://github.com/DTStack/dt-sql-parser/issues/347)) ([b835c4b](https://github.com/DTStack/dt-sql-parser/commit/b835c4b5b506c8e4bf0bd9c99fe66c15e53a179b))
* **impala:** fix alter table change statement ([#332](https://github.com/DTStack/dt-sql-parser/issues/332)) ([4a9681e](https://github.com/DTStack/dt-sql-parser/commit/4a9681ed3bd188e41c30a6d7be39d6e77df7f61b))
* mysql case when ([#317](https://github.com/DTStack/dt-sql-parser/issues/317)) ([fea1ad1](https://github.com/DTStack/dt-sql-parser/commit/fea1ad1a357b70291a240eca6d2058bab9b49469))
* **postgresql:** combine plsql_unreserved_keyword to unreserved_keyword and remove unused rules ([7884cbe](https://github.com/DTStack/dt-sql-parser/commit/7884cbe37844c057fa41fde4d0716af43c4023af))
* spell check ([#337](https://github.com/DTStack/dt-sql-parser/issues/337)) ([694b0cd](https://github.com/DTStack/dt-sql-parser/commit/694b0cdf15943d02a9402a748155a1b06508af95))

### [4.0.2](https://github.com/DTStack/dt-sql-parser/compare/v4.0.1...v4.0.2) (2024-06-19)


Expand Down Expand Up @@ -61,7 +130,7 @@ All notable changes to this project will be documented in this file. See [standa

### Features

* add toMatchUnorderedArrary matcher and apply it ([#271](https://github.com/DTStack/dt-sql-parser/issues/271)) ([a05f099](https://github.com/DTStack/dt-sql-parser/commit/a05f099aa1ad555c408bc2018240fb4611ec09b8))
* add toMatchUnorderedArray matcher and apply it ([#271](https://github.com/DTStack/dt-sql-parser/issues/271)) ([a05f099](https://github.com/DTStack/dt-sql-parser/commit/a05f099aa1ad555c408bc2018240fb4611ec09b8))
* collect entity ([#265](https://github.com/DTStack/dt-sql-parser/issues/265)) ([a997211](https://github.com/DTStack/dt-sql-parser/commit/a99721162be0d463b513f53bb13ada6d10168548)), closes [#256](https://github.com/DTStack/dt-sql-parser/issues/256) [#263](https://github.com/DTStack/dt-sql-parser/issues/263) [#268](https://github.com/DTStack/dt-sql-parser/issues/268)
* migrate to antlr4ng ([#267](https://github.com/DTStack/dt-sql-parser/issues/267)) ([195878d](https://github.com/DTStack/dt-sql-parser/commit/195878da9bb1ff8011b5d60c02389fa66d2bc0b8))
* **spark:** support materialized view for spark sql ([#262](https://github.com/DTStack/dt-sql-parser/issues/262)) ([5ce89cb](https://github.com/DTStack/dt-sql-parser/commit/5ce89cb421de18330d56e23a4ab5b658b2130a0b))
Expand Down
49 changes: 49 additions & 0 deletions README-zh_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -356,6 +356,55 @@ console.log(sqlSlices)

行列号信息不是必传的,如果传了行列号信息,那么收集到的实体中,如果实体位于对应行列号所在的语句下,那么实体的所属的语句对象上会带有 `isContainCaret` 标识,这在与自动补全功能结合时,可以帮助你快速筛选出需要的实体信息。


### 获取语义上下文信息
调用 SQL 实例上的 `getSemanticContextAtCaretPosition` 方法,传入 sql 文本和指定位置的行列号, 例如:
```typescript
import { HiveSQL } from 'dt-sql-parser';

const hive = new HiveSQL();
const sql = 'SELECT * FROM tb;';
const pos = { lineNumber: 1, column: 18 }; // 'tb;' 的后面
const semanticContext = hive.getSemanticContextAtCaretPosition(sql, pos);

console.log(semanticContext);
```

*输出*

```typescript
/*
{
isStatementBeginning: true,
}
*/
```

目前能收集到的语义上下文信息如下,如果有更多的需求,欢迎提[issue](https://github.com/DTStack/dt-sql-parser/issues)
- `isStatementBeginning` 当前输入位置是否为一条语句的开头

默认情况下,`isStatementBeginning` 的收集策略为`SqlSplitStrategy.STRICT`

有两种可选策略:
- `SqlSplitStrategy.STRICT` 严格策略, 仅以语句分隔符`;`作为上一条语句结束的标识
- `SqlSplitStrategy.LOOSE` 宽松策略, 以语法解析树为基础分割SQL

两种策略的差异:
如输入SQL为
```sql
CREATE TABLE tb (id INT)

SELECT
```
CREATE语句后未添加分号,那么当获取SELECT后的语义上下文时,
在`SqlSplitStrategy.STRICT`策略下`isStatementBeginning` 为`false`, 因为CREATE语句未以分号结尾,那么会被认为这条语句尚未结束;
在`SqlSplitStrategy.LOOSE`策略下`isStatementBeginning` 为`true`, 因为语法解析树中这条SQL被拆分成了CREATE独立语句与SELECT独立语句。

可以通过第三个`options`参数设置策略:
```typescript
hive.getSemanticContextAtCaretPosition(sql, pos, { splitSqlStrategy: SqlSplitStrategy.LOOSE });
```

### 其他 API

- `createLexer` 创建一个 Antlr4 Lexer 实例并返回;
Expand Down
51 changes: 51 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -357,6 +357,57 @@ Call the `getAllEntities` method on the SQL instance, and pass in the sql text a

Position is not required, if the position is passed, then in the collected entities, if the entity is located under the statement where the corresponding position is located, then the statement object to which the entity belongs will be marked with `isContainCaret`, which can help you quickly filter out the required entities when combined with the code completion function.

### Get semantic context information

Call the `getSemanticContextAtCaretPosition` method on the SQL instance, passing in the sql text and the line and column numbers at the specified position, for example:

```typescript
import { HiveSQL } from 'dt-sql-parser';

const hive = new HiveSQL();
const sql = 'SELECT * FROM tb;';
const pos = { lineNumber: 1, column: 18 }; // after 'tb;'
const semanticContext = hive.getSemanticContextAtCaretPosition(sql, pos);

console.log(semanticContext);
```

*output*

```typescript
/*
{
isStatementBeginning: true,
}
*/
```

Currently, the semantic context information that can be collected is as follows. If there are more requirements, please submit an [issue](https://github.com/DTStack/dt-sql-parser/issues).

- `isStatementBeginning` Whether the current input position is the beginning of a statement

The **default strategy** for `isStatementBeginning` is `SqlSplitStrategy.STRICT`

There are two optional strategies:
- `SqlSplitStrategy.STRICT` Strict strategy, only the statement delimiter `;` is used as the identifier for the end of the previous statement
- `SqlSplitStrategy.LOOSE` Loose strategy, based on the syntax parsing tree to split SQL

The difference between the two strategies:
For example, if the input SQL is:
```sql
CREATE TABLE tb (id INT)

SELECT
```
In the `SqlSplitStrategy.STRICT` strategy, `isStatementBeginning` is `false`, because the CREATE statement is not terminated by a semicolon.

In the `SqlSplitStrategy.LOOSE` strategy, `isStatementBeginning` is `true`, because the syntax parsing tree splits the SQL into two independent statements: CREATE and SELECT.

You can set the strategy through the third `options` parameter:
```typescript
hive.getSemanticContextAtCaretPosition(sql, pos, { splitSqlStrategy: SqlSplitStrategy.LOOSE });
```

### Other API

- `createLexer` Create an instance of Antlr4 Lexer and return it;
Expand Down
5 changes: 5 additions & 0 deletions benchmark/benchmark.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,11 @@ const testFiles: TestFile[] = [
includes: ['flink'],
testTypes: ['getSuggestionAtCaretPosition'],
},
{
name: 'Collect Semantics',
sqlFileName: 'select.sql',
testTypes: ['getSemanticContextAtCaretPosition'],
},
];

export default {
Expand Down
3 changes: 3 additions & 0 deletions benchmark/data/params.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,8 @@
"suggestion_flink": {
"getAllEntities": ["$sql", { "lineNumber": 1020, "column": 38 }],
"getSuggestionAtCaretPosition": ["$sql", { "lineNumber": 1020, "column": 38 }]
},
"select": {
"getSemanticContextAtCaretPosition": ["$sql", { "lineNumber": 997, "column": 25 }]
}
}
29 changes: 15 additions & 14 deletions benchmark_reports/cold_start/flink.benchmark.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,33 +4,34 @@
FlinkSQL

### Report Time
2024/9/9 19:55:03
2024/12/18 14:50:08

### Device
macOS 14.4.1
macOS 15.0.1
(8) arm64 Apple M1 Pro
16.00 GB

### Version
`nodejs`: v21.6.1
`dt-sql-parser`: v4.0.2
`dt-sql-parser`: v4.1.0-beta.0
`antlr4-c3`: v3.3.7
`antlr4ng`: v2.0.11

### Running Mode
Cold Start

### Report
| Benchmark Name | Method Name |SQL Rows|Average Time(ms)|
|----------------|----------------------------|--------|----------------|
|Query Collection| getAllTokens | 1015 | 227 |
|Query Collection| validate | 1015 | 221 |
| Insert Columns | getAllTokens | 1001 | 65 |
| Insert Columns | validate | 1001 | 65 |
| Create Table | getAllTokens | 1004 | 27 |
| Create Table | validate | 1004 | 26 |
| Split SQL | splitSQLByStatement | 999 | 52 |
|Collect Entities| getAllEntities | 1056 | 141 |
| Suggestion |getSuggestionAtCaretPosition| 1056 | 131 |
| Benchmark Name | Method Name |SQL Rows|Average Time(ms)|
|-----------------|---------------------------------|--------|----------------|
| Query Collection| getAllTokens | 1015 | 257 |
| Query Collection| validate | 1015 | 277 |
| Insert Columns | getAllTokens | 1001 | 66 |
| Insert Columns | validate | 1001 | 67 |
| Create Table | getAllTokens | 1004 | 27 |
| Create Table | validate | 1004 | 28 |
| Split SQL | splitSQLByStatement | 999 | 53 |
| Collect Entities| getAllEntities | 1056 | 191 |
| Suggestion | getSuggestionAtCaretPosition | 1056 | 185 |
|Collect Semantics|getSemanticContextAtCaretPosition| 1015 | 247 |


33 changes: 17 additions & 16 deletions benchmark_reports/cold_start/hive.benchmark.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,35 +4,36 @@
HiveSQL

### Report Time
2024/9/9 19:55:03
2024/12/18 14:50:08

### Device
macOS 14.4.1
macOS 15.0.1
(8) arm64 Apple M1 Pro
16.00 GB

### Version
`nodejs`: v21.6.1
`dt-sql-parser`: v4.0.2
`dt-sql-parser`: v4.1.0-beta.0
`antlr4-c3`: v3.3.7
`antlr4ng`: v2.0.11

### Running Mode
Cold Start

### Report
| Benchmark Name | Method Name |SQL Rows|Average Time(ms)|
|----------------|----------------------------|--------|----------------|
|Query Collection| getAllTokens | 1015 | 185 |
|Query Collection| validate | 1015 | 179 |
| Update Table | getAllTokens | 1011 | 112 |
| Update Table | validate | 1011 | 109 |
| Insert Columns | getAllTokens | 1001 | 329 |
| Insert Columns | validate | 1001 | 329 |
| Create Table | getAllTokens | 1002 | 21 |
| Create Table | validate | 1002 | 20 |
| Split SQL | splitSQLByStatement | 1001 | 72 |
|Collect Entities| getAllEntities | 1066 | 106 |
| Suggestion |getSuggestionAtCaretPosition| 1066 | 100 |
| Benchmark Name | Method Name |SQL Rows|Average Time(ms)|
|-----------------|---------------------------------|--------|----------------|
| Query Collection| getAllTokens | 1015 | 194 |
| Query Collection| validate | 1015 | 194 |
| Update Table | getAllTokens | 1011 | 126 |
| Update Table | validate | 1011 | 119 |
| Insert Columns | getAllTokens | 1001 | 326 |
| Insert Columns | validate | 1001 | 323 |
| Create Table | getAllTokens | 1002 | 21 |
| Create Table | validate | 1002 | 20 |
| Split SQL | splitSQLByStatement | 1001 | 71 |
| Collect Entities| getAllEntities | 1066 | 338 |
| Suggestion | getSuggestionAtCaretPosition | 1066 | 148 |
|Collect Semantics|getSemanticContextAtCaretPosition| 1015 | 201 |


Loading