-
Notifications
You must be signed in to change notification settings - Fork 16.3k
feat(ag-grid): add SQLGlot-based SQL escaping for where and having filter clauses #36136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've completed my review and didn't find any issues.
Files scanned
| File Path | Reviewed |
|---|---|
| superset/utils/sql_sanitizer.py | ✅ |
| superset-frontend/packages/superset-ui-core/src/query/types/Query.ts | ✅ |
| superset-frontend/plugins/plugin-chart-ag-grid-table/src/buildQuery.ts | ✅ |
| superset/common/query_object.py | ✅ |
Explore our documentation to understand the languages and file types we support and the files we ignore.
Check out our docs on how you can make Korbit work best for you and your team.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #36136 +/- ##
===========================================
+ Coverage 0 68.14% +68.14%
===========================================
Files 0 632 +632
Lines 0 46464 +46464
Branches 0 5037 +5037
===========================================
+ Hits 0 31664 +31664
- Misses 0 13537 +13537
- Partials 0 1263 +1263
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…sary tests & print statements from the test file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've completed my review and didn't find any issues.
Files scanned
| File Path | Reviewed |
|---|---|
| superset/utils/sql_sanitizer.py | ✅ |
| superset-frontend/packages/superset-ui-core/src/query/types/Query.ts | ✅ |
| superset-frontend/plugins/plugin-chart-ag-grid-table/src/buildQuery.ts | ✅ |
| superset/common/query_object.py | ✅ |
Explore our documentation to understand the languages and file types we support and the files we ignore.
Check out our docs on how you can make Korbit work best for you and your team.
superset/common/query_object.py
Outdated
|
|
||
| if use_sqlglot_escaping: | ||
| database_backend = database.db_engine_spec.engine | ||
| clause = sanitize_sql_with_sqlglot( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also handle the case where the database_backend is for some reason undefined or None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces an opt-in SQLGlot-based SQL sanitization mechanism for WHERE and HAVING filter clauses in AG Grid queries. The solution provides database-agnostic SQL escaping and security validation to prevent SQL injection attacks through the extras parameter.
- Adds a new
sanitize_sql_with_sqlglot()function that parses SQL into an AST, validates for dangerous constructs (subqueries, set operations, DDL/DML), and regenerates with database-specific escaping - Integrates the sanitizer into
query_object.pywith an opt-inuse_sqlglot_escapingflag - Enables SQLGlot escaping by default for AG Grid table queries with server-side filtering
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| superset/utils/sql_sanitizer.py | New module implementing SQLGlot-based SQL sanitization with dialect mapping and security validation |
| superset/common/query_object.py | Integration of SQLGlot sanitizer into filter sanitization pipeline with opt-in flag support |
| superset-frontend/plugins/plugin-chart-ag-grid-table/src/buildQuery.ts | Enables use_sqlglot_escaping flag for AG Grid download queries with SQL clauses |
| superset-frontend/packages/superset-ui-core/src/query/types/Query.ts | Adds TypeScript type definition for use_sqlglot_escaping boolean flag in query extras |
| tests/unit_tests/utils/sql_sanitizer_escaping_test.py | Comprehensive test suite covering database-specific escaping, dangerous construct blocking, and edge cases |
Comments suppressed due to low confidence (1)
superset/common/query_object.py:370
- The SQLGlot sanitization result is assigned to
clausebut then immediately passed to the originalsanitize_clause()function which performs essentially the same parsing and regeneration. This means the SQL is being parsed and regenerated twice, which is redundant and inefficient.
Since sanitize_sql_with_sqlglot already performs database-specific SQL parsing and regeneration (which is what sanitize_clause does), you should either:
- Skip calling
sanitize_clause()whenuse_sqlglot_escapingis enabled, OR - Only call
sanitize_clause()and skip the SQLGlot sanitization when they're disabled
The current implementation performs duplicate work and may cause the clause to differ from what SQLGlot sanitized.
if use_sqlglot_escaping:
database_backend = database.db_engine_spec.engine
clause = sanitize_sql_with_sqlglot(
clause,
database_backend,
validate_structure=True,
)
engine = database.db_engine_spec.engine
sanitized_clause = sanitize_clause(clause, engine)
if sanitized_clause != clause:
self.extras[param] = sanitized_clause
|
Thanks for the review @sadpandajoe !! |
betodealmeida
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amaannawab923 can you give more context on what problem we're trying to solve here?
I don't think we should try to figure out the meaning of adhoc SQL provided by users. If users provide adhoc SQL it should be valid; they should quote things correctly according to the DB dialect. In the past we had lenient parsers and that led to multiple security issues.
Can you give an example of where this would be needed?
I also don't think we need to worry about DML and other attacks in here. We should first build the whole query with whatever adhoc filters the user provided, and only then validate that it only queries tables the user has access to, is a single statement, and only includes DML if the database allows. We already do all of that, so we shouldn't need to do it here.
# Conflicts: # superset/common/query_object.py
betodealmeida
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, thanks for the awesome work! One small nit, you can use r" strings in Python so that you don't need to double the backslash, it makes it easier to understand what's happening — otherwise people might think the database is adding the two backslashes instead of 1.
| # SQLite - SQL-92 standard double single quotes | ||
| ("name = 'O''Hara'", "sqlite", "name = 'O''Hara'"), | ||
| # Snowflake - backslash escaping | ||
| ("name = 'O''Hara'", "snowflake", "name = 'O\\'Hara'"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nit, you can use r" strings here and in other places to make it easier to read:
| ("name = 'O''Hara'", "snowflake", "name = 'O\\'Hara'"), | |
| ("name = 'O''Hara'", "snowflake", r"name = 'O\'Hara'"), |
SUMMARY
This PR adds SQLGlot-based SQL transpilation for AG Grid filter clauses, enabling cross-database compatibility.
Problem: AG Grid's server-side filtering generates SQL clauses using PostgreSQL-style syntax (e.g., ILIKE for case-insensitive matching, '' for quote escaping). When the
target database is MySQL, Presto, Snowflake, or another dialect, these clauses may fail or behave incorrectly.
Solution: Transpile filter SQL from generic syntax to the target database dialect using SQLGlot.
Key changes:
- Quote escaping: 'O''Hara' → 'O'Hara' (for Snowflake, BigQuery, Databricks)
- Operator conversion: ILIKE '%test%' → LOWER(col) LIKE LOWER('%test%') (for MySQL, Presto)
Supported dialects: PostgreSQL, MySQL, SQLite, Snowflake, BigQuery, Databricks, Presto, Trino, MSSQL, and others in SQLGLOT_DIALECTS.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
ADDITIONAL INFORMATION