Skip to content

fix: address escape literal issue #21516#21599

Draft
SamAya21 wants to merge 2 commits intoapache:mainfrom
SamAya21:fix-escape-literals
Draft

fix: address escape literal issue #21516#21599
SamAya21 wants to merge 2 commits intoapache:mainfrom
SamAya21:fix-escape-literals

Conversation

@SamAya21
Copy link
Copy Markdown

@SamAya21 SamAya21 commented Apr 13, 2026

Which issue does this PR close?
Closes #21516

Rationale for this change
Issue #21516 reports that Spark-mode SQL string literals such as '\thello' and '\nhello' are treated as literal backslash text instead of using Spark-compatible escape handling. This fix should address all string functions not just soundex and length. This should improve matching functionality between pyspark and datafusion-spark

What changes are included in this PR?

  • I added a Spark-only SQL parser option: 'datafusion.sql_parser.spark_string_literal_unescape'
  • threaded the option through 'ParserOptions'
  • updated 'datafusion/sql/src/expr/value.rs' to unescape string literals only when the Spark-only option is enabled
  • enabled the option from 'with_spark_features()' in 'datafusion/spark/src/session_state.rs'
  • updated tests and sqllogictest expectations and test snapshots

Are these changes tested?
Yes. I tested the change with these commands:

cargo test -p datafusion
cargo test -p datafusion-sql
cargo test -p datafusion-spark
cargo test --profile=ci --test sqllogictests

Are there any user-facing changes?
Yes, there are user-facing changes. In Spark mode, SQL string literals now use Spark-compatible backslash escape handling. Generic DataFusion SQL behavior is unchanged.

@github-actions github-actions bot added sql SQL Planner core Core DataFusion crate labels Apr 13, 2026
@kumarUjjawal
Copy link
Copy Markdown
Contributor

Few points before the actual code review:

  1. Always make sure you run the linter locally with all the tests passing
  2. Follow the PT template
  3. Please read the ai assisted pr guidelines https://datafusion.apache.org/contributor-guide/index.html#ai-assisted-contributions

@SamAya21
Copy link
Copy Markdown
Author

Sorry about that, I hadn't seen I should've ran sqllogictests. How exactly would I run the linter locally or what webpage would I find that as I see that I had forgot to add the header to a test file I had created.

@SamAya21 SamAya21 marked this pull request as draft April 15, 2026 05:16
@kumarUjjawal
Copy link
Copy Markdown
Contributor

Sorry about that, I hadn't seen I should've ran sqllogictests. How exactly would I run the linter locally or what webpage would I find that as I see that I had forgot to add the header to a test file I had created.

  1. run cargo fmt: cargo fmt --all
  2. run clippy: cargo clippy --workspace --all-features --all-targets -- --deny=warnings
  3. run sqllogictest: make -C testing/runner test-sqlite

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) common Related to common crate spark labels Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common Related to common crate core Core DataFusion crate spark sql SQL Planner sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: datafusion-spark string literals don't interpret escape sequences like Spark

2 participants