Is your feature request related to a problem or challenge?
The SQL unparser has no Snowflake dialect. Snowflake diverges from standard SQL in ways that go beyond identifier quoting and keyword support - certain logical plan nodes require structurally different SQL output.
The most impactful example is UNNEST. Snowflake has no UNNEST keyword. Its equivalent is LATERAL FLATTEN(INPUT => expr), a table function in the FROM clause whose output is accessed via alias."VALUE". This is not a syntax substitution - the generated SQL needs a different structure (a FROM-clause table factor with a CROSS JOIN, instead of a SELECT-clause expression).
Other Snowflake-specific behaviors that affect unparsing:
- No support for empty select lists (
SELECT FROM t is invalid)
- Column aliases in table alias definitions are silently ignored in join contexts
- Timestamp cast types differ from the default (
TIMESTAMP_TZ vs TIMESTAMP)
CustomDialectBuilder cannot express these - FLATTEN requires new SQL generation logic, not just configuration knobs.
Currently, users targeting Snowflake must post-process generated SQL or avoid plan shapes involving UNNEST entirely.
Describe the solution you'd like
A SnowflakeDialect for the SQL unparser that produces valid Snowflake SQL from DataFusion logical plans. Specifically:
Dialect settings: Double-quote identifier quoting, NULLS FIRST/NULLS LAST support, no empty select lists, no column aliases in table aliases, Snowflake-appropriate timestamp cast types.
UNNEST → LATERAL FLATTEN translation: When the logical plan contains UNNEST, the unparser should emit LATERAL FLATTEN(INPUT => expr, OUTER => bool) in the FROM clause and rewrite SELECT references to the FLATTEN output column (alias."VALUE").
This translation should handle the plan shapes that the DataFusion optimizer produces, including:
- Inline array sources (
UNNEST([1,2,3])) and table column sources
Limit or Sort nodes inserted between the outer Projection and the Unnest by the optimizer
SubqueryAlias nodes wrapping the unnest source (e.g. from virtual/passthrough tables)
- Outer projections that apply functions to the unnest output (e.g.
CAST(unnest_col AS INT))
- Outer projections with multiple expressions referencing the unnest output
Struct-type unnest should be rejected with a clear error - Snowflake FLATTEN expands array elements into rows (vertical), while DataFusion expands struct fields into columns (horizontal). These are incompatible semantics.
Describe alternatives you've considered
-
String-level post-processing of generated SQL to rewrite UNNEST(...) to LATERAL FLATTEN(...). Fragile - doesn't handle the structural difference between a SELECT expression and a FROM-clause table factor.
-
Extending unnest_as_table_factor (BigQuery path) to cover Snowflake. FLATTEN's syntax (INPUT => expr, OUTER => bool) and output model (alias."VALUE") are structurally different from BigQuery's UNNEST(expr) table factor. Conflating them would complicate both code paths.
-
CustomDialectBuilder with existing knobs. Insufficient — there is no existing configuration option for FLATTEN-style rendering, and the translation requires plan-aware SQL restructuring that configuration alone cannot express.
Additional context
PR coming.
This would be the first dialect that requires the unparser to emit a fundamentally different SQL structure for a core plan node (UNNEST), rather than just varying syntax. Tried to design in a way that can support how future dialects with similar structural divergences could plug in.
Is your feature request related to a problem or challenge?
The SQL unparser has no Snowflake dialect. Snowflake diverges from standard SQL in ways that go beyond identifier quoting and keyword support - certain logical plan nodes require structurally different SQL output.
The most impactful example is
UNNEST. Snowflake has noUNNESTkeyword. Its equivalent isLATERAL FLATTEN(INPUT => expr), a table function in theFROMclause whose output is accessed viaalias."VALUE". This is not a syntax substitution - the generated SQL needs a different structure (aFROM-clause table factor with aCROSS JOIN, instead of aSELECT-clause expression).Other Snowflake-specific behaviors that affect unparsing:
SELECT FROM tis invalid)TIMESTAMP_TZvsTIMESTAMP)CustomDialectBuildercannot express these -FLATTENrequires new SQL generation logic, not just configuration knobs.Currently, users targeting Snowflake must post-process generated SQL or avoid plan shapes involving
UNNESTentirely.Describe the solution you'd like
A
SnowflakeDialectfor the SQL unparser that produces valid Snowflake SQL from DataFusion logical plans. Specifically:Dialect settings: Double-quote identifier quoting,
NULLS FIRST/NULLS LASTsupport, no empty select lists, no column aliases in table aliases, Snowflake-appropriate timestamp cast types.UNNEST→LATERAL FLATTENtranslation: When the logical plan containsUNNEST, the unparser should emitLATERAL FLATTEN(INPUT => expr, OUTER => bool)in theFROMclause and rewriteSELECTreferences to theFLATTENoutput column (alias."VALUE").This translation should handle the plan shapes that the DataFusion optimizer produces, including:
UNNEST([1,2,3])) and table column sourcesLimitorSortnodes inserted between the outerProjectionand theUnnestby the optimizerSubqueryAliasnodes wrapping the unnest source (e.g. from virtual/passthrough tables)CAST(unnest_col AS INT))Struct-type unnest should be rejected with a clear error - Snowflake
FLATTENexpands array elements into rows (vertical), while DataFusion expands struct fields into columns (horizontal). These are incompatible semantics.Describe alternatives you've considered
String-level post-processing of generated SQL to rewrite
UNNEST(...)toLATERAL FLATTEN(...). Fragile - doesn't handle the structural difference between aSELECTexpression and aFROM-clause table factor.Extending
unnest_as_table_factor(BigQuery path) to cover Snowflake.FLATTEN's syntax (INPUT => expr, OUTER => bool) and output model (alias."VALUE") are structurally different from BigQuery'sUNNEST(expr)table factor. Conflating them would complicate both code paths.CustomDialectBuilderwith existing knobs. Insufficient — there is no existing configuration option forFLATTEN-style rendering, and the translation requires plan-aware SQL restructuring that configuration alone cannot express.Additional context
PR coming.
This would be the first dialect that requires the unparser to emit a fundamentally different SQL structure for a core plan node (
UNNEST), rather than just varying syntax. Tried to design in a way that can support how future dialects with similar structural divergences could plug in.