Skip to content

Commit 42d0f8e

Browse files
committed
fix(slt): update information_schema.slt config description to match shortened use_expression_analyzer doc
1 parent 04450ee commit 42d0f8e

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

datafusion/sqllogictest/test_files/information_schema.slt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -476,7 +476,7 @@ datafusion.optimizer.repartition_windows true Should DataFusion repartition data
476476
datafusion.optimizer.skip_failed_rules false When set to true, the logical plan optimizer will produce warning messages if any optimization rules produce errors and then proceed to the next rule. When set to false, any rules that produce errors will cause the query to fail
477477
datafusion.optimizer.subset_repartition_threshold 4 Partition count threshold for subset satisfaction optimization. When the current partition count is >= this threshold, DataFusion will skip repartitioning if the required partitioning expression is a subset of the current partition expression such as Hash(a) satisfies Hash(a, b). When the current partition count is < this threshold, DataFusion will repartition to increase parallelism even when subset satisfaction applies. Set to 0 to always repartition (disable subset satisfaction optimization). Set to a high value to always use subset satisfaction. Example (subset_repartition_threshold = 4): ```text Hash([a]) satisfies Hash([a, b]) because (Hash([a, b]) is subset of Hash([a]) If current partitions (3) < threshold (4), repartition: AggregateExec: mode=FinalPartitioned, gby=[a, b], aggr=[SUM(x)] RepartitionExec: partitioning=Hash([a, b], 8), input_partitions=3 AggregateExec: mode=Partial, gby=[a, b], aggr=[SUM(x)] DataSourceExec: file_groups={...}, output_partitioning=Hash([a], 3) If current partitions (8) >= threshold (4), use subset satisfaction: AggregateExec: mode=SinglePartitioned, gby=[a, b], aggr=[SUM(x)] DataSourceExec: file_groups={...}, output_partitioning=Hash([a], 8) ```
478478
datafusion.optimizer.top_down_join_key_reordering true When set to true, the physical plan optimizer will run a top down process to reorder the join keys
479-
datafusion.optimizer.use_expression_analyzer false When set to true, the pluggable `ExpressionAnalyzerRegistry` from `SessionState` is injected into exec nodes that use expression-level statistics (`FilterExec`, `ProjectionExec`, `AggregateExec`, join nodes) and re-injected after each physical optimizer rule so rebuilt nodes always carry it. Custom analyzers then influence `partition_statistics` in those operators.
479+
datafusion.optimizer.use_expression_analyzer false When set to true, the pluggable `ExpressionAnalyzerRegistry` from `SessionState` is used for expression-level statistics estimation (NDV, selectivity, min/max, null fraction) in physical plan operators.
480480
datafusion.optimizer.use_statistics_registry false When set to true, the physical plan optimizer uses the pluggable `StatisticsRegistry` for a bottom-up statistics walk across operators, enabling more accurate cardinality estimates. Enabling `use_expression_analyzer` alongside this flag gives built-in providers access to custom expression-level analyzers (NDV, selectivity) for the operators they process.
481481
datafusion.runtime.list_files_cache_limit 1M Maximum memory to use for list files cache. Supports suffixes K (kilobytes), M (megabytes), and G (gigabytes). Example: '2G' for 2 gigabytes.
482482
datafusion.runtime.list_files_cache_ttl NULL TTL (time-to-live) of the entries in the list file cache. Supports units m (minutes), and s (seconds). Example: '2m' for 2 minutes.

0 commit comments

Comments
 (0)