Prototype combined Repartition/Filter + Coalesce (WIP) #11647

alamb · 2024-07-25T10:29:48Z

Builds on

Which issue does this PR close?

Rationale for this change

As described on #7957 and #11628 the current combination of filtering / repartition followed by coalesce requires copying the data twice. This PR is a prototype to:

See how much better performance would be if combining the two operations into one and avoided a copy
Figure out how big a change it would be / what ht code would look like

This is based on the code in #11610 and a bunch of discussion with @XiangpengHao @edmondop @2010YOUY01 and others

Plan

The theory is there is non trivial time spent in coalesce batches and repartitioning that we could improve performance by several seconds (almost 1s of CPU in several queries query) --- see analysis below

My high level plan is to implement enough of this idea to run some ClickBench queries like Q20 Q15 and Q16 and TPCH Q8 and see. If the results are promising, I will work to scope out how to make this into real PRs

High level plan:

Integrate the BatchCoalescer to FilterExec control flow
Move actual calls to filter into BatchCoalescer
Integrate the BatchCoalescer to RepartitionExec
Move actual calls to take into BatchCoalescer
Test to make sure the results are the same
Test to make sure it doesn't slow things down
Test to make sure the CoalesceBatchesExec doesn't do any work now (work is shifted to the FilterExec and RepartitionExec)
Implement some special case coalesce batches for filter
Implement some special case coalesce batches for repartition (take)

Supporting Anaylsis

Details

Clickbench Q16

This query has no filter in this query

SELECT "UserID", COUNT(*) FROM "hits.parquet" GROUP BY "UserID" ORDER BY COUNT(*) DESC LIMIT 10;

(venv) andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion2/benchmarks/data$ datafusion-cli -c 'SELECT "UserID", COUNT(*) FROM "hits.parquet" GROUP BY "UserID" ORDER BY COUNT(*) DESC LIMIT 10;'
DataFusion CLI v40.0.0
+---------------------+----------+
| UserID              | count(*) |
+---------------------+----------+
| 1313338681122956954 | 29097    |
| 1907779576417363396 | 25333    |
| 2305303682471783379 | 10597    |
| 7982623143712728547 | 7584     |
| 6018350421959114808 | 6678     |
| 7280399273658728997 | 6411     |
| 1090981537032625727 | 6197     |
| 5730251990344211405 | 6019     |
| 835157184735512989  | 5211     |
| 770542365400669095  | 4906     |
+---------------------+----------+
10 row(s) fetched.
Elapsed 0.393 seconds.

CoalesceBathces takes 92ms of the time which the theory is we can totally avoid. Repartition take 178ms

$ datafusion-cli -c 'explain analyze SELECT "UserID", COUNT(*) FROM "hits.parquet" GROUP BY "UserID" ORDER BY COUNT(*) DESC LIMIT 10;'
DataFusion CLI v40.0.0
+-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type         | plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
+-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | GlobalLimitExec: skip=0, fetch=10, metrics=[output_rows=10, elapsed_compute=13.583µs]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
|                   |   SortPreservingMergeExec: [count(*)@1 DESC], fetch=10, metrics=[output_rows=10, elapsed_compute=3.5µs]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
|                   |     SortExec: TopK(fetch=10), expr=[count(*)@1 DESC], preserve_partitioning=[true], metrics=[output_rows=160, elapsed_compute=145.487854ms, row_replacements=1570]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                   |       AggregateExec: mode=FinalPartitioned, gby=[UserID@0 as UserID], aggr=[count(*)], metrics=[output_rows=17630976, elapsed_compute=1.581519786s]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
|                   |         CoalesceBatchesExec: target_batch_size=8192, metrics=[output_rows=21164213, elapsed_compute=92.710954ms]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
|                   |           RepartitionExec: partitioning=Hash([UserID@0], 16), input_partitions=16, metrics=[send_time=483.37239ms, repart_time=178.203635ms, fetch_time=3.187560018s]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
|                   |             AggregateExec: mode=Partial, gby=[UserID@0 as UserID], aggr=[count(*)], metrics=[output_rows=21164213, elapsed_compute=2.415851718s]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
|                   |               ParquetExec: file_groups={16 groups: [[Users/andrewlamb/Software/datafusion/benchmarks/data/hits.parquet:0..923748528], [Users/andrewlamb/Software/datafusion/benchmarks/data/hits.parquet:923748528..1847497056], [Users/andrewlamb/Software/datafusion/benchmarks/data/hits.parquet:1847497056..2771245584], [Users/andrewlamb/Software/datafusion/benchmarks/data/hits.parquet:2771245584..3694994112], [Users/andrewlamb/Software/datafusion/benchmarks/data/hits.parquet:3694994112..4618742640], ...]}, projection=[UserID], metrics=[output_rows=99997497, elapsed_compute=16ns, row_groups_pruned_bloom_filter=0, num_predicate_creation_errors=0, bytes_scanned=270116234, row_groups_matched_statistics=0, row_groups_matched_bloom_filter=0, predicate_evaluation_errors=0, row_groups_pruned_statistics=0, file_open_errors=0, pushdown_rows_filtered=0, file_scan_errors=0, page_index_rows_filtered=0, time_elapsed_opening=321.212666ms, page_index_eval_time=32ns, time_elapsed_scanning_total=2.856808586s, time_elapsed_scanning_until_data=15.897751ms, pushdown_eval_time=32ns, time_elapsed_processing=581.233243ms] |
|                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
+-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.412 seconds.

Under the covers repartition eventually simply calls take and the internal implementation of take uses a mutable buffer

So in theory we could make a FilteredArrayBuilder or something that was able to
take one or more arrays, and copy those rows which match into an inprogress
output array. This would avoid the need to copy the data twice.

ClickBenchmark Q15 (predicate)

Q15 should be one of the bast cases to show this improvement, as the effect of copying should be especially pronounced (filtering / repartitioning strings)

SELECT "SearchEngineID", "SearchPhrase", COUNT(*) AS c FROM hits WHERE "SearchPhrase" <> '' GROUP BY "SearchEngineID", "SearchPhrase" ORDER BY c DESC LIMIT 10;

 $ datafusion-cli -c "SELECT \"SearchEngineID\", \"SearchPhrase\", COUNT(*) AS c FROM \"hits.parquet\" WHERE \"SearchPhrase\" <> '' GROUP BY \"SearchEngineID\", \"SearchPhrase\" ORDER BY c DESC LIMIT 10;"
 DataFusion CLI v40.0.0
 +----------------+---------------------------+-------+
 | SearchEngineID | SearchPhrase              | c     |
 +----------------+---------------------------+-------+
 | 2              | карелки                   | 46258 |
 | 2              | мангу в зарабей грама     | 18871 |
 | 2              | смотреть онлайн           | 16905 |
 | 3              | албатрутдин               | 16748 |
 | 2              | смотреть онлайн бесплатно | 14909 |
 | 2              | албатрутдин               | 13716 |
 | 2              | экзоидные                 | 13414 |
 | 2              | смотреть                  | 13108 |
 | 3              | карелки                   | 12815 |
 | 2              | дружке помещение          | 11946 |
 +----------------+---------------------------+-------+
 10 row(s) fetched.
 Elapsed 0.682 seconds.

Well here elapsed compute in coalesce batches is non trivial (both for filter and repartition)

 $ datafusion-cli -c "explain analyze SELECT \"SearchEngineID\", \"SearchPhrase\", COUNT(*) AS c FROM \"hits.parquet\" WHERE \"SearchPhrase\" <> '' GROUP BY \"SearchEngineID\", \"SearchPhrase\" ORDER BY c DESC LIMIT 10;"
 DataFusion CLI v40.0.0
 +-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 | plan_type         | plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
 +-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 | Plan with Metrics | GlobalLimitExec: skip=0, fetch=10, metrics=[output_rows=10, elapsed_compute=22µs]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
 |                   |   SortPreservingMergeExec: [c@2 DESC], fetch=10, metrics=[output_rows=10, elapsed_compute=5µs]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
 |                   |     SortExec: TopK(fetch=10), expr=[c@2 DESC], preserve_partitioning=[true], metrics=[output_rows=160, elapsed_compute=48.584286ms, row_replacements=738]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
 |                   |       ProjectionExec: expr=[SearchEngineID@0 as SearchEngineID, SearchPhrase@1 as SearchPhrase, count(*)@2 as c], metrics=[output_rows=6474212, elapsed_compute=123.825µs]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
 |                   |         AggregateExec: mode=FinalPartitioned, gby=[SearchEngineID@0 as SearchEngineID, SearchPhrase@1 as SearchPhrase], aggr=[count(*)], metrics=[output_rows=6474212, elapsed_compute=1.962629061s]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
 |                   |           CoalesceBatchesExec: target_batch_size=8192, metrics=[output_rows=7559975, elapsed_compute=113.930353ms]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
 |                   |             RepartitionExec: partitioning=Hash([SearchEngineID@0, SearchPhrase@1], 16), input_partitions=16, metrics=[send_time=211.810042ms, fetch_time=6.817337071s, repart_time=352.1063ms]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
 |                   |               AggregateExec: mode=Partial, gby=[SearchEngineID@0 as SearchEngineID, SearchPhrase@1 as SearchPhrase], aggr=[count(*)], metrics=[output_rows=7559975, elapsed_compute=2.744048082s]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
 |                   |                 CoalesceBatchesExec: target_batch_size=8192, metrics=[output_rows=13172392, elapsed_compute=91.456825ms]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
 |                   |                   FilterExec: SearchPhrase@1 != , metrics=[output_rows=13172392, elapsed_compute=656.500384ms]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
 |                   |                     ParquetExec: file_groups={16 groups: [[Users/andrewlamb/Software/datafusion/benchmarks/data/hits.parquet:0..923748528], [Users/andrewlamb/Software/datafusion/benchmarks/data/hits.parquet:923748528..1847497056], [Users/andrewlamb/Software/datafusion/benchmarks/data/hits.parquet:1847497056..2771245584], [Users/andrewlamb/Software/datafusion/benchmarks/data/hits.parquet:2771245584..3694994112], [Users/andrewlamb/Software/datafusion/benchmarks/data/hits.parquet:3694994112..4618742640], ...]}, projection=[SearchEngineID, SearchPhrase], predicate=SearchPhrase@39 != , pruning_predicate=CASE WHEN SearchPhrase_null_count@2 = SearchPhrase_row_count@3 THEN false ELSE SearchPhrase_min@0 !=  OR  != SearchPhrase_max@1 END, required_guarantees=[SearchPhrase not in ()], metrics=[output_rows=99997497, elapsed_compute=16ns, file_open_errors=0, row_groups_matched_statistics=226, file_scan_errors=0, page_index_rows_filtered=0, pushdown_rows_filtered=0, row_groups_pruned_statistics=0, bytes_scanned=391794592, predicate_evaluation_errors=0, num_predicate_creation_errors=0, row_groups_matched_bloom_filter=0, row_groups_pruned_bloom_filter=0, time_elapsed_processing=2.725566865s, page_index_eval_time=685ns, time_elapsed_scanning_total=6.044856135s, time_elapsed_opening=373.773501ms, time_elapsed_scanning_until_data=52.434752ms, pushdown_eval_time=32ns] |
 |                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
 +-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 1 row(s) fetched.
 Elapsed 0.757 seconds.

TPCH Q1 (thanks @2010YOUY01)

Query run:

(venv) andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion2/benchmarks/data/tpch_sf10$ datafusion-cli -c "select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price, sum(l_extendedprice * (1 - l_discount)) as sum_disc_price, sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge, avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc, count(*) as count_order from lineitem where l_shipdate <= date '1998-09-02' group by l_returnflag, l_linestatus order by l_returnflag, l_linestatus;"
DataFusion CLI v40.0.0
+--------------+--------------+--------------+------------------+--------------------+----------------------+-----------+--------------+----------+-------------+
| l_returnflag | l_linestatus | sum_qty      | sum_base_price   | sum_disc_price     | sum_charge           | avg_qty   | avg_price    | avg_disc | count_order |
+--------------+--------------+--------------+------------------+--------------------+----------------------+-----------+--------------+----------+-------------+
| A            | F            | 377518399.00 | 566065727797.25  | 537759104278.0656  | 559276670892.116819  | 25.500975 | 38237.151008 | 0.050006 | 14804077    |
| N            | F            | 9851614.00   | 14767438399.17   | 14028805792.2114   | 14590490998.366737   | 25.522448 | 38257.810660 | 0.049973 | 385998      |
| N            | O            | 743124873.00 | 1114302286901.88 | 1058580922144.9638 | 1100937000170.591854 | 25.498075 | 38233.902923 | 0.050000 | 29144351    |
| R            | F            | 377732830.00 | 566431054976.00  | 538110922664.7677  | 559634780885.086257  | 25.508384 | 38251.219273 | 0.049996 | 14808183    |
+--------------+--------------+--------------+------------------+--------------------+----------------------+-----------+--------------+----------+-------------+
4 row(s) fetched.
Elapsed 0.528 seconds.

And indeed CoalesceBathesExec is consuming 332.124162ms (filter) + elapsed_compute=1.997875ms (repartition)

(venv) andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion2/benchmarks/data/tpch_sf10$ datafusion-cli -c "explain analyze select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price, sum(l_extendedprice * (1 - l_discount)) as sum_disc_price, sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge, avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc, count(*) as count_order from lineitem where l_shipdate <= date '1998-09-02' group by l_returnflag, l_linestatus order by l_returnflag, l_linestatus;"
DataFusion CLI v40.0.0
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type         | plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | SortPreservingMergeExec: [l_returnflag@0 ASC NULLS LAST,l_linestatus@1 ASC NULLS LAST], metrics=[output_rows=4, elapsed_compute=16.917µs]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                   |   SortExec: expr=[l_returnflag@0 ASC NULLS LAST,l_linestatus@1 ASC NULLS LAST], preserve_partitioning=[true], metrics=[output_rows=4, elapsed_compute=16ns, spill_count=0, spilled_bytes=0, spilled_rows=0]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
|                   |     ProjectionExec: expr=[l_returnflag@0 as l_returnflag, l_linestatus@1 as l_linestatus, sum(lineitem.l_quantity)@2 as sum_qty, sum(lineitem.l_extendedprice)@3 as sum_base_price, sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)@4 as sum_disc_price, sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount * Int64(1) + lineitem.l_tax)@5 as sum_charge, avg(lineitem.l_quantity)@6 as avg_qty, avg(lineitem.l_extendedprice)@7 as avg_price, avg(lineitem.l_discount)@8 as avg_disc, count(*)@9 as count_order], metrics=[output_rows=4, elapsed_compute=15.333µs]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
|                   |       AggregateExec: mode=FinalPartitioned, gby=[l_returnflag@0 as l_returnflag, l_linestatus@1 as l_linestatus], aggr=[sum(lineitem.l_quantity), sum(lineitem.l_extendedprice), sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount), sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount * Int64(1) + lineitem.l_tax), avg(lineitem.l_quantity), avg(lineitem.l_extendedprice), avg(lineitem.l_discount), count(*)], metrics=[output_rows=4, elapsed_compute=1.543875ms]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
|                   |         CoalesceBatchesExec: target_batch_size=8192, metrics=[output_rows=64, elapsed_compute=1.997875ms]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                   |           RepartitionExec: partitioning=Hash([l_returnflag@0, l_linestatus@1], 16), input_partitions=16, metrics=[fetch_time=7.427854455s, send_time=83.11µs, repart_time=990.462µs]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
|                   |             AggregateExec: mode=Partial, gby=[l_returnflag@5 as l_returnflag, l_linestatus@6 as l_linestatus], aggr=[sum(lineitem.l_quantity), sum(lineitem.l_extendedprice), sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount), sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount * Int64(1) + lineitem.l_tax), avg(lineitem.l_quantity), avg(lineitem.l_extendedprice), avg(lineitem.l_discount), count(*)], metrics=[output_rows=64, elapsed_compute=3.467796344s]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
|                   |               ProjectionExec: expr=[l_extendedprice@1 * (Some(1),20,0 - l_discount@2) as __common_expr_1, l_quantity@0 as l_quantity, l_extendedprice@1 as l_extendedprice, l_discount@2 as l_discount, l_tax@3 as l_tax, l_returnflag@4 as l_returnflag, l_linestatus@5 as l_linestatus], metrics=[output_rows=59142609, elapsed_compute=880.747072ms]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
|                   |                 CoalesceBatchesExec: target_batch_size=8192, metrics=[output_rows=59142609, elapsed_compute=332.124162ms]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                   |                   FilterExec: l_shipdate@6 <= 1998-09-02, metrics=[output_rows=59142609, elapsed_compute=355.65294ms]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
|                   |                     ParquetExec: file_groups={16 groups: [[Users/andrewlamb/Software/datafusion/benchmarks/data/tpch_sf10/lineitem/part-0.parquet:0..104118746], [Users/andrewlamb/Software/datafusion/benchmarks/data/tpch_sf10/lineitem/part-0.parquet:104118746..104939977, Users/andrewlamb/Software/datafusion/benchmarks/data/tpch_sf10/lineitem/part-1.parquet:0..103297515], [Users/andrewlamb/Software/datafusion/benchmarks/data/tpch_sf10/lineitem/part-1.parquet:103297515..104722832, Users/andrewlamb/Software/datafusion/benchmarks/data/tpch_sf10/lineitem/part-10.parquet:0..102693429], [Users/andrewlamb/Software/datafusion/benchmarks/data/tpch_sf10/lineitem/part-10.parquet:102693429..103993618, Users/andrewlamb/Software/datafusion/benchmarks/data/tpch_sf10/lineitem/part-11.parquet:0..102818557], [Users/andrewlamb/Software/datafusion/benchmarks/data/tpch_sf10/lineitem/part-11.parquet:102818557..103951831, Users/andrewlamb/Software/datafusion/benchmarks/data/tpch_sf10/lineitem/part-12.parquet:0..102985472], ...]}, projection=[l_quantity, l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate], predicate=l_shipdate@10 <= 1998-09-02, pruning_predicate=CASE WHEN l_shipdate_null_count@1 = l_shipdate_row_count@2 THEN false ELSE l_shipdate_min@0 <= 1998-09-02 END, required_guarantees=[], metrics=[output_rows=59986052, elapsed_compute=16ns, row_groups_matched_statistics=64, predicate_evaluation_errors=0, page_index_rows_filtered=0, row_groups_matched_bloom_filter=0, bytes_scanned=413089297, pushdown_rows_filtered=0, row_groups_pruned_statistics=0, file_scan_errors=0, row_groups_pruned_bloom_filter=0, file_open_errors=0, num_predicate_creation_errors=0, time_elapsed_opening=23.318374ms, time_elapsed_scanning_until_data=192.471334ms, pushdown_eval_time=62ns, time_elapsed_scanning_total=7.392932812s, time_elapsed_processing=2.191135021s, page_index_eval_time=352.571µs] |
|                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.499 seconds.

The types are:

CoalesceBatchesExec: target_batch_size=8192, schema=[l_returnflag:Utf8, l_linestatus:Utf8, sum(lineitem.l_quantity)[sum]:Decimal128(25, 2);N, sum(lineitem.l_extendedprice)[sum]:Decimal128(25, 2);N, sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)[sum]:Decimal128(38, 4);N, sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount * Int64(1) + lineitem.l_tax)[sum]:Decimal128(38, 6);N, avg(lineitem.l_quantity)[count]:UInt64;N, avg(lineitem.l_quantity)[sum]:Decimal128(15, 2);N, avg(lineitem.l_extendedprice)[count]:UInt64;N, avg(lineitem.l_extendedprice)[sum]:Decimal128(15, 2);N, avg(lineitem.l_discount)[count]:UInt64;N, avg(lineitem.l_discount)[sum]:Decimal128(15, 2);N, count(*)[count]:Int64;N]
CoalesceBatchesExec: target_batch_size=8192, schema=[l_quantity:Decimal128(15, 2), l_extendedprice:Decimal128(15, 2), l_discount:Decimal128(15, 2), l_tax:Decimal128(15, 2), l_returnflag:Utf8, l_linestatus:Utf8, l_shipdate:Date32]
FilterExec: l_shipdate@6 <= 1998-09-02, schema=[l_quantity:Decimal128(15, 2), l_extendedprice:Decimal128(15, 2), l_discount:Decimal128(15, 2), l_tax:Decimal128(15, 2), l_returnflag:Utf8, l_linestatus:Utf8, l_shipdate:Date32]

So to make this one faster we would have to support several types (decomal, utf8 and date32)

alamb · 2024-07-25T10:31:06Z

datafusion/physical-plan/src/filter.rs

@@ -278,10 +279,12 @@ impl ExecutionPlan for FilterExec {
        trace!("Start FilterExec::execute for partition {} of context session_id {} and task_id {:?}", partition, context.session_id(), context.task_id());
        let baseline_metrics = BaselineMetrics::new(&self.metrics, partition);
        Ok(Box::pin(FilterExecStream {


the changes here show how to integrate Coalesce into Filter, which seems reasonable so far

alamb · 2024-07-25T11:31:59Z

datafusion/physical-plan/src/coalescer/mod.rs

+            .map(|a| filter_array(a, &filter))
+            .collect::<Result<Vec<_>, _>>()?;
+        let options = RecordBatchOptions::default().with_row_count(Some(filter.count()));
+        let filtered_batch =


The key goal is to avoid this batch materialization. I'll keep hacking on it tomorrow / later

github-actions · 2024-11-02T01:58:47Z

Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.

alamb commented Jul 25, 2024

View reviewed changes

This was referenced Jul 25, 2024

Add comments and tests for gc_string_view_batch XiangpengHao/datafusion#1

Merged

Minor: use ready! macro to simplify FilterExec #11649

Merged

andygrove mentioned this pull request Jul 25, 2024

[EPIC] Performance focus for 0.2.0 Release apache/datafusion-comet#717

Closed

5 tasks

alamb mentioned this pull request Aug 8, 2024

Improve performance of high cardinality grouping by reusing hash values #11680

Open

alamb added 3 commits August 18, 2024 08:45

Minor: Extract BatchCoalescer to its own module

3f9bf97

Add coalescing into FilterExec

f10db59

move filtering logic into the coalescer

e4d7e55

alamb force-pushed the alamb/combined_coalesce branch from 0fce388 to e4d7e55 Compare August 18, 2024 12:58

github-actions bot added the physical-expr Changes to the physical-expr crates label Aug 18, 2024

github-actions bot added the Stale PR has not had any activity for some time label Nov 2, 2024

github-actions bot closed this Nov 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prototype combined Repartition/Filter + Coalesce (WIP) #11647

Prototype combined Repartition/Filter + Coalesce (WIP) #11647

Uh oh!

alamb commented Jul 25, 2024 •

edited by yjshen

Loading

Uh oh!

alamb Jul 25, 2024

Uh oh!

alamb Jul 25, 2024

Uh oh!

github-actions bot commented Nov 2, 2024

Uh oh!

Uh oh!

Prototype combined Repartition/Filter + Coalesce (WIP) #11647

Prototype combined Repartition/Filter + Coalesce (WIP) #11647

Uh oh!

Conversation

alamb commented Jul 25, 2024 • edited by yjshen Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

Plan

Supporting Anaylsis

Clickbench Q16

ClickBenchmark Q15 (predicate)

TPCH Q1 (thanks @2010YOUY01)

Uh oh!

alamb Jul 25, 2024

Choose a reason for hiding this comment

Uh oh!

alamb Jul 25, 2024

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Nov 2, 2024

Uh oh!

Uh oh!

alamb commented Jul 25, 2024 •

edited by yjshen

Loading