Skip to content

Further improve datafusion-cli memory usage if we setting huge number for maxrow size. #14810

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Tracked by #15271
zhuqi-lucas opened this issue Feb 21, 2025 · 6 comments · Fixed by #14877 · May be fixed by #14954
Open
Tracked by #15271

Further improve datafusion-cli memory usage if we setting huge number for maxrow size. #14810

zhuqi-lucas opened this issue Feb 21, 2025 · 6 comments · Fixed by #14877 · May be fixed by #14954
Labels
enhancement New feature or request

Comments

@zhuqi-lucas
Copy link
Contributor

Is your feature request related to a problem or challenge?

This is a follow-up for the bellow comments:

#14766 (comment)

Describe the solution you'd like

Streaming datafusion-cli the print batch progress.

Describe alternatives you've considered

No response

Additional context

No response

@alamb
Copy link
Contributor

alamb commented Feb 21, 2025

Basically the idea of this ticket is to print rows as they come in in batches rather than buffering them all up at once

I think this will take some non trivial work as the formatter wants to know the width of all cells up front

I believe Postgres does something like "compute column widths based on the first 1000 cells" and then just has a crappy display if the rows after that happen to have wider columns

@zhuqi-lucas
Copy link
Contributor Author

Thank you @alamb for the great idea.

Besides this improvement, i also found a bug for unlimited cases which we are missing for the buffer. Filed a ticket now:

#14814

@zhuqi-lucas
Copy link
Contributor Author

zhuqi-lucas commented Feb 25, 2025

Redesign the datafusion-cli execution and print, make it totally streaming printing without memory overhead.
Almost no extra memory usage for datafusion-cli now after this PR.

  1. Change all datafusion-cli printing to streaming
  2. Support preview the column width format printing, default is 1000 lines
  3. Also support csv/json, etc format, max rows support which not supporting before
  4. Totally removing the datafusion-cli memory overhead, the only occupied memory is the current batch for batchs iterator.

Submitted the PR, and i will continue testing the corner cases.

Basic tests are all good.

@zhuqi-lucas
Copy link
Contributor Author

/usr/bin/time -l cargo run --profile release-nonlto --   -m 3G --mem-pool-type fair --maxrows 1   --format table -f '/Users/zhuqi/arrow-datafusion/benchmarks/data/external_sort.sql'
   Compiling datafusion-cli v45.0.0 (/Users/zhuqi/arrow-datafusion/datafusion-cli)
    Finished `release-nonlto` profile [optimized] target(s) in 23.18s
     Running `/Users/zhuqi/arrow-datafusion/target/release-nonlto/datafusion-cli -m 3G --mem-pool-type fair --maxrows 1 --format table -f /Users/zhuqi/arrow-datafusion/benchmarks/data/external_sort.sql`
DataFusion CLI v45.0.0
0 row(s) fetched.
Elapsed 0.005 seconds.

+------------+-----------+-----------+--------------+------------+-----------------+------------+-------+------------+--------------+---------------+
| l_orderkey | l_partkey | l_suppkey | l_linenumber | l_quantity | l_extendedprice | l_discount | l_tax | l_shipdate | l_commitdate | l_receiptdate |
+------------+-----------+-----------+--------------+------------+-----------------+------------+-------+------------+--------------+---------------+
| 1          | 1551894   | 76910     | 1            | 17.00      | 33078.94        | 0.04       | 0.02  | 1996-03-13 | 1996-02-12   | 1996-03-22    |
| .          | .         | .         | .            | .          | .               | .          | .     | .          | .            | .             |
| .          | .         | .         | .            | .          | .               | .          | .     | .          | .            | .             |
| .          | .         | .         | .            | .          | .               | .          | .     | .          | .            | .             |
+------------+-----------+-----------+--------------+------------+-----------------+------------+-------+------------+--------------+---------------+
59986052 row(s) fetched. (First 1 displayed. Use --maxrows to adjust)
Elapsed 4.252 seconds.

       32.07 real        12.73 user         6.46 sys
          3856441344  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
              994646  page reclaims
                   0  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                   0  messages sent
                   0  messages received
                   0  signals received
                2279  voluntary context switches
               98027  involuntary context switches
        304329202988  instructions retired
         72023941353  cycles elapsed
          3847026176  peak memory footprint
/usr/bin/time -l cargo run --profile release-nonlto --   -m 3G --mem-pool-type fair --maxrows 1   --format csv -f '/Users/zhuqi/arrow-datafusion/benchmarks/data/external_sort.sql'
    Finished `release-nonlto` profile [optimized] target(s) in 0.32s
     Running `/Users/zhuqi/arrow-datafusion/target/release-nonlto/datafusion-cli -m 3G --mem-pool-type fair --maxrows 1 --format csv -f /Users/zhuqi/arrow-datafusion/benchmarks/data/external_sort.sql`
DataFusion CLI v45.0.0
0 row(s) fetched.
Elapsed 0.006 seconds.

l_orderkey,l_partkey,l_suppkey,l_linenumber,l_quantity,l_extendedprice,l_discount,l_tax,l_shipdate,l_commitdate,l_receiptdate
1,1551894,76910,1,17.00,33078.94,0.04,0.02,1996-03-13,1996-02-12,1996-03-22
59986052 row(s) fetched. (First 1 displayed. Use --maxrows to adjust)
Elapsed 3.840 seconds.

        8.47 real        12.51 user         6.20 sys
          3736174592  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
              954842  page reclaims
                   0  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                   0  messages sent
                   0  messages received
                   0  signals received
                1528  voluntary context switches
              104430  involuntary context switches
        293439737861  instructions retired
         69591166435  cycles elapsed
          3726603512  peak memory footprint

@zhuqi-lucas
Copy link
Contributor Author

Even we streaming printing all result, the memory usage is also same:

/usr/bin/time -l cargo run --profile release-nonlto --   -m 3G --mem-pool-type fair --maxrows inf   --format table -f '/Users/zhuqi/arrow-datafusion/benchmarks/data/external_sort.sql'



| 60000000   | 1258565   | 33602     | 2            | 15.00      | 22852.50        | 0.03       | 0.08  | 1997-11-03 | 1997-11-18   | 1997-11-05    |
| 60000000   | 698651    | 48664     | 3            | 46.00      | 75882.52        | 0.00       | 0.06  | 1997-09-04 | 1997-11-12   | 1997-09-05    |
| 60000000   | 224200    | 24201     | 4            | 37.00      | 41595.03        | 0.08       | 0.02  | 1997-11-17 | 1997-11-12   | 1997-12-14    |
| 60000000   | 118838    | 93842     | 5            | 28.00      | 51991.24        | 0.00       | 0.08  | 1997-09-29 | 1997-11-06   | 1997-09-30    |
| 60000000   | 1294851   | 19864     | 6            | 48.00      | 88597.92        | 0.03       | 0.07  | 1997-11-28 | 1997-10-05   | 1997-12-06    |
| 60000000   | 558286    | 33302     | 7            | 12.00      | 16131.12        | 0.02       | 0.05  | 1997-10-09 | 1997-10-27   | 1997-10-21    |
+------------+-----------+-----------+--------------+------------+-----------------+------------+-------+------------+--------------+---------------+
59986052 row(s) fetched.
Elapsed 309.217 seconds.

      336.02 real       150.53 user        58.27 sys
          3714342912  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
              960345  page reclaims
                   0  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                   0  messages sent
                   0  messages received
                   0  signals received
             8604616  voluntary context switches
              104476  involuntary context switches
       2973463503658  instructions retired
        712748119117  cycles elapsed
          3704357512  peak memory footprint

@alamb
Copy link
Contributor

alamb commented Mar 1, 2025

We had to revert this change temporarily to get the 46 release out

So reopening the PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment