Improve benchmark docs (#14820)

carols10cents · web-flow · commit e799097cbd20 · 2025-02-24T11:18:45.000-05:00
* Correct docs on subcommand help

The command as written gives you `cargo` help, not `tpch` help as the
text above says. And the output shown was for the benchmark bin, not the
subcommand.

Also correct some inconsistencies and punctuation.

* Docs on how to add a new benchmark

* Improve wording and punctuation in benchmarks README

* Remove help text about /benchmark PR command that's disabled
diff --git a/benchmarks/README.md b/benchmarks/README.md
@@ -85,7 +85,7 @@ git checkout main
 # Gather baseline data for tpch benchmark
 ./benchmarks/bench.sh run tpch
 
-# Switch to the branch the branch name is mybranch and gather data
+# Switch to the branch named mybranch and gather data
 git checkout mybranch
 ./benchmarks/bench.sh run tpch
 
@@ -157,22 +157,19 @@ Benchmark tpch_mem.json
 └──────────────┴──────────────┴──────────────┴───────────────┘
 ```
 
-Note that you can also execute an automatic comparison of the changes in a given PR against the base
-just by including the trigger `/benchmark` in any comment.
-
 ### Running Benchmarks Manually
 
-Assuming data in the `data` directory, the `tpch` benchmark can be run with a command like this
+Assuming data is in the `data` directory, the `tpch` benchmark can be run with a command like this:
 
 ```bash
 cargo run --release --bin dfbench -- tpch --iterations 3 --path ./data --format tbl --query 1 --batch-size 4096
 ```
 
-See the help for more details
+See the help for more details.
 
 ### Different features
 
-You can enable `mimalloc` or `snmalloc` (to use either the mimalloc or snmalloc allocator) as features by passing them in as `--features`. For example
+You can enable `mimalloc` or `snmalloc` (to use either the mimalloc or snmalloc allocator) as features by passing them in as `--features`. For example:
 
 ```shell
 cargo run --release --features "mimalloc" --bin tpch -- benchmark datafusion --iterations 3 --path ./data --format tbl --query 1 --batch-size 4096
@@ -184,6 +181,7 @@ The benchmark program also supports CSV and Parquet input file formats and a uti
 ```bash
 cargo run --release --bin tpch -- convert --input ./data --output /mnt/tpch-parquet --format parquet
 ```
+
 Or if you want to verify and run all the queries in the benchmark, you can just run `cargo test`.
 
 ### Comparing results between runs
@@ -206,7 +204,7 @@ $ cargo run --release --bin tpch -- benchmark datafusion --iterations 5 --path .
 ./compare.py /tmp/output_main/tpch-summary--1679330119.json  /tmp/output_branch/tpch-summary--1679328405.json
 ```
 
-This will produce output like
+This will produce output like:
 
 ```
 ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
@@ -243,28 +241,92 @@ The `dfbench` program contains subcommands to run the various
 benchmarks. When benchmarking, it should always be built in release
 mode using `--release`.
 
-Full help for each benchmark can be found in the relevant sub
-command. For example to get help for tpch, run
+Full help for each benchmark can be found in the relevant
+subcommand. For example, to get help for tpch, run:
 
 ```shell
-cargo run --release --bin dfbench  --help
+cargo run --release --bin dfbench -- tpch --help
 ...
-datafusion-benchmarks 27.0.0
-benchmark command
+dfbench-tpch 45.0.0
+Run the tpch benchmark.
+
+This benchmarks is derived from the [TPC-H][1] version
+[2.17.1]. The data and answers are generated using `tpch-gen` from
+[2].
+
+[1]: http://www.tpc.org/tpch/
+[2]: https://github.com/databricks/tpch-dbgen.git,
+[2.17.1]: https://www.tpc.org/tpc_documents_current_versions/pdf/tpc-h_v2.17.1.pdf
 
 USAGE:
-    dfbench <SUBCOMMAND>
+    dfbench tpch [FLAGS] [OPTIONS] --path <path>
+
+FLAGS:
+    -d, --debug
+            Activate debug mode to see more details
 
-SUBCOMMANDS:
-    clickbench        Run the clickbench benchmark
-    help              Prints this message or the help of the given subcommand(s)
-    parquet-filter    Test performance of parquet filter pushdown
-    sort              Test performance of parquet filter pushdown
-    tpch              Run the tpch benchmark.
-    tpch-convert      Convert tpch .slt files to .parquet or .csv files
+    -S, --disable-statistics
+            Whether to disable collection of statistics (and cost based optimizations) or not
 
+    -h, --help
+            Prints help information
+...
 ```
 
+# Writing a new benchmark
+
+## Creating or downloading data outside of the benchmark
+
+If you want to create or download the data with Rust as part of running the benchmark, see the next
+section on adding a benchmark subcommand and add code to create or download data as part of its
+`run` function.
+
+If you want to create or download the data with shell commands, in `benchmarks/bench.sh`, define a
+new function named `data_[your benchmark name]` and call that function in the `data` command case
+as a subcommand case named for your benchmark. Also call the new function in the `data all` case.
+
+## Adding the benchmark subcommand
+
+In `benchmarks/bench.sh`, define a new function named `run_[your benchmark name]` following the
+example of existing `run_*` functions. Call that function in the `run` command case as a subcommand
+case named for your benchmark. subcommand for your benchmark. Also call the new function in the
+`run all` case. Add documentation for your benchmark to the text in the `usage` function.
+
+In `benchmarks/src/bin/dfbench.rs`, add a `dfbench` subcommand for your benchmark by:
+
+- Adding a new variant to the `Options` enum
+- Adding corresponding code to handle the new variant in the `main` function, similar to the other
+  variants
+- Adding a module to the `use datafusion_benchmarks::{}` statement
+
+In `benchmarks/src/lib.rs`, declare the new module you imported in `dfbench.rs` and create the
+corresponding file(s) for the module's code.
+
+In the module, following the pattern of other existing benchmarks, define a `RunOpt` struct with:
+
+- A doc comment that will become the `--help` output for the subcommand
+- A `run` method that the `dfbench` `main` function will call.
+- A `--path` structopt field that the `bench.sh` script should use with `${DATA_DIR}` to define
+  where the input data should be stored.
+- An `--output` structopt field that the `bench.sh` script should use with `"${RESULTS_FILE}"` to
+  define where the benchmark's results should be stored.
+
+### Creating or downloading data as part of the benchmark
+
+Use the `--path` structopt field defined on the `RunOpt` struct to know where to store or look for
+the data. Generate the data using whatever Rust code you'd like, before the code that will be
+measuring an operation.
+
+### Collecting data
+
+Your benchmark should create and use an instance of `BenchmarkRun` defined in `benchmarks/src/util/run.rs` as follows:
+
+- Call its `start_new_case` method with a string that will appear in the "Query" column of the
+  compare output.
+- Use `write_iter` to record elapsed times for the behavior you're benchmarking.
+- When all cases are done, call the `BenchmarkRun`'s `maybe_write_json` method, giving it the value
+  of the `--output` structopt field on `RunOpt`.
+
 # Benchmarks
 
 The output of `dfbench` help includes a description of each benchmark, which is reproduced here for convenience