docs: expand profiling guide with JVM and async-profiler coverage#3628
docs: expand profiling guide with JVM and async-profiler coverage#3628andygrove wants to merge 2 commits intoapache:mainfrom
Conversation
Rename profiling_native_code.md to profiling.md and add sections for async-profiler (unified JVM + native flame graphs), Java Flight Recorder, a tool comparison table, and practical tips for profiling Comet's mixed JVM/Rust execution.
There was a problem hiding this comment.
Thanks @andygrove we might want to include a full working example
For example I use
TPCH Local Q4 (Profiler)
JAVA_HOME=/opt/homebrew/opt/openjdk@17/libexec/openjdk.jdk/Contents/Home SPARK_HOME=/opt/homebrew/Cellar/apache-spark/3.5.5/libexec spark-submit --master="local[*]" --conf spark.driver.memory=8G --conf spark.executor.instances=1 --conf spark.executor.cores=4 --conf spark.cores.max=4 --conf spark.executor.memory=16g --conf spark.memory.offHeap.enabled=true --conf spark.memory.offHeap.size=16g --conf spark.comet.explainFallback.enabled=true --conf spark.comet.logFallbackReasons.enabled=true --conf spark.eventLog.enabled=true --jars $COMET_JAR --driver-class-path $COMET_JAR --conf spark.driver.extraClassPath=$COMET_JAR --conf spark.executor.extraClassPath=$COMET_JAR --conf spark.plugins=org.apache.spark.CometPlugin --conf spark.shuffle.manager=org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager --conf spark.comet.enabled=true --conf spark.comet.exec.shuffle.enableFastEncoding=true --conf spark.comet.exec.shuffle.fallbackToColumnar=true --conf spark.comet.exec.replaceSortMergeJoin=true --conf spark.comet.cast.allowIncompatible=true --conf spark.comet.scan.impl=native_iceberg_compat --conf "spark.executor.extraJavaOptions=-agentpath:/Users/xxx/Downloads/async-profiler-4.3-macos/lib/libasyncProfiler.dylib=start,event=cpu,file=profile-executor.html,tree" --conf "spark.driver.extraJavaOptions=-agentpath:/Users/xxx/Downloads/async-profiler-4.3-macos/lib/libasyncProfiler.dylib=start,event=cpu,file=profile-driver.html,tree" dev/benchmarks/tpcbench.py --name comet --benchmark tpch --data ../datafusion-benchmarks/tpch/data --queries ../datafusion-benchmarks/tpch/queries --output . --iterations 1 --query 4
And this flag helps a bit file=profile-driver.html,tree which builds an output as a tree hierarchy
| ### Integrated benchmark profiling | ||
|
|
||
| The TPC benchmark scripts in `benchmarks/tpc/` have built-in async-profiler support via | ||
| the `--async-profiler` flag. See [benchmarks/tpc/README.md](../../benchmarks/tpc/README.md) |
There was a problem hiding this comment.
| the `--async-profiler` flag. See [benchmarks/tpc/README.md](../../benchmarks/tpc/README.md) | |
| the `--async-profiler` flag. See [benchmarks/tpc/README.md](../../../benchmarks/tpc/README.md) |
| ### Integrated benchmark profiling | ||
|
|
||
| The TPC benchmark scripts support `--jfr` for automatic JFR recording during benchmark | ||
| runs. See [benchmarks/tpc/README.md](../../benchmarks/tpc/README.md) for details. |
There was a problem hiding this comment.
| runs. See [benchmarks/tpc/README.md](../../benchmarks/tpc/README.md) for details. | |
| runs. See [benchmarks/tpc/README.md](../../../benchmarks/tpc/README.md) for details. |
|
|
||
| ```shell | ||
| # Linux x64 | ||
| wget https://github.com/async-profiler/async-profiler/releases/download/v3.0/async-profiler-3.0-linux-x64.tar.gz |
There was a problem hiding this comment.
| wget https://github.com/async-profiler/async-profiler/releases/download/v3.0/async-profiler-3.0-linux-x64.tar.gz | |
| wget https://github.com/async-profiler/async-profiler/releases/download/v3.0/async-profiler-3.0-linux-x64.tar.gz | |
| mkdir -p /opt/async-profiler |
There was a problem hiding this comment.
Also, /opt usually requires sudo privileges. Maybe use $HOME/opt/... ?!
| harness = false | ||
| ``` | ||
|
|
||
| These benchmarks are useful when for comparing performance between releases or between feature branches and the |
There was a problem hiding this comment.
| These benchmarks are useful when for comparing performance between releases or between feature branches and the | |
| These benchmarks are useful for comparing performance between releases or between feature branches and the |
| ... | ||
| ``` | ||
|
|
||
| ### Choosing an event type |
There was a problem hiding this comment.
| ### Choosing an event type | |
| Note: If the executor is distributed then `executor.html` will be written on the remote node. | |
| ### Choosing an event type |
Summary
Add detailed profiling guide - rendered version
Changes
profiling_native_code.mdtoprofiling.mdand expands it into a comprehensive profiling guideasprof, java agent usage, event types, output formats, platform notesjcmddynamic recording, viewer options, useful JFR events for Comet debuggingbenchmarks/tpc/README.mdfor integrated benchmark profiling