git submodule init
git submodule updatecurl -LsSf https://astral.sh/uv/install.sh | shUV_SKIP_WHEEL_FILENAME_CHECK=1
uv sync --extra cuda
. .venv/bin/activateUV_SKIP_WHEEL_FILENAME_CHECK=1
uv sync --extra rocm
. .venv/bin/activateBefore proceeding, make sure you have activated python env by . .venv/bin/activate.
cd fig13_14- Start experiments by
python run_experiment_nv.pyif on NVIDIA orpython run_experiment_amd.pyif on AMD. The results will be both displayed in terminal and stored in a json file. - Plot Figure 13 by
python fig13.py. The script will automatically read json file and plot AMD part of the figure. - Plot Figure 14 by
python fig14.py. - Check generated
fig13.pdfandfig14.pdf
cd fig15- Start experiments by
bash run_exp_and_visualize.sh - Plot Figure 15 by
python fig15.py. The generated figure isfig15.pdf
cd fig8
bash run.shThe proton profile will be visualized and printed in the terminal outputs.
cd fig9
bash run.shData and figures would be generated under fig9/triton_kernels/bench/logs/gpt-oss-x2
cd fig10
bash run.shNote
This experiment does not work with ROCM 7.0 due to ROCM bugs, and it requires the user to install ROCM 6.4. Please try uv pip install https://download.pytorch.org/whl/rocm6.4/torch-2.9.0%2Brocm6.4-cp313-cp313-manylinux_2_28_x86_64.whl
Similar to figure 9, data and figures would be generated under fig10/triton_kernels/bench/logs/gpt-oss-x2
cd fig12
bash run.shSimilar to figure 8, the proton profile will be visualized and printed in the terminal.