Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docs for nvidia target fuse settings #2660

Merged
merged 6 commits into from
Mar 1, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/sphinx/using/backends/sims/svsims.rst
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ setting the target. It is worth drawing attention to gate fusion, a powerful too
- Description
* - ``CUDAQ_FUSION_MAX_QUBITS``
- positive integer
- The max number of qubits used for gate fusion. The default value is `4`.
- The max number of qubits used for gate fusion. The default value depends on `GPU Compute Capability <https://developer.nvidia.com/cuda-gpus>`__ (CC) and the floating point precision selected for the simulator. Specifically, for CC 8.0, 9.0, and 10.0 the defaults are `4`, `5`, and `5` for `FP32`. For `FP64` the corresponding defaults are `5`, `6`, and `4`. For all other CC, the default is `4` for both precision modes.
* - ``CUDAQ_FUSION_DIAGONAL_GATE_MAX_QUBITS``
- integer greater than or equal to -1
- The max number of qubits used for diagonal gate fusion. The default value is set to `-1` and the fusion size will be automatically adjusted for the better performance. If 0, the gate fusion for diagonal gates is disabled.
Expand Down Expand Up @@ -232,7 +232,7 @@ prior to setting the target.
- The qubit count threshold where state vector distribution is activated. Below this threshold, simulation is performed as independent (non-distributed) tasks across all MPI processes for optimal performance. Default is 25.
* - ``CUDAQ_MGPU_FUSE``
- positive integer
- The max number of qubits used for gate fusion. The default value is `6` if there are more than one MPI processes or `4` otherwise.
- The max number of qubits used for gate fusion. The default value depends on `GPU Compute Capability <https://developer.nvidia.com/cuda-gpus>`__ (CC) and the floating point precision selected for the simulator. Specifically, for CC 8.0, 9.0, and 10.0 the defaults are `4`, `5`, and `5` for `FP32`. For `FP64` the corresponding defaults are `5`, `6`, and `4`. For all other CC, the default is `4` for both precision modes.
* - ``CUDAQ_MGPU_P2P_DEVICE_BITS``
- positive integer
- Specify the number of GPUs that can communicate by using GPUDirect P2P. Default value is 0 (P2P communication is disabled).
Expand Down
Loading