Skip to content

Commit 4dbe68b

Browse files
committed
Trajectory noisy sim docs
Signed-off-by: Thien Nguyen <[email protected]>
1 parent d2f3a95 commit 4dbe68b

File tree

6 files changed

+241
-22
lines changed

6 files changed

+241
-22
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
/*******************************************************************************
2+
* Copyright (c) 2022 - 2025 NVIDIA Corporation & Affiliates. *
3+
* All rights reserved. *
4+
* *
5+
* This source code and the accompanying materials are made available under *
6+
* the terms of the Apache License 2.0 which accompanies this distribution. *
7+
******************************************************************************/
8+
9+
// [Begin Documentation]
10+
#include <chrono>
11+
#include <cudaq.h>
12+
#include <iostream>
13+
14+
struct xOp {
15+
void operator()(int qubit_count) __qpu__ {
16+
cudaq::qvector q(qubit_count);
17+
x(q);
18+
mz(q);
19+
}
20+
};
21+
22+
int main() {
23+
// Add a simple bit-flip noise channel to X gate
24+
const double error_probability = 0.01;
25+
26+
cudaq::bit_flip_channel bit_flip(error_probability);
27+
// Add noise channels to our noise model.
28+
cudaq::noise_model noise_model;
29+
// Apply the bitflip channel to any X-gate on any qubits
30+
noise_model.add_all_qubit_channel<cudaq::types::x>(bit_flip);
31+
32+
const int qubit_count = 10;
33+
const auto start_time = std::chrono::high_resolution_clock::now();
34+
// Due to the impact of noise, our measurements will no longer be uniformly in
35+
// the |1...1> state.
36+
auto counts =
37+
cudaq::sample({.shots = 1000, .noise = noise_model}, xOp{}, qubit_count);
38+
const auto end_time = std::chrono::high_resolution_clock::now();
39+
counts.dump();
40+
const std::chrono::duration<double, std::milli> elapsed_time =
41+
end_time - start_time;
42+
std::cout << "Simulation elapsed time: " << elapsed_time.count() << "ms\n";
43+
return 0;
44+
}

docs/sphinx/snippets/python/using/backends/trajectory.py

+3
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,9 @@
1010
import cudaq
1111

1212
# Use the `nvidia` target
13+
# Other targets capable of trajectory simulation are:
14+
# - `tensornet`
15+
# - `tensornet-mps`
1316
cudaq.set_target("nvidia")
1417

1518
# Let's define a simple kernel that we will add noise to.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# ============================================================================ #
2+
# Copyright (c) 2022 - 2024 NVIDIA Corporation & Affiliates. #
3+
# All rights reserved. #
4+
# #
5+
# This source code and the accompanying materials are made available under #
6+
# the terms of the Apache License 2.0 which accompanies this distribution. #
7+
# ============================================================================ #
8+
9+
#[Begin Docs]
10+
import time
11+
import cudaq
12+
# Use the `nvidia` target
13+
cudaq.set_target("nvidia")
14+
15+
# Let's define a simple kernel that we will add noise to.
16+
qubit_count = 10
17+
18+
19+
@cudaq.kernel
20+
def kernel(qubit_count: int):
21+
qvector = cudaq.qvector(qubit_count)
22+
x(qvector)
23+
mz(qvector)
24+
25+
26+
# Add a simple bit-flip noise channel to X gate
27+
error_probability = 0.01
28+
bit_flip = cudaq.BitFlipChannel(error_probability)
29+
30+
# Add noise channels to our noise model.
31+
noise_model = cudaq.NoiseModel()
32+
# Apply the bit-flip channel to any X-gate on any qubits
33+
noise_model.add_all_qubit_channel("x", bit_flip)
34+
35+
ideal_counts = cudaq.sample(kernel, qubit_count, shots_count=1000)
36+
37+
start = time.time()
38+
# Due to the impact of noise, our measurements will no longer be uniformly
39+
# in the |1...1> state.
40+
noisy_counts = cudaq.sample(kernel,
41+
qubit_count,
42+
noise_model=noise_model,
43+
shots_count=1000)
44+
end = time.time()
45+
noisy_counts.dump()
46+
print(f"Simulation elapsed time: {(end - start) * 1000} ms")

docs/sphinx/snippets/python/using/backends/trajectory_observe.py

+3
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,9 @@
1111
from cudaq import spin
1212

1313
# Use the `nvidia` target
14+
# Other targets capable of trajectory simulation are:
15+
# - `tensornet`
16+
# - `tensornet-mps`
1417
cudaq.set_target("nvidia")
1518

1619

docs/sphinx/using/backends/sims/noisy.rst

+135-18
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,12 @@ Noisy Simulators
44
Trajectory Noisy Simulation
55
++++++++++++++++++++++++++++++++++
66

7-
The :code:`nvidia` target supports noisy quantum circuit simulations using
8-
quantum trajectory method across all configurations: single GPU, multi-node
9-
multi-GPU, and with host memory. When simulating many trajectories with small
10-
state vectors, the simulation is batched for optimal performance.
7+
CUDA-Q GPU simulator backends, :code:`nvidia`, :code:`tensornet`, and :code:`tensornet-mps`,
8+
supports noisy quantum circuit simulations using quantum trajectory method.
119

12-
When a :code:`noise_model` is provided to CUDA-Q, the :code:`nvidia` target
10+
When a :code:`noise_model` is provided to CUDA-Q, the backend target
1311
will incorporate quantum noise into the quantum circuit simulation according
14-
to the noise model specified.
15-
12+
to the noise model specified, as shown in the below example.
1613

1714
.. tab:: Python
1815

@@ -33,14 +30,70 @@ to the noise model specified.
3330
3431
.. code:: bash
3532
33+
# nvidia target
3634
nvq++ --target nvidia program.cpp [...] -o program.x
3735
./program.x
3836
{ 00:15 01:92 10:81 11:812 }
37+
# tensornet target
38+
nvq++ --target tensornet program.cpp [...] -o program.x
39+
./program.x
40+
{ 00:10 01:108 10:73 11:809 }
41+
# tensornet-mps target
42+
nvq++ --target tensornet-mps program.cpp [...] -o program.x
43+
./program.x
44+
{ 00:5 01:86 10:102 11:807 }
45+
46+
47+
In the case of bit-string measurement sampling as in the above example, each measurement 'shot' is executed as a trajectory,
48+
whereby Kraus operators specified in the noise model are sampled.
49+
50+
51+
Unitary Mixture vs. General Noise Channel
52+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3953
54+
Quantum noise channels can be classified into two categories:
4055
41-
In the case of bit-string measurement sampling as in the above example, each measurement 'shot' is executed as a trajectory, whereby Kraus operators specified in the noise model are sampled.
56+
(1) Unitary mixture
4257
43-
For observable expectation value estimation, the statistical error scales asymptotically as :math:`1/\sqrt{N_{trajectories}}`, where :math:`N_{trajectories}` is the number of trajectories.
58+
The noise channel can be defined by a set of unitary matrices along with list of probabilities associated with those matrices.
59+
The depolarizing channel is an example of unitary mixture, whereby `I` (no noise), `X`, `Y`, or `Z` unitaries may occur to the
60+
quantum state at pre-defined probabilities.
61+
62+
(2) General noise channel
63+
64+
The channel is defined as a set of non-unitary Kraus matrices, satisfying the completely positive and trace preserving (CPTP) condition.
65+
An example of this type of channels is the amplitude damping noise channel.
66+
67+
In trajectory simulation method, simulating unitary mixture noise channels is more efficient than
68+
general noise channels since the trajectory sampling of the latter requires probability calculation based
69+
on the immediate quantum state.
70+
71+
.. note::
72+
CUDA-Q noise channel utility automatically detects whether a list of Kraus matrices can be converted to
73+
the unitary mixture representation for more efficient simulation.
74+
75+
.. list-table:: **Noise Channel Support**
76+
:widths: 20 30 50
77+
78+
* - Backend
79+
- Unitary Mixture
80+
- General Channel
81+
* - :code:`nvidia`
82+
- YES
83+
- YES
84+
* - :code:`tensornet`
85+
- YES
86+
- NO
87+
* - :code:`tensornet-mps`
88+
- YES
89+
- YES (number of qubits > 1)
90+
91+
92+
Trajectory Expectation Value Calculation
93+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
94+
95+
In trajectory simulation method, the statistical error of observable expectation value estimation scales asymptotically
96+
as :math:`1/\sqrt{N_{trajectories}}`, where :math:`N_{trajectories}` is the number of trajectories.
4497
Hence, depending on the required level of accuracy, the number of trajectories can be specified accordingly.
4598
4699
.. tab:: Python
@@ -63,24 +116,54 @@ Hence, depending on the required level of accuracy, the number of trajectories c
63116
64117
.. code:: bash
65118
119+
# nvidia target
66120
nvq++ --target nvidia program.cpp [...] -o program.x
67121
./program.x
68122
Noisy <Z> with 1024 trajectories = -0.810547
69123
Noisy <Z> with 8192 trajectories = -0.800049
70124
125+
# tensornet target
126+
nvq++ --target tensornet program.cpp [...] -o program.x
127+
./program.x
128+
Noisy <Z> with 1024 trajectories = -0.777344
129+
Noisy <Z> with 8192 trajectories = -0.800537
130+
131+
# tensornet-mps target
132+
nvq++ --target tensornet-mps program.cpp [...] -o program.x
133+
./program.x
134+
Noisy <Z> with 1024 trajectories = -0.828125
135+
Noisy <Z> with 8192 trajectories = -0.801758
136+
137+
In the above example, as we increase the number of trajectories,
138+
the result of CUDA-Q `observe` approaches the true value.
139+
140+
.. note::
141+
With trajectory noisy simulation, the result of CUDA-Q `observe` is inherently stochastic.
142+
For a small number of qubits, the true expectation value can be simulated by the :ref:`density matrix <density-matrix-cpu-backend>` simulator.
143+
144+
Batched Trajectory Simulation
145+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
146+
147+
On the :code:`nvidia` target, when simulating many trajectories with small
148+
state vectors, the simulation is batched for optimal performance.
71149
72-
The following environment variable options are applicable to the :code:`nvidia` target for trajectory noisy simulation. Any environment variables must be set
73-
prior to setting the target.
150+
.. note::
151+
152+
Batched trajectory simulation is only available on the single-GPU execution mode of the :code:`nvidia` target.
153+
154+
If batched trajectory simulation is not activated, e.g., due to problem size, number of trajectories,
155+
or the nature of the circuit (dynamic circuits with mid-circuit measurements and conditional branching),
156+
the required number of trajectories will be executed sequentially.
157+
158+
The following environment variable options are applicable to the :code:`nvidia` target for batched trajectory noisy simulation.
159+
Any environment variables must be set prior to setting the target or running "`import cudaq`".
74160
75161
.. list-table:: **Additional environment variable options for trajectory simulation**
76162
:widths: 20 30 50
77163
78164
* - Option
79165
- Value
80166
- Description
81-
* - ``CUDAQ_OBSERVE_NUM_TRAJECTORIES``
82-
- positive integer
83-
- The default number of trajectories for observe simulation if none was provided in the `observe` call. The default value is 1000.
84167
* - ``CUDAQ_BATCH_SIZE``
85168
- positive integer or `NONE`
86169
- The number of state vectors in the batched mode. If `NONE`, the batch size will be calculated based on the available device memory. Default is `NONE`.
@@ -95,11 +178,45 @@ prior to setting the target.
95178
- The minimum number of trajectories for batching. If the number of trajectories is less than this value, batched trajectory simulation will be disabled. Default value is 4.
96179
97180
.. note::
98-
99-
Batched trajectory simulation is only available on the single-GPU execution mode of the :code:`nvidia` target.
100-
101-
If batched trajectory simulation is not activated, e.g., due to problem size, number of trajectories, or the nature of the circuit (dynamic circuits with mid-circuit measurements and conditional branching), the required number of trajectories will be executed sequentially.
181+
The default batched trajectory simulation parameters have been chosen for optimal performance.
182+
183+
In the below example, we demonstrate the use of these parameters to control trajectory batching.
184+
185+
.. tab:: Python
186+
187+
.. literalinclude:: ../../../snippets/python/using/backends/trajectory_batching.py
188+
:language: python
189+
:start-after: [Begin Docs]
190+
191+
.. code:: bash
192+
193+
# Default batching parameter
194+
python3 program.py
195+
Simulation elapsed time: 45.75657844543457 ms
196+
197+
# Disable batching by setting batch size to 1
198+
CUDAQ_BATCH_SIZE=1 python3 program.py
199+
Simulation elapsed time: 716.090202331543 ms
200+
201+
.. tab:: C++
202+
203+
.. literalinclude:: ../../../snippets/cpp/using/backends/trajectory_batching.cpp
204+
:language: cpp
205+
:start-after: [Begin Documentation]
206+
207+
.. code:: bash
208+
209+
nvq++ --target nvidia program.cpp [...] -o program.x
210+
# Default batching parameter
211+
./program.x
212+
Simulation elapsed time: 45.47ms
213+
# Disable batching by setting batch size to 1
214+
Simulation elapsed time: 558.66ms
215+
216+
.. note::
102217
218+
The :code:`CUDAQ_LOG_LEVEL` :doc:`environment variable <../../basics/troubleshooting>` can be used to
219+
view detailed logs of batched trajectory simulation, e.g., the batch size.
103220
104221
105222
Density Matrix

docs/sphinx/using/backends/sims/svsims.rst

+10-4
Original file line numberDiff line numberDiff line change
@@ -96,8 +96,8 @@ To execute a program on the :code:`nvidia` backend, use the following commands:
9696

9797

9898
In the single-GPU mode, the :code:`nvidia` backend provides the following
99-
environment variable options. Any environment variables must be set prior to
100-
setting the target. It is worth drawing attention to gate fusion, a powerful tool for improving simulation performance which is discussed in greater detail `here <https://nvidia.github.io/cuda-quantum/latest/examples/python/performance_optimizations.html>`__.
99+
environment variable options. Any environment variables must be set prior to setting the target or running "`import cudaq`".
100+
It is worth drawing attention to gate fusion, a powerful tool for improving simulation performance which is discussed in greater detail `here <https://nvidia.github.io/cuda-quantum/latest/examples/python/performance_optimizations.html>`__.
101101

102102
.. list-table:: **Environment variable options supported in single-GPU mode**
103103
:widths: 20 30 50
@@ -121,6 +121,7 @@ setting the target. It is worth drawing attention to gate fusion, a powerful too
121121
- positive integer, or `NONE`
122122
- GPU memory (in GB) allowed for on-device state-vector allocation. As the state-vector size exceeds this limit, host memory will be utilized for migration. `NONE` means unlimited (up to physical memory constraints). This is the default.
123123

124+
124125
.. deprecated:: 0.8
125126
The :code:`nvidia-fp64` targets, which is equivalent setting the `fp64` option on the :code:`nvidia` target,
126127
is deprecated and will be removed in a future release.
@@ -212,8 +213,8 @@ See the `Divisive Clustering <https://nvidia.github.io/cuda-quantum/latest/appli
212213

213214
In addition to those environment variable options supported in the single-GPU mode,
214215
the :code:`nvidia` backend provides the following environment variable options particularly for
215-
the multi-node multi-GPU configuration. Any environment variables must be set
216-
prior to setting the target.
216+
the multi-node multi-GPU configuration. Any environment variables must be set prior to setting the target or running "`import cudaq`".
217+
217218

218219
.. list-table:: **Additional environment variable options for multi-node multi-GPU mode**
219220
:widths: 20 30 50
@@ -276,3 +277,8 @@ environment variable to another integer value as shown below.
276277
nvq++ --target nvidia --target-option mgpu,fp64 program.cpp [...] -o program.x
277278
CUDAQ_MGPU_FUSE=5 mpiexec -np 2 ./program.x
278279
280+
281+
.. note::
282+
283+
On multi-node systems without `MNNVL` support, the `nvidia` target in `mgpu` mode may fail to allocate memory.
284+
Users can disable `MNNVL` fabric-based memory sharing by setting the environment variable `UBACKEND_USE_FABRIC_HANDLE=0`.

0 commit comments

Comments
 (0)