Skip to content

Commit 4cd7118

Browse files
authored
Fix AMDGPU synchronize in tests & update doc (#628)
* Add CUDA-aware MPI all-to-all tests and fix typo. * Update ROCQueue init to latest syntax
1 parent 6ef9d6b commit 4cd7118

File tree

2 files changed

+7
-2
lines changed

2 files changed

+7
-2
lines changed

docs/src/usage.md

+6-1
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,12 @@ If your MPI implementation has been compiled with CUDA support, then `CUDA.CuArr
7474
[CUDA.jl](https://github.com/JuliaGPU/CUDA.jl) package) can be passed directly as
7575
send and receive buffers for point-to-point and collective operations (they may also work with one-sided operations, but these are not often supported).
7676

77-
If using Open MPI, the status of CUDA support can be checked via the
77+
Successfully running the [alltoall\_test\_cuda.jl](https://gist.github.com/luraess/0063e90cb08eb2208b7fe204bbd90ed2)
78+
should confirm your MPI implementation to have the CUDA support enabled. Moreover, successfully running the
79+
[alltoall\_test\_cuda\_multigpu.jl](https://gist.github.com/luraess/ed93cc09ba04fe16f63b4219c1811566) should confirm
80+
your CUDA-aware MPI implementation to use multiple Nvidia GPUs (one GPU per rank).
81+
82+
If using OpenMPI, the status of CUDA support can be checked via the
7883
[`MPI.has_cuda()`](@ref) function.
7984

8085
## ROCm-aware MPI support

test/common.jl

+1-1
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ elseif get(ENV,"JULIA_MPI_TEST_ARRAYTYPE","") == "ROCArray"
1010
ArrayType = AMDGPU.ROCArray
1111
function synchronize()
1212
# TODO: AMDGPU synchronization story is complicated. HSA does not provide a consistent notion of global queues. We need a mechanism for all GPUArrays.jl provided kernels to be synchronized.
13-
queue = AMDGPU.get_default_queue()
13+
queue = AMDGPU.ROCQueue()
1414
barrier = AMDGPU.barrier_and!(queue, AMDGPU.active_kernels(queue))
1515
AMDGPU.HIP.hipDeviceSynchronize() # Sync all HIP kernels e.g. BLAS. N.B. this is blocking Julia progress
1616
wait(barrier)

0 commit comments

Comments
 (0)