You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
at `__init__`. If set externally, it should be modified to exclude `SIGSEGV` from the list.
67
67
68
+
## CUDA-aware MPI
69
+
70
+
### Memory pool
71
+
72
+
Using CUDA-aware MPI on multi-GPU nodes with recent CUDA.jl may trigger (see [here](https://github.com/JuliaGPU/CUDA.jl/issues/1053#issue-946826096))
73
+
```
74
+
The call to cuIpcGetMemHandle failed. This means the GPU RDMA protocol
75
+
cannot be used.
76
+
cuIpcGetMemHandle return value: 1
77
+
```
78
+
in the MPI layer, or fail on a segmentation fault (see [here](https://discourse.julialang.org/t/cuda-aware-mpi-works-on-system-but-not-for-julia/75060)) with
This is due to the MPI implementation using legacy `cuIpc*` APIs, which are incompatible with stream-ordered allocator, now default in CUDA.jl, see [UCX issue #7110](https://github.com/openucx/ucx/issues/7110).
83
+
84
+
To circumvent this, one has to ensure the CUDA memory pool to be set to `none`:
85
+
```
86
+
export JULIA_CUDA_MEMORY_POOL=none
87
+
```
88
+
_More about CUDA.jl [memory environment-variables](https://juliagpu.gitlab.io/CUDA.jl/usage/memory/#Environment-variables)._
89
+
90
+
### Hints to ensure CUDA-aware MPI to be functional
91
+
92
+
Make sure to:
93
+
- Have MPI and CUDA on path (or module loaded) that were used to build the CUDA-aware MPI
94
+
- Make sure to have:
95
+
```
96
+
export JULIA_CUDA_MEMORY_POOL=none
97
+
export JULIA_MPI_BINARY=system
98
+
export JULIA_CUDA_USE_BINARYBUILDER=false
99
+
```
100
+
- Add CUDA and MPI packages in Julia. Build MPI.jl in verbose mode to check whether correct versions are built/used:
101
+
```
102
+
julia -e 'using Pkg; pkg"add CUDA"; pkg"add MPI"; Pkg.build("MPI"; verbose=true)'
103
+
```
104
+
- Then in Julia, upon loading MPI and CUDA modules, you can check
105
+
- CUDA version: `CUDA.versioninfo()`
106
+
- If MPI has CUDA: `MPI.has_cuda()`
107
+
- If you are using correct MPI implementation: `MPI.identify_implementation()`
108
+
109
+
After that, it may be preferred to run the Julia MPI script (as suggested [here](https://discourse.julialang.org/t/cuda-aware-mpi-works-on-system-but-not-for-julia/75060/11)) launching it from a shell script (as suggested [here](https://discourse.julialang.org/t/cuda-aware-mpi-works-on-system-but-not-for-julia/75060/4)).
110
+
68
111
## Microsoft MPI
69
112
70
113
### Custom operators on 32-bit Windows
71
114
72
-
It is not possible to use [custom operators with 32-bit Microsoft MPI](https://github.com/JuliaParallel/MPI.jl/issues/246), as it uses the `stdcall` calling convention, which is not supported by [Julia's C-compatible function pointers](https://docs.julialang.org/en/v1/manual/calling-c-and-fortran-code/index.html#Creating-C-Compatible-Julia-Function-Pointers-1).
115
+
It is not possible to use [custom operators with 32-bit Microsoft MPI](https://github.com/JuliaParallel/MPI.jl/issues/246), as it uses the `stdcall` calling convention, which is not supported by [Julia's C-compatible function pointers](https://docs.julialang.org/en/v1/manual/calling-c-and-fortran-code/index.html#Creating-C-Compatible-Julia-Function-Pointers-1).
0 commit comments