-
Notifications
You must be signed in to change notification settings - Fork 5
Buffered communication for CUDA-Aware MPI #167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
A new DistributedMix class is create with the aim of simpflify and unify all comm. calls in both DistributedArray and operators (further hiding away all implementation details).
@tharittk great start! Regarding the setup, I completely agree with the need to change the installation process for CUDA-Aware MPI. I have personally so far mostly relied on conda to install Regarding the code, as I briefly mentioned offline, whilst I think this is the right way to go:
i am starting to feel that the number of branches in code is growing and it is about time to put it all in one place... what I am mostly concerned is that this kind of branches will not only be present in
@astroC86 we have also talked a bit about this in the context of your |
The package rebuilding is required to run pylops-mpi with CUDA-aware
In my case for NCSA Delta, I create a new conda environment and do
module load openmpi/5.0.5+cuda
then
MPICC=/path/to/mpicc pip install --no-cache-dir --force-reinstall mpi4py
And to run the test (assuming you're in the compute node already):
Note:
allgather
has not been implemented with the buffered version.