[WIP] Implement DLPack #454

ax3l · 2025-07-23T00:53:33Z

Add first-class support for zero-copy data exchange with ROCm and SYCL GPUs via DLPack interfaces.

Specs:

Note: we might want to implement a slightly older DLPack version if we do not want to bump up NumPy/CuPy/PyTorch/... to very recent versions. Do we have access to the 2025 Intel Python tools release on Aurora?

Close #9

Action Items

roelof-groenewald · 2025-07-25T06:10:00Z

I performed some testing of the new functionality on Perlmutter. After the latest commit, the following appears to work as intended:

def test_mfab_cuda_cupy(mfab_device):
    import cupy as cp

    # AMReX -> cupy
    for mfi in mfab_device:   
        marr_cupy_from_dlpack = cp.from_dlpack(mfab_device.array(mfi))
        marr_cupy_from_dlpack[0, 1, 3, 2] = 5

    for mfi in mfab_device:   
        marr_cupy_from_dlpack = cp.from_dlpack(mfab_device.array(mfi))
        print(marr_cupy_from_dlpack[0, 1, 3, 2])

It executes without failure and prints the modified value 5. Inspection of the DLDevice showed that the device was successfully identified as kDLCUDA. The device id returned 3, which is consistent with Perlmutter's standard rank-to-gpu mapping with just one MPI rank.

ax3l · 2025-07-25T18:34:44Z

Awesome, then we are nearly there.

Try the dpnp logic for SYCL next?

roelof-groenewald · 2025-07-26T06:09:15Z

I tested the dlpack functionality on Aurora (SYCL) and it also now produces the expected result. I also modified Array4_to_xp to take into account the GPU backend. We can now successfully access a MultiFab's Array4 from a SYCL device with

for mfi in mfab_device:
    mfab_device.array(mfi).to_dpnp()

roelof-groenewald · 2025-07-26T07:19:04Z

I compiled WarpX on Aurora using this pyamrex branch. With it I was able to successfully run a multi-GPU simulation that uses fields.py to read MultiFab values 🎉 🚀

roelof-groenewald · 2025-07-26T06:12:47Z

src/Base/Array4.H

+                    /* TODO: Handle keyword arguments
+                    [[maybe_unused]] py::handle stream,
+                    [[maybe_unused]] std::tuple<int, int> max_version,
+                    [[maybe_unused]] std::tuple<DLDeviceType, int32_t> dl_device,
+                    [[maybe_unused]] bool copy
+                    */


Just want to flag this since copy=True doesn't yet work in the .to_dpnp() function.

src/amrex/extensions/MultiFab.py

ax3l · 2025-07-28T16:11:12Z

We need to rebase against development after #455 was merged. I already added the DLDeviceType bindings now and the other PR adds capsule type hints.

DLPack 1.1, e.g., in NumPy 2.1+ Tests do not yet pass.

…ltensor_versioned" Signed-off-by: Roelof Groenewald <[email protected]>

…e check Signed-off-by: Roelof Groenewald <[email protected]>

…PU backends Signed-off-by: Roelof Groenewald <[email protected]>

for more information, see https://pre-commit.ci

Signed-off-by: Roelof Groenewald <[email protected]>

for more information, see https://pre-commit.ci

… `mf_to_dpnp` Signed-off-by: Roelof Groenewald <[email protected]>

Signed-off-by: Roelof Groenewald <[email protected]>

src/dlpack/DLPack.cpp

Signed-off-by: Axel Huebl <[email protected]>

src/dlpack/DLPack.cpp

Signed-off-by: Axel Huebl <[email protected]>

ax3l requested a review from roelof-groenewald July 23, 2025 00:53

ax3l added backend: cuda Specific to CUDA execution (GPUs) backend: sycl Specific to DPC++/SYCL execution (CPUs/GPUs) backend: hip Specific to ROCm execution (GPUs) labels Jul 23, 2025

This was referenced Jul 23, 2025

spacecraft_charging example won't run on AMD GPU (gfx1030) ROCm 6.4.1. Fix documented within BLAST-WarpX/warpx#6034

Open

Discussion on mapping between amrex, numpy.ndarray, and torch.tensor data types #9

Open

ax3l force-pushed the topic-dlpack branch from 192fb40 to 833eaff Compare July 23, 2025 05:54

ax3l mentioned this pull request Jul 24, 2025

DLPack (Capsule) sizmailov/pybind11-stubgen#257

Closed

roelof-groenewald reviewed Jul 26, 2025

View reviewed changes

ax3l and others added 13 commits July 28, 2025 13:02

Vibe Start

adb52c8

Cleaning

32e4535

More Cleanup

0349655

DLPack 1.1, e.g., in NumPy 2.1+ Tests do not yet pass.

early return in PyCapsule_Destructor if the capsule name is "used_d…

181145d

…ltensor_versioned" Signed-off-by: Roelof Groenewald <[email protected]>

accept keyword arguments in Array4.__dlpack__ and fix Cuda memory typ…

81929d7

…e check Signed-off-by: Roelof Groenewald <[email protected]>

add array4_to_dpnp and modify array4_to_xp logic to distinguish G…

b04171c

…PU backends Signed-off-by: Roelof Groenewald <[email protected]>

[pre-commit.ci] auto fixes from pre-commit.com hooks

ccce85c

for more information, see https://pre-commit.ci

update MultiFab.py to also support SYCL backend

25d3b34

Signed-off-by: Roelof Groenewald <[email protected]>

[pre-commit.ci] auto fixes from pre-commit.com hooks

228f940

for more information, see https://pre-commit.ci

use array4_to_xp in MultiFab.py to avoid needing mf_to_cupy and…

e700c25

… `mf_to_dpnp` Signed-off-by: Roelof Groenewald <[email protected]>

add specific optional arguments to Array4.__dlpack__

13f68c9

Signed-off-by: Roelof Groenewald <[email protected]>

remove mf_to_cupy and mf_to_dpnp

728fa4a

Signed-off-by: Roelof Groenewald <[email protected]>

Bind DLDeviceType

9b942d2

roelof-groenewald force-pushed the topic-dlpack branch from 7b35ac4 to 9b942d2 Compare July 28, 2025 20:03

ax3l commented Jul 28, 2025

View reviewed changes

src/dlpack/DLPack.cpp Outdated Show resolved Hide resolved

Simplify

63a0071

Signed-off-by: Axel Huebl <[email protected]>

ax3l commented Jul 28, 2025

View reviewed changes

src/dlpack/DLPack.cpp Outdated Show resolved Hide resolved

Simplify

ac9e5bc

Signed-off-by: Axel Huebl <[email protected]>

ax3l mentioned this pull request Aug 1, 2025

Taichi examples #452

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Implement DLPack #454

[WIP] Implement DLPack #454

Uh oh!

ax3l commented Jul 23, 2025 •

edited

Loading

Uh oh!

roelof-groenewald commented Jul 25, 2025 •

edited

Loading

Uh oh!

ax3l commented Jul 25, 2025

Uh oh!

roelof-groenewald commented Jul 26, 2025 •

edited

Loading

Uh oh!

roelof-groenewald commented Jul 26, 2025

Uh oh!

roelof-groenewald Jul 26, 2025

Uh oh!

Uh oh!

ax3l commented Jul 28, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[WIP] Implement DLPack #454

Are you sure you want to change the base?

[WIP] Implement DLPack #454

Uh oh!

Conversation

ax3l commented Jul 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Action Items

Uh oh!

roelof-groenewald commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ax3l commented Jul 25, 2025

Uh oh!

roelof-groenewald commented Jul 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

roelof-groenewald commented Jul 26, 2025

Uh oh!

roelof-groenewald Jul 26, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ax3l commented Jul 28, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ax3l commented Jul 23, 2025 •

edited

Loading

roelof-groenewald commented Jul 25, 2025 •

edited

Loading

roelof-groenewald commented Jul 26, 2025 •

edited

Loading