Skip to content

Conversation

@benlwalker
Copy link
Contributor

What?

Add GTest unit tests for several scenarios with multiple backends

Why?

From code inspection, I had a lot of questions about the exact behavior of many API calls when multiple backends are present. I wrote unit tests to clarify the behavior.

How?

I added some unit tests to the existing framework.

@benlwalker benlwalker requested a review from a team as a code owner November 21, 2025 18:20
@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 21, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions
Copy link

👋 Hi benlwalker! Thank you for contributing to ai-dynamo/nixl.

Your PR reviewers will review your contribution then trigger the CI to test your changes.

🚀

Verify that multiple backends can be created in a unit test.

Signed-off-by: Ben Walker <[email protected]>
Test the case where two backends support the same memory type and the
case where each backend supports different memory types.

Signed-off-by: Ben Walker <[email protected]>
different backends

This test fails.

This test creates two backends that support DRAM. Then it does memory
registration with one backend on one data buffer and memory registration
on the other backend with a different data buffer. Then it tries to do
prepXferDlist on a descriptor for both data buffers and that fails.

This could be fixed by having prepXferDlist attempt to populate
each segment with each backend and allowing for failure at the backend
call without failing the entire operation. However, this just kicks
the can down the road, because now the user can get failures in
makeXferReq depending on which indices of the nixlXferDlistH they
choose.

I suggest fixing this with a different approach. First, we observe that
almost all real users only have one backend type anyway. Second, if some
user really has two separate pools of DRAM that are used by different
backends with NIXL, they could just as easily solve that problem by
creating two agents, each with one backend, rather than creating
multiple backends.

So I propose we make the agent require all backends to
support all memory of the same type. In other words, do not allow
registerMem to register on specific backends. Instead, registerMem
should force the registration on all backends that support the memory
type. Then the end user knows that for a given agent, all backends
that can access DRAM can interoperate via all registered DRAM, for
example.

Even with the above change, NIXL can still support memory registered
with different backends that don't overlap. The user would just need
to use multiple agents. I think that's far more intuitive anyway.

Signed-off-by: Ben Walker <[email protected]>
@benlwalker benlwalker force-pushed the ben/multi-backend-tests branch from 7257138 to a5a4075 Compare November 21, 2025 20:16
@brminich brminich requested a review from ColinNV November 25, 2025 16:24
std::unique_ptr<char[]> buf2 = std::make_unique<char[]>(buf_size);

// Explicitly set addresses for the two DRAM blobs
uintptr_t addr1 = reinterpret_cast<uintptr_t>(buf1.get());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These and a couple of other variables can be made const.

@ColinNV
Copy link
Contributor

ColinNV commented Nov 27, 2025

Looks clean to me; didn't dive into the logic of the tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants