Skip to content

cuda.core: expose mem-range attributes on ManagedBuffer (last_prefetch_location and read-side getters) #2109

@rparolin

Description

@rparolin

Background

PR #1775 lands ManagedBuffer with a property-style advice API (buf.read_mostly = ..., buf.preferred_location = ..., buf.accessed_by.add(...)) for the write side of managed-memory advice. The corresponding read side is uneven:

  • buf.preferred_location — exists, returns Device | Host | None.
  • buf.read_mostly — exists as a getter (queries CU_MEM_RANGE_ATTRIBUTE_READ_MOSTLY).
  • buf.accessed_by — exists as AccessedBySetProxy.
  • last_prefetch_location — missing.

Because the last-prefetch query isn't exposed, the test suite reaches into cuda.bindings.driver directly:

last = _get_int_attr(buf, driver.CUmem_range_attribute.CU_MEM_RANGE_ATTRIBUTE_LAST_PREFETCH_LOCATION)

As @leofang noted on PR #1775 (#1775 (comment)):

The fact that these are needed at test time rings a bell. cuda.core tries hard to not leak the abstraction. This highlights a problem that we do not expose enough mem-range attributes for ManagedBuffer.

PR #1775 currently works around this with a private _last_prefetch_location(buf) helper in tests/memory/test_managed_ops.py carrying a TODO that points at this issue.

Proposal

Add ManagedBuffer.last_prefetch_location mirroring preferred_location's shape:

@property
def last_prefetch_location(self) -> Device | Host | None:
    """Location of the most recent prefetch on this range, or ``None``
    if no prefetch has been issued.
    """

Returns:

  • Device(i) for i >= 0
  • Host() for the legacy -1 ordinal
  • None for the "no prefetch yet" sentinel

On CUDA 13, verify whether CU_MEM_RANGE_ATTRIBUTE_LAST_PREFETCH_LOCATION_TYPE / _ID exist; if they do, layer a v2 path the same way preferred_location does so Host(numa_id=N) round-trips. Otherwise document the legacy-attribute caveat consistently with preferred_location.

Follow-on cleanup

Once this lands, in cuda_core/tests/memory/test_managed_ops.py:

  • Drop the private _last_prefetch_location(buf) helper.
  • Replace last == _HOST_LOCATION_ID / last == device.device_id assertions with buf.last_prefetch_location == Host() / ... == device.
  • Drop the from cuda.core._memory._managed_buffer import _get_int_attr import and most driver.CUmem_range_attribute.* references.

Scope notes

Metadata

Metadata

Assignees

Labels

cuda.coreEverything related to the cuda.core modulefeatureNew feature or requesttriageNeeds the team's attention
No fields configured for Enhancement.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions