Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement xrt-smi query to report preemption counters #368

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

NishadSaraf
Copy link
Member

Implement xrt-smi query to report preemption counters.

image

image

PS: Error "Subcommand not found" is due to the recent changes in xrt, unrelated to this change.

@NishadSaraf NishadSaraf self-assigned this Jan 28, 2025
Copy link
Contributor

@mamin506 mamin506 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor change is needed. Not reviewed SHIM changes.

src/include/uapi/drm_local/amdxdna_accel.h Outdated Show resolved Hide resolved
src/include/uapi/drm_local/amdxdna_accel.h Outdated Show resolved Hide resolved
src/include/uapi/drm_local/amdxdna_accel.h Outdated Show resolved Hide resolved
pci_dev_impl.ioctl(DRM_IOCTL_AMDXDNA_GET_INFO, &query_telemetry);

if (telemetry.major != NPU_TELEMETRY_VERSION_MAJOR || telemetry.minor != NPU_TELEMETRY_VERSION_MINOR) {
memset(&telemetry, 0, sizeof(telemetry));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need to clear the buffer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done to align with the Windows driver implementation for the Phoenix devices. Since we don't have preemption support in Phoenix, the expectation is that querying the preemption counter should return zeros instead of an exception.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to clear the buffer, send it down to driver and return to xrt-smi with whatever is returned from driver. On PHX, since there is no preemption support, driver should have already put all zeros in the buffer, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XDNA driver do put all zeros in the buffer.

  1. buff = dma_alloc_noncoherent()
  2. memset(buff, 0, aligned_buffer_size)
  3. query telemetry data from firmware
  4. copy_to_user(args->buffer, buff, args->buffer_size)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @mamin506 mentioned, the driver does initialize the buffer to all 0s. However, the firmware memsets it to all 0xFF values by default.

Copy link
Contributor

@mamin506 mamin506 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The driver change looks good.

Implement xrt-smi query to report preemption counters.

Signed-off-by: Nishad Saraf <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants