Add storage read/write support to sequential perftest #1033

mtennenhaus · 2025-11-17T14:07:34Z

What?

Add storage read/write support to sequential KV bench (matrix indices ≥ world_size map to storage), with per-TP files and NIXL FILE transfers for rank→storage and storage→rank I/O.

Why?

Benchmark realistic tiered I/O (offload/load KV) alongside rank↔rank transfers.
Measure storage bandwidth/latency and interaction with GPU/CPU paths.
Ensure deterministic, isolated storage regions per TP for reproducible results.

How?

Storage endpoint is used as a base directory.
CLI requires a base storage path; each TP uses base/tp_/obj_<storage_idx>.bin.
TP matrix row/col that are bigger than world size are storage endpoints (files for read/write).
example with world size 1:
(0 0)
(100m 0)
Row 1 = storage file 0. 100mb will be read from the file (created in init phase) to rank 0 via selected backend.
Rank 0 prep: delete/recreate TP dir and files for read if needed.
Transfers:
- Rank↔rank via existing UCX.
- Storage via POSIX / GDS.

copy-pr-bot · 2025-11-17T14:07:37Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

github-actions · 2025-11-17T14:07:45Z

👋 Hi mtennenhaus! Thank you for contributing to ai-dynamo/nixl.

Your PR reviewers will review your contribution then trigger the CI to test your changes.

🚀

brminich · 2025-11-18T09:28:56Z

/ok to test 944f1e6

brminich · 2025-11-18T09:29:03Z

/build

mtennenhaus requested review from aranadive, brminich and ovidiusm as code owners November 17, 2025 14:07

pull-request-size bot added the size/L label Nov 17, 2025

github-actions bot added the external-contribution label Nov 17, 2025

add storage read/write support to sequential perftest

944f1e6

mtennenhaus force-pushed the mtennenhaus/kvbench branch from a314ae2 to 944f1e6 Compare November 17, 2025 14:40

pull-request-size bot added size/XL and removed size/L labels Nov 17, 2025

copy-pr-bot bot temporarily deployed to SWX_AWS November 18, 2025 09:29 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 18, 2025 09:29 Inactive

copy-pr-bot bot temporarily deployed to SWX_AWS November 18, 2025 09:29 Inactive

Merge branch 'main' into mtennenhaus/kvbench

c119832

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add storage read/write support to sequential perftest #1033

Add storage read/write support to sequential perftest #1033

Uh oh!

mtennenhaus commented Nov 17, 2025

Uh oh!

copy-pr-bot bot commented Nov 17, 2025

Uh oh!

github-actions bot commented Nov 17, 2025

Uh oh!

brminich commented Nov 18, 2025

Uh oh!

brminich commented Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add storage read/write support to sequential perftest #1033

Are you sure you want to change the base?

Add storage read/write support to sequential perftest #1033

Uh oh!

Conversation

mtennenhaus commented Nov 17, 2025

What?

Why?

How?

Uh oh!

copy-pr-bot bot commented Nov 17, 2025

Uh oh!

github-actions bot commented Nov 17, 2025

Uh oh!

brminich commented Nov 18, 2025

Uh oh!

brminich commented Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants