feat: add total_token_throughput metric, rename prefill_throughput_per_user #500

ajcasagrande · 2025-12-05T04:45:06Z

Add total_token_throughput metric
Rename prefill_throughput to prefill_throughput_per_user for clarity

# benchmark_duration is converted to seconds
total_token_throughput = (total_isl + total_osl) / benchmark_duration

# time_to_first_token is converted to seconds, per request
prefill_throughput_per_user = input_sequence_length / time_to_first_token

github-actions · 2025-12-05T04:45:17Z

Try out this PR

Quick install:

pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@337283b48215270edab950575919b53026ab70c2

Recommended with virtual environment (using uv):

uv venv --python 3.12 && source .venv/bin/activate
uv pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@337283b48215270edab950575919b53026ab70c2

Last updated for commit: 337283b • Browse code

coderabbitai · 2025-12-05T04:48:51Z

Walkthrough

Renamed PrefillThroughputMetric to PrefillThroughputPerUserMetric with updated semantics and identifiers. Added new TotalTokenThroughputMetric derived metric class computing combined input/output token throughput. Included comprehensive unit tests validating throughput calculations and error handling.

Changes

Cohort / File(s)	Change Summary
Per-user metric refactoring `src/aiperf/metrics/types/prefill_throughput_per_user.py`	Renamed class from `PrefillThroughputMetric` to `PrefillThroughputPerUserMetric`. Updated tag, headers, unit, and flags to reflect per-user semantics. Modified docstrings and error messages to align with per-user calculation context.
Total token throughput metric `src/aiperf/metrics/types/total_token_throughput.py`	Added new derived metric class `TotalTokenThroughputMetric` computing throughput as (total input + output tokens) / benchmark duration. Includes zero-duration guard with `NoMetricValue` exception and metadata declarations.
Total token throughput tests `tests/unit/metrics/test_total_token_throughput_metric.py`	Added unit test class `TestTotalTokenThroughputMetric` with parametrized tests validating throughput calculations across input/output/duration combinations and error handling for zero/None durations.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

New metric implementation requires verification of _derive_value logic and metric dependency declarations
Ensure zero-duration guard and NoMetricValue exception handling are correctly implemented
Validate test coverage comprehensively exercises all calculation paths and edge cases
Verify consistency of per-user metric naming changes with framework conventions

Poem

🐰 A metric hops in, measuring tokens with care,
Throughput per user, a fresh calculation rare,
Total tokens flowing, in and out they stream,
Per-second precision fuels our performance dream! 📊✨

Pre-merge checks

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The pull request title accurately summarizes the two main changes: adding a total_token_throughput metric and renaming the prefill_throughput to prefill_throughput_per_user.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 669fb62 and ec4f211.

📒 Files selected for processing (3)

src/aiperf/metrics/types/prefill_throughput_per_user.py (3 hunks)
src/aiperf/metrics/types/total_token_throughput.py (1 hunks)
tests/unit/metrics/test_total_token_throughput_metric.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use async/await for all I/O operations; never use time.sleep() or blocking calls
Always use orjson for JSON operations: orjson.loads(s) and orjson.dumps(d)
All functions must have type hints on parameters and return types
Use Python 3.10+ union syntax (|) instead of typing.Union; use match/case for pattern matching; use @DataClass(slots=True)

Files:

src/aiperf/metrics/types/prefill_throughput_per_user.py
tests/unit/metrics/test_total_token_throughput_metric.py
src/aiperf/metrics/types/total_token_throughput.py

**/*test*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test files must use pytest with fixtures, helpers, and @pytest.mark.parametrize; import statements at the top; use # fmt: skip for long parameterize blocks

Files:

tests/unit/metrics/test_total_token_throughput_metric.py

🧬 Code graph analysis (2)

src/aiperf/metrics/types/prefill_throughput_per_user.py (1)

src/aiperf/common/enums/metric_enums.py (2)

MetricOverTimeUnit (338-396)

MetricFlags (602-698)

tests/unit/metrics/test_total_token_throughput_metric.py (4)

src/aiperf/common/exceptions.py (1)

NoMetricValue (168-169)

src/aiperf/metrics/metric_dicts.py (1)

MetricResultsDict (120-140)

src/aiperf/metrics/types/input_sequence_length_metric.py (1)

TotalInputSequenceLengthMetric (45-63)

src/aiperf/metrics/types/total_token_throughput.py (1)

TotalTokenThroughputMetric (17-57)

🪛 Ruff (0.14.7)

src/aiperf/metrics/types/total_token_throughput.py

35-39: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

54-56: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)

GitHub Check: build (macos-latest, 3.10)
GitHub Check: build (macos-latest, 3.12)
GitHub Check: build (macos-latest, 3.13)
GitHub Check: build (macos-latest, 3.11)
GitHub Check: build (ubuntu-latest, 3.12)
GitHub Check: build (ubuntu-latest, 3.10)
GitHub Check: build (ubuntu-latest, 3.11)
GitHub Check: build (ubuntu-latest, 3.13)
GitHub Check: integration-tests (ubuntu-latest, 3.13)
GitHub Check: integration-tests (ubuntu-latest, 3.11)
GitHub Check: integration-tests (ubuntu-latest, 3.12)
GitHub Check: integration-tests (ubuntu-latest, 3.10)

🔇 Additional comments (6)

src/aiperf/metrics/types/total_token_throughput.py (3)

17-23: LGTM!

The class definition and docstring clearly explain the metric's purpose and formula.

25-39: LGTM!

The metadata attributes are well-defined:

Appropriate unit (TOKENS_PER_SECOND) and flags for a throughput metric

Correct required_metrics dependencies

Note: The static analysis hint RUF012 suggesting ClassVar is a false positive; these are immutable class-level constants that follow the established pattern in the codebase.

41-57: LGTM!

The implementation correctly calculates throughput with proper error handling:

Appropriate use of get_or_raise and get_converted_or_raise

Zero-duration validation prevents division by zero

Correct formula: (input_tokens + output_tokens) / duration

The type: ignore comments are justified due to the complex type unions in the metric framework. The TRY003 static analysis hint is a nitpick and can be safely ignored.

src/aiperf/metrics/types/prefill_throughput_per_user.py (2)

13-31: LGTM!

The rename from PrefillThroughputMetric to PrefillThroughputPerUserMetric is consistent across all identifiers:

Class name, tag, header, and short_header all updated

Correct unit (TOKENS_PER_SECOND_PER_USER) for per-user semantics

STREAMING_TOKENS_ONLY flag appropriately replaces STREAMING_ONLY

This clarification improves metric naming consistency and makes the per-user nature explicit.

42-53: LGTM!

Documentation and error messages properly updated to reflect per-user semantics. The calculation logic remains correct and unchanged.

tests/unit/metrics/test_total_token_throughput_metric.py (1)

19-45: LGTM!

The test structure follows pytest best practices:

Proper use of @pytest.mark.parametrize with comprehensive test cases

Correct application of # fmt: skip per coding guidelines

Type hints on test method parameters

Good coverage of edge cases (zero tokens, fractional duration, large numbers)

tests/unit/metrics/test_total_token_throughput_metric.py

codecov · 2025-12-05T04:51:56Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

ajcasagrande changed the title ~~feat: add total_token_throughput metric~~ feat: add total_token_throughput metric, rename prefill_throughput_per_user Dec 5, 2025

github-actions bot added the feat label Dec 5, 2025

coderabbitai bot reviewed Dec 5, 2025

View reviewed changes

tests/unit/metrics/test_total_token_throughput_metric.py Show resolved Hide resolved

feat: add total_token_throughput metric

337283b

ajcasagrande force-pushed the ajc/throughput-metrics branch from ec4f211 to 337283b Compare December 5, 2025 04:49

ajcasagrande requested review from IzzyPutterman, debermudez and lkomali December 5, 2025 16:42

debermudez approved these changes Dec 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add total_token_throughput metric, rename prefill_throughput_per_user #500

feat: add total_token_throughput metric, rename prefill_throughput_per_user #500

Uh oh!

ajcasagrande commented Dec 5, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Dec 5, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Dec 5, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

codecov bot commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: add total_token_throughput metric, rename prefill_throughput_per_user #500

Are you sure you want to change the base?

feat: add total_token_throughput metric, rename prefill_throughput_per_user #500

Uh oh!

Conversation

ajcasagrande commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Try out this PR

Uh oh!

coderabbitai bot commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented Dec 5, 2025

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ajcasagrande commented Dec 5, 2025 •

edited

Loading

github-actions bot commented Dec 5, 2025 •

edited

Loading

coderabbitai bot commented Dec 5, 2025 •

edited

Loading