-
Notifications
You must be signed in to change notification settings - Fork 11
feat: add total_token_throughput metric, rename prefill_throughput_per_user #500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Try out this PRQuick install: pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@337283b48215270edab950575919b53026ab70c2Recommended with virtual environment (using uv): uv venv --python 3.12 && source .venv/bin/activate
uv pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@337283b48215270edab950575919b53026ab70c2Last updated for commit: |
WalkthroughRenamed PrefillThroughputMetric to PrefillThroughputPerUserMetric with updated semantics and identifiers. Added new TotalTokenThroughputMetric derived metric class computing combined input/output token throughput. Included comprehensive unit tests validating throughput calculations and error handling. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Poem
Pre-merge checks❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/aiperf/metrics/types/prefill_throughput_per_user.py(3 hunks)src/aiperf/metrics/types/total_token_throughput.py(1 hunks)tests/unit/metrics/test_total_token_throughput_metric.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use async/await for all I/O operations; never use time.sleep() or blocking calls
Always use orjson for JSON operations: orjson.loads(s) and orjson.dumps(d)
All functions must have type hints on parameters and return types
Use Python 3.10+ union syntax (|) instead of typing.Union; use match/case for pattern matching; use @DataClass(slots=True)
Files:
src/aiperf/metrics/types/prefill_throughput_per_user.pytests/unit/metrics/test_total_token_throughput_metric.pysrc/aiperf/metrics/types/total_token_throughput.py
**/*test*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Test files must use pytest with fixtures, helpers, and @pytest.mark.parametrize; import statements at the top; use # fmt: skip for long parameterize blocks
Files:
tests/unit/metrics/test_total_token_throughput_metric.py
🧬 Code graph analysis (2)
src/aiperf/metrics/types/prefill_throughput_per_user.py (1)
src/aiperf/common/enums/metric_enums.py (2)
MetricOverTimeUnit(338-396)MetricFlags(602-698)
tests/unit/metrics/test_total_token_throughput_metric.py (4)
src/aiperf/common/exceptions.py (1)
NoMetricValue(168-169)src/aiperf/metrics/metric_dicts.py (1)
MetricResultsDict(120-140)src/aiperf/metrics/types/input_sequence_length_metric.py (1)
TotalInputSequenceLengthMetric(45-63)src/aiperf/metrics/types/total_token_throughput.py (1)
TotalTokenThroughputMetric(17-57)
🪛 Ruff (0.14.7)
src/aiperf/metrics/types/total_token_throughput.py
35-39: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
54-56: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: build (macos-latest, 3.10)
- GitHub Check: build (macos-latest, 3.12)
- GitHub Check: build (macos-latest, 3.13)
- GitHub Check: build (macos-latest, 3.11)
- GitHub Check: build (ubuntu-latest, 3.12)
- GitHub Check: build (ubuntu-latest, 3.10)
- GitHub Check: build (ubuntu-latest, 3.11)
- GitHub Check: build (ubuntu-latest, 3.13)
- GitHub Check: integration-tests (ubuntu-latest, 3.13)
- GitHub Check: integration-tests (ubuntu-latest, 3.11)
- GitHub Check: integration-tests (ubuntu-latest, 3.12)
- GitHub Check: integration-tests (ubuntu-latest, 3.10)
🔇 Additional comments (6)
src/aiperf/metrics/types/total_token_throughput.py (3)
17-23: LGTM!The class definition and docstring clearly explain the metric's purpose and formula.
25-39: LGTM!The metadata attributes are well-defined:
- Appropriate unit (TOKENS_PER_SECOND) and flags for a throughput metric
- Correct required_metrics dependencies
Note: The static analysis hint RUF012 suggesting ClassVar is a false positive; these are immutable class-level constants that follow the established pattern in the codebase.
41-57: LGTM!The implementation correctly calculates throughput with proper error handling:
- Appropriate use of
get_or_raiseandget_converted_or_raise- Zero-duration validation prevents division by zero
- Correct formula:
(input_tokens + output_tokens) / durationThe
type: ignorecomments are justified due to the complex type unions in the metric framework. The TRY003 static analysis hint is a nitpick and can be safely ignored.src/aiperf/metrics/types/prefill_throughput_per_user.py (2)
13-31: LGTM!The rename from
PrefillThroughputMetrictoPrefillThroughputPerUserMetricis consistent across all identifiers:
- Class name, tag, header, and short_header all updated
- Correct unit (
TOKENS_PER_SECOND_PER_USER) for per-user semanticsSTREAMING_TOKENS_ONLYflag appropriately replacesSTREAMING_ONLYThis clarification improves metric naming consistency and makes the per-user nature explicit.
42-53: LGTM!Documentation and error messages properly updated to reflect per-user semantics. The calculation logic remains correct and unchanged.
tests/unit/metrics/test_total_token_throughput_metric.py (1)
19-45: LGTM!The test structure follows pytest best practices:
- Proper use of
@pytest.mark.parametrizewith comprehensive test cases- Correct application of
# fmt: skipper coding guidelines- Type hints on test method parameters
- Good coverage of edge cases (zero tokens, fractional duration, large numbers)
ec4f211 to
337283b
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
total_token_throughputmetricprefill_throughputtoprefill_throughput_per_userfor clarity