feat(experiments): Filter experiments list by a rollup metric by shanaiabuggy · Pull Request #453 · NVIDIA-NeMo/nemo-platform

shanaiabuggy · 2026-06-25T00:17:15Z

What

Adds metric filtering to the experiments list — filter by rollup metrics (cost, latency, evaluator scores, run count), not just entity fields.

How

Reuses the platform's standard filter[field][op] bracket syntax, so it combines naturally with existing entity filters:

?filter[cost_usd.mean][$lte]=0.5&filter[run_count][$gte]=1&filter[experiment_group_id]=

Supported paths mirror the sort grammar: run_count, cost_usd.<stat>, latency_ms.<stat>, evaluators.<name>.<stat> (stat ∈ mean/median/p90/p95/p99/sum/count). Operators: $gte/$lte/$gt/$lt/$eq.
Metrics live in ClickHouse, not Postgres, so list_experiments splits the filter tree: entity predicates go to the entity store; metric predicates are applied in-app after rollup hydration (compute-on-read, same plumbing as metric sort). Declared via self-mapping namespaces on ExperimentFilter so paths pass validation untranslated.
Added a NumberFilter range type ($gte/$lte/$gt/$lt/$eq) alongside DatetimeFilter/StringFilter.

Behavior

Metric filters must be AND-combined with entity filters (nested ANDs flatten); a metric under OR/NOT → 400 (can't split a boolean tree across two stores).
400 unsupported metric/stat/operator; 413 result set over the in-memory bound; 503 when ClickHouse is unavailable for a metric filter. Missing metric never matches.

Tests

Unit tests for the split/validate/match helpers + endpoint wiring (validation, 400/503), and an integration test combining entity + metric filters end to end against ClickHouse. OpenAPI specs regenerated.

Summary by CodeRabbit

New Features
- Experiment list filtering now supports numeric comparisons on metric rollups, including run_count, cost_usd, latency_ms, and evaluator metrics.
- Added support for comparison operators: $eq, $gt, $gte, $lt, $lte (e.g., range queries like cost_usd.mean <= 0.50).
Bug Fixes
- Improved error messaging for unsupported sort or filter fields.
- Metric-based sort/filter now returns 503 when telemetry data is unavailable.
Documentation
- Updated the OpenAPI contract to describe the new metric rollup filter syntax and updated response text.
Tests
- Added integration and unit test coverage for metric filtering and error cases.

Signed-off-by: shanaiabuggy <59746633+shanaiabuggy@users.noreply.github.com>

coderabbitai · 2026-06-25T00:22:29Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6e8d96e4-7659-4ba5-abc5-e8fab6522eaf

📥 Commits

Reviewing files that changed from the base of the PR and between a965d84 and 139fc9e.

⛔ Files ignored due to path filters (10)

sdk/python/nemo-platform/.nmpcontext/openapi.yaml is excluded by !sdk/**
sdk/python/nemo-platform/.nmpcontext/stainless.yaml is excluded by !sdk/**
sdk/python/nemo-platform/src/nemo_platform/resources/experiments/api.md is excluded by !sdk/**
sdk/python/nemo-platform/src/nemo_platform/resources/experiments/experiments.py is excluded by !sdk/**
sdk/python/nemo-platform/src/nemo_platform/types/experiments/__init__.py is excluded by !sdk/**
sdk/python/nemo-platform/src/nemo_platform/types/experiments/experiment_filter_param.py is excluded by !sdk/**
sdk/python/nemo-platform/src/nemo_platform/types/experiments/experiment_list_params.py is excluded by !sdk/**
sdk/python/nemo-platform/src/nemo_platform/types/experiments/number_filter_param.py is excluded by !sdk/**
sdk/python/nemo-platform/tests/api_resources/test_experiments.py is excluded by !sdk/**
sdk/stainless.yaml is excluded by !sdk/**

📒 Files selected for processing (7)

openapi/ga/individual/platform.openapi.yaml
openapi/ga/openapi.yaml
openapi/openapi.yaml
packages/nmp_common/src/nmp/common/entities/values.py
services/intake/src/nmp/intake/api/v2/experiments/endpoints.py
services/intake/src/nmp/intake/api/v2/experiments/schemas.py
services/intake/tests/test_experiment_metric_filter.py

🚧 Files skipped from review as they are similar to previous changes (7)

packages/nmp_common/src/nmp/common/entities/values.py
services/intake/tests/test_experiment_metric_filter.py
services/intake/src/nmp/intake/api/v2/experiments/schemas.py
openapi/ga/openapi.yaml
services/intake/src/nmp/intake/api/v2/experiments/endpoints.py
openapi/ga/individual/platform.openapi.yaml
openapi/openapi.yaml

📝 Walkthrough

Walkthrough

Adds numeric rollup filtering to the experiments list API. OpenAPI, shared filter types, endpoint handling, and tests now cover run_count, cost_usd, latency_ms, and evaluators comparisons.

Changes

Metric Rollup Filters

Layer / File(s)	Summary
Endpoint docs and error text `openapi/openapi.yaml`, `openapi/ga/openapi.yaml`, `openapi/ga/individual/platform.openapi.yaml`	`filter` docs add rollup-metric examples; `400` and `503` text now mention unsupported sort/filter fields and metric-based sort/filter.
Filter contract and schema shapes `openapi/openapi.yaml`, `openapi/ga/openapi.yaml`, `openapi/ga/individual/platform.openapi.yaml`, `packages/nmp_common/src/nmp/common/entities/values.py`, `services/intake/src/nmp/intake/api/v2/experiments/schemas.py`	`ExperimentFilter` gains rollup-metric fields, and `NumberFilter` adds `$gte`, `$lte`, `$gt`, `$lt`, and `$eq` operators.
Metric filter handling in list_experiments `services/intake/src/nmp/intake/api/v2/experiments/endpoints.py`	`list_experiments` splits entity filters from metric predicates, validates metric paths and numeric operators, hydrates rollups, and applies metric predicates before sorting and pagination.
Metric filter tests `services/intake/tests/test_experiment_metric_filter.py`, `services/intake/tests/integration/spans/test_experiment_metric_sort.py`	Unit tests cover filter extraction and validation, and integration tests cover combined metric filtering in the experiments list response.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant ExperimentListEndpoint
  participant EntityStore
  participant ClickHouseTelemetryStore
  Client->>ExperimentListEndpoint: GET /experiments with metric filters
  ExperimentListEndpoint->>ExperimentListEndpoint: split entity filters and metric predicates
  ExperimentListEndpoint->>EntityStore: query entity operation
  EntityStore-->>ExperimentListEndpoint: experiments
  ExperimentListEndpoint->>ClickHouseTelemetryStore: hydrate rollups for metric fields
  ClickHouseTelemetryStore-->>ExperimentListEndpoint: rollup values
  ExperimentListEndpoint->>ExperimentListEndpoint: apply metric predicates, sort, paginate
  ExperimentListEndpoint-->>Client: filtered page

Possibly related PRs

NVIDIA-NeMo/nemo-platform#124 — Extends the same experiments filter schema with rollup metric comparisons.
NVIDIA-NeMo/nemo-platform#448 — Updates the same experiments list endpoint’s rollup hydration and metric-path handling.
NVIDIA-NeMo/nemo-platform#349 — Touches the same experiments list filter model and endpoint flow.

Suggested reviewers

BrianNewsom
callingmedic911

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 23.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title matches the main change: adding rollup-metric filtering to the experiments list endpoint.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch sbuggy/ase-321

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (1)

packages/nmp_common/src/nmp/common/entities/values.py (1)
274-310: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

NumberFilter overlaps FloatFilter.

FloatFilter already provides $gte/$lte; NumberFilter adds $gt/$lt/$eq. Consider folding the extra operators into FloatFilter (or deriving one from the other) to avoid two near-identical numeric filters drifting apart.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/nmp_common/src/nmp/common/entities/values.py` around lines 274 -
310, `NumberFilter` duplicates most of `FloatFilter` and risks the two numeric
filter models drifting apart. Refactor the filter types so there is a single
source of truth for numeric comparisons, either by moving `$gt`/`$lt`/`$eq` into
`FloatFilter` or by making `NumberFilter` inherit/compose from `FloatFilter`;
update the `NumberFilter` and `FloatFilter` definitions in `values.py` so their
shared behavior lives in one place and their aliases/config stay consistent.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@openapi/ga/individual/platform.openapi.yaml`:
- Around line 3738-3740: The filter examples in the OpenAPI docs use unprefixed
operators that do not match the schema. Update the example text in the
`NumberFilter`/rollup metric descriptions so the keys match the defined query
shape (`$gte`, `$lte`, `$gt`, `$lt`, `$eq`) everywhere this example appears,
including the related `NumberFilter` documentation block. Keep the surrounding
example values the same, but ensure the operator names in the docs are
consistent with the schema.
- Around line 14255-14279: The NumberFilter schema currently allows empty
objects because it lacks a minimum property constraint. Update the NumberFilter
definition in the openapi schema to require at least one predicate by adding
minProperties: 1 alongside the existing properties and additionalProperties:
false, so the schema still accepts $gte, $lte, $gt, $lt, or $eq but rejects {}.

In `@openapi/ga/openapi.yaml`:
- Around line 3734-3740: Update the filter documentation text in the OpenAPI
spec so the numeric range examples use the same $-prefixed operator keys defined
by the schema. In the affected description near the experiments filter section,
change the examples for run_count, cost_usd.mean, latency_ms.p95, and
evaluators.<name>.mean to use $gte/$lte/$gt/$lt/$eq consistently. Apply the same
wording cleanup anywhere the duplicated filter description appears so the
examples match the actual supported operators and do not point clients to
invalid keys.
- Around line 14255-14279: NumberFilter currently allows an empty object, so
update the NumberFilter schema in openapi/ga/openapi.yaml to require at least
one predicate operator. Add minProperties: 1 alongside the existing properties
definition so validation rejects {} while still allowing $gte, $lte, $gt, $lt,
or $eq. Use the NumberFilter schema block to locate the change.

In `@services/intake/src/nmp/intake/api/v2/experiments/endpoints.py`:
- Around line 992-1016: The metric filter validation in the LogicalOperation
handling is rejecting nested AND groups because
`_operation_references_metric(child)` treats a child AND containing metrics as
invalid, even though the parent combinator is already AND. Update the logic
around `LogicalOperation`, `_operation_references_metric`, and
`_validated_metric_predicate` to either flatten nested ANDs before validation or
explicitly recurse through AND children so metric comparisons inside sub-ANDs
are accepted; if nested ANDs remain unsupported, adjust the HTTPException detail
to clearly state that only flat metric comparisons are allowed.

---

Nitpick comments:
In `@packages/nmp_common/src/nmp/common/entities/values.py`:
- Around line 274-310: `NumberFilter` duplicates most of `FloatFilter` and risks
the two numeric filter models drifting apart. Refactor the filter types so there
is a single source of truth for numeric comparisons, either by moving
`$gt`/`$lt`/`$eq` into `FloatFilter` or by making `NumberFilter` inherit/compose
from `FloatFilter`; update the `NumberFilter` and `FloatFilter` definitions in
`values.py` so their shared behavior lives in one place and their aliases/config
stay consistent.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6359daae-f76d-4371-af1a-b831e0bf4a36

📥 Commits

Reviewing files that changed from the base of the PR and between b4473e8 and a965d84.

⛔ Files ignored due to path filters (1)

web/packages/sdk/generated/agents/schema/DeploymentLogsResponse.ts is excluded by !**/generated/**

📒 Files selected for processing (8)

openapi/ga/individual/platform.openapi.yaml
openapi/ga/openapi.yaml
openapi/openapi.yaml
packages/nmp_common/src/nmp/common/entities/values.py
services/intake/src/nmp/intake/api/v2/experiments/endpoints.py
services/intake/src/nmp/intake/api/v2/experiments/schemas.py
services/intake/tests/integration/spans/test_experiment_metric_sort.py
services/intake/tests/test_experiment_metric_filter.py

github-actions · 2026-06-25T00:27:33Z

Suite	Lines Covered	Line Rate	Branch Rate
Unit Tests	20917/27485	76.1%	61.2%
Integration Tests	12123/26254	46.2%	19.6%

Signed-off-by: shanaiabuggy <59746633+shanaiabuggy@users.noreply.github.com>

feat(experiments): Filter experiments list by a rollup metric

a965d84

Signed-off-by: shanaiabuggy <59746633+shanaiabuggy@users.noreply.github.com>

shanaiabuggy requested review from a team as code owners June 25, 2026 00:17

github-actions Bot added the feat label Jun 25, 2026

coderabbitai Bot reviewed Jun 25, 2026

View reviewed changes

Comment thread openapi/ga/individual/platform.openapi.yaml Outdated

Comment thread openapi/ga/individual/platform.openapi.yaml

Comment thread openapi/ga/openapi.yaml Outdated

Comment thread openapi/ga/openapi.yaml

Comment thread services/intake/src/nmp/intake/api/v2/experiments/endpoints.py

shanaiabuggy added 2 commits June 24, 2026 20:42

bunny

139fc9e

Signed-off-by: shanaiabuggy <59746633+shanaiabuggy@users.noreply.github.com>

lint

65755bb

Signed-off-by: shanaiabuggy <59746633+shanaiabuggy@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(experiments): Filter experiments list by a rollup metric#453

feat(experiments): Filter experiments list by a rollup metric#453
shanaiabuggy wants to merge 3 commits into
mainfrom
sbuggy/ase-321

shanaiabuggy commented Jun 25, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

shanaiabuggy commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

How

Behavior

Tests

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

shanaiabuggy commented Jun 25, 2026 •

edited

Loading

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading

github-actions Bot commented Jun 25, 2026 •

edited

Loading