feat(experiments): Filter experiments list by a rollup metric#453
feat(experiments): Filter experiments list by a rollup metric#453shanaiabuggy wants to merge 3 commits into
Conversation
Signed-off-by: shanaiabuggy <59746633+shanaiabuggy@users.noreply.github.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: ⛔ Files ignored due to path filters (10)
📒 Files selected for processing (7)
🚧 Files skipped from review as they are similar to previous changes (7)
📝 WalkthroughWalkthroughAdds numeric rollup filtering to the experiments list API. OpenAPI, shared filter types, endpoint handling, and tests now cover ChangesMetric Rollup Filters
Sequence Diagram(s)sequenceDiagram
participant Client
participant ExperimentListEndpoint
participant EntityStore
participant ClickHouseTelemetryStore
Client->>ExperimentListEndpoint: GET /experiments with metric filters
ExperimentListEndpoint->>ExperimentListEndpoint: split entity filters and metric predicates
ExperimentListEndpoint->>EntityStore: query entity operation
EntityStore-->>ExperimentListEndpoint: experiments
ExperimentListEndpoint->>ClickHouseTelemetryStore: hydrate rollups for metric fields
ClickHouseTelemetryStore-->>ExperimentListEndpoint: rollup values
ExperimentListEndpoint->>ExperimentListEndpoint: apply metric predicates, sort, paginate
ExperimentListEndpoint-->>Client: filtered page
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (1)
packages/nmp_common/src/nmp/common/entities/values.py (1)
274-310: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value
NumberFilteroverlapsFloatFilter.
FloatFilteralready provides$gte/$lte;NumberFilteradds$gt/$lt/$eq. Consider folding the extra operators intoFloatFilter(or deriving one from the other) to avoid two near-identical numeric filters drifting apart.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/nmp_common/src/nmp/common/entities/values.py` around lines 274 - 310, `NumberFilter` duplicates most of `FloatFilter` and risks the two numeric filter models drifting apart. Refactor the filter types so there is a single source of truth for numeric comparisons, either by moving `$gt`/`$lt`/`$eq` into `FloatFilter` or by making `NumberFilter` inherit/compose from `FloatFilter`; update the `NumberFilter` and `FloatFilter` definitions in `values.py` so their shared behavior lives in one place and their aliases/config stay consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@openapi/ga/individual/platform.openapi.yaml`:
- Around line 3738-3740: The filter examples in the OpenAPI docs use unprefixed
operators that do not match the schema. Update the example text in the
`NumberFilter`/rollup metric descriptions so the keys match the defined query
shape (`$gte`, `$lte`, `$gt`, `$lt`, `$eq`) everywhere this example appears,
including the related `NumberFilter` documentation block. Keep the surrounding
example values the same, but ensure the operator names in the docs are
consistent with the schema.
- Around line 14255-14279: The NumberFilter schema currently allows empty
objects because it lacks a minimum property constraint. Update the NumberFilter
definition in the openapi schema to require at least one predicate by adding
minProperties: 1 alongside the existing properties and additionalProperties:
false, so the schema still accepts $gte, $lte, $gt, $lt, or $eq but rejects {}.
In `@openapi/ga/openapi.yaml`:
- Around line 3734-3740: Update the filter documentation text in the OpenAPI
spec so the numeric range examples use the same $-prefixed operator keys defined
by the schema. In the affected description near the experiments filter section,
change the examples for run_count, cost_usd.mean, latency_ms.p95, and
evaluators.<name>.mean to use $gte/$lte/$gt/$lt/$eq consistently. Apply the same
wording cleanup anywhere the duplicated filter description appears so the
examples match the actual supported operators and do not point clients to
invalid keys.
- Around line 14255-14279: NumberFilter currently allows an empty object, so
update the NumberFilter schema in openapi/ga/openapi.yaml to require at least
one predicate operator. Add minProperties: 1 alongside the existing properties
definition so validation rejects {} while still allowing $gte, $lte, $gt, $lt,
or $eq. Use the NumberFilter schema block to locate the change.
In `@services/intake/src/nmp/intake/api/v2/experiments/endpoints.py`:
- Around line 992-1016: The metric filter validation in the LogicalOperation
handling is rejecting nested AND groups because
`_operation_references_metric(child)` treats a child AND containing metrics as
invalid, even though the parent combinator is already AND. Update the logic
around `LogicalOperation`, `_operation_references_metric`, and
`_validated_metric_predicate` to either flatten nested ANDs before validation or
explicitly recurse through AND children so metric comparisons inside sub-ANDs
are accepted; if nested ANDs remain unsupported, adjust the HTTPException detail
to clearly state that only flat metric comparisons are allowed.
---
Nitpick comments:
In `@packages/nmp_common/src/nmp/common/entities/values.py`:
- Around line 274-310: `NumberFilter` duplicates most of `FloatFilter` and risks
the two numeric filter models drifting apart. Refactor the filter types so there
is a single source of truth for numeric comparisons, either by moving
`$gt`/`$lt`/`$eq` into `FloatFilter` or by making `NumberFilter` inherit/compose
from `FloatFilter`; update the `NumberFilter` and `FloatFilter` definitions in
`values.py` so their shared behavior lives in one place and their aliases/config
stay consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 6359daae-f76d-4371-af1a-b831e0bf4a36
⛔ Files ignored due to path filters (1)
web/packages/sdk/generated/agents/schema/DeploymentLogsResponse.tsis excluded by!**/generated/**
📒 Files selected for processing (8)
openapi/ga/individual/platform.openapi.yamlopenapi/ga/openapi.yamlopenapi/openapi.yamlpackages/nmp_common/src/nmp/common/entities/values.pyservices/intake/src/nmp/intake/api/v2/experiments/endpoints.pyservices/intake/src/nmp/intake/api/v2/experiments/schemas.pyservices/intake/tests/integration/spans/test_experiment_metric_sort.pyservices/intake/tests/test_experiment_metric_filter.py
|
What
Adds metric filtering to the experiments list — filter by rollup metrics (cost, latency, evaluator scores, run count), not just entity fields.
How
Reuses the platform's standard
filter[field][op]bracket syntax, so it combines naturally with existing entity filters:?filter[cost_usd.mean][$lte]=0.5&filter[run_count][$gte]=1&filter[experiment_group_id]=
run_count,cost_usd.<stat>,latency_ms.<stat>,evaluators.<name>.<stat>(stat ∈ mean/median/p90/p95/p99/sum/count). Operators:$gte/$lte/$gt/$lt/$eq.list_experimentssplits the filter tree: entity predicates go to the entity store; metric predicates are applied in-app after rollup hydration (compute-on-read, same plumbing as metric sort). Declared via self-mapping namespaces onExperimentFilterso paths pass validation untranslated.NumberFilterrange type ($gte/$lte/$gt/$lt/$eq) alongsideDatetimeFilter/StringFilter.Behavior
Tests
Unit tests for the split/validate/match helpers + endpoint wiring (validation, 400/503), and an integration test combining entity + metric filters end to end against ClickHouse. OpenAPI specs regenerated.
Summary by CodeRabbit
New Features
run_count,cost_usd,latency_ms, and evaluator metrics.$eq,$gt,$gte,$lt,$lte(e.g., range queries likecost_usd.mean <= 0.50).Bug Fixes
503when telemetry data is unavailable.Documentation
Tests