fix(charts): cap group-by time-series to top-N series to prevent OOM by wrn14897 · Pull Request #2429 · hyperdxio/hyperdx

wrn14897 · 2026-06-09T01:55:05Z

Summary

High-cardinality group-by time charts could pull hundreds of thousands of series into a single tile — only ~60 were ever drawn (HARD_LINES_LIMIT), but every series was still fetched, JSON-parsed, and zero-filled into a dense buckets×series matrix in memory, OOMing the browser tab. This adds an opt-in query-level seriesLimit: when set on a group-by + granularity chart, renderChartConfig emits a CTE (__hdx_series_limit) that keeps only the top-N series ranked by max value in any bucket and restricts the outer query to those groups (mirroring the existing aggFn=increase TopGroups path). The time-chart display path (convertToTimeChartConfig) sets this cap, while alerts and other renderChartConfig consumers leave it unset so their series evaluation is unchanged.

The cap is also configurable per team: a new seriesLimit team setting (alongside the existing ClickHouse client settings) lets teams tune it, defaulting to 100 when unset and floored at 1 (no upper bound, so teams can intentionally fetch more series). The default and the rendered-line cap (HARD_LINES_LIMIT) move together; a team value only governs how many series are fetched, with any surplus available in the series selector. Verified end-to-end against real ClickHouse (50 series → top 5) plus unit tests; all existing SQL snapshots are unchanged.

How to test on Vercel preview

Preview routes: /team

Steps:

Open /team.
Scroll to the "ClickHouse Client Settings" section.
Verify a "Time Chart Series Limit" setting is listed with a default of 100.

References

Linear Issue: HDX-4499
Related PRs:

High-cardinality group-by time charts could pull hundreds of thousands of series into a single tile (only ~60 were ever drawn), zero-filling a dense buckets×series matrix in memory and OOMing the browser tab. Add an opt-in query-level `seriesLimit`: when set on a group-by + granularity chart, renderChartConfig emits a TopGroups CTE that keeps only the top-N series by max value in any bucket and restricts the outer query to those groups. The time-chart display path sets seriesLimit=60 (matching HARD_LINES_LIMIT); alerts and other renderChartConfig consumers leave it unset, so their series evaluation is unchanged.

changeset-bot · 2026-06-09T01:55:09Z

🦋 Changeset detected

Latest commit: 14e9921

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 4 packages

Name	Type
@hyperdx/common-utils	Patch
@hyperdx/app	Patch
@hyperdx/api	Patch
@hyperdx/otel-collector	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

vercel · 2026-06-09T01:55:11Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
hyperdx-oss	Ready	Preview, Comment	Jun 10, 2026 2:30pm
hyperdx-storybook	Ready	Preview, Comment	Jun 10, 2026 2:30pm

github-actions · 2026-06-09T01:55:28Z

🔴 Tier 4 — Critical

Touches auth, data models, config, tasks, OTel pipeline, ClickHouse, or CI/CD.

Why this tier:

Critical-path files (1):
- packages/api/src/models/team.ts
Cross-layer change: touches frontend (packages/app) + backend (packages/api) + shared utils (packages/common-utils)

Review process: Deep review from a domain expert. Synchronous walkthrough may be required.
SLA: Schedule synchronous review within 2 business days.

Stats

Production files changed: 9
Production lines changed: 152 (+ 478 in test files, excluded from tier calculation)
Branch: sao-paulo
Author: wrn14897

To override this classification, remove the review/tier-4 label and apply a different review/tier-* label. Manual overrides are preserved on subsequent pushes.

greptile-apps · 2026-06-09T02:03:20Z

Greptile Summary

This PR caps group-by time-series queries to a configurable top-N series to prevent browser OOM on high-cardinality group-bys. It introduces a __hdx_series_limit CTE in renderChartConfig that ranks groups by peak aggregate value and restricts the outer query to those groups, while leaving alert evaluation and other renderChartConfig consumers unaffected.

CTE approach: renderSeriesLimitCte builds a ranking CTE with comprehensive guards (group-by present, granularity set, real table source, array select), correct alias-stripping for tuple()/IS NOT NULL predicates, and NULL exclusion to avoid unresolvable groups.
Team configurability: A new seriesLimit team setting (default 100, min 1, no upper bound) is wired through the Mongoose model, Zod schemas, and the Settings UI.
HARD_LINES_LIMIT raised 60 → 100: The rendering cap is unified with the fetch cap via DEFAULT_SERIES_LIMIT, intentionally increasing the number of drawable lines.

Confidence Score: 5/5

Safe to merge — the CTE is opt-in, the alert evaluation path is unchanged, and the implementation is backed by both unit and integration tests covering alias stripping, NULL exclusion, and multi-column group-bys.

The core CTE logic is well-guarded and thoroughly tested. The only finding is that the chart editor SQL preview calls convertToTimeChartConfig without passing the team's custom series limit, so the preview SQL may show a different LIMIT than the query that actually executes. This does not affect runtime behavior.

packages/app/src/components/DBEditTimeChartForm/utils.ts — the one call site not updated to pass the team series limit.

Important Files Changed

Filename	Overview
packages/common-utils/src/core/renderChartConfig.ts	Adds `renderSeriesLimitCte`: a new CTE that ranks group-by series by peak aggregate value and restricts the outer query to the top-N, guarded by comprehensive preconditions. Alias-stripping, null-filtering, and multi-column tuple handling are all correctly implemented and tested.
packages/app/src/ChartUtils.tsx	Adds optional `teamSeriesLimit` parameter to `convertToTimeChartConfig`; floored at 1, defaulting to `DEFAULT_SERIES_LIMIT`. `MAX_TIME_CHART_SERIES` and `HARD_LINES_LIMIT` both move to 100 (was 60).
packages/app/src/components/DBTimeChart.tsx	Moves `api.useMe()` before the `useMemo` so `me?.team?.seriesLimit` is available as a memo dependency; passes it to `convertToTimeChartConfig`. Query is already gated on `!isLoadingMe`.
packages/app/src/components/SearchTotalCountChart.tsx	Mirrors DBTimeChart: moves `api.useMe()` before the memo and passes `me?.team?.seriesLimit` to preserve React Query deduplication.
packages/common-utils/src/types.ts	Adds `seriesLimit: z.number().int().positive().optional()` to `SelectSQLStatementSchema`, `TeamClickHouseSettingsSchema`, and the nullable update schema. All three changes are consistent.
packages/app/src/components/TeamSettings/TeamQueryConfigSection.tsx	Adds a 'Time Chart Series Limit' setting form with min=1, defaultValue=DEFAULT_SERIES_LIMIT, and correct display value.
packages/api/src/models/team.ts	Adds `seriesLimit: Number` to the Mongoose team model, matching the Zod schema additions.
packages/app/src/defaults.ts	Adds `DEFAULT_SERIES_LIMIT = 100` as the single source of truth for both the fetch cap and the render cap.
packages/common-utils/src/tests/queryChartConfig.int.test.ts	Four new integration tests cover: basic top-N cap, multi-column string group-by, NULL group exclusion, and aliased group-by. All use finally blocks for table cleanup.
packages/common-utils/src/tests/renderChartConfig.test.ts	Seven new unit tests verify CTE emission, no-op conditions, multi-column tuple packing, alias stripping, and comma-separated string group-by splitting.
packages/app/src/tests/ChartUtils.test.ts	Three tests validate `seriesLimit` defaulting, team override, and unbounded large values in `convertToTimeChartConfig`.

Sequence Diagram

sequenceDiagram
    participant UI as DBTimeChart / SearchTotalCountChart
    participant API as api.useMe()
    participant CU as convertToTimeChartConfig
    participant RC as renderChartConfig
    participant CH as ClickHouse

    UI->>API: fetch team settings
    API-->>UI: "{ seriesLimit: N (or undefined) }"
    UI->>CU: convertToTimeChartConfig(config, N)
    Note over CU: seriesLimit = max(1, N ?? 100)
    CU-->>UI: config + seriesLimit
    UI->>RC: renderChartConfig(config+seriesLimit)
    Note over RC: renderSeriesLimitCte() emits<br/>__hdx_series_limit CTE<br/>(top-N by max per bucket)
    RC-->>UI: SQL with WITH __hdx_series_limit AS (...)
    UI->>CH: execute SQL
    CH-->>UI: "<= N series"

_{Reviews (12): Last reviewed commit: "Merge branch 'main' into sao-paulo" | Re-trigger Greptile}

github-actions · 2026-06-09T02:04:28Z

E2E Test Results

✅ All tests passed • 199 passed • 3 skipped • 1330s

Status	Count
✅ Passed	199
❌ Failed	0
⚠️ Flaky	3
⏭️ Skipped	3

Tests ran across 4 shards in parallel.

View full report →

Add a `seriesLimit` team setting alongside the existing ClickHouse client settings so teams can tune the top-N series cap that bounds time-chart memory. `convertToTimeChartConfig` now reads `me.team.seriesLimit`, falling back to the default (60) when unset and floored at 1. No upper bound — teams may intentionally fetch more series than the 60 rendered at once (the surplus is available in the series selector). The setting flows through the existing team-settings pipeline (Zod schema, Mongoose model, /me, PATCH /team/clickhouse-settings, ClickhouseSettingForm); no controller or endpoint changes were needed.

karl-power

Just missing a changeset

…Config Replace the `baseWithClauses` rename and two nested ternaries with a plain `let withClauses`/`let where` plus a single `if (seriesCap)` fold. Behaviour is unchanged — the ranking CTE is still appended to the already-rendered WITH clause (rather than chartConfig.with, which would disable the main SELECT's materialized-column optimization). Byte-identical SQL; all snapshots unchanged.

A multi-column string group-by (e.g. "LogAttributes['cap'],ServiceName") made the series-cap ranking CTE emit an invalid two-argument toString() and a malformed NULL check, because renderSelectList returns a comma-joined string as a single expression. Split it into per-column expressions with splitAndTrimWithBracket (which respects []/()/quotes) so the NULL/empty filter applies per column. Array and single-column-string group-bys are unchanged (snapshots unchanged); covered by a renderChartConfig unit test and a real-CH integration test mirroring the failing Map-access case.

…es cap The series-cap ranking CTE previously dropped both NULL and empty-string group values. Empty-string groups (e.g. a missing Map key like LogAttributes['x'] -> '') are real data, so silently hiding them was surprising. Now only NULL components are excluded — which is the genuine technical need, since the outer `tuple(...) IN (...)` is NULL-unsafe (transform_null_in=0) and would otherwise waste a top-N slot on a group it can't match. Empty-string groups now compete for a slot like any other value.

If a group-by item carried an alias, renderSelectList appended ` AS "alias"`, which then leaked into tuple(...) and `(... IS NOT NULL)` in the series-cap CTE — both invalid SQL there (the outer GROUP BY tolerates aliases, these positions do not). Strip the alias before building the tuple and null filter, matching how the rank value is already rendered. No alias is set on group-bys today, so this is a defensive fix; covered by a new unit test.

…rray, metric gating) Add integration tests against real ClickHouse for the corner cases the recent series-cap fixes target but unit tests can't fully prove: - NULL group component excluded (NULL-unsafe `tuple() IN (...)` would otherwise waste the only top-N slot and yield an empty chart); - multi-column *array* group-by with an alias (proves the 2-col tuple()/IN executes and the alias is stripped inside the CTE yet preserved in output). Add a unit test that a metric source does not emit the series-cap CTE (gating).

Bumps DEFAULT_SERIES_LIMIT 60 -> 100. Since MAX_TIME_CHART_SERIES and HARD_LINES_LIMIT derive from it, this raises both the default query-time cap and the rendered-line cap (and ServicesDashboard's MAX_NUM_SERIES) to 100. Teams can still override the query cap via the seriesLimit setting.

… tooltip Drop the explanatory comments added across the series-cap changes (keeping only the ones inside renderSeriesLimitCte), and simplify the "Time Chart Series Limit" tooltip to "Maximum number of series fetched per time chart."

greptile-apps · 2026-06-09T19:12:19Z

Want your agent to iterate on Greptile's feedback? Try greploops.

github-actions Bot added the review/tier-3 Standard — full human review required label Jun 9, 2026

vercel Bot deployed to Preview – hyperdx-storybook June 9, 2026 01:56 View deployment

vercel Bot deployed to Preview – hyperdx-oss June 9, 2026 01:58 View deployment

greptile-apps Bot reviewed Jun 9, 2026

View reviewed changes

Comment thread packages/common-utils/src/core/renderChartConfig.ts Outdated

Comment thread packages/common-utils/src/core/renderChartConfig.ts Outdated

github-actions Bot added review/tier-4 Critical — deep review + domain expert sign-off and removed review/tier-3 Standard — full human review required labels Jun 9, 2026

vercel Bot deployed to Preview – hyperdx-storybook June 9, 2026 02:18 View deployment

vercel Bot deployed to Preview – hyperdx-oss June 9, 2026 02:19 View deployment

wrn14897 force-pushed the sao-paulo branch from 30e5362 to 094729b Compare June 9, 2026 05:01

vercel Bot deployed to Preview – hyperdx-storybook June 9, 2026 05:03 View deployment

vercel Bot deployed to Preview – hyperdx-oss June 9, 2026 05:04 View deployment

karl-power previously approved these changes Jun 9, 2026

View reviewed changes

chore: add changeset for time-chart series limit

b08ef0f

wrn14897 dismissed karl-power’s stale review via b08ef0f June 9, 2026 15:10

vercel Bot deployed to Preview – hyperdx-storybook June 9, 2026 16:00 View deployment

vercel Bot deployed to Preview – hyperdx-oss June 9, 2026 16:01 View deployment

vercel Bot deployed to Preview – hyperdx-storybook June 9, 2026 16:21 View deployment

vercel Bot deployed to Preview – hyperdx-oss June 9, 2026 16:21 View deployment

vercel Bot deployed to Preview – hyperdx-storybook June 9, 2026 16:36 View deployment

vercel Bot deployed to Preview – hyperdx-oss June 9, 2026 16:37 View deployment

vercel Bot deployed to Preview – hyperdx-storybook June 9, 2026 16:41 View deployment

vercel Bot deployed to Preview – hyperdx-oss June 9, 2026 16:43 View deployment

vercel Bot deployed to Preview – hyperdx-storybook June 9, 2026 16:49 View deployment

vercel Bot deployed to Preview – hyperdx-oss June 9, 2026 16:50 View deployment

knudtty reviewed Jun 9, 2026

View reviewed changes

vercel Bot deployed to Preview – hyperdx-storybook June 9, 2026 18:24 View deployment

vercel Bot deployed to Preview – hyperdx-oss June 9, 2026 18:25 View deployment

vercel Bot deployed to Preview – hyperdx-storybook June 9, 2026 18:44 View deployment

vercel Bot deployed to Preview – hyperdx-oss June 9, 2026 18:45 View deployment

wrn14897 force-pushed the sao-paulo branch 2 times, most recently from b672cf2 to 7d363be Compare June 9, 2026 18:59

vercel Bot deployed to Preview – hyperdx-storybook June 9, 2026 19:01 View deployment

vercel Bot deployed to Preview – hyperdx-oss June 9, 2026 19:02 View deployment

wrn14897 force-pushed the sao-paulo branch from 7d363be to 00561b1 Compare June 9, 2026 19:04

wrn14897 force-pushed the sao-paulo branch from 00561b1 to 072d3ad Compare June 9, 2026 19:06

vercel Bot deployed to Preview – hyperdx-storybook June 9, 2026 19:08 View deployment

vercel Bot deployed to Preview – hyperdx-oss June 9, 2026 19:09 View deployment

wrn14897 added the automerge label Jun 9, 2026

knudtty approved these changes Jun 10, 2026

View reviewed changes

Merge branch 'main' into sao-paulo

14e9921

vercel Bot deployed to Preview – hyperdx-storybook June 10, 2026 14:29 View deployment

vercel Bot deployed to Preview – hyperdx-oss June 10, 2026 14:30 View deployment

kodiakhq Bot merged commit 81e524c into main Jun 10, 2026
19 checks passed

kodiakhq Bot deleted the sao-paulo branch June 10, 2026 14:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(charts): cap group-by time-series to top-N series to prevent OOM#2429

fix(charts): cap group-by time-series to top-N series to prevent OOM#2429
kodiakhq[bot] merged 11 commits into
mainfrom
sao-paulo

wrn14897 commented Jun 9, 2026 •

edited

Loading

Uh oh!

changeset-bot Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

vercel Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

karl-power left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

wrn14897 commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How to test on Vercel preview

References

Uh oh!

changeset-bot Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

vercel Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔴 Tier 4 — Critical

Uh oh!

greptile-apps Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Test Results

Uh oh!

karl-power left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wrn14897 commented Jun 9, 2026 •

edited

Loading

changeset-bot Bot commented Jun 9, 2026 •

edited

Loading

vercel Bot commented Jun 9, 2026 •

edited

Loading

github-actions Bot commented Jun 9, 2026 •

edited

Loading

greptile-apps Bot commented Jun 9, 2026 •

edited

Loading

github-actions Bot commented Jun 9, 2026 •

edited

Loading