feat: use text index to power filters and autocomplete#2376
Conversation
|
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
🔴 Tier 4 — CriticalTouches auth, data models, config, tasks, OTel pipeline, ClickHouse, or CI/CD. Why this tier:
Review process: Deep review from a domain expert. Synchronous walkthrough may be required. Stats
|
E2E Test Results✅ All tests passed • 191 passed • 3 skipped • 1310s
Tests ran across 4 shards in parallel. |
Deep Review🔴 P0/P1 — must fix
🟡 P2 — recommended
🔵 P3 nitpicks (19)
Reviewers (7): adversarial, api-contract, correctness, kieran-typescript, maintainability, performance, project-standards, reliability, testing Testing gaps:
|
Greptile SummaryThis PR wires the new
Confidence Score: 4/5The text-index fast path and the three-level cascade (text-index → MV → scan) are architecturally sound, but open items from earlier review rounds — the incomplete time-range overlap predicate in partsOverlapFilter, the absent caching in the new getMapValues, and the removal of attribute-map branches from the MV for fresh CH < 26.3 deployments — should be resolved before merging. The cascade fallback logic and the new getMapColumnTextIndexes guard (version check + Distributed check) are correct. The main concerns that remain from the review history involve the system.parts overlap filter missing the 'part fully contains query range' case, which causes the text-index path to silently under-read on any query whose window is fully bracketed by a single large part, and the removal of rollup coverage for attribute map columns on CH < 26.3 deployments where no text-index path exists. packages/common-utils/src/core/metadata.ts — the partsOverlapFilter predicate and the absence of caching in getMapValues; docker/otel-collector/schema/seed/00006_otel_logs_rollups.sql — the MV now omits attribute map columns, which matters for fresh deployments below CH 26.3. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[getMapKeys / getMapValues / getAllKeyValues] --> B{getMapColumnTextIndexes\nCH >= 26.3 & local table?}
B -- No index / CH < 26.3 --> D{metadataMVs\n& dateRange?}
B -- keysIndex exists --> C1[mergeTreeTextIndex\ntoken AS key]
B -- itemsIndex exists --> C2[mergeTreeTextIndex\nsplitByString sep token]
C1 -- empty --> C2
C2 -- empty --> D
C1 -- results --> Z[Return results]
C2 -- results --> Z
D -- yes --> E[kvRollupTable\nfiltered by bucket]
D -- no --> F[Main-table scan\nSELECT DISTINCT col key]
E -- empty --> F
E -- results --> Z
F --> Z
Reviews (4): Last reviewed commit: "fix fallback behavior" | Re-trigger Greptile |
Summary
The new text indexes can power filters and autocomplete and ease the metadataMVs. So let's do it!
References