fix: unknown lucene field falls through in search#2422
Conversation
🦋 Changeset detectedLatest commit: 6f2a0c4 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
🔵 Tier 2 — Low RiskSmall, isolated change with no API route or data model modifications. Why this tier:
Review process: AI review + quick human skim (target: 5–15 min). Reviewer validates AI assessment and checks for domain-specific concerns. Stats
|
09e5e96 to
a6ed715
Compare
E2E Test Results✅ All tests passed • 197 passed • 3 skipped • 1354s
Tests ran across 4 shards in parallel. |
a6ed715 to
bd9a8b6
Compare
Deep ReviewGating the unknown-field fall-through is the right call, and the implementation is sound: every ✅ No critical issues found. 🟡 P2 -- recommended
🔵 P3 nitpicks (4)
Reviewers (6): correctness, testing, maintainability, kieran-typescript, security, performance. Testing gaps:
|
bd9a8b6 to
4eaae4c
Compare
Greptile SummaryThis PR fixes a correctness bug in the Lucene-to-SQL serializer where an unresolvable field was emitted as a raw SQL identifier (causing ClickHouse
Confidence Score: 5/5Safe to merge — all Lucene call sites now receive the alias set, and unknown fields consistently resolve to the no-match predicate instead of raw SQL identifiers. The fix is precise and well-bounded: the serializer fall-through is gated on a Set computed from the chart config's own select aliases and expression WITH clauses. Every Lucene-capable call site in renderChartConfig (where, filters, aggCondition, valueExpression, having) now threads the alias set through. Comprehensive tests cover all paths including the saved-search alert pattern. The null-safety fix for getMaterializedColumnsLookupTable is a strict improvement. No existing behaviour is regressed. No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["Lucene WHERE field lookup"] --> B{Real column or JSON field?}
B -- Yes --> C[Return columnExpression with correct type]
B -- No --> D{field in selectAliases?}
D -- Yes --> E["Return found:true, columnExpression = field"]
D -- No --> F["Return found:false → (1 = 0)"]
G["extractSelectAliases(config)"] --> H[Array-form selects: col.alias]
G --> I[String-form selects: chSqlToAliasMap]
G --> J["WITH clauses (isSubquery === false only)"]
H & I & J --> K["Set passed to CustomSchemaSQLSerializerV2"]
K --> A
Reviews (2): Last reviewed commit: "greptile feedback: other call sites were..." | Re-trigger Greptile |
Summary
Why
When a Lucene search field couldn't be resolved to a real column, the
CustomSchemaSQLSerializerV2fall-through emitted it verbatim as a raw SQL identifier (queryParser.ts, the old// It might be an alias, let's just try the columnbranch). That had two problems:WHERE myTypo = '...', which ClickHouse rejects with a confusingUnknown identifiererror instead of simply returning no rows.WHERE), but it couldn't tell an alias apart from a typo — both were emitted blindly.WITH-clause aliases matter — saved-search alerts: a SAVED_SEARCH alert's query selectscount(), not the saved search'sselect, so the saved search's select aliases are injected as expressionWITHclauses bycomputeAliasWithClauses(in the alert task — unchanged here). Including those in the alias set keeps a Lucene alertWHEREsuch asbody:wrong(where the saved search declarestoString(Body) AS body) resolving to the alias instead of collapsing to(1 = 0). Without this, gating the fall-through would have silently regressed saved-search alerts that reference a select alias in theirWHERE— they'd stop firing.This change makes that distinction explicit: a field that matches a known SELECT alias is emitted as a bare identifier; anything genuinely unknown resolves to the no-match predicate
(1 = 0).What changed
Gate the fall-through on known aliases (
queryParser.ts):CustomSchemaConfig/CustomSchemaSQLSerializerV2gain an optionalselectAliases: Set<string>. The resolver now returns the bare identifier only whenselectAliases.has(field); otherwise it returnsfound: false, which renders as(1 = 0).Collect aliases from the chart config (
renderChartConfig.ts): newextractSelectAliases({ selectLists, withClauses })helper gathers the identifiers a LuceneWHEREmay legally reference, from three sources:{ valueExpression, alias }[]) — readsaliasdirectly;defaultTableSelectExpressionsuch as'Timestamp, ServiceName as service, Body') — parsed via the existingchSqlToAliasMap()helper to recover declared aliases. This matters because the default search/events view uses string-form selects, so without it the default view would lose alias resolution;WITHclauses — contributes a clause's name only when it declares an expression alias (isSubquery === false, i.e.WITH (expr) AS ident). Subquery CTEs (WITH ident AS (subquery)) are excluded, since they name a table-like source rather than a column usable inWHERE.The set is threaded through
renderWhereExpressionStrinto the serializer.Null-safety fix (
queryParser.ts): the exact-match branch now treats anull/undefinedmaterialized-columns lookup the same as the existing catch path (?? new Map()), so a missing lookup proceeds with no materialized columns instead of throwing on.entries().Behavior change
ServiceName:foo)Content:foowhereBody AS Content)myTypo:foo)(1 = 0)→ no rowsExample that now works end-to-end:
SELECT Body AS Content … WHERE Content = '…'How to test on Vercel preview
Preview routes:
/searchSteps:
References