Skip to content

Commit 88c9f06

Browse files
matt-aitkenclaude
andcommitted
fix(webapp): recover from ClickHouse JSON parse failures on out-of-range integers
Second class of poisoned-row failure in the runs replication path. PR #3708 handled lone UTF-16 surrogates; this one handles bare JSON integer literals that exceed ClickHouse's Int64/UInt64 range. ClickHouse's `JSON(max_dynamic_paths=...)` column fits each bare integer token into Int64 (signed) or UInt64 (unsigned). Bare integers strictly outside `[-2^63, 2^64 - 1]` are rejected with `INCORRECT_DATA` (no silent fallback to Float64). JS Numbers that are integer-valued but above `Number.MAX_SAFE_INTEGER` still serialise via JSON.stringify as bare integer tokens (no exponent) while `|value| < 1e21`, so any such Number lands on the wire as a token CH cannot accept. Customer-facing symptom: `scan-social-profiles` runs continued to be stranded in `EXECUTING` on the Tasks page even after the surrogate fix landed. CloudWatch showed `Dropped batch — ClickHouse JSON parse error but sanitizer found nothing to fix` firing 8/8 times since the previous deploy. Root cause: upstream JS Number precision loss on a 21-digit Google Plus ID (`117039831458782873093` → `117039831458782870000`) — the precision-lossy value still serialises as a bare integer that exceeds UInt64.MAX, which CH rejects. Reproduced end-to-end against ClickHouse 25.12.11.4 in Docker with the exact `Cannot parse JSON object here` error from prod. `apps/webapp/app/v3/eventRepository/sanitizeRowsOnParseError.server.ts`: - New private `isUnsafeJsonInteger(value)` helper — true iff value is a finite, integer-valued JS Number where `|value| < 1e21` (i.e. JSON.stringify emits integer form, not exponent) AND `value` falls outside `[Int64.MIN, UInt64.MAX]`. - `sanitizeUnknownInPlace` gains a number-branch: when the predicate holds, replace the Number with its string form. CH's dynamic JSON column accepts a `String` subtype on the same path, so the row inserts cleanly on retry. The numeric value was already precision-lossy upstream (JS Number can't represent integers above 2^53 faithfully), so type-flipping to string is information-preserving relative to what arrived. - Float-valued numbers and large floats (>= 1e21, NaN, Infinity) are left alone — JSON.stringify emits them with exponents or as `null`, both of which CH accepts. Recovery stays purely reactive — no extra cost on the hot replication path. The sanitizer only runs after a ClickHouse parse-error rejection, so healthy rows pay nothing. `apps/webapp/test/sanitizeRowsOnParseError.test.ts`: four new unit tests covering positive/negative out-of-range integers, boundary values (MAX_SAFE_INTEGER, 2^63, UInt64.MAX itself), non-integer numbers, and the actual `scan-social-profiles` nested shape with `gp_id: 117039831458782870000`. Plus an extension to `sanitizeRows` that verifies surrogate and integer fixes are counted together across rows. `.server-changes/runs-replication-bigint-recovery.md` — release notes. Refs TRI-9755. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent eefb96c commit 88c9f06

3 files changed

Lines changed: 176 additions & 0 deletions

File tree

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
---
2+
area: webapp
3+
type: fix
4+
---
5+
6+
Extend the runs-replication sanitizer (`sanitizeUnknownInPlace`) to detect
7+
JS Numbers that JSON-serialise as bare integer tokens outside the
8+
Int64..UInt64 range and replace them with their string form, so a
9+
following retry insert no longer trips ClickHouse's
10+
`INCORRECT_DATA` parser failure on `JSON(max_dynamic_paths)` columns.
11+
12+
This is the second class of poisoned-row failure that was stranding
13+
`scan-social-profiles` runs in `EXECUTING` on the Tasks page even after
14+
the UTF-16 surrogate fix (#3708 / TRI-9755). Root cause: upstream JS
15+
Number precision loss on a 21-digit Google Plus ID
16+
(`117039831458782873093``117039831458782870000`) — the precision-lossy
17+
value still serialises as a bare integer that exceeds UInt64.MAX,
18+
which CH's JSON column rejects with `Cannot parse JSON object here`.
19+
20+
Recovery stays purely reactive (no extra cost on the hot replication
21+
path); the sanitizer only runs after a ClickHouse parse-error rejection.

apps/webapp/app/v3/eventRepository/sanitizeRowsOnParseError.server.ts

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,38 @@ import { detectBadJsonStrings } from "~/utils/detectBadJsonStrings";
77
*/
88
export const INVALID_UTF16_SENTINEL = "[invalid-utf16]";
99

10+
/**
11+
* ClickHouse's `JSON(max_dynamic_paths)` column fits each bare-integer
12+
* JSON token into Int64 (signed) or UInt64 (unsigned). Bare integers
13+
* outside `[-2^63, 2^64 - 1]` are rejected with `INCORRECT_DATA` (no
14+
* silent fallback to Float64). `JSON.stringify` emits any integer-valued
15+
* Number with `|value| < 1e21` as a bare integer (no exponent), so any
16+
* JS Number above ~9.2e18 that *happens* to be integer-valued lands on
17+
* the wire as a token CH cannot accept.
18+
*
19+
* The fix: replace such Numbers with their string form. CH's dynamic
20+
* JSON column accepts a `String` subtype on the same path, so the row
21+
* inserts cleanly on retry. The numeric value was already
22+
* precision-lossy upstream (JS Number can't represent integers above
23+
* 2^53 faithfully), so type-flipping to string is information-preserving
24+
* relative to what arrived.
25+
*
26+
* Float-valued numbers (including very large ones like `1e25`) serialise
27+
* with an exponent and are accepted by CH at any magnitude, so they're
28+
* left alone.
29+
*/
30+
function isUnsafeJsonInteger(value: number): boolean {
31+
if (!Number.isFinite(value)) return false;
32+
if (!Number.isInteger(value)) return false;
33+
// JSON.stringify emits integer-valued Numbers as bare integer tokens
34+
// (no exponent) only while `|value| < 1e21`; at or above that
35+
// threshold `Number.prototype.toString` switches to exponential form,
36+
// which CH accepts as Float64 at any magnitude. So the dangerous band
37+
// is strictly between the Int64/UInt64 boundary and 1e21.
38+
if (Math.abs(value) >= 1e21) return false;
39+
return value > 18446744073709551615 || value < -9223372036854775808;
40+
}
41+
1042
export type SanitizeResult = {
1143
/** How many rows had at least one string field replaced. */
1244
rowsTouched: number;
@@ -62,6 +94,10 @@ export function sanitizeUnknownInPlace(value: unknown): { value: unknown; fixed:
6294
return { value, fixed: 0 };
6395
}
6496

97+
if (typeof value === "number" && isUnsafeJsonInteger(value)) {
98+
return { value: String(value), fixed: 1 };
99+
}
100+
65101
if (Array.isArray(value)) {
66102
let fixed = 0;
67103
for (let i = 0; i < value.length; i++) {

apps/webapp/test/sanitizeRowsOnParseError.test.ts

Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -105,6 +105,93 @@ describe("sanitizeUnknownInPlace", () => {
105105
expect(sanitizeUnknownInPlace(null)).toEqual({ value: null, fixed: 0 });
106106
expect(sanitizeUnknownInPlace(undefined)).toEqual({ value: undefined, fixed: 0 });
107107
});
108+
109+
// ─── Out-of-range integers (TRI-9755) ──────────────────────────────────────
110+
// ClickHouse's JSON(max_dynamic_paths) column rejects bare integer tokens
111+
// outside [Int64.MIN, UInt64.MAX]. Such Numbers serialise as bare integer
112+
// form via JSON.stringify (no exponent, since |value| < 1e21) so they reach
113+
// ClickHouse as unquoted oversized ints. Sanitizer replaces them with the
114+
// string form, which ClickHouse's dynamic JSON column accepts as a String
115+
// subtype on that path.
116+
117+
it("replaces an integer-valued Number above UInt64.MAX with its string form", () => {
118+
// 117039831458782870000 is the actual prod value (Google Plus ID after
119+
// upstream JS-Number precision loss from 117039831458782873093).
120+
const result = sanitizeUnknownInPlace(117039831458782870000);
121+
expect(result.value).toBe("117039831458782870000");
122+
expect(result.fixed).toBe(1);
123+
});
124+
125+
it("replaces an integer-valued Number below Int64.MIN with its string form", () => {
126+
// -9223372036854775809 is the first failing negative; in float64 it
127+
// rounds to the same representation as Int64.MIN (-9223372036854775808),
128+
// but for completeness we check a clearly-out-of-range negative.
129+
const result = sanitizeUnknownInPlace(-1e20);
130+
expect(result.value).toBe("-100000000000000000000");
131+
expect(result.fixed).toBe(1);
132+
});
133+
134+
it("leaves safe integers and boundary values untouched", () => {
135+
// 42 — safe integer
136+
expect(sanitizeUnknownInPlace(42)).toEqual({ value: 42, fixed: 0 });
137+
// Number.MAX_SAFE_INTEGER (2^53 - 1) — JSON.stringify still emits as integer
138+
expect(sanitizeUnknownInPlace(Number.MAX_SAFE_INTEGER)).toEqual({
139+
value: Number.MAX_SAFE_INTEGER,
140+
fixed: 0,
141+
});
142+
// 2^63 (Int64.MAX + 1) — still fits in UInt64, CH accepts it
143+
expect(sanitizeUnknownInPlace(2 ** 63)).toEqual({ value: 2 ** 63, fixed: 0 });
144+
});
145+
146+
it("leaves non-integer numbers untouched (floats, NaN, Infinity)", () => {
147+
// Numbers with a fractional part — emitted with `.` in JSON
148+
expect(sanitizeUnknownInPlace(3.14)).toEqual({ value: 3.14, fixed: 0 });
149+
// Very large float-form (>= 1e21) — JSON.stringify uses exponent form,
150+
// CH parses as Float64 successfully
151+
expect(sanitizeUnknownInPlace(1e25)).toEqual({ value: 1e25, fixed: 0 });
152+
// NaN / Infinity — JSON.stringify emits `null`, so harmless on the wire
153+
expect(sanitizeUnknownInPlace(Number.NaN)).toEqual({ value: Number.NaN, fixed: 0 });
154+
expect(sanitizeUnknownInPlace(Number.POSITIVE_INFINITY)).toEqual({
155+
value: Number.POSITIVE_INFINITY,
156+
fixed: 0,
157+
});
158+
});
159+
160+
it("finds an oversized integer nested deep inside the actual scan-social-profiles shape", () => {
161+
const row = {
162+
output: {
163+
data: {
164+
profiles: [
165+
{ module: "linktree", query: "x@example.com" },
166+
{
167+
module: "poshmark",
168+
spec_format: [
169+
{
170+
platform_variables: [
171+
{
172+
key: "gp_id",
173+
proper_key: "Gp Id",
174+
// The actual prod value — bare JSON integer > UInt64.MAX
175+
value: 117039831458782870000,
176+
type: "int",
177+
},
178+
],
179+
},
180+
],
181+
},
182+
],
183+
},
184+
},
185+
};
186+
const result = sanitizeUnknownInPlace(row);
187+
expect(result.fixed).toBe(1);
188+
expect(
189+
(row.output.data.profiles[1].spec_format![0].platform_variables[0] as any).value
190+
).toBe("117039831458782870000");
191+
// Untouched neighbours
192+
expect(row.output.data.profiles[0].module).toBe("linktree");
193+
expect(row.output.data.profiles[1].spec_format![0].platform_variables[0].type).toBe("int");
194+
});
108195
});
109196

110197
describe("sanitizeRows", () => {
@@ -158,4 +245,36 @@ describe("sanitizeRows", () => {
158245
expect(result.rowsTouched).toBe(1);
159246
expect(result.fieldsSanitized).toBe(2);
160247
});
248+
249+
it("counts surrogate fixes and out-of-range integer fixes together (TRI-9755)", () => {
250+
const rows = [
251+
{
252+
id: "r0",
253+
attributes: {
254+
surrogate: `bad ${HIGH_SURROGATE}`,
255+
bigint: 117039831458782870000,
256+
clean: "fine",
257+
safe: 42,
258+
},
259+
},
260+
{
261+
id: "r1",
262+
attributes: {
263+
bigint: -1e20,
264+
clean: "still fine",
265+
},
266+
},
267+
{
268+
id: "r2",
269+
attributes: { clean: "no fixes needed" },
270+
},
271+
];
272+
const result = sanitizeRows(rows);
273+
expect(result.rowsTouched).toBe(2);
274+
expect(result.fieldsSanitized).toBe(3);
275+
expect(rows[0].attributes.surrogate).toBe(INVALID_UTF16_SENTINEL);
276+
expect(rows[0].attributes.bigint).toBe("117039831458782870000");
277+
expect(rows[0].attributes.safe).toBe(42);
278+
expect(rows[1].attributes.bigint).toBe("-100000000000000000000");
279+
});
161280
});

0 commit comments

Comments
 (0)