RFC: compress serialized payload refs — zstd (gzip fallback), specVersion 5#2394
Conversation
Add a composable 'gzip' format prefix layer to the serialization pipeline (compress before encrypt: encr(gzip(devl))), cutting stored payload bytes by ~70-87% on real-world-style workloads. Compression is gated on run specVersion 5 (new SPEC_VERSION_SUPPORTS_COMPRESSION) and on target-deployment capabilities for cross-deployment writes; payloads under 1KB or that don't compress meaningfully are stored unchanged. Reads dispatch on the format prefix so both compressed and uncompressed data are always readable. WORKFLOW_DISABLE_COMPRESSION=1 disables writes. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
🦋 Changeset detectedLatest commit: 2df737a The changes in this PR will be included in the next version bump. This PR includes changesets to release 20 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
📊 Benchmark Results
workflow with no steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) | Nitro workflow with 1 step💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) workflow with 10 sequential steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Next.js (Turbopack) | Express workflow with 25 sequential steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) workflow with 50 sequential steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) | Nitro Promise.all with 10 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) Promise.all with 25 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) | Nitro Promise.all with 50 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) | Nitro | Express Promise.race with 10 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) | Nitro | Express Promise.race with 25 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) Promise.race with 50 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) | Nitro | Express workflow with 10 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Next.js (Turbopack) | Express workflow with 25 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) workflow with 50 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) workflow with 10 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) workflow with 25 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) workflow with 50 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Next.js (Turbopack) | Express Stream Benchmarks (includes TTFB metrics)workflow with stream💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) stream pipeline with 5 transform steps (1MB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Next.js (Turbopack) | Express 10 parallel streams (1MB each)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) fan-out fan-in 10 streams (1MB each)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Next.js (Turbopack) | Express SummaryFastest Framework by WorldWinner determined by most benchmark wins
Fastest World by FrameworkWinner determined by most benchmark wins
Column Definitions
Worlds:
|
🧪 E2E Test Results✅ All tests passed Summary
Details by Category✅ ▲ Vercel Production
✅ 💻 Local Development
✅ 📦 Local Production
✅ 🐘 Local Postgres
✅ 🪟 Windows
✅ 📋 Other
|
There was a problem hiding this comment.
Pull request overview
This PR introduces a new composable gzip serialization format prefix in @workflow/core to gzip-compress serialized payload refs before encryption, gated behind specVersion 5 to keep compatibility explicit. It also bumps @workflow/world’s SPEC_VERSION_CURRENT to 5 and wires compression gating through runtime write paths (start args, step I/O, errors, hooks), including cross-deployment capability probing.
Changes:
- Add a gzip compression/decompression layer to the serialization pipeline and integrate it across workflow/step/client serialization + hydration paths.
- Bump spec version to 5 (
SPEC_VERSION_SUPPORTS_COMPRESSION) and export it from@workflow/world, with unit tests validating the compatibility contract. - Add capability-table gating for cross-deployment payload writes, plus benchmarks and test coverage for compression behavior.
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/world/src/spec-version.ts | Adds spec v5 constant for compression and bumps SPEC_VERSION_CURRENT to 5. |
| packages/world/src/spec-version.test.ts | Tests new spec constants and requiresNewerWorld contract (incl. v4-reader simulation). |
| packages/world/src/index.ts | Re-exports SPEC_VERSION_SUPPORTS_COMPRESSION. |
| packages/core/src/workflow.ts | Gates workflow return-value payload compression on run specVersion >= 5. |
| packages/core/src/serialization/types.ts | Adds SerializationFormat.GZIP format prefix constant. |
| packages/core/src/serialization/step.ts | Applies compress-before-encrypt on step serialization; decompress-after-decrypt on reads. |
| packages/core/src/serialization/index.ts | Re-exports compression utilities from serialization module. |
| packages/core/src/serialization/compression.ts | Implements composable gzip wrapper + conditional compression thresholds/kill switch. |
| packages/core/src/serialization/compression.test.ts | Adds tests for compression layer behavior, nesting with encryption, hydration, and capability gating. |
| packages/core/src/serialization/codec.ts | Adds compression?: boolean option to write-side serialization options. |
| packages/core/src/serialization/client.ts | Applies compress-before-encrypt and decompress-after-decrypt for client-mode serializer. |
| packages/core/src/serialization.ts | Threads compression flag through dehydrate APIs; adds read-side decompression where appropriate. |
| packages/core/src/serialization-format.ts | Adds browser-safe gzip detection and sync/async hydration support for gzip-prefixed values. |
| packages/core/src/runtime/suspension-handler.ts | Gates step argument/metadata compression on run specVersion >= 5. |
| packages/core/src/runtime/step-handler.ts | Gates step error/output compression on step entity specVersion >= 5. |
| packages/core/src/runtime/step-handler.test.ts | Updates expectations for new dehydrateStepError call signature/args. |
| packages/core/src/runtime/step-executor.ts | Threads run specVersion into step execution and gates compression accordingly. |
| packages/core/src/runtime/start.ts | Adds cross-deployment capability probing for gzip support; compresses workflow args when safe. |
| packages/core/src/runtime/resume-hook.ts | Gates hook payload compression on run specVersion and deployment capabilities. |
| packages/core/src/runtime.ts | Gates run error compression where run specVersion is available; threads run specVersion into step execution. |
| packages/core/src/capabilities.ts | Adds gzip to FORMAT_VERSION_TABLE with min core version gating. |
| packages/core/scripts/benchmark-compression.mjs | Adds deterministic benchmark script to measure compression savings on representative workloads. |
| .changeset/gzip-ref-compression-world.md | Changeset for @workflow/world spec version bump to 5. |
| .changeset/gzip-ref-compression-core.md | Changeset for @workflow/core gzip-compressed serialized payload refs feature. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /** | ||
| * Whether the current runtime can compress/decompress. CompressionStream | ||
| * is a web standard available in Node.js 18+, browsers, and edge | ||
| * runtimes; this guard exists for exotic runtimes only. | ||
| */ | ||
| function isCompressionAvailable(): boolean { | ||
| return ( | ||
| typeof CompressionStream === 'function' && | ||
| typeof DecompressionStream === 'function' | ||
| ); | ||
| } |
There was a problem hiding this comment.
Fixed in 2bc368c. Split the availability checks: decompress()'s gzip branch now gates on isGzipDecompressAvailable() (only DecompressionStream), while compress() uses isGzipCompressAvailable() (CompressionStream). Reads no longer require compression support.
| if (!(data instanceof Uint8Array)) return data; | ||
| if (peekFormatPrefix(data) !== SerializationFormat.GZIP) return data; | ||
|
|
||
| if (!isCompressionAvailable()) { |
There was a problem hiding this comment.
Done in 2bc368c — the gzip read path now checks DecompressionStream only (see the sibling thread). Thanks!
| /** | ||
| * Synchronously gunzip a payload when running on Node.js. | ||
| * | ||
| * This module is browser-safe, so `node:zlib` is resolved dynamically via | ||
| * `process.getBuiltinModule` (no static Node dependency, invisible to | ||
| * browser bundlers). Returns `undefined` when sync decompression isn't | ||
| * available in the current runtime — callers fall back to leaving the | ||
| * data un-hydrated (the async `hydrateDataWithKey` path handles | ||
| * decompression in browsers via `DecompressionStream`). | ||
| */ |
There was a problem hiding this comment.
Clarified in 2bc368c. Renamed to decompressSyncIfAvailable with a doc that states it's best-effort, not "always on Node": it inflates only when process.getBuiltinModule('node:zlib') resolves and exposes the needed fn (gunzipSync; zstdDecompressSync on Node ≥ 22.15), and returns undefined otherwise (browser/edge, Node without getBuiltinModule, or zstd on Node < 22.15) — callers then fall back to the async path.
TooTallNate
left a comment
There was a problem hiding this comment.
Approve — the right design at the right seam; three concrete convergence asks with the snapshot-runtime compression work
I built the branch, ran all the suites locally (20 compression tests, 55 world, 1209 core — all green), and reproduced the benchmark table exactly (deterministic/seeded as claimed, including the 402.1 KB → 108.4 KB simulated agent run). The design fundamentals are correct and well-argued:
- SDK-side, pre-encryption placement is the only placement that works —
encr(gzip(devl)), verified instep.ts/client.tswherecompress()runs between prefixing andencryptData(). This matches the layering the snapshot-runtime compression work (#1300) arrived at independently, for the same reason: ciphertext doesn't compress. - The conditional logic is production-minded: 1 KB floor, ≥5% savings requirement (so already-compressed binary never pays a permanent decompression tax), env kill switch that only affects writes. The benchmark's "random binary still wins 24.7%" result — gzip clawing back devalue's base64 4/3× penalty almost exactly — is a great catch and a real argument for compressing even apparently-incompressible payloads.
pipeThroughTransformgets the classic CompressionStream deadlock right (write-before-read with a handled mirrored rejection) — large payloads would wedge a naiveawait write(); read()sequence.- specVersion 5 as the compatibility contract is the honest mechanism: old SDKs get a typed
RunNotSupportedErrorup front instead of per-payload format errors. I verified the gating at every call site in the table (start's probe-AND-spec gate,runSpecVersionthreading throughStepExecutorParams, resume-hook's target-spec-AND-capability gate matching theencrprecedent). Community worlds are unaffected sincestart()derives the run's spec fromworld.specVersion— a spec-4 world keeps creating spec-4 runs with no compression. The local-dev writer-stamped V1-step-handler edge is honestly documented and matches the existingencr/framing behavior. - The
TODO(release)cutoff (5.0.0-beta.16) is correct as of today (beta.15 shipped yesterday) — and the marker pattern has already proven its worth twice on the framing PR.
Convergence with #1300 (snapshot-runtime compression) — three asks
#1300 carries its own packages/core/src/serialization/compression.ts with a different shape: sync node:zlib codecs, zstd-preferred with gzip fallback (feature-detected at zlib.zstdCompressSync, Node ≥ 22.15), prefix-dispatched reads. On its 8 MB QuickJS heap snapshots, zstd-3 beats gzip-6 on ratio (4.29× vs 4.02×) and compress speed (18 ms vs 127 ms — 7×). These two modules will collide at the same path, and the second to land has to reconcile. Asks:
-
Reserve the
zstdprefix now. AddZSTD: 'zstd'toSerializationFormat(bothserialization/types.tsand the browser-safeserialization-format.tscopy) even with no write path using it here. Cost: two lines. Benefit: the format namespace can't drift between the two branches, and a reader hitting a zstd payload from a snapshot-runtime writer gets a precise "zstd payload requires …" error instead of genericUnsupported serialization format. -
Fix the semantic conflict in the
GZIPconstant docs before they fork. This PR documentsgzipas "inner payload has its own format prefix"; #1300 documents it as "inner is raw bytes". Same prefix, contradictory contracts. Both are locally true — which is exactly the resolution: compression prefixes mark the codec only; inner structure is caller-defined (refs recurse intohydrateData, snapshots hand raw bytes to the snapshot loader). One sentence of wording alignment now saves a confusing archaeology session later. -
Plan the module merge as one file, two APIs. The natural shape: this PR's async conditional gzip API for browser-reachable payload refs, plus #1300's sync zstd-preferred API for Node-only consumers — and the browser-safety trick is already in this PR:
gunzipSyncIfAvailableresolvesnode:zlibviaprocess.getBuiltinModule, which is precisely how the merged module can host sync zstd codecs without growing a static Node dependency.
On gzip-only for payload refs: I think it's the right call, but the stated reason should be sharpened. The out-of-scope note says zstd is "not a web standard" — the operative constraint is specifically the browser read path: the web o11y UI decompresses post-decrypt via DecompressionStream, which standardizes only gzip/deflate/deflate-raw. Snapshots never reach a browser, which is why #1300 can prefer zstd unconditionally and this path can't. Worth a sentence in the module docs, because it's the criterion a future "add zsd1 here?" decision turns on: when browsers ship zstd in DecompressionStream (or a wasm decoder becomes acceptable in the web bundle), the prefix system makes it a drop-in for refs too. Node-side availability is already a non-issue for writes.
Smaller notes
pipeThroughTransform/gunzipAsyncare duplicated betweencompression.tsandserialization-format.ts. If deliberate (keepingserialization-format.tsdependency-free for browser bundles), a cross-reference comment would prevent drift; the module-merge in ask 3 is the natural moment to unify.- On the world-local open question: I'd keep it uniform (as this PR does). The "local files were greppable" ship sailed at spec v2 base64-encoding;
hydrateDataalready gives the CLI sync decompression, so a--raw-style inspection view is cheap if local-debugging demand materializes. Option B (CBOR for world-local) is attractive but orthogonal — neither should block this. - The rare uncompressed error-write paths (max-deliveries, replay-budget) are fine as-is — error payloads under 1 KB would pass through anyway.
CI
Still in progress as I write this. The nextjs-webpack dev lane and express prod lane failures match this cycle's known baseline flakes; the nextjs-turbopack canary failure has no logs yet — worth a look once the run completes, since canary lanes exercise the freshest Next integration against these serialization changes. Flagging for a re-check, not blocking on it given the full e2e suite passed locally per the validation notes (and my local unit runs corroborate).
Approving — please land the zstd prefix reservation and the constant-docs alignment with this PR; the module merge can be whoever-lands-second's job as long as both sides know the plan.
|
Amending one conclusion from my review after discussing with Nathan: I framed the browser read path as the constraint keeping this gzip-only — that's a soft constraint, not a hard one. The web o11y UI only ever reads payloads, so the browser story needs only a decoder, and a wasm zstd decoder is a solved problem — Nathan has a working implementation with exactly the right shape: That changes the zstd calculus from "blocked on web standards" to "a follow-up with known cost":
Why it's worth the follow-up: compression runs at every step boundary, so write-side CPU is a per-step tax — the snapshot-runtime benchmarks showed zstd ~7× faster than gzip at compression with a better ratio, and the ratio delta compounds with this PR's motivating workload (the same growing payload re-serialized N times). The wasm asset cost (~133 KB, lazy-loaded only when a zstd payload is actually encountered) is modest for an o11y dashboard. None of this blocks the current PR — the prefix system is what makes the whole sequence migration-free, which is the strongest argument for the design as submitted. But "zstd: out of scope" should read as "zstd: staged follow-up with a working decoder in hand," and the module-merge with the snapshot-runtime compression work (ask 3 in my review) is the natural vehicle for step 2's Node side. |
Split the compression benchmark into reproducible size and CPU scripts sharing deterministic workloads (lib/workloads.mjs). The CPU benchmark measures serialize/deserialize overhead per payload, total CPU across thousands of events, and compares gzip levels/brotli/deflate. Documents how to run the size, CPU, and end-to-end (bench.bench.ts) benchmarks against local and Vercel in scripts/README.md. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Switch the payload compression codec to zstd, which benchmarks 3–7× faster than gzip at an equal-or-better ratio on representative workloads (compression runs at every step boundary, so the write CPU is a per-step tax). zstd uses node:zlib (>= 22.15); gzip via the portable CompressionStream remains the fallback when zstd is unavailable, and WORKFLOW_COMPRESSION_CODEC=gzip forces it. Reads dispatch on the format prefix, so 'zstd' and 'gzip' payloads are both always decodable. zstd is Node-only (Web CompressionStream has no zstd), so the browser o11y read path registers a WASM-backed decoder (@tootallnate/zstd-wasm) via a new registerZstdDecoder hook; node:zlib handles Node-side reads (runtime replay, CLI, server o11y). A new workflow.serialization.codec span attribute reports which codec applied. gzip and zstd read support co-ship, so the existing specVersion-5 capability gate is unchanged. Verified end-to-end: spec-5 runs store zstd-prefixed payloads on disk and replay/complete correctly; the WASM decoder round-trips node:zlib zstd output. Benchmarks updated to compare zstd vs gzip. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
| data.length < COMPRESSION_MIN_BYTES || | ||
| isCompressionDisabledByEnv() | ||
| ) { | ||
| recordStats(stats, 'none', data.length, data.length); |
There was a problem hiding this comment.
Cross-deployment writes (start.ts, resume-hook.ts) can emit zstd-compressed payloads to a capability-approved target that cannot decode zstd, because the write-side codec is chosen from the writer's own Node runtime while zstd read support (Node 22.15+) is a runtime property not implied by the target's SDK version.
There was a problem hiding this comment.
Good catch — fixed in 2bc368c. Root engines allows Node 18/20, which lack node:zlib zstd, so the SDK version genuinely doesn't imply the reader's runtime can decode zstd. Cross-deployment writes (start({deploymentId}) and resumeHook) now pass a new compressionPortableOnly flag that restricts compress() to gzip (decodable on every supported runtime). Same-deployment writes keep zstd, since there the reader is the same runtime as the writer — a zstd-capable writer implies a zstd-capable reader. Added a unit test for the forced-gzip path.
* origin/main: Small detail panel cleanup (#2459) Fix lazy Next workflow HMR (#2438) Prevent peer dependency-only major bumps (#2437) fix(changesets): only major-bump peer dependents when out of range (#2439) Version Packages (beta) (#2428) otel: explicit traceparent injection + linked-trace mode for bounded per-invocation traces (#2363) [next] Clarify `serverExternalPackages` warning (#2417) Add .swc gitignore handling to builder (#2427) Version Packages (beta) (#2390) [ci] Increase dev.test.ts cleanup hook timeout (#2416) [world-vercel] Switch event endpoints to v4 wire format (#2055) docs: document run idempotency (#2011) Render attr_set events and run attributes in observability UI (#2393) [ci] Fix backport job model slug (#2403) [ci] Comment on PR when backport fails, revert to use opus 4.8 (#2400) Update queue client to 0.3.1 (#2399) fix(deps): upgrade esbuild to 0.28.1 (GHSA-gv7w-rqvm-qjhr) (#2395) test: e2e coverage for run-idempotency conflict-handling strategies (#2387) # Conflicts: # pnpm-lock.yaml
karthikscale3
left a comment
There was a problem hiding this comment.
Browser decompression review notes.
|
|
||
| function loadWasmModule(): Promise<WebAssembly.Module> { | ||
| if (!modulePromise) { | ||
| const url = new URL('@tootallnate/zstd-wasm/zstd.wasm', import.meta.url); |
There was a problem hiding this comment.
This bare package specifier does not look like a bundler asset URL. I ran a Vite production-build probe from packages/web, and Vite left new URL("@tootallnate/zstd-wasm/zstd.wasm", import.meta.url) unchanged; in the browser that resolves relative to the JS chunk, not to the package asset, so the WASM fetch will likely 404 in production. Can we switch this to a bundler-verified asset import, e.g. @tootallnate/zstd-wasm/zstd.wasm?url or equivalent, and add a browser/build smoke test that actually fetches and compiles the emitted WASM?
There was a problem hiding this comment.
Fixed in 2bc368c. The bundler couldn't rewrite the bare specifier, so I vendor zstd.wasm into web-shared/dist/lib/ (build-script cp) and reference it relatively: new URL('./zstd.wasm', import.meta.url) — the form Vite/webpack/Turbopack all rewrite. Verified against a Vite production build of packages/web: it now emits build/client/assets/zstd-*.wasm and the decoder chunk references that hashed asset URL. Added a Node test compiling the emitted WASM and round-tripping node:zlib zstd output through it.
| const { ensureZstdDecoderRegistered } = await import( | ||
| './zstd-browser-decoder.js' | ||
| ); | ||
| ensureZstdDecoderRegistered(); |
There was a problem hiding this comment.
This registers the browser zstd decoder only inside the decrypt path, but compression can be enabled without encryption. serialize() compresses before optional encryption, and encrypt() returns the compressed data unchanged when no key exists; local worlds also advertise spec v5. In the web app, no-key paths call plain hydrateResourceIO(), whose browser sync path returns compressed Uint8Arrays untouched. That means an unencrypted zstd/gzip payload can render as raw bytes instead of hydrated data. Could we add an async web hydration path for compressed data even when no encryption key is present, or otherwise register/decompress in the normal web hydration path, plus a browser test for unencrypted compressed payloads?
There was a problem hiding this comment.
Fixed in 2bc368c. hydrateResourceIOWithKey now takes an optional key and always registers the zstd decoder + decompresses, so it inflates compressed payloads with or without encryption. The web no-key paths that render resolved payloads (use-resource-data, run-detail-view event-data loaders) now route through it. (The trace-viewer/events-list use withData:false/resolveData:'none', so they carry no payload bodies and stay on the sync path.) Added a web-shared test hydrating an unencrypted zstd payload with no key.
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
|
No backport to This feature builds on To override, re-run the Backport to stable workflow manually via |
TooTallNate
left a comment
There was a problem hiding this comment.
Post-merge review — one shipping-blocking cutoff bug to fix before beta.18; design is otherwise sound
This merged in a much stronger form than what I reviewed (it was gzip-only then; it now ships zstd-preferred with gzip fallback plus a WASM browser decoder — closing exactly the loop from my pre-merge amendment, using the published @tootallnate/zstd-wasm). I re-reviewed the merged commit 5f0b84521 end to end: built core/world/web-shared, ran the suites (1236 core, 36 compression, 2 zstd-decoder, 5 spec-version — all green), and confirmed zstd round-trips on Node 24. The codec layering, prefix-dispatched reads, telemetry, and browser-decode wiring are all correct.
But there is one concrete bug that will cause silent payload corruption once compression activates, and it should be fixed before the next beta publishes.
🛑 The FORMAT_VERSION_TABLE cutoff is one beta too low
capabilities.ts gates both gzip and zstd at minVersion: '5.0.0-beta.16'. But:
workflow@5.0.0-beta.16was tagged June 15 (Version Packages #2390)workflow@5.0.0-beta.17was tagged June 15 (Version Packages #2428)- this PR merged June 16 —
git merge-base --is-ancestor 5f0b84521 workflow@5.0.0-beta.16→ NOT in beta.16; same for beta.17
So neither published beta contains the compression read path. The pending Version Packages PR (#2451) bumps @workflow/core to beta.18 — that's the first version that can decode these payloads.
With the cutoff at beta.16, getRunCapabilities() reports a target running beta.16 or beta.17 as compression-capable. A cross-deployment start() / resumeHook() to such a target (or the resilient-start probe resolving to one) will then write zstd/gzip payloads that the target cannot decode — a ReferenceError-class failure at replay, silent until it bites. This is precisely the hazard the TODO(release) comment on those lines warns about, and the same cutoff-lag bug that hit the byte-stream framing PR twice (it shipped without #1853 in beta.14, then again in beta.15).
Fix: bump both entries to 5.0.0-beta.18. I've prepared that change and will push it to a follow-up branch (see below) — flagging here so it's on record against this PR.
The cutoff being one-too-low is only dangerous once a published SDK actually starts writing compressed payloads, i.e. beta.18. Since beta.18 hasn't shipped, there's no corrupted data in the wild yet — but the fix must land in the same release that first enables compression, not after.
Everything else checks out
- Codec selection (
selectWriteCodec): zstd viaprocess.getBuiltinModule('node:zlib')(≥22.15), gzip fallback via portableCompressionStream,WORKFLOW_COMPRESSION_CODECoverride — bundler-safe (no staticnode:zlibimport), which is what keeps the module importable in browser/edge targets where it correctly degrades to gzip/none. - Read dispatch is codec-agnostic and complete: Node sync path (
serialization-format.ts:205), Node async/replay path (compression.ts decompress), and the browser path (hydration.ts→ensureZstdDecoderRegistered→ WASM decoder, lazy + idempotent). The "gzip and zstd co-ship, so a run that reads one reads both" claim that justifies the single sharedminVersionis structurally true — all three reader surfaces gained both codecs in this one PR. @tootallnate/zstd-wasm@0.0.2is the right pin: the browser only ever decodes, and 0.0.2 is decode-only — so the heavier compression code added in v0.1.0 is correctly not pulled into the web bundle. (Noting for the record since the v0.1.0 publish is what prompted this look: nothing here needs it — the write side uses Node's native zstd, never the WASM module.)- Compress-before-encrypt preserved (
encr(zstd(devl))), conditional gating intact (1KB floor, ≥5% savings, env kill switch), telemetry sizes measured at the compression boundary (pre-encryption) as documented. - world-vercel stamps new runs
specVersion: 5with a clear comment pointing at the server-side spec-5 support (workflow-server#520) — the cross-repo coordination is documented. - specVersion-5 contract,
requiresNewerWorldreject matrix, and CPU/size benchmark scripts all landed as described.
Bottom line
Strong implementation that correctly absorbed the zstd follow-up. The single must-fix is the beta.16 → beta.18 cutoff bump, which I'm pushing as a follow-up since this PR is already merged. Once that lands in the beta.18 release, this is good.
Post-merge review — one shipping-blocking cutoff bug to fix before beta.18; design is otherwise soundThis merged in a much stronger form than what I reviewed (it was gzip-only then; it now ships zstd-preferred with gzip fallback plus a WASM browser decoder — closing exactly the loop from my pre-merge amendment, using the published But there is one concrete bug that will cause silent payload corruption once compression activates, and it should be fixed before the next beta publishes. 🛑 The
|
|
Follow-up fix for the cutoff bug is up: #2470 (bumps gzip/zstd |
RFC: Compress serialized payload refs (zstd, gzip fallback)
Summary
Worlds now store compressed payloads. Every serialized payload (step inputs/outputs, workflow arguments/return values, errors, hook payloads) is wrapped in a composable codec format prefix before it reaches the World storage layer, cutting stored bytes by ~73–89% on real-world-style workloads (benchmarks below). zstd is the preferred codec — 3–7× faster than gzip at an equal-or-better ratio — with gzip as the portable fallback; reads dispatch on the prefix so both are always decodable. Compression is gated on a new specVersion 5 so the compatibility contract is explicit and testable.
Motivation
The event log re-serializes full payloads at every step boundary — an AI agent workflow that threads a growing chat history through 10 steps stores the conversation 10 times. Payloads are devalue-encoded JSON-ish text, which is highly compressible, and several storage backends amplify the bytes further (DynamoDB inline refs and world-local JSON files base64-encode binary, a 4/3× penalty that compression also claws back). Smaller payloads also push more Vercel-world refs under the 3750-byte inline cutoff, which means fewer S3 round-trips on replay — a latency win, not just storage.
Design
Compression must live in the SDK, not the server
On Vercel, payloads are AES-256-GCM encrypted client-side with a per-run key before the world ever sees them. Encrypted bytes are incompressible, so server-side compression (e.g. in
remote-ref.tsS3/DynamoDB writes) would be a no-op for the dominant case. Compressing in@workflow/core's serialization pipeline — before encryption — is the only placement that works, and it benefits every world (vercel, postgres, local) from one seam.A composable format layer, mirroring encryption
The serialization pipeline already supports composable format prefixes (
devl,encr). This PR adds compression as a sibling layer inpackages/core/src/serialization/compression.ts, mirroringencryption.ts:Because readers dispatch on the prefix at every layer, one read path transparently handles compressed (either codec), uncompressed, encrypted, and any nesting of these — new SDKs read both old and new data structurally, not via special cases.
Codec choice: zstd preferred, gzip fallback
compress()picks zstd whennode:zlibzstd is available (Node ≥ 22.15 — the production runtime), since it benchmarks 3–7× faster than gzip at an equal-or-better ratio and compression runs at every step boundary. It falls back to gzip via the portableCompressionStreamon runtimes withoutnode:zlibzstd, andWORKFLOW_COMPRESSION_CODEC=gzipforces the portable codec (handy for A/B or pinning). Both codecs are always decodable on read — the per-payload prefix means a mixed-codec event log is a non-event. gzip and zstd read support co-ship, so a single specVersion-5 capability gate covers both (no per-codec version skew is possible).Conditional compression
WORKFLOW_DISABLE_COMPRESSION=1is a write-side kill switch. Reads are unaffected.specVersion 5: the compatibility contract
SPEC_VERSION_CURRENTis bumped to 5 (SPEC_VERSION_SUPPORTS_COMPRESSION). The contract:requiresNewerWorld()→RunNotSupportedErrormachinery — a clear, typed "this run requires a newer SDK" instead of a cryptic per-payload format error. Since v5 is still in beta, this is the natural cut point: the v4 SDK cannot read/write/cancel v5 runs, but the first non-beta v5 client handles both v4 and v5 runs.Write-side gating, per call site
start()workflow argumentsgzip)run.specVersion ≥ 5(run record in scope)run.specVersion ≥ 5threaded viaStepExecutorParams.runSpecVersionspecVersion ≥ 5(stamped by the same-deployment orchestrator)run.specVersion ≥ 5run_failed)run.specVersion ≥ 5where the run record is in scope; otherwise uncompressedresumeHook)run.specVersion ≥ 5AND target deployment capability (getRunCapabilities) — same pattern as the existingencrgateCross-deployment writes reuse the existing capabilities machinery:
gzipandzstdentries inFORMAT_VERSION_TABLE(capabilities.ts) keyed on the target run'sworkflowCoreVersion, exactly likeencrandframedByteStreamsbefore it. (The table column in this section reads "supports compression"; since the codecs co-ship, the gate is one boolean.)Read paths (incl. browser zstd via WASM)
zstd is Node-only (the Web
CompressionStream/DecompressionStreamhas no zstd), so each reader decodes appropriately:node:zlibzstd/gzip, resolved throughprocess.getBuiltinModule— no static Node dependency, so the modules stay browser-safe.gzipvia the web-standardDecompressionStream;zstdvia a WASM decoder (@tootallnate/zstd-wasm, ~160 KB, compiled lazily on first use) that@workflow/web-sharedregisters with core through a newregisterZstdDecoderhook. Core stays free of the WASM dependency; the o11y host supplies it.Observability
hydrateData(sync, used by the CLI and server o11y on Node) decompresses synchronously vianode:zlibresolved throughprocess.getBuiltinModule— no static Node dependency, so the module stays browser-safe.hydrateDataWithKey(async, used by the web UI's decrypt flow) decompresses via the web-standardDecompressionStream, handlingencr(gzip(devl))and baregzip(devl).isCompressedData()helper mirrorsisEncryptedData()for UI affordances.Backwards/forwards compatibility
RunNotSupportedErrorKnown edge (documented tradeoff): in local dev, upgrading the SDK mid-run and continuing an old spec-4 run keeps its payloads uncompressed on the orchestrator paths (run-record gating) but the V1 step-handler path gates on the step entity's writer-stamped specVersion, which can compress step outputs of an old run after an upgrade. Deployed runs are pinned to their deployment (skew protection), so this cannot happen in production. The same writer-stamped behavior already exists for
encrand byte-stream framing.Benchmarks
All benchmark code lives in
packages/core/scripts/and is reproducible — shared deterministic workloads inlib/workloads.mjs, run instructions inscripts/README.md. Two dimensions: storage size and CPU cost.Storage size —
benchmark-compression-size.mjsRaw serialized payload bytes handed to World storage, compression off vs on:
Raw serialized payload bytes handed to World storage, off vs on (zstd, the shipped codec):
Simulated 10-step AI agent run (event log total): 402.1 KB → 109.4 KB (72.8% smaller). Incompressible binary still wins ~25% because devalue base64-encodes
Uint8Array(4/3×) and compression recovers that. Backends that base64 binary (DynamoDB inline refs, world-local JSON) see ~33% larger absolute savings. Text workloads are seeded non-repetitive prose to avoid overstating ratios.CPU cost —
benchmark-compression-cpu.mjsCompression is a world-independent CPU cost on the serialize (write) and deserialize (read) paths — the same
@workflow/corecode runs before any World is touched. So these absolute numbers hold for every backend; the world only sets the baseline the cost is compared against. Real shipping path (zstd), µs per op, on an M-series laptop:Stress — thousands of events (≈6.6 KB e-commerce payload, ser+deser per event, modelling a long workflow + replay):
A 10,000-event run adds only ~250ms of total CPU across its entire lifetime — ~4× less than gzip (which added ~1.1s). Costs scale with payload size; the
<1 KBthreshold makes small payloads free.zstd vs gzip (the reason for the switch;
node:zlibsync, level 3 zstd vs level 6 gzip = what each codec actually ships):zstd is faster on both compress and decompress with equal-or-better ratios; the win compounds because compression runs at every step boundary. zstd‑19 was far too slow (14–23 ms) to consider; level 3 (default) is the sweet spot. The script also reports gzip levels 1/9 and brotli for reference.
End-to-end runtime —
bench.bench.ts(pnpm bench:local)The existing stress-workflow harness, run twice against a local
nextjs-turbopackdev server — compression on (default) vs off (WORKFLOW_DISABLE_COMPRESSION=1), diffingbench-timings-*.json:End-to-end, compression's CPU is within run-to-run noise (±10%) — and net faster on the larger cases, because the highly compressible 10 KB payload shrinks enough that smaller filesystem IO outweighs the gzip CPU. (Caveat: the harness's
'x'.repeat(10240)payload is pathologically compressible; the microbenchmark above with realistic payloads is the cleaner CPU signal.) The takeaway: orchestration + queue + IO dominate per-step wall-clock, so the sub-millisecond gzip CPU disappears in the noise.Vercel
Now enabled.
@workflow/world-verceladvertisesspecVersion: 5(this PR), so new Vercel runs are created compressible — payloads on Vercel are nowencr(zstd(devl)). This is unblocked by the server-side companion vercel/workflow-server#520 (merged), which formally declared spec-5 support (payloads stay opaque to the server; the bump is the contract that lets the SDK stamp spec-5 runs). The ordering held: server first, then this SDK bump.Measurement: the
bench.bench.tsharness runs against a real Vercel labs deployment via theBenchmark Vercel (nextjs-turbopack / nitro-v3 / express)jobs in.github/workflows/benchmarks.ymlon every PR, comparing this branch against themainbaseline (spec-4, no compression) and posting the delta as a sticky PR comment — that comment is the Vercel compression result. For ad-hoc A/B on a real app, deploy the PR tarball and toggleWORKFLOW_DISABLE_COMPRESSION=1as a Vercel project env var (off baseline) vs unset (on);scripts/README.mddocuments the command. Expectation: the relative impact on Vercel is the smallest of any backend — a Vercel step's wall-clock is dominated by queue dispatch, AES-GCM encryption, S3/DynamoDB writes, and HTTP round-trips (100s of ms), so the sub-ms compression CPU is a tiny fraction.Verified end-to-end against the
nextjs-turbopackworkbench (world-local): new runs are created withspecVersion: 5, large step outputs and error payloads appear on disk with thezstdprefix (gzip when forced), small payloads stay as readabledevlpassthrough, and workflows complete and replay correctly.Observability
The serialize (write) and deserialize (read) paths emit OpenTelemetry span attributes so compression's impact shows up per-step in any OTel backend (incl. Vercel's), without manual storage inspection:
workflow.serialization.operationserialize(write) ordeserialize(read)workflow.serialization.compressedworkflow.serialization.codeczstd/gzip/none)workflow.serialization.uncompressed_bytesworkflow.serialization.stored_bytesworkflow.serialization.compression_ratioSizes are at the compression boundary (pre-encryption), so they measure compression's effect, not at-rest size (which adds the
encrenvelope + base64 on some backends). Attributes land on the active span — typically the dedicatedstep.dehydrate/step.hydratespan, otherwise the enclosing run/start span. The compression codec stays pure:compress/decompresspopulate aCompressionStatssink threaded throughCodecOptions; the dehydrate/hydrate wrappers set the attributes. Telemetry failures are swallowed and never affect serialization.A "Compressed / saved N%" badge in the web trace viewer (deriving from
hydrateDataWithKeyafter it peelsencr→codec) is a possible follow-up; the attributes above are the foundation.Open question: should world-local compress? (please discuss)
This PR compresses uniformly across all worlds, including world-local — no special casing. That's a deliberate simplification, but it's up for debate:
fs.tsjsonReplacer), so compression doesn't change much in practice — base64(devalue) was already opaque to grep.devltext payloads as plain readable strings in the JSON files. Best DX for local debugging and LLM agents grepping run data.Either follow-up is cheap; the read path handles both formats forever regardless.
Out of scope / future work
FORMAT_VERSION_TABLE. But independently-deployed older dashboards/CLI have no zstd decoder until updated, so they'd show a zstd payload un-hydrated (degrades to "can't display", not a crash) until redeployed. Acceptable while spec 5 is beta;WORKFLOW_COMPRESSION_CODEC=gzipforces the portable codec if a fleet needs it.remote-ref.ts(workflow-server): only worth revisiting if DynamoDB WCU data says so.Testing
compression.test.ts: round-trips,encr(zstd(devl))nesting order (white-box), zstd-preferred + gzip-fallback codec selection, mixed-codec decode, small-payload passthrough, incompressible-discard, kill switch, mode serializers, o11y sync+async hydration, capability-table gating (both codecs).compression-telemetry.test.ts: serialize/deserialize span attributes incl.codec.zstd-decoder.test.ts(web-shared): the WASM decoder round-tripsnode:zlibzstd output — locks the cross-codec contract for the browser path.spec-version.test.ts: spec 5 constants,requiresNewerWorldaccept/reject matrix, v4-reader simulation.@workflow/core(1225 tests),@workflow/world,@workflow/world-local,@workflow/world-vercel,@workflow/world-postgres,workflow,@workflow/cli,@workflow/web-shared.nextjs-turbopackdev server: spec-5 runs store zstd-prefixed payloads on disk (verified magic bytes) and replay/complete correctly.packages/core/scripts/— seescripts/README.md.🤖 Generated with Claude Code