docs(ai-chat): slim-wire HITL continuations + field-level merge contract (#3721)

ericallam · web-flow · commit 9f64bf404b90 · 2026-05-23T17:27:15.000+01:00
## Summary Updates the AI chat docs to match the slim-wire + field-level merge behavior shipped in #3719 and the precise `.in/append` cap + CORS-readable 413 shipped in #3720. No behavior changes here — code is correct in `main`; the docs were lagging on three patterns customers copy out of the page. ## What changed - **`hydrateMessages` examples upsert by id** (in `lifecycle-hooks.mdx`, `patterns/database-persistence.mdx`, and `patterns/persistence-and-replay.mdx`). The previous `stored.push(newMsg)` pattern duplicated the assistant id on HITL continuations and caused the LLM to receive a tool call with no `arguments`. The new examples include the rationale inline. - **`onValidateMessages` example filters to user messages** (`lifecycle-hooks.mdx`). The previous example called `validateUIMessages({ messages, tools })` directly, which now throws on HITL slim wires (the AI SDK schema requires `input` on resolved tool parts). New example shows the filter pattern, with a Warning callout explaining why. - **Merge contract description updated** (`lifecycle-hooks.mdx`). The old wording said incoming messages are "auto-merged" / "replaced"; the new description explains the actual field-level overlay (state advances only). - **Approval-responded wire example slimmed** (`client-protocol.mdx`). Shows the minimum shape the agent reads — `state` + `approval` (or `output` / `errorText` for HITL). Notes that the built-in transports ship this slim shape by default and that fuller shapes are still accepted. - **`/in/append` 413 row and FAQ updated** (`client-protocol.mdx`, `patterns/trusted-edge-signals.mdx`). Reflects the new precise S2 cap and the CORS-readable 413. - **New changelog entry** at the top of `changelog.mdx` covering all of the above. The historical `## 512 KiB ceiling removed` entry further down the changelog is left as-is (it's a snapshot of the prior transition), and the v4.5 upgrade-guide section is skipped — the merge contract is backwards compatible. ## Test plan - Mintlify dev preview renders cleanly with no broken anchors - Linked references resolve (`/ai-chat/lifecycle-hooks#hydratemessages`, `/ai-chat/lifecycle-hooks#onvalidatemessages`, `/ai-chat/patterns/database-persistence#alternative-hydratemessages`, `/ai-chat/client-protocol#step-3-send-messages-stops-and-actions`, `/ai-chat/patterns/large-payloads`)
diff --git a/docs/ai-chat/changelog.mdx b/docs/ai-chat/changelog.mdx
@@ -4,6 +4,54 @@ sidebarTitle: "Changelog"
 description: "Pre-release updates for AI chat agents."
 ---
 
+<Update label="May 23, 2026" description="4.5.0-rc.2" tags={["SDK", "Webapp", "Bug fix"]}>
+
+## HITL continuations — slim wire by default + field-level merge
+
+`chat.addToolOutput(...)` and `chat.addToolApproveResponse(...)` continuations on reasoning-heavy agent loops used to fail two ways: either the wire body crossed the `/in/append` cap (encrypted reasoning blobs + tool input routinely > 512 KiB), or apps that slimmed the wire as a workaround landed a tool call with no `arguments` on the next LLM step (the per-turn merge replaced the hydrated message wholesale instead of overlaying only the new tool-state advance). Both modes are fixed.
+
+The transport (`TriggerChatTransport.sendMessages`, `AgentChat.sendRaw`) now slims the assistant message itself on `submit-message` turns whose assistant carries resolved or approval-responded tool parts. The wire shape ships as `{ id, role: "assistant", parts: [<resolved tool part only>] }` — `state` plus `output` / `errorText` / `approval`, depending on the new state. Everything else (reasoning blobs, prior text, tool `input`, provider metadata) is reconstructed server-side from `hydrateMessages` or the durable snapshot. Continuation payloads typically drop from 600 KiB – 1 MiB to ~1 KiB.
+
+The per-turn merge now overlays only the tool-part state advances (`output-available` / `output-error` / `approval-responded` / `output-denied`) from the wire copy onto the matching hydrated entry. Hydrated `input`, text, reasoning, and provider metadata stay put. The agent still accepts a fuller `UIMessage` on the wire (the merge only reads the resolved fields), so custom transports that ship more don't break — they just waste bytes.
+
+### `hydrateMessages` upsert-by-id
+
+If your `hydrateMessages` hook persists the incoming message, **upsert by id** — don't unconditionally push. HITL continuations ship the existing assistant's id with a slim payload; a blind `stored.push(newMsg)` duplicates the row in the chain you return, the merge updates the first match, and the slim duplicate hits `toModelMessages` with no `input`.
+
+A new `upsertIncomingMessage` helper is exported from `@trigger.dev/sdk/ai` to handle this for the common case:
+
+```ts
+import { chat, upsertIncomingMessage } from "@trigger.dev/sdk/ai";
+
+chat.agent({
+  hydrateMessages: async ({ chatId, trigger, incomingMessages }) => {
+    const record = await db.chat.findUnique({ where: { id: chatId } });
+    const stored = record?.messages ?? [];
+    if (upsertIncomingMessage(stored, { trigger, incomingMessages })) {
+      await db.chat.update({ where: { id: chatId }, data: { messages: stored } });
+    }
+    return stored;
+  },
+});
+```
+
+The helper pushes fresh user messages, no-ops on HITL continuations (so the runtime can overlay the new tool-state advance), and skips on non-`submit-message` triggers. Returns `true` if it mutated `stored`. The examples in [lifecycle hooks](/ai-chat/lifecycle-hooks#hydratemessages), [Database persistence](/ai-chat/patterns/database-persistence#alternative-hydratemessages), and [Persistence and replay](/ai-chat/patterns/persistence-and-replay) have all been updated. Custom hydrate logic (branching, rollback, etc.) can still write the upsert by hand — the helper is a convenience for the common shape.
+
+### `onValidateMessages` slim wire caveat
+
+The slim wire is what arrives in `onValidateMessages` on HITL turns. `validateUIMessages` from `ai` rejects the slim shape (the AI SDK schema requires `input` on resolved tool parts), so filter to user messages first (or skip validation entirely on those turns). See the updated example in [lifecycle hooks](/ai-chat/lifecycle-hooks#onvalidatemessages).
+
+### `/in/append` 413 + precise cap
+
+In parallel:
+
+- The 413 response now carries CORS headers, so browser fetches can read the status instead of failing as opaque `TypeError: Failed to fetch`. App-side retry-on-disconnect loops no longer spin forever on a permanently-rejected payload.
+- The per-record cap is now computed precisely against S2's actual ceiling instead of the conservative 512 KiB floor. Legitimate ~600 – 900 KiB tool outputs (search results, file content) now succeed; pathological all-quote content that would double under JSON escape still rejects cleanly with a clear error.
+
+See the updated [413 row in the client protocol](/ai-chat/client-protocol#step-3-send-messages-stops-and-actions).
+
+</Update>
+
 <Update label="May 21, 2026" description="4.5.0-rc.1" tags={["SDK", "Bug fix"]}>
 
 ## v4.5.0-rc.1 — two bug fixes
diff --git a/docs/ai-chat/client-protocol.mdx b/docs/ai-chat/client-protocol.mdx
@@ -692,7 +692,7 @@ The body is a JSON-serialized [`ChatInputChunk`](#chatinputchunk) — a tagged u
 | `401` | Missing or invalid `Authorization` header. |
 | `403` | Token doesn't carry `write:sessions:{externalId}`. |
 | `409` | The session is closed — `{ "ok": false, "error": "Cannot append to a closed session" }`. |
-| `413` | Body exceeds 512 KiB. A normal `kind: "message"` payload is a few KB; if you hit this you're shipping more than one message per record. |
+| `413` | Body exceeds 1 MiB **or** the wrapped record would exceed S2's ~1 MiB per-record metered ceiling. A normal `kind: "message"` payload is a few KB; if you hit this you're shipping more than one message per record or pushing a single tool output that's itself oversized. Carries CORS headers so browser fetches can read the status. |
 | `500` | Transient backend failure on the durable stream. Safe to retry — appends are idempotent on `(externalId, X-Part-Id)` if you set the optional `X-Part-Id` request header (the built-in clients set it from a UUID). |
 
 <Warning>
@@ -851,7 +851,7 @@ The agent trims trailing assistant messages from its accumulator and re-streams
 
 ### Tool approval responses
 
-When a tool requires approval (`needsApproval: true`), the agent streams the tool call with an `approval-requested` state and completes the turn. After the user approves or denies, send the **updated assistant message** (with `approval-responded` tool parts) back as a `kind: "message"` chunk — singular, not the full chain:
+When a tool requires approval (`needsApproval: true`), the agent streams the tool call with an `approval-requested` state and completes the turn. After the user approves or denies, send the **updated assistant message** back as a `kind: "message"` chunk — singular, not the full chain. The minimum shape the agent reads is just the resolved tool parts:
 
 ```json
 {
@@ -861,12 +861,10 @@ When a tool requires approval (`needsApproval: true`), the agent streams the too
       "id": "asst-msg-1",
       "role": "assistant",
       "parts": [
-        { "type": "text", "text": "I'll send that email for you." },
         {
           "type": "tool-sendEmail",
           "toolCallId": "call-1",
           "state": "approval-responded",
-          "input": { "to": "user@example.com", "subject": "Hello" },
           "approval": { "id": "approval-1", "approved": true }
         }
       ]
@@ -878,7 +876,11 @@ When a tool requires approval (`needsApproval: true`), the agent streams the too
 }
 ```
 
-The agent matches the incoming message by `id` against the rebuilt accumulator. If a match is found, it **replaces** the existing message instead of appending.
+The agent matches the incoming message by `id` against the rebuilt accumulator (or hydrated chain) and **overlays the tool-state advance** onto the matching entry — `state` plus `output` / `errorText` / `approval`, depending on the new state. Hydrated `input`, text, reasoning, and provider metadata stay put. This is what makes the slim shape above sufficient: the agent rebuilds everything else from the snapshot or from your `hydrateMessages` hook.
+
+The same shape applies to HITL `addToolOutput` answers — substitute `state: "output-available"` and `output: <result>` for the approval pair above. Single-tool HITL `addToolOutput` continuation payloads are typically ~1 KiB on the wire.
+
+The built-in transports (`TriggerChatTransport`, `AgentChat`) ship the slim shape by default on `submit-message` continuations. Custom transports can ship a fuller `UIMessage` — the agent still only reads the resolved tool-part fields — but the slim shape is the most efficient and avoids brushing the per-record cap on reasoning-heavy turns.
 
 <Note>
   The message `id` must match the one the agent assigned during streaming. `TriggerChatTransport` keeps IDs in sync automatically. Custom transports should use the `messageId` from the stream's `start` chunk.
@@ -938,7 +940,7 @@ To bridge that gap, the head-start route handler ships **full UIMessage history*
 
 Two reasons this exception is safe:
 
-1. **The route handler runs against the customer's own HTTP endpoint**, not `/realtime/v1/sessions/{id}/in/append`. The 512 KiB body cap on the realtime route doesn't apply.
+1. **The route handler runs against the customer's own HTTP endpoint**, not `/realtime/v1/sessions/{id}/in/append`. The per-record cap on the realtime route doesn't apply.
 2. **`headStartMessages` is only honored on `trigger: "handover-prepare"`**. The runtime ignores the field on every other trigger — the one-message-per-record rule still holds for normal turns.
 
 After turn 1 completes, the snapshot is written and turn 2+ run as a normal single-message-per-record chat.
@@ -1067,7 +1069,7 @@ No. `seq_num` is monotonic across the entire session — turn 1 might emit seq 0
 </Expandable>
 
 <Expandable title="What's the maximum size of a single `.in/append` body?">
-512 KiB. A typical `kind: "message"` is a few KB. If you're brushing the cap you're shipping more than one message per record, which the protocol forbids. The headStart path (`trigger: "handover-prepare"`) sends through the customer's own HTTP route handler, not `.in/append`, so the cap doesn't apply there.
+The HTTP body is capped at 1 MiB as a DoS guard. The actual ceiling is at the storage layer: each `.in/append` becomes a single S2 record, metered as `8 + body_bytes_after_JSON_wrap`, capped at 1 MiB. So the practical limit on the raw HTTP body sits around ~1023 KiB for content with low JSON-escape overhead (ASCII, base64) and ~512 KiB for content that escapes heavily (all quotes / backslashes). A typical `kind: "message"` is a few KiB. If you're brushing the cap you're either shipping a single tool output that's itself oversized — see [Large payloads](/ai-chat/patterns/large-payloads) — or you're shipping more than one message per record, which the protocol forbids. The 413 response carries CORS headers so browser fetches can read the status. The headStart path (`trigger: "handover-prepare"`) sends through the customer's own HTTP route handler, not `.in/append`, so the cap doesn't apply there.
 </Expandable>
 
 ## See also
diff --git a/docs/ai-chat/lifecycle-hooks.mdx b/docs/ai-chat/lifecycle-hooks.mdx
@@ -242,14 +242,22 @@ import { validateUIMessages } from "ai";
 export const myChat = chat.agent({
   id: "my-chat",
   onValidateMessages: async ({ messages }) => {
-    return validateUIMessages({ messages, tools: chatTools });
+    const userMessages = messages.filter((m) => m.role === "user");
+    if (userMessages.length > 0) {
+      await validateUIMessages({ messages: userMessages, tools: chatTools });
+    }
+    return messages;
   },
   run: async ({ messages, signal }) => {
     return streamText({ model: anthropic("claude-sonnet-4-5"), messages, tools: chatTools, abortSignal: signal });
   },
 });
 ```
 
+<Warning>
+  On HITL continuations (`addToolOutput` / `addToolApproveResponse`) the assistant entry in `messages` is **slim** — `state` + `output` / `errorText` / `approval` only, no `input` or other parts. `validateUIMessages` against the AI SDK schema rejects that shape (the schema requires `input` on resolved tool parts), so filter to user messages first (or skip validation entirely on those turns). The example above does the filter.
+</Warning>
+
 <Note>
   `onValidateMessages` fires **before** `onTurnStart` and message accumulation. If you need to validate messages loaded from a database, do the loading in `onChatStart` or `onPreload` and let `onValidateMessages` validate the full incoming set each turn.
 </Note>
@@ -272,16 +280,15 @@ Use this when the backend should be the source of truth for message history: abu
 | `previousRunId`    | `string \| undefined`                                 | The previous run ID (if continuation)                     |
 
 ```ts
+import { chat, upsertIncomingMessage } from "@trigger.dev/sdk/ai";
+
 export const myChat = chat.agent({
   id: "my-chat",
   hydrateMessages: async ({ chatId, trigger, incomingMessages }) => {
     const record = await db.chat.findUnique({ where: { id: chatId } });
     const stored = record?.messages ?? [];
 
-    // Append the new user message and persist
-    if (trigger === "submit-message" && incomingMessages.length > 0) {
-      const newMsg = incomingMessages[incomingMessages.length - 1]!;
-      stored.push(newMsg);
+    if (upsertIncomingMessage(stored, { trigger, incomingMessages })) {
       await db.chat.update({
         where: { id: chatId },
         data: { messages: stored },
@@ -296,9 +303,13 @@ export const myChat = chat.agent({
 });
 ```
 
+`upsertIncomingMessage` (exported from `@trigger.dev/sdk/ai`) handles the three cases that matter — fresh user messages get pushed, HITL continuations (`addToolOutput` / `addToolApproveResponse`) no-op because the incoming wire shares the existing assistant's id and the runtime overlays the new tool-state advance onto that entry, and non-`submit-message` triggers (`regenerate-message` / `action`) skip persistence. It returns `true` when it mutated `stored`, so the caller knows whether to persist.
+
+If you need branching, rollback, or other custom hydrate logic, you can still write the upsert by hand — `upsertIncomingMessage` is a convenience for the common case, not the only supported shape.
+
 **Lifecycle position:** `onValidateMessages` → **`hydrateMessages`** → `onChatStart` (chat's first message only) → `onTurnStart` → `run()`
 
-After the hook returns, any incoming wire message whose ID matches a hydrated message is auto-merged. This makes [tool approvals](/ai-chat/frontend#tool-approvals) work transparently with hydration.
+After the hook returns, the runtime overlays the wire's tool-state advances (`output-available` / `output-error` / `approval-responded` / `output-denied`) onto matching hydrated entries by id. Everything else on the hydrated entry — text, reasoning, tool `input`, providerMetadata — stays put. This makes [tool approvals](/ai-chat/frontend#tool-approvals) and HITL `addToolOutput` continuations work transparently: ship a slim resolution on the wire, the agent merges the new state onto your DB-backed copy.
 
 <Note>
   `hydrateMessages` also fires for [action](/ai-chat/actions) turns (`trigger: "action"`) with empty `incomingMessages`. This lets the action handler work with the latest DB state.
diff --git a/docs/ai-chat/patterns/database-persistence.mdx b/docs/ai-chat/patterns/database-persistence.mdx
@@ -178,14 +178,19 @@ For apps that need the backend to be the single source of truth for message hist
 With hydration, the hook loads messages from your database on every turn. The frontend's messages are ignored (except for the new user message, which arrives in `incomingMessages`):
 
 ```ts
+import { chat, upsertIncomingMessage } from "@trigger.dev/sdk/ai";
+
 export const myChat = chat.agent({
   id: "my-chat",
   hydrateMessages: async ({ chatId, trigger, incomingMessages }) => {
     const record = await db.chat.findUnique({ where: { id: chatId } });
     const stored = record?.messages ?? [];
 
-    if (trigger === "submit-message" && incomingMessages.length > 0) {
-      stored.push(incomingMessages[incomingMessages.length - 1]!);
+    // `upsertIncomingMessage` pushes a fresh user message and no-ops
+    // on HITL continuations (the runtime overlays the new tool-state
+    // advance onto the existing entry). See lifecycle hooks for the
+    // full pattern: /ai-chat/lifecycle-hooks#hydratemessages
+    if (upsertIncomingMessage(stored, { trigger, incomingMessages })) {
       await db.chat.update({ where: { id: chatId }, data: { messages: stored } });
     }
 
diff --git a/docs/ai-chat/patterns/persistence-and-replay.mdx b/docs/ai-chat/patterns/persistence-and-replay.mdx
@@ -131,16 +131,17 @@ If `onAction` mutates `chat.history.*` and then the run crashes before the next
 When the customer registers a [`hydrateMessages`](/ai-chat/lifecycle-hooks#hydratemessages) hook, the runtime trusts the hook to be the source of truth for history. Snapshot read and replay are **skipped entirely** at boot. The hook fires per turn, returns the canonical chain from the customer's database, and the accumulator is set to whatever the hook returned.
 
 ```ts
-import { chat } from "@trigger.dev/sdk/ai";
+import { chat, upsertIncomingMessage } from "@trigger.dev/sdk/ai";
 import { db } from "@/lib/db";
 
 export const myChat = chat.agent({
   id: "my-chat",
   hydrateMessages: async ({ chatId, trigger, incomingMessages }) => {
     const stored = (await db.chat.findUnique({ where: { id: chatId } }))?.messages ?? [];
 
-    if (trigger === "submit-message" && incomingMessages.length > 0) {
-      stored.push(incomingMessages[0]!);
+    // See lifecycle-hooks for the full upsert pattern + rationale:
+    // /ai-chat/lifecycle-hooks#hydratemessages
+    if (upsertIncomingMessage(stored, { trigger, incomingMessages })) {
       await db.chat.update({ where: { id: chatId }, data: { messages: stored } });
     }
 
diff --git a/docs/ai-chat/patterns/trusted-edge-signals.mdx b/docs/ai-chat/patterns/trusted-edge-signals.mdx
@@ -115,7 +115,7 @@ The body is a JSON-serialized `ChatInputChunk`. The proxy parses it, checks `kin
 }
 ```
 
-Both bodies stay well under the [512 KiB cap on `/in/append`](/ai-chat/client-protocol#step-3-send-messages-stops-and-actions) — a typical trust object is ~200 bytes.
+Both bodies stay well under the [per-record cap on `/in/append`](/ai-chat/client-protocol#step-3-send-messages-stops-and-actions) — a typical trust object is ~200 bytes.
 
 Other paths — `.out` SSE, `/api/v1/auth/jwt/claims`, anything else — pass through the proxy untouched. The SSE stream in particular must not be buffered; preserve the response body as-is.
 

Original file line number	Diff line number	Diff line change
@@ -115,7 +115,7 @@ The body is a JSON-serialized `ChatInputChunk`. The proxy parses it, checks `kin
`115`	`115`	`}`
`116`	`116`	```
`117`	`117`
`118`		-Both bodies stay well under the [512 KiB cap on `/in/append`](/ai-chat/client-protocol#step-3-send-messages-stops-and-actions) — a typical trust object is ~200 bytes.
	`118`	+Both bodies stay well under the [per-record cap on `/in/append`](/ai-chat/client-protocol#step-3-send-messages-stops-and-actions) — a typical trust object is ~200 bytes.
`119`	`119`
`120`	`120`	Other paths — `.out` SSE, `/api/v1/auth/jwt/claims`, anything else — pass through the proxy untouched. The SSE stream in particular must not be buffered; preserve the response body as-is.
`121`	`121`