You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
## Summary
Updates the AI chat docs to match the slim-wire + field-level merge
behavior shipped in #3719 and the precise `.in/append` cap +
CORS-readable 413 shipped in #3720. No behavior changes here — code is
correct in `main`; the docs were lagging on three patterns customers
copy out of the page.
## What changed
- **`hydrateMessages` examples upsert by id** (in `lifecycle-hooks.mdx`,
`patterns/database-persistence.mdx`, and
`patterns/persistence-and-replay.mdx`). The previous
`stored.push(newMsg)` pattern duplicated the assistant id on HITL
continuations and caused the LLM to receive a tool call with no
`arguments`. The new examples include the rationale inline.
- **`onValidateMessages` example filters to user messages**
(`lifecycle-hooks.mdx`). The previous example called
`validateUIMessages({ messages, tools })` directly, which now throws on
HITL slim wires (the AI SDK schema requires `input` on resolved tool
parts). New example shows the filter pattern, with a Warning callout
explaining why.
- **Merge contract description updated** (`lifecycle-hooks.mdx`). The
old wording said incoming messages are "auto-merged" / "replaced"; the
new description explains the actual field-level overlay (state advances
only).
- **Approval-responded wire example slimmed** (`client-protocol.mdx`).
Shows the minimum shape the agent reads — `state` + `approval` (or
`output` / `errorText` for HITL). Notes that the built-in transports
ship this slim shape by default and that fuller shapes are still
accepted.
- **`/in/append` 413 row and FAQ updated** (`client-protocol.mdx`,
`patterns/trusted-edge-signals.mdx`). Reflects the new precise S2 cap
and the CORS-readable 413.
- **New changelog entry** at the top of `changelog.mdx` covering all of
the above.
The historical `## 512 KiB ceiling removed` entry further down the
changelog is left as-is (it's a snapshot of the prior transition), and
the v4.5 upgrade-guide section is skipped — the merge contract is
backwards compatible.
## Test plan
- Mintlify dev preview renders cleanly with no broken anchors
- Linked references resolve (`/ai-chat/lifecycle-hooks#hydratemessages`,
`/ai-chat/lifecycle-hooks#onvalidatemessages`,
`/ai-chat/patterns/database-persistence#alternative-hydratemessages`,
`/ai-chat/client-protocol#step-3-send-messages-stops-and-actions`,
`/ai-chat/patterns/large-payloads`)
`chat.addToolOutput(...)` and `chat.addToolApproveResponse(...)` continuations on reasoning-heavy agent loops used to fail two ways: either the wire body crossed the `/in/append` cap (encrypted reasoning blobs + tool input routinely > 512 KiB), or apps that slimmed the wire as a workaround landed a tool call with no `arguments` on the next LLM step (the per-turn merge replaced the hydrated message wholesale instead of overlaying only the new tool-state advance). Both modes are fixed.
12
+
13
+
The transport (`TriggerChatTransport.sendMessages`, `AgentChat.sendRaw`) now slims the assistant message itself on `submit-message` turns whose assistant carries resolved or approval-responded tool parts. The wire shape ships as `{ id, role: "assistant", parts: [<resolved tool part only>] }` — `state` plus `output` / `errorText` / `approval`, depending on the new state. Everything else (reasoning blobs, prior text, tool `input`, provider metadata) is reconstructed server-side from `hydrateMessages` or the durable snapshot. Continuation payloads typically drop from 600 KiB – 1 MiB to ~1 KiB.
14
+
15
+
The per-turn merge now overlays only the tool-part state advances (`output-available` / `output-error` / `approval-responded` / `output-denied`) from the wire copy onto the matching hydrated entry. Hydrated `input`, text, reasoning, and provider metadata stay put. The agent still accepts a fuller `UIMessage` on the wire (the merge only reads the resolved fields), so custom transports that ship more don't break — they just waste bytes.
16
+
17
+
### `hydrateMessages` upsert-by-id
18
+
19
+
If your `hydrateMessages` hook persists the incoming message, **upsert by id** — don't unconditionally push. HITL continuations ship the existing assistant's id with a slim payload; a blind `stored.push(newMsg)` duplicates the row in the chain you return, the merge updates the first match, and the slim duplicate hits `toModelMessages` with no `input`.
20
+
21
+
A new `upsertIncomingMessage` helper is exported from `@trigger.dev/sdk/ai` to handle this for the common case:
The helper pushes fresh user messages, no-ops on HITL continuations (so the runtime can overlay the new tool-state advance), and skips on non-`submit-message` triggers. Returns `true` if it mutated `stored`. The examples in [lifecycle hooks](/ai-chat/lifecycle-hooks#hydratemessages), [Database persistence](/ai-chat/patterns/database-persistence#alternative-hydratemessages), and [Persistence and replay](/ai-chat/patterns/persistence-and-replay) have all been updated. Custom hydrate logic (branching, rollback, etc.) can still write the upsert by hand — the helper is a convenience for the common shape.
39
+
40
+
### `onValidateMessages` slim wire caveat
41
+
42
+
The slim wire is what arrives in `onValidateMessages` on HITL turns. `validateUIMessages` from `ai` rejects the slim shape (the AI SDK schema requires `input` on resolved tool parts), so filter to user messages first (or skip validation entirely on those turns). See the updated example in [lifecycle hooks](/ai-chat/lifecycle-hooks#onvalidatemessages).
43
+
44
+
### `/in/append` 413 + precise cap
45
+
46
+
In parallel:
47
+
48
+
- The 413 response now carries CORS headers, so browser fetches can read the status instead of failing as opaque `TypeError: Failed to fetch`. App-side retry-on-disconnect loops no longer spin forever on a permanently-rejected payload.
49
+
- The per-record cap is now computed precisely against S2's actual ceiling instead of the conservative 512 KiB floor. Legitimate ~600 – 900 KiB tool outputs (search results, file content) now succeed; pathological all-quote content that would double under JSON escape still rejects cleanly with a clear error.
50
+
51
+
See the updated [413 row in the client protocol](/ai-chat/client-protocol#step-3-send-messages-stops-and-actions).
| `409` | The session is closed — `{ "ok": false, "error": "Cannot append to a closed session" }`. |
695
-
| `413` | Body exceeds 512 KiB. A normal `kind: "message"` payload is a few KB; if you hit this you're shipping more than one message per record. |
695
+
| `413` | Body exceeds 1 MiB **or** the wrapped record would exceed S2's ~1 MiB per-record metered ceiling. A normal `kind: "message"` payload is a few KB;if you hit this you're shipping more than one message per record or pushing a single tool output that's itself oversized. Carries CORS headers so browser fetches can read the status. |
696
696
|`500`| Transient backend failure on the durable stream. Safe to retry — appends are idempotent on `(externalId, X-Part-Id)`if you set the optional `X-Part-Id` request header (the built-in clients set it from a UUID). |
697
697
698
698
<Warning>
@@ -851,7 +851,7 @@ The agent trims trailing assistant messages from its accumulator and re-streams
851
851
852
852
### Tool approval responses
853
853
854
-
When a tool requires approval (`needsApproval: true`), the agent streams the tool call with an `approval-requested` state and completes the turn. After the user approves or denies, send the **updated assistant message** (with `approval-responded` tool parts) back as a `kind: "message"` chunk — singular, not the full chain:
854
+
When a tool requires approval (`needsApproval: true`), the agent streams the tool call with an `approval-requested` state and completes the turn. After the user approves or denies, send the **updated assistant message** back as a `kind: "message"` chunk — singular, not the full chain. The minimum shape the agent reads is just the resolved tool parts:
855
855
856
856
```json
857
857
{
@@ -861,12 +861,10 @@ When a tool requires approval (`needsApproval: true`), the agent streams the too
861
861
"id": "asst-msg-1",
862
862
"role": "assistant",
863
863
"parts": [
864
-
{ "type": "text", "text": "I'll send that email for you." },
@@ -878,7 +876,11 @@ When a tool requires approval (`needsApproval: true`), the agent streams the too
878
876
}
879
877
```
880
878
881
-
The agent matches the incoming message by `id` against the rebuilt accumulator. If a match is found, it **replaces** the existing message instead of appending.
879
+
The agent matches the incoming message by `id` against the rebuilt accumulator (or hydrated chain) and **overlays the tool-state advance** onto the matching entry — `state` plus `output` / `errorText` / `approval`, depending on the new state. Hydrated `input`, text, reasoning, and provider metadata stay put. This is what makes the slim shape above sufficient: the agent rebuilds everything else from the snapshot or from your `hydrateMessages` hook.
880
+
881
+
The same shape applies to HITL `addToolOutput` answers — substitute `state: "output-available"` and `output: <result>` for the approval pair above. Single-tool HITL `addToolOutput` continuation payloads are typically ~1 KiB on the wire.
882
+
883
+
The built-in transports (`TriggerChatTransport`, `AgentChat`) ship the slim shape by default on `submit-message` continuations. Custom transports can ship a fuller `UIMessage` — the agent still only reads the resolved tool-part fields — but the slim shape is the most efficient and avoids brushing the per-record cap on reasoning-heavy turns.
882
884
883
885
<Note>
884
886
The message `id` must match the one the agent assigned during streaming. `TriggerChatTransport` keeps IDs in sync automatically. Custom transports should use the `messageId` from the stream's `start` chunk.
@@ -938,7 +940,7 @@ To bridge that gap, the head-start route handler ships **full UIMessage history*
938
940
939
941
Two reasons this exception is safe:
940
942
941
-
1. **The route handler runs against the customer's own HTTP endpoint**, not `/realtime/v1/sessions/{id}/in/append`. The 512 KiB body cap on the realtime route doesn't apply.
943
+
1. **The route handler runs against the customer's own HTTP endpoint**, not `/realtime/v1/sessions/{id}/in/append`. The per-record cap on the realtime route doesn't apply.
942
944
2. **`headStartMessages` is only honored on `trigger: "handover-prepare"`**. The runtime ignores the field on every other trigger — the one-message-per-record rule still holds for normal turns.
943
945
944
946
After turn 1 completes, the snapshot is written and turn 2+ run as a normal single-message-per-record chat.
@@ -1067,7 +1069,7 @@ No. `seq_num` is monotonic across the entire session — turn 1 might emit seq 0
1067
1069
</Expandable>
1068
1070
1069
1071
<Expandable title="What's the maximum size of a single `.in/append` body?">
1070
-
512 KiB. A typical `kind: "message"` is a few KB. If you're brushing the cap you're shipping more than one message per record, which the protocol forbids. The headStart path (`trigger: "handover-prepare"`) sends through the customer's own HTTP route handler, not `.in/append`, so the cap doesn't apply there.
1072
+
The HTTP body is capped at 1 MiB as a DoS guard. The actual ceiling is at the storage layer: each `.in/append` becomes a single S2 record, metered as `8 + body_bytes_after_JSON_wrap`, capped at 1 MiB. So the practical limit on the raw HTTP body sits around ~1023 KiB for content with low JSON-escape overhead (ASCII, base64) and ~512 KiB for content that escapes heavily (all quotes / backslashes). A typical `kind: "message"` is a few KiB. If you're brushing the cap you're either shipping a single tool output that's itself oversized — see [Large payloads](/ai-chat/patterns/large-payloads) — or you're shipping more than one message per record, which the protocol forbids. The 413 response carries CORS headers so browser fetches can read the status. The headStart path (`trigger: "handover-prepare"`) sends through the customer's own HTTP route handler, not `.in/append`, so the cap doesn't apply there.
returnstreamText({ model: anthropic("claude-sonnet-4-5"), messages, tools: chatTools, abortSignal: signal });
249
253
},
250
254
});
251
255
```
252
256
257
+
<Warning>
258
+
On HITL continuations (`addToolOutput` / `addToolApproveResponse`) the assistant entry in `messages` is **slim** — `state` + `output` / `errorText` / `approval` only, no `input` or other parts. `validateUIMessages` against the AI SDK schema rejects that shape (the schema requires `input` on resolved tool parts), so filter to user messages first (or skip validation entirely on those turns). The example above does the filter.
259
+
</Warning>
260
+
253
261
<Note>
254
262
`onValidateMessages` fires **before**`onTurnStart` and message accumulation. If you need to validate messages loaded from a database, do the loading in `onChatStart` or `onPreload` and let `onValidateMessages` validate the full incoming set each turn.
255
263
</Note>
@@ -272,16 +280,15 @@ Use this when the backend should be the source of truth for message history: abu
272
280
|`previousRunId`|`string \| undefined`| The previous run ID (if continuation) |
`upsertIncomingMessage` (exported from `@trigger.dev/sdk/ai`) handles the three cases that matter — fresh user messages get pushed, HITL continuations (`addToolOutput` / `addToolApproveResponse`) no-op because the incoming wire shares the existing assistant's id and the runtime overlays the new tool-state advance onto that entry, and non-`submit-message` triggers (`regenerate-message` / `action`) skip persistence. It returns `true` when it mutated `stored`, so the caller knows whether to persist.
307
+
308
+
If you need branching, rollback, or other custom hydrate logic, you can still write the upsert by hand — `upsertIncomingMessage` is a convenience for the common case, not the only supported shape.
After the hook returns, any incoming wire message whose ID matches a hydrated message is auto-merged. This makes [tool approvals](/ai-chat/frontend#tool-approvals) work transparently with hydration.
312
+
After the hook returns, the runtime overlays the wire's tool-state advances (`output-available` / `output-error` / `approval-responded` / `output-denied`) onto matching hydrated entries by id. Everything else on the hydrated entry — text, reasoning, tool `input`, providerMetadata — stays put. This makes [tool approvals](/ai-chat/frontend#tool-approvals)and HITL `addToolOutput` continuations work transparently: ship a slim resolution on the wire, the agent merges the new state onto your DB-backed copy.
302
313
303
314
<Note>
304
315
`hydrateMessages` also fires for [action](/ai-chat/actions) turns (`trigger: "action"`) with empty `incomingMessages`. This lets the action handler work with the latest DB state.
Copy file name to clipboardExpand all lines: docs/ai-chat/patterns/database-persistence.mdx
+7-2Lines changed: 7 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -178,14 +178,19 @@ For apps that need the backend to be the single source of truth for message hist
178
178
With hydration, the hook loads messages from your database on every turn. The frontend's messages are ignored (except for the new user message, which arrives in `incomingMessages`):
Copy file name to clipboardExpand all lines: docs/ai-chat/patterns/persistence-and-replay.mdx
+4-3Lines changed: 4 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -131,16 +131,17 @@ If `onAction` mutates `chat.history.*` and then the run crashes before the next
131
131
When the customer registers a [`hydrateMessages`](/ai-chat/lifecycle-hooks#hydratemessages) hook, the runtime trusts the hook to be the source of truth for history. Snapshot read and replay are **skipped entirely** at boot. The hook fires per turn, returns the canonical chain from the customer's database, and the accumulator is set to whatever the hook returned.
Copy file name to clipboardExpand all lines: docs/ai-chat/patterns/trusted-edge-signals.mdx
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -115,7 +115,7 @@ The body is a JSON-serialized `ChatInputChunk`. The proxy parses it, checks `kin
115
115
}
116
116
```
117
117
118
-
Both bodies stay well under the [512 KiB cap on `/in/append`](/ai-chat/client-protocol#step-3-send-messages-stops-and-actions) — a typical trust object is ~200 bytes.
118
+
Both bodies stay well under the [per-record cap on `/in/append`](/ai-chat/client-protocol#step-3-send-messages-stops-and-actions) — a typical trust object is ~200 bytes.
119
119
120
120
Other paths — `.out` SSE, `/api/v1/auth/jwt/claims`, anything else — pass through the proxy untouched. The SSE stream in particular must not be buffered; preserve the response body as-is.
0 commit comments