Skip to content

Stop → Continue: resume truncated or aborted assistant turns#49

Open
vahid-ahmadi wants to merge 1 commit intomainfrom
feat/stop-continue
Open

Stop → Continue: resume truncated or aborted assistant turns#49
vahid-ahmadi wants to merge 1 commit intomainfrom
feat/stop-continue

Conversation

@vahid-ahmadi
Copy link
Copy Markdown
Contributor

Summary

A long policy analysis that hits the 16k `max_tokens` cap, or one the user kills with Stop, currently dies there — they have to start a new chat and re-explain context. This PR adds a Continue affordance below those messages that resumes from exactly where the answer stopped.

Closes #44.

Behaviour

  • Truncation detection (backend): `backend/routes/chatbot.py` captures `final.stop_reason` from each Anthropic stream and propagates it on the `done` SSE event. `stop_reason === "max_tokens"` means truncated; `"end_turn"` / `"stop_sequence"` mean the model finished naturally.
  • Manual stop detection (frontend): existing `AbortController` flow now also tags the in-progress message with `stopped: true` before flushing.
  • Continue button: renders below the cost line on assistant messages where `stop_reason === "max_tokens" || stopped`, hidden if any tool in the message is still pending (avoids orphan tool calls).
  • Resume in place: `continueMessage(idx)` posts the conversation up to and including the partial assistant turn back to `/chat/message`. Anthropic's assistant-prefill behaviour means the model continues the same logical turn — no extra "continue from where you stopped" user nudge needed. Streamed deltas append into the same bubble. Cost is summed onto the existing `cost_gbp`.
  • Persistence: `stop_reason` / `stopped` / `cost_gbp` are now serialised by `saveConversation` and restored by `loadConversation`, so the Continue affordance survives a page reload.
  • Per the issue's "out of scope": re-truncation is allowed (button reappears), but no auto-looping — every continuation requires a user click.

Implementation notes

  • No structural backend changes — same loop, same tools, same prompt cache.
  • `continueMessage` deliberately reuses the streaming protocol but skips the typewriter drain animation: resumed text dumps directly into the message rather than animating, which felt right (the user already read the prefix).
  • Tools complication: the partial assistant might have triggered tool calls in the original turn. The new request only carries serialized text (existing pattern for `apiMessages`); the model continues from the text alone. That's lossy but consistent with how the project already handles multi-turn conversations.

Test plan

  • `docker-compose up`. Ask a question that produces a long answer (e.g. "Walk through the entire UK income tax code with examples"). Watch for max_tokens truncation → Continue appears → click → answer resumes inline, cost sums up.
  • Mid-stream click Stop → partial text preserved, Continue appears, click → resumes from exactly where it stopped.
  • Click Stop while a tool is pending → Continue button is hidden for that message (orphan-tool guard).
  • Reload the page (or open from history) → Continue affordance is still there on truncated/stopped messages.
  • Plan-mode replies don't show Continue (they end with `stop_reason: "end_turn"`).
  • Existing chat tests (`pytest backend/tests/test_api.py`) still pass — the only backend change is an additive field on the `done` event.

Out of scope

  • "Continue indefinitely" auto-looping — explicitly user-driven only.
  • Reconstructing tool_use / tool_result blocks across the request boundary (would let the model continue with full tool context but is a bigger refactor).

Backend (chatbot.py):
- Capture `final.stop_reason` from each Anthropic stream and propagate it
  on the `done` SSE event. The frontend uses "max_tokens" to detect
  truncation; "end_turn" / "stop_sequence" mean the model finished cleanly.

Frontend (ChatPage.tsx):
- Extend Message with `stop_reason` and `stopped` flags. The `done` handler
  stores `stop_reason`; the AbortError catch (user clicks Stop) sets
  `stopped: true`. Both flags survive saveConversation/loadConversation
  via the untyped messages JSON column on the backend.
- Render a Continue affordance below any message where
  `stop_reason === "max_tokens" || stopped`, hidden if a tool is still
  pending in the message (no orphan tool calls).
- New `continueMessage(idx)` posts the conversation up to and including
  the partial assistant turn back to /chat/message. Anthropic's
  assistant-prefill behaviour means the model continues the same logical
  turn — no "Continue from where you stopped" nudge needed. Streamed
  content appends into the SAME message bubble; cost is summed onto the
  existing `cost_gbp`. If continue itself truncates or is stopped, the
  affordance comes back (user-driven, not auto-loop).

Acceptance criteria:
- max_tokens → Continue button appears.
- User clicks Stop mid-stream → partial preserved + Continue appears.
- Continue resumes in-place, single bubble, summed cost.
- Out of scope: indefinitely auto-continuing — kept user-triggered.

Closes #44

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 6, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
policyengine-uk-chat Ready Ready Preview, Comment May 6, 2026 11:47am

Request Review

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

Beta preview is ready.

@vahid-ahmadi vahid-ahmadi self-assigned this May 6, 2026
@vahid-ahmadi vahid-ahmadi requested a review from SakshiKekre May 6, 2026 12:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stop → continue: resume aborted or truncated assistant turns

1 participant