Skip to content

feat: add plan mode — agent asks clarifying questions before running#38

Merged
vahid-ahmadi merged 2 commits into
mainfrom
feat/plan-mode
May 6, 2026
Merged

feat: add plan mode — agent asks clarifying questions before running#38
vahid-ahmadi merged 2 commits into
mainfrom
feat/plan-mode

Conversation

@vahid-ahmadi
Copy link
Copy Markdown
Contributor

Summary

Adds a per-message Plan mode toggle to the chat input. When on, the agent asks 1–3 numbered clarifying questions before running any tools, then proceeds normally on the next turn once the user replies. Modeled after Claude Code's plan mode.

Policy questions are routinely ambiguous ("model a wealth tax", "abolish NI", "raise the personal allowance"). Running a 30-second simulation on the wrong assumptions is worse than asking a 5-second clarifying question first.

Behaviour

  1. User toggles plan mode on, types "abolish NI", sends.
  2. Request includes plan_mode: true.
  3. Backend appends PLAN_MODE_DIRECTIVE to the system prompt — agent responds with 1–3 numbered questions, no tool calls.
  4. Toggle auto-resets to off after send.
  5. User's follow-up answer proceeds with full tool use.

Backend

  • ChatRequest gains plan_mode: bool = False.
  • PLAN_MODE_DIRECTIVE tells the model: don't call tools, ask 1–3 numbered clarifying questions, numbered list, no preamble.
  • _build_system_blocks(plan_mode) appends the directive after the cache_control breakpoint so toggling plan mode never invalidates the cached base prompt.

Frontend

  • planMode state in ChatPage.tsx, sent as plan_mode in the POST body.
  • Toggle button (IconBulb) in the input footer — always visible, styled as a pill that flips to the accent colour when on.
  • sendMessage resets planMode to false after capturing it for the current send.

Smoke test (passed)

  • curl with plan_mode=true and "Abolish national insurance" → 3 numbered clarifying questions (year, which NICs, what metric), zero tool calls.
  • curl with plan_mode=false and a normal question → immediate run_python tool use.
  • Second turn (toggle reset) proceeds with full analysis.

Test plan

  • docker-compose up → open http://localhost:3006
  • Toggle plan mode on → ask "model a wealth tax" → expect 1–3 numbered questions, no tool call panel
  • Confirm toggle visually returns to off after send
  • Reply with answers → expect the agent to proceed with simulation (tool calls visible)
  • With plan mode off, ask an unambiguous question → expect normal tool use

🤖 Generated with Claude Code

Per-message toggle in the chat input. When on, the request includes
plan_mode=true and the backend appends a directive to the system
prompt telling the agent to ask 1–3 numbered clarifying questions and
skip all tool use for that turn. Toggle auto-resets to off after send
so follow-ups proceed normally.

Backend:
- ChatRequest gains plan_mode: bool = False
- _build_system_blocks(plan_mode) appends PLAN_MODE_DIRECTIVE AFTER the
  cache_control breakpoint, so toggling plan mode does not invalidate
  the cached base prompt

Frontend:
- planMode state in ChatPage, plan_mode in the fetch body
- Toggle button (IconBulb) in the input footer, always visible, reset
  in sendMessage

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
policyengine-uk-chat Ready Ready Preview, Comment May 5, 2026 8:55am

Request Review

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 21, 2026

Beta preview has been cleaned up because this PR was closed.

@vahid-ahmadi vahid-ahmadi self-assigned this Apr 24, 2026
Copy link
Copy Markdown

@SakshiKekre SakshiKekre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea adding the plan mode! The UI shape looks reasonable, I just think the backend should enforce plan mode rather than relying only on prompt text.

Right now plan_mode=true appends PLAN_MODE_DIRECTIVE, but the request still sends the normal tool list to Anthropic. If the model ignores the directive and emits a tool call, the existing loop will execute it. Since the feature promise is “ask clarifying questions before running tools,” could we make that invariant code-level?

Non-blocking,suggestion: Could we also add a small backend regression test for that contract? Suggested shape:

  • assert _build_system_blocks(plan_mode=True) includesPLAN_MODE_DIRECTIVE

Address review feedback: plan mode previously relied on the model
obeying a system-prompt directive while still being given the full
tool list. Now plan_mode=True omits `tools` from the Anthropic call
entirely, so tool_use blocks cannot be emitted. A defence-in-depth
guard drops any tool_use that surfaces anyway. Adds regression tests
covering the directive, cache breakpoint, and request schema.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vahid-ahmadi
Copy link
Copy Markdown
Contributor Author

Thanks @SakshiKekre — good catch. Pushed 82baaf4:

  • Structural enforcement: when plan_mode=True, the Anthropic request now omits tools entirely (backend/routes/chatbot.py). Without a tools list, the API cannot emit tool_use blocks — the "no tool execution in plan mode" promise is now a property of the request shape, not a prompt-level hope.
  • Defence-in-depth guard: if a tool_use block ever surfaces in plan mode anyway, it's logged and skipped instead of executed.
  • Regression tests (backend/tests/test_api.py::TestPlanMode): directive present iff plan_mode=True, base-prompt cache breakpoint unchanged across toggle, and ChatRequest accepts plan_mode with default False.

@vahid-ahmadi vahid-ahmadi merged commit 37ff283 into main May 6, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants