Skip to content

feat(coordinator): MCP orchestration engine, REST API hardening, and task model#120

Open
brooksc wants to merge 1 commit into
johannesjo:mainfrom
brooksc:coordinator-2-mcp-backend
Open

feat(coordinator): MCP orchestration engine, REST API hardening, and task model#120
brooksc wants to merge 1 commit into
johannesjo:mainfrom
brooksc:coordinator-2-mcp-backend

Conversation

@brooksc
Copy link
Copy Markdown
Contributor

@brooksc brooksc commented May 16, 2026

Overview

This is PR 2 of 4 in the coordinator series splitting #100 as requested in the round-4 review. It is stacked on PR 1 (coordinator-1-security) and should be merged after that one. The diff shown here includes PR 1's content; the meaningful delta for this PR is the coordinator engine and REST hardening described below.

PR sequence:

PR Branch Status Contents
1 coordinator-1-security Open Atomic writes, input validators, static analysis configs
2 (this PR) coordinator-2-mcp-backend Open MCP coordinator engine + REST API hardening
3 coordinator-3-store-ipc Pending Frontend store wiring + IPC handlers
4 coordinator-4-ui Pending UI components + coordinator entry points

PRs 3–4 are stacked on this one. Nothing coordinator-related is user-visible until PR 4 adds the NewTaskDialog checkbox and Settings toggle (coordinatorModeEnabled defaults to false).


What's in this PR (delta over PR 1)

MCP orchestration engine (electron/mcp/)

coordinator.ts — core orchestrator singleton

  • Creates and manages coordinated sub-tasks, each in its own git worktree
  • Three-class token RBAC: coordinator / subtask / mobile
  • Per-task done tokens (24-byte random, timing-safe comparison)
  • PTY output subscription for idle/prompt detection
  • Batched review notifications with configurable delay and restaging
  • signal_done / waitForSignalDone with replay cache (requestId dedup for safe retries)
  • Atomic preamble injection and strip via atomic.ts
  • setTaskControl / blockedByHumanControl state machine (coordinator ↔ human hand-off)
  • cleanupTask with double-resolve guard on anySignalResolvers

server.ts — MCP stdio entry point

  • Speaks MCP over stdio to Claude Code; delegates to Electron app via HTTP
  • CLI arg validation: rejects \r/\n in --coordinator-id / --task-id (header injection prevention)
  • Exposes tools: create_task, list_tasks, get_task_status, get_task_diff, get_task_output, send_prompt, wait_for_idle, wait_for_signal_done, merge_task, close_task, review_and_merge_task, signal_done

config.ts — MCP JSON config generation

  • selectMcpJsonDir: selects the right directory for the per-coordinator config
  • writeMcpConfig: writes config (mode 0o600) atomically

preamble.ts — preamble injection and strip

  • Atomically appends a <sub-task-mode> block to CLAUDE.md, AGENTS.md, GEMINI.md, .agent.md, or settings.local.json depending on agent type
  • stripPreambleFromBranch: atomically strips the block on merge/close
  • buildNormalizedPreambleFileDiff: git diff --no-index with anchored-regex path rewriting (prevents false substitutions when tmpdir path appears in file content)

sub-task-preamble.ts — sub-task-side preamble injection
prompt-detect.ts — sliding-window idle/prompt detector
replay-cache.ts — deduplicates wait_for_signal_done retries by requestId
client.tsMCPClient used by the MCP stdio process to call the REST API
mcp-tool-list.ts — builds the tool list advertised to coordinator vs. sub-task roles
types.ts — shared types: CoordinatedTask, ApiTaskSummary, ApiTaskDetail, token classes, etc.


REST API hardening (electron/remote/server.ts)

New coordinator task routes (all require auth):

Method Path Description
POST /api/tasks Create sub-task
GET /api/tasks List sub-tasks (coordinator-scoped)
GET /api/tasks/:id Get status
POST /api/tasks/:id/prompt Send prompt
POST /api/tasks/:id/wait Wait for idle
GET /api/tasks/:id/diff Get diff
GET /api/tasks/:id/output Get scrollback
POST /api/tasks/:id/merge Merge branch
POST /api/tasks/:id/review-merge Diff + merge
DELETE /api/tasks/:id Close/cleanup
POST /api/tasks/:id/done Signal done (subtask token + X-Done-Token)
POST /api/wait-signal Wait for any signal_done

Auth scoping hardening:

  • callerCoordinatorId extracted from verified X-Coordinator-Id header and enforced before all coordinator routes (including wait-signal)
  • create_task: body coordinatorTaskId ignored; header is authoritative; mismatch → 403
  • wait-signal: body coordinatorTaskId ignored entirely; header-only
  • Mobile token restricted to /api/agents only — task routes removed (mobile token is embedded in a QR-code URL reachable by anyone on the local network)
  • Coordinator token without X-Coordinator-Id → 403 on all task routes
  • task.name: 200-char max, control characters stripped (prompt injection prevention)

IPC handlers (electron/ipc/register.ts)

New handlers wired up: StartMCPServer, StopMCPServer, GetMCPStatus, GetMCPLogs, HydrateCoordinatedTask, MCP_CoordinatedTaskClosed, MCP_TaskHydrated, MCP_CoordinatorNotificationAck, MCP_CoordinatorNotificationDropAck, MCP_CoordinatorRestageAfterUserSend


Store / type changes (src/store/)

New Task fields: coordinatorMode, coordinatedBy, controlledBy, mcpConfigPath, preambleFileExistedBefore, signalDone*, needsReview, mcpStartupStatus/Error

New PersistedState / AppStore fields: coordinatorModeEnabled, coordinatorNotificationDelayMs, coordinatorControlHintDismissed, MCPStatus

core.ts, remote.ts, persistence.ts, store.ts, ui.ts — wired up coordinator store state with defaults and persistence round-trip


OpenSpec

openspec/changes/coordinator-mcp-backend/proposal.md — documents the MCP orchestration server, REST task API, three-token auth model, signal/wait lifecycle, preamble injection, and new IPC channels per CLAUDE.md requirement


Tests

File Cases Coverage
electron/mcp/coordinator.test.ts 180+ Lifecycle, preamble injection, waitForIdle, signal/wait, notifications, hydrateTask, setTaskControl, cleanupTask
electron/mcp/coordinator-sequence.test.ts 10+ End-to-end create → signal → merge
electron/mcp/config.test.ts 15 selectMcpJsonDir, writeMcpConfig
electron/mcp/mcp-tool-list.test.ts 10 Tool selection by role
electron/mcp/prompt-detect.test.ts 15 Idle detection patterns
electron/remote/coordinator-scoping.test.ts 40 HTTP integration: coordinator scoping, subtask token restrictions, mobile token restrictions, create_task body-vs-header scoping
electron/ipc/register-mcp.test.ts 10 StartMCPServer input validation
electron/ipc/register.test.ts 5 IPC handler registration
electron/ipc/docker-config.test.ts 10 Docker MCP config paths
electron/mcp/docker.integration.test.ts Docker lifecycle (skipped without Docker daemon)

Total: 295+ test cases. npm test → 859 pass, 12 skipped.

Test plan

  • npm run compile && npm run typecheck && npm run lint && npm run format:check — pass
  • npm test — 859 pass, 12 skipped (Docker integration skipped without daemon)
  • git diff --check johannesjo/main...HEAD — clean

🤖 Generated with Claude Code

@johannesjo
Copy link
Copy Markdown
Owner

johannesjo commented May 16, 2026

Thank you very much for your work on this! <3

Follow-up review after splitting the pass across remote auth, coordinator lifecycle/persistence, and MCP execution paths. I think these need attention before merge:

  1. Connect Phone cannot authenticate with the new mobile token. The QR URL now embeds mobileToken (electron/remote/server.ts:867), but WebSocket auth still accepts only coordinator tokens (electron/remote/server.ts:760). The phone SPA sends the QR token as its first WS auth message (src/remote/ws.ts:41), so scanning the QR gets 4001 Unauthorized and reloads as unauthenticated. Either allow mobile-scoped WS access to the existing agent protocol, or move the mobile UI to a REST-only read-only flow.

  2. Connect Phone can return unreachable LAN URLs while MCP is active. If MCP started the shared remote server on 127.0.0.1, StartRemoteServer intentionally skips rebinding while a coordinator is active (electron/ipc/register.ts:1061) but still returns wifiUrl/tailscaleUrl from that same loopback-bound server (electron/remote/server.ts:878). The modal will show QR URLs that other devices cannot reach. Track the bind host and either return an explicit unavailable state, run a separate LAN/mobile server, or only expose LAN URLs after a real 0.0.0.0 bind.

  3. Coordinator state is typed but not persisted/restored. PersistedState/PersistedTask gained coordinator fields, and main-process startup checks coordinatorModeEnabled (electron/ipc/register.ts:1306), but saveState() does not write those top-level fields (src/store/persistence.ts:86) and task restore omits coordinatedBy, controlledBy, mcpConfigPath, signalDone*, needsReview, etc. (src/store/persistence.ts:558, src/store/persistence.ts:641). Restart loses the coordinator setting, child nesting, MCP config paths, signal state, and review state, so hydration cannot work reliably.

  4. Deregistering a coordinator drops live child-task backend state. deregisterCoordinator() unsubscribes each child PTY and deletes the task from this.tasks (electron/mcp/coordinator.ts:1361, electron/mcp/coordinator.ts:1391) without necessarily killing those child agents/worktrees. A later signal_done for that still-running child will be task-not-found, and real PTY output will no longer drive orphan/review notifications. Keep child task records until each child is closed, or explicitly transfer them to human/orphaned review state while preserving the done endpoint.

  5. Sub-task agent args are Claude-specific even when the coordinator agent is not Claude. createTask() appends --mcp-config and --dangerously-skip-permissions to every configured agent command (electron/mcp/coordinator.ts:633, electron/mcp/coordinator.ts:637). The rest of the app models skip-permission args per agent (src/components/TaskAITerminal.tsx:592), so Codex/Gemini/opencode sub-tasks will be misconfigured or fail to start. Either restrict coordinator sub-tasks to supported agents, or make MCP config and skip-permission injection agent-specific.

Secondary follow-ups from the pass: ConnectPhoneModal ignores { stopped: false, reason: 'coordinator_active' } from stopRemoteAccess() (src/components/ConnectPhoneModal.tsx:145), and StartMCPServer should validate skipPermissions/propagateSkipPermissions as booleans plus UUID-check coordinatorTaskId before using it in temp paths.

…task model

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@brooksc brooksc force-pushed the coordinator-2-mcp-backend branch from 2465ab6 to b43a427 Compare May 17, 2026 06:10
@brooksc
Copy link
Copy Markdown
Contributor Author

brooksc commented May 17, 2026

Thank you for the thorough review — all five issues and both secondary items are addressed in the latest push. Additional correctness issues surfaced during internal review and are also fixed; listed below.


1. Connect Phone cannot authenticate with the new mobile token

The WS auth handler in electron/remote/server.ts now accepts both 'coordinator' and 'mobile' token types via classifyCandidate(). A Map<WebSocket, 'coordinator' | 'mobile'> tracks the token type per connection. A read-only guard before the message dispatch closes the socket with 4003 Forbidden if a mobile client sends input, resize, or kill — so phone clients get live terminal output with no write access.

2. Connect Phone can return unreachable LAN URLs while MCP is active

bindHost: string is now tracked on the RemoteServer object (set from opts.host at bind time). In StartRemoteServer, when the existing server is loopback-bound (bindHost === '127.0.0.1'), we return { wifiUrl: null, tailscaleUrl: null, unavailableReason: 'coordinator_active' } without mutating remoteServerRequestedManually or remoteServerPendingStop — preserving the coordinator's cleanup path. GetRemoteStatus also returns { enabled: false } when the server is loopback-only, so the Connect Phone modal cannot re-activate via status polling. On the frontend, startRemoteAccess() throws a descriptive error when unavailableReason is present. stopRemoteAccess() returns { stopped: boolean; reason?: string } and ConnectPhoneModal.handleDisconnect() checks the return value, showing an informational message rather than silently closing when the coordinator is blocking the stop.

3. Coordinator state is typed but not persisted/restored

saveState() in src/store/persistence.ts now writes all coordinator fields:

  • 3 global fields: coordinatorModeEnabled, coordinatorNotificationDelayMs, coordinatorControlHintDismissed
  • 10 per-task fields in both active and collapsed task sections: coordinatorMode, propagateSkipPermissions, coordinatedBy, controlledBy, mcpConfigPath, preambleFileExistedBefore, signalDoneReceived, signalDoneAt, signalDoneConsumed, needsReview

loadState() restores all of the above with appropriate type guards.

4. Deregistering a coordinator drops live child-task backend state

deregisterCoordinator() in electron/mcp/coordinator.ts no longer calls this.tasks.delete(taskId) on children. Instead each child is marked orphaned: task.needsReview = true, controlMap set to 'human', and MCP_TaskStateSync fired to the renderer. The PTY output subscription is removed but the task record is preserved, so subsequent signal_done calls for still-running children resolve correctly.

5. Sub-task agent args are Claude-specific even when the coordinator agent is not Claude

electron/ipc/agents.ts exports getSkipPermissionsArgs(command: string): string[], which normalises via path.basename() (handling path-qualified executables) and returns the correct per-agent flag from AgentDef.skip_permissions_args. createTask() in coordinator.ts spreads the result instead of hardcoding --dangerously-skip-permissions. Codex gets --dangerously-bypass-approvals-and-sandbox; Gemini/Copilot get --yolo; opencode gets nothing.


Secondary: ConnectPhoneModal ignores { stopped: false } from stopRemoteAccess()

Addressed in item 2 above — handleDisconnect() now checks the return value and surfaces the coordinator-active reason.

Secondary: StartMCPServer validation

validateStartMCPServerArgs() now uses the existing validateUUID() helper (strict UUID v4 regex) for coordinatorTaskId, and adds assertOptionalBoolean checks for skipPermissions and propagateSkipPermissions. Test fixtures in register-mcp.test.ts were migrated from short stub IDs to a valid UUID constant to match the stricter validation.


Additional fixes from internal review

  • Compile blocker: duplicate token classifier — An earlier draft introduced a second classifyToken(candidate: string) overload that called an undefined safeCompare. Removed; the auth handler uses classifyCandidate() directly throughout.

  • GetRemoteStatus re-enabling dead QR — Before this fix, GetRemoteStatus returned enabled: true even when the server was loopback-only, allowing the Connect Phone modal to display a reachable-looking status after a coordinator-active rejection. Fixed by returning enabled: false when bindHost === '127.0.0.1'.

  • Failed manual start requests poisoning MCP cleanupremoteServerRequestedManually and remoteServerPendingStop were being set before the loopback check, so a coordinator-active rejection would leave the flags in the wrong state and interfere with the MCP server's own teardown. Fixed by moving those assignments to after the check.

  • Write UI visible on read-only mobile sessionssrc/remote/AgentDetail.tsx previously rendered an input field and quick-action buttons even though mobile is always read-only. Removed all write UI; the toolbar now shows only A−/A+ font-size controls.

  • Path-qualified commands getting no skip argsgetSkipPermissionsArgs('/usr/local/bin/claude') returned [] because the exact-match lookup failed on full paths. Fixed with path.basename() normalisation before the lookup.

@johannesjo
Copy link
Copy Markdown
Owner

johannesjo commented May 18, 2026

Thank you very much for your ongoing work on this! <3

Additional review of 2465ab6..b43a427 after splitting the pass across coordinator lifecycle, remote/mobile auth, and tooling/agent compatibility.

The mobile WebSocket/auth and loopback remote-status changes look good from the additional pass. I still think these need attention before merge:

  1. Orphaned sub-tasks still lose the signal_done path. deregisterCoordinator() now keeps child task records (electron/mcp/coordinator.ts:1362), but the IPC deregistration handler stops the shared MCP HTTP server as soon as the last coordinator exits when it was MCP-started and not manually requested (electron/ipc/register.ts:1187, electron/ipc/register.ts:1192). In the normal host path that leaves still-running sub-tasks with a dead 127.0.0.1:<port> endpoint, so a later signal_done fails before Coordinator.signalDone() can use the retained task record. The same orphaning path also unlinks and clears each child mcpConfigPath (electron/mcp/coordinator.ts:1388, electron/mcp/coordinator.ts:1394), so delayed/lazy MCP startup can lose its config too. Either keep the MCP transport/config alive until all children are closed, or explicitly terminate/convert the children so the system no longer promises that later signal_done calls will resolve.

  2. Sub-task agent args are still partly Claude-specific. The skip-permission flag is now selected via getSkipPermissionsArgs() (electron/mcp/coordinator.ts:636, electron/ipc/agents.ts:85), but createTask() still appends --mcp-config <path> to every selected agent (electron/mcp/coordinator.ts:638). Non-Claude or custom agents that do not support that Claude-style flag can still fail or start with an unknown argument. This should be gated to supported agents or made per-agent along with skip-permission propagation.

  3. The coordinator notification delay default does not round-trip to the backend. The renderer default is 60_000 (src/store/core.ts:88) and saveState() omits that value when it equals the default (src/store/persistence.ts:129), but the main-process coordinator default remains 30_000 (electron/mcp/coordinator.ts:90) and syncTaskNamesFromJson() only calls setNotificationDelayMs() when the persisted field exists (electron/ipc/register.ts:651). With the default setting, the UI restores 60s while the backend silently uses 30s. Persist the default value or align the backend default.

  4. The static-analysis coverage regresses. no-inner-html-without-sanitize now scans only **/electron/** (.semgrep/electron-security.yml:10), dropping renderer coverage even though renderer innerHTML assignments still exist, e.g. src/components/PlanViewerDialog.tsx:117. The token URL rule also excludes all of electron/remote/server.ts (.semgrep/ipc-auth.yml:14), which is the file where token-bearing URLs are constructed. Narrow exclusions would preserve the intended protection without muting the highest-risk locations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants