Skip to content

extensions: add agent_spawn tool for bounded dev-agent startup#188

Merged
benvinegar merged 2 commits intomainfrom
feat/agent-spawn-runbook
Mar 1, 2026
Merged

extensions: add agent_spawn tool for bounded dev-agent startup#188
benvinegar merged 2 commits intomainfrom
feat/agent-spawn-runbook

Conversation

@benvinegar
Copy link
Member

@benvinegar benvinegar commented Mar 1, 2026

Summary

  • add a new agent_spawn extension tool that launches a tmux-backed pi session with validated inputs and bounded timeouts
  • perform readiness verification via ~/.pi/session-control/<alias>.alias + socket probe and return structured stage/error details
  • update control-agent runbook to require agent_spawn for worker startup and forbid invalid/fragile spawn patterns
  • add focused tests for successful readiness, readiness timeout behavior (session/logs retained), and validation failures
  • add runbook guard assertions in memory.test.mjs and include new test file in test config

Testing

  • npm run test:js
  • npm run typecheck
  • npm run lint:js (existing unrelated warning remains in slack-bridge/broker-bridge.mjs for unused err)

Refs #185

@greptile-apps
Copy link

greptile-apps bot commented Mar 1, 2026

Greptile Summary

This PR introduces a new agent_spawn extension tool that wraps tmux session creation with validated inputs, a bounded readiness poll (alias file + Unix socket probe), and structured error details, then updates the control-agent runbook to require agent_spawn for dev-agent startup and adds focused tests covering success, timeout, and validation failure paths.

Key observations:

  • The AbortSignal (signal) passed to execute is forwarded to pi.exec for the tmux spawn but is not propagated into waitForSessionReadiness. A caller cancellation after the tmux spawn will not interrupt the polling loop, which can run for up to 60 seconds unimpeded.
  • The readiness polling while loop re-checks Date.now() <= deadline only at the top of each iteration; within a single iteration isSocketAlive can block for 300 ms and sleep for 200 ms, causing the actual wait to overshoot the requested timeout by up to 500 ms.
  • The test suite writes real alias symlinks and Unix socket files into ~/.pi/session-control/, the live production control directory, rather than an isolated temp directory. A test crash before afterEach could leave stale files that confuse the heartbeat extension.
  • The sentry-agent in SKILL.md (line 332) still uses the raw tmux new-session inline command that the PR explicitly forbids for dev-agents, creating an inconsistency in the runbook the new guard tests do not catch.

Confidence Score: 3/5

  • The core logic is sound but missing signal propagation and a sentry-agent runbook inconsistency warrant fixes before merge.
  • The implementation is well-structured with good input validation and shell-quoting, and the tests cover the main scenarios. However, the AbortSignal is not forwarded through the readiness loop (a logic issue that could cause long hangs on cancellation), the timeout can overshoot by up to 500 ms, tests contaminate the real production control directory, and the sentry-agent section of SKILL.md still uses the forbidden raw tmux pattern. These collectively reduce confidence.
  • Pay close attention to pi/extensions/agent-spawn.ts (signal propagation and timeout precision) and pi/skills/control-agent/SKILL.md (sentry-agent still uses raw tmux).

Important Files Changed

Filename Overview
pi/extensions/agent-spawn.ts Core implementation of the agent_spawn tool; well-structured with input validation and socket probing, but the AbortSignal is not forwarded to the readiness polling loop and the loop can overshoot its deadline by up to 500 ms per iteration.
pi/extensions/agent-spawn.test.mjs Good coverage of spawn success, readiness timeout, and input validation; however tests write alias and socket files to the real ~/.pi/session-control/ production directory, which could contaminate the live environment if a test crashes before afterEach cleanup.
pi/skills/control-agent/SKILL.md Dev-agent spawn section correctly migrated to agent_spawn tool pattern, but sentry-agent startup still uses the raw tmux new-session incantation that this PR explicitly forbids.
pi/extensions/memory.test.mjs Adds three guard assertions to enforce runbook correctness (agent_spawn reference, pi session spawn prohibition, send_to_session clarification); straightforward and low risk.

Sequence Diagram

sequenceDiagram
    participant C as Control Agent
    participant T as agent_spawn tool
    participant X as pi.exec (tmux)
    participant FS as ~/.pi/session-control/
    participant S as Unix Socket

    C->>T: execute(session_name, cwd, skill_path, model, ready_timeout_sec)
    T->>T: Validate inputs (session_name, ready_alias, model, cwd, skill_path)
    T->>X: tmux new-session -d -s <session_name> <command>
    X-->>T: {code: 0}
    loop Poll every 200ms until deadline
        T->>FS: existsSync(<alias>.alias)
        alt alias exists
            T->>FS: readlinkSync → resolve socket path
            T->>FS: existsSync(<socket>.sock)
            alt socket exists
                T->>S: net.createConnection (probe, 300ms timeout)
                S-->>T: connect / error / timeout
            end
        end
    end
    alt Ready
        T-->>C: {ready: true, spawned: true, socket_path, alias_path}
    else Timeout
        T-->>C: {isError: true, ready: false, error: "readiness_timeout", session/logs intact}
    end
Loading

Last reviewed commit: 1ada492

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link

greptile-apps bot commented Mar 1, 2026

Additional Comments (1)

pi/skills/control-agent/SKILL.md
Sentry-agent still uses raw tmux new-session spawn

The PR's stated goal is to require agent_spawn for worker startup and forbid ad-hoc shell spawn patterns. The dev-agent section has been updated accordingly, but the sentry-agent spawn on this line still uses the old raw tmux new-session incantation with inline export PATH=.... This means the sentry-agent bypasses the bounded readiness check and validated inputs that agent_spawn provides, and creates an inconsistency that the runbook guard tests don't catch (the guard only asserts the string agent_spawn appears somewhere in the file).

Consider migrating the sentry-agent spawn to an agent_spawn tool call block, matching the pattern used for dev-agents above.

Prompt To Fix With AI
This is a comment left during a code review.
Path: pi/skills/control-agent/SKILL.md
Line: 332

Comment:
**Sentry-agent still uses raw `tmux new-session` spawn**

The PR's stated goal is to require `agent_spawn` for worker startup and forbid ad-hoc shell spawn patterns. The dev-agent section has been updated accordingly, but the sentry-agent spawn on this line still uses the old raw `tmux new-session` incantation with inline `export PATH=...`. This means the sentry-agent bypasses the bounded readiness check and validated inputs that `agent_spawn` provides, and creates an inconsistency that the runbook guard tests don't catch (the guard only asserts the string `agent_spawn` appears somewhere in the file).

Consider migrating the sentry-agent spawn to an `agent_spawn` tool call block, matching the pattern used for dev-agents above.

How can I resolve this? If you propose a fix, please make it concise.

@benvinegar
Copy link
Member Author

Addressed the additional runbook concern in d324512: the sentry-agent section now uses the agent_spawn tool pattern (no raw tmux new-session), startup checklist wording was updated to match, and memory guard tests now assert this so it does not regress.

@benvinegar benvinegar merged commit 680b076 into main Mar 1, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant