✨ feat: add --no-sandbox to run without the gVisor runtime by yeazelm · Pull Request #14 · papercomputeco/agentd

yeazelm · 2026-06-27T03:45:35Z

Summary

agentd unconditionally required nix and runsc (gVisor) at startup and treated
a failed tmux start as fatal, so it could not come up where those are
unavailable — e.g. a minimal capstan microVM where the VM boundary is the
sandbox (no nix/gVisor, no sudo/agent user).

--no-sandbox skips the runsc/nix initialization. Sandboxed agents then fail
to launch (guarded); the API and native/tmux agents are unaffected.
tmux startup becomes best-effort and non-fatal: agentd still serves its
control-plane API, and native agents simply can't launch until a tmux server
and the agent user are available.

Both are opt-in — default startup is unchanged.

Context: lets agentd come up for control-plane bring-up inside a capstan microVM
(a minimal Rust PID 1 for microVM sandboxes).

Test plan

gofmt / make lint (go vet ./...) clean
make test — all suites pass; added a spec for SetNoSandbox
Verified agentd serves its API under capstan with --no-sandbox (no
nix/gVisor present)

Part of PCC-777.

agentd unconditionally required `nix` and `runsc` (gVisor) at startup and treated a failed tmux start as fatal, so it could not come up anywhere those are unavailable — e.g. a minimal capstan microVM where the VM boundary is the sandbox and there is no nix/gVisor or sudo/agent user. `--no-sandbox` skips the runsc/nix initialization; sandboxed agents then fail to launch (guarded), while the API and native/tmux agents are unaffected. tmux startup becomes best-effort and non-fatal for the same reason: agentd still serves its control-plane API, and native agents simply can't launch until a tmux server and the agent user are available. Both behaviors are opt-in — default startup is unchanged. Part of PCC-777.

linear-code · 2026-06-27T03:45:38Z

PCC-777

greptile-apps · 2026-06-27T03:48:44Z

Greptile Summary

This PR adds a --no-sandbox flag that lets agentd start in environments where gVisor (runsc) and Nix are unavailable (e.g. a minimal capstan microVM), and makes tmux startup non-fatal so the control-plane API can come up even without the agent user or tmux binary.

--no-sandbox skips the runsc/nix binary checks and leaves d.runner nil; sandboxed agent launches are correctly rejected by the existing nil-runner guard in createManager.
tmux failure at startup is downgraded from fatal to a logged warning \u2014 but this applies to all deployments, not only --no-sandbox, which silently changes the failure mode for standard stereOS hosts that rely on native (tmux) agents.

Confidence Score: 4/5

Safe to merge; the --no-sandbox path is cleanly gated and the nil-runner guard in createManager correctly rejects sandboxed agents when the runtime is disabled.

The tmux non-fatal change applies to every agentd startup, not only the --no-sandbox path. In a standard stereOS deployment with native agents, a broken tmux or missing agent user at boot will no longer surface as a startup failure, making misconfigured hosts harder to diagnose.

agentd/agentd.go — specifically the unconditional tmux non-fatal change around line 273.

Important Files Changed

Filename	Overview
agentd/agentd.go	Adds noSandbox field and SetNoSandbox() method; gates gVisor/nix init on the flag; makes tmux startup non-fatal unconditionally (not limited to --no-sandbox). Sandboxed agent launch is correctly guarded by a nil-runner check in createManager.
agentd/agentd_test.go	Adds a minimal no-panic spec for SetNoSandbox, consistent with the existing test style in the file.
main.go	Adds --no-sandbox flag wired to daemon.SetNoSandbox(true); placement and conditional are consistent with other flags.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[agentd Run] --> B{noSandbox?}
    B -- yes --> C[Log: sandbox disabled\nskip runsc + nix checks]
    B -- no --> D[LookPath nix]
    D --> E[sandbox.NewRunner]
    E --> F[runner.Cleanup]
    C --> G[tmux.Start]
    F --> G
    G -- success --> H[API server start]
    G -- failure --> I[Log warning\nnative agents unavailable]
    I --> H
    H --> J[reconcileLoop]
    J --> K{agent type?}
    K -- sandboxed + runner==nil --> L[Error: runner not initialized]
    K -- sandboxed + runner ok --> M[sandbox.NewManager]
    K -- native --> N[native.NewManager\nwith tmux ref]
    M --> O[mgr.Start]
    N --> O

%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[agentd Run] --> B{noSandbox?}
    B -- yes --> C[Log: sandbox disabled\nskip runsc + nix checks]
    B -- no --> D[LookPath nix]
    D --> E[sandbox.NewRunner]
    E --> F[runner.Cleanup]
    C --> G[tmux.Start]
    F --> G
    G -- success --> H[API server start]
    G -- failure --> I[Log warning\nnative agents unavailable]
    I --> H
    H --> J[reconcileLoop]
    J --> K{agent type?}
    K -- sandboxed + runner==nil --> L[Error: runner not initialized]
    K -- sandboxed + runner ok --> M[sandbox.NewManager]
    K -- native --> N[native.NewManager\nwith tmux ref]
    M --> O[mgr.Start]
    N --> O

Prompt To Fix All With AI

Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
agentd/agentd.go:273-280
**Unconditional tmux non-fatal change affects all deployments**

The tmux startup failure is now silently downgraded for every agentd invocation, not only when `--no-sandbox` is active. In a standard stereOS deployment where `type = "native"` agents are configured, a broken agent user or missing tmux binary no longer causes a visible startup failure — agentd starts, the reconcile loop runs, and every native agent logs a per-cycle `"error starting agent"` error indefinitely, while the root cause (tmux never started) is only printed once at boot. Previously this was a hard startup failure via `return fmt.Errorf(...)`.

Consider gating the non-fatal path on `d.noSandbox`, so standard sandboxed deployments retain the original fail-fast behavior and only the capstan control-plane path is lenient.

```suggestion
	if err := d.tmux.Start(); err != nil {
		if d.noSandbox {
			// Non-fatal: agentd still serves its API and status. Native (tmux)
			// agents need a working tmux server + the agent user, so they cannot
			// launch until that's available; sandboxed agents are unaffected. This
			// lets agentd run in a minimal environment (e.g. a capstan VM with no
			// sudo/agent user) for control-plane bring-up.
			log.Printf("agentd: warning: tmux server unavailable (%v); native agents cannot launch", err)
		} else {
			return fmt.Errorf("starting tmux server: %w", err)
		}
	}
```

_{Reviews (1): Last reviewed commit: "✨ feat: add --no-sandbox to run without ..." | Re-trigger Greptile}

greptile-apps · 2026-06-27T03:48:47Z

 	if err := d.tmux.Start(); err != nil {
-		return fmt.Errorf("starting tmux server: %w", err)
+		// Non-fatal: agentd still serves its API and status. Native (tmux)
+		// agents need a working tmux server + the agent user, so they cannot
+		// launch until that's available; sandboxed agents are unaffected. This
+		// lets agentd run in a minimal environment (e.g. a capstan VM with no
+		// sudo/agent user) for control-plane bring-up.
+		log.Printf("agentd: warning: tmux server unavailable (%v); native agents cannot launch", err)
 	}


Unconditional tmux non-fatal change affects all deployments

The tmux startup failure is now silently downgraded for every agentd invocation, not only when --no-sandbox is active. In a standard stereOS deployment where type = "native" agents are configured, a broken agent user or missing tmux binary no longer causes a visible startup failure — agentd starts, the reconcile loop runs, and every native agent logs a per-cycle "error starting agent" error indefinitely, while the root cause (tmux never started) is only printed once at boot. Previously this was a hard startup failure via return fmt.Errorf(...).

Consider gating the non-fatal path on d.noSandbox, so standard sandboxed deployments retain the original fail-fast behavior and only the capstan control-plane path is lenient.

Suggested change

if err := d.tmux.Start(); err != nil {

return fmt.Errorf("starting tmux server: %w", err)

// Non-fatal: agentd still serves its API and status. Native (tmux)

// agents need a working tmux server + the agent user, so they cannot

// launch until that's available; sandboxed agents are unaffected. This

// lets agentd run in a minimal environment (e.g. a capstan VM with no

// sudo/agent user) for control-plane bring-up.

log.Printf("agentd: warning: tmux server unavailable (%v); native agents cannot launch", err)

}

if err := d.tmux.Start(); err != nil {

if d.noSandbox {

// Non-fatal: agentd still serves its API and status. Native (tmux)

// agents need a working tmux server + the agent user, so they cannot

// launch until that's available; sandboxed agents are unaffected. This

// lets agentd run in a minimal environment (e.g. a capstan VM with no

// sudo/agent user) for control-plane bring-up.

log.Printf("agentd: warning: tmux server unavailable (%v); native agents cannot launch", err)

} else {

return fmt.Errorf("starting tmux server: %w", err)

}

}

Prompt To Fix With AI

This is a comment left during a code review. Path: agentd/agentd.go Line: 273-280 Comment: **Unconditional tmux non-fatal change affects all deployments** The tmux startup failure is now silently downgraded for every agentd invocation, not only when `--no-sandbox` is active. In a standard stereOS deployment where `type = "native"` agents are configured, a broken agent user or missing tmux binary no longer causes a visible startup failure — agentd starts, the reconcile loop runs, and every native agent logs a per-cycle `"error starting agent"` error indefinitely, while the root cause (tmux never started) is only printed once at boot. Previously this was a hard startup failure via `return fmt.Errorf(...)`. Consider gating the non-fatal path on `d.noSandbox`, so standard sandboxed deployments retain the original fail-fast behavior and only the capstan control-plane path is lenient. ```suggestion if err := d.tmux.Start(); err != nil { if d.noSandbox { // Non-fatal: agentd still serves its API and status. Native (tmux) // agents need a working tmux server + the agent user, so they cannot // launch until that's available; sandboxed agents are unaffected. This // lets agentd run in a minimal environment (e.g. a capstan VM with no // sudo/agent user) for control-plane bring-up. log.Printf("agentd: warning: tmux server unavailable (%v); native agents cannot launch", err) } else { return fmt.Errorf("starting tmux server: %w", err) } } ``` How can I resolve this? If you propose a fix, please make it concise.

The daggerverse ghcontrib module now requires dagger v0.20.8, so the PR checks (title conformance, Linear magic word) error out under the pinned v0.20.6 with "module requires dagger v0.20.8, but you have v0.20.6".

yeazelm requested a review from a team June 27, 2026 03:45

greptile-apps Bot reviewed Jun 27, 2026

View reviewed changes

🔧 fix: bump dagger v0.20.6 → v0.20.8

b26584d

The daggerverse ghcontrib module now requires dagger v0.20.8, so the PR checks (title conformance, Linear magic word) error out under the pinned v0.20.6 with "module requires dagger v0.20.8, but you have v0.20.6".

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

✨ feat: add --no-sandbox to run without the gVisor runtime#14

✨ feat: add --no-sandbox to run without the gVisor runtime#14
yeazelm wants to merge 2 commits into
mainfrom
apple_vm/agentd

yeazelm commented Jun 27, 2026

Uh oh!

linear-code Bot commented Jun 27, 2026

Uh oh!

greptile-apps Bot commented Jun 27, 2026

Greptile Summary

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

yeazelm commented Jun 27, 2026

Summary

Test plan

Uh oh!

linear-code Bot commented Jun 27, 2026

Uh oh!

greptile-apps Bot commented Jun 27, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant