Skip to content

✨ feat: add --no-sandbox to run without the gVisor runtime#14

Open
yeazelm wants to merge 2 commits into
mainfrom
apple_vm/agentd
Open

✨ feat: add --no-sandbox to run without the gVisor runtime#14
yeazelm wants to merge 2 commits into
mainfrom
apple_vm/agentd

Conversation

@yeazelm

@yeazelm yeazelm commented Jun 27, 2026

Copy link
Copy Markdown

Summary

agentd unconditionally required nix and runsc (gVisor) at startup and treated
a failed tmux start as fatal, so it could not come up where those are
unavailable — e.g. a minimal capstan microVM where the VM boundary is the
sandbox (no nix/gVisor, no sudo/agent user).

  • --no-sandbox skips the runsc/nix initialization. Sandboxed agents then fail
    to launch (guarded); the API and native/tmux agents are unaffected.
  • tmux startup becomes best-effort and non-fatal: agentd still serves its
    control-plane API, and native agents simply can't launch until a tmux server
    and the agent user are available.

Both are opt-in — default startup is unchanged.

Context: lets agentd come up for control-plane bring-up inside a capstan microVM
(a minimal Rust PID 1 for microVM sandboxes).

Test plan

  • gofmt / make lint (go vet ./...) clean
  • make test — all suites pass; added a spec for SetNoSandbox
  • Verified agentd serves its API under capstan with --no-sandbox (no
    nix/gVisor present)

Part of PCC-777.

agentd unconditionally required `nix` and `runsc` (gVisor) at startup and
treated a failed tmux start as fatal, so it could not come up anywhere those are
unavailable — e.g. a minimal capstan microVM where the VM boundary is the
sandbox and there is no nix/gVisor or sudo/agent user.

`--no-sandbox` skips the runsc/nix initialization; sandboxed agents then fail to
launch (guarded), while the API and native/tmux agents are unaffected. tmux
startup becomes best-effort and non-fatal for the same reason: agentd still
serves its control-plane API, and native agents simply can't launch until a tmux
server and the agent user are available.

Both behaviors are opt-in — default startup is unchanged.

Part of PCC-777.
@yeazelm yeazelm requested a review from a team June 27, 2026 03:45
@linear-code

linear-code Bot commented Jun 27, 2026

Copy link
Copy Markdown

PCC-777

@greptile-apps

greptile-apps Bot commented Jun 27, 2026

Copy link
Copy Markdown

Greptile Summary

This PR adds a --no-sandbox flag that lets agentd start in environments where gVisor (runsc) and Nix are unavailable (e.g. a minimal capstan microVM), and makes tmux startup non-fatal so the control-plane API can come up even without the agent user or tmux binary.

  • --no-sandbox skips the runsc/nix binary checks and leaves d.runner nil; sandboxed agent launches are correctly rejected by the existing nil-runner guard in createManager.
  • tmux failure at startup is downgraded from fatal to a logged warning \u2014 but this applies to all deployments, not only --no-sandbox, which silently changes the failure mode for standard stereOS hosts that rely on native (tmux) agents.

Confidence Score: 4/5

Safe to merge; the --no-sandbox path is cleanly gated and the nil-runner guard in createManager correctly rejects sandboxed agents when the runtime is disabled.

The tmux non-fatal change applies to every agentd startup, not only the --no-sandbox path. In a standard stereOS deployment with native agents, a broken tmux or missing agent user at boot will no longer surface as a startup failure, making misconfigured hosts harder to diagnose.

agentd/agentd.go — specifically the unconditional tmux non-fatal change around line 273.

Important Files Changed

Filename Overview
agentd/agentd.go Adds noSandbox field and SetNoSandbox() method; gates gVisor/nix init on the flag; makes tmux startup non-fatal unconditionally (not limited to --no-sandbox). Sandboxed agent launch is correctly guarded by a nil-runner check in createManager.
agentd/agentd_test.go Adds a minimal no-panic spec for SetNoSandbox, consistent with the existing test style in the file.
main.go Adds --no-sandbox flag wired to daemon.SetNoSandbox(true); placement and conditional are consistent with other flags.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[agentd Run] --> B{noSandbox?}
    B -- yes --> C[Log: sandbox disabled\nskip runsc + nix checks]
    B -- no --> D[LookPath nix]
    D --> E[sandbox.NewRunner]
    E --> F[runner.Cleanup]
    C --> G[tmux.Start]
    F --> G
    G -- success --> H[API server start]
    G -- failure --> I[Log warning\nnative agents unavailable]
    I --> H
    H --> J[reconcileLoop]
    J --> K{agent type?}
    K -- sandboxed + runner==nil --> L[Error: runner not initialized]
    K -- sandboxed + runner ok --> M[sandbox.NewManager]
    K -- native --> N[native.NewManager\nwith tmux ref]
    M --> O[mgr.Start]
    N --> O
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[agentd Run] --> B{noSandbox?}
    B -- yes --> C[Log: sandbox disabled\nskip runsc + nix checks]
    B -- no --> D[LookPath nix]
    D --> E[sandbox.NewRunner]
    E --> F[runner.Cleanup]
    C --> G[tmux.Start]
    F --> G
    G -- success --> H[API server start]
    G -- failure --> I[Log warning\nnative agents unavailable]
    I --> H
    H --> J[reconcileLoop]
    J --> K{agent type?}
    K -- sandboxed + runner==nil --> L[Error: runner not initialized]
    K -- sandboxed + runner ok --> M[sandbox.NewManager]
    K -- native --> N[native.NewManager\nwith tmux ref]
    M --> O[mgr.Start]
    N --> O
Loading
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
agentd/agentd.go:273-280
**Unconditional tmux non-fatal change affects all deployments**

The tmux startup failure is now silently downgraded for every agentd invocation, not only when `--no-sandbox` is active. In a standard stereOS deployment where `type = "native"` agents are configured, a broken agent user or missing tmux binary no longer causes a visible startup failure — agentd starts, the reconcile loop runs, and every native agent logs a per-cycle `"error starting agent"` error indefinitely, while the root cause (tmux never started) is only printed once at boot. Previously this was a hard startup failure via `return fmt.Errorf(...)`.

Consider gating the non-fatal path on `d.noSandbox`, so standard sandboxed deployments retain the original fail-fast behavior and only the capstan control-plane path is lenient.

```suggestion
	if err := d.tmux.Start(); err != nil {
		if d.noSandbox {
			// Non-fatal: agentd still serves its API and status. Native (tmux)
			// agents need a working tmux server + the agent user, so they cannot
			// launch until that's available; sandboxed agents are unaffected. This
			// lets agentd run in a minimal environment (e.g. a capstan VM with no
			// sudo/agent user) for control-plane bring-up.
			log.Printf("agentd: warning: tmux server unavailable (%v); native agents cannot launch", err)
		} else {
			return fmt.Errorf("starting tmux server: %w", err)
		}
	}
```

Reviews (1): Last reviewed commit: "✨ feat: add --no-sandbox to run without ..." | Re-trigger Greptile

Comment thread agentd/agentd.go
Comment on lines 273 to 280
if err := d.tmux.Start(); err != nil {
return fmt.Errorf("starting tmux server: %w", err)
// Non-fatal: agentd still serves its API and status. Native (tmux)
// agents need a working tmux server + the agent user, so they cannot
// launch until that's available; sandboxed agents are unaffected. This
// lets agentd run in a minimal environment (e.g. a capstan VM with no
// sudo/agent user) for control-plane bring-up.
log.Printf("agentd: warning: tmux server unavailable (%v); native agents cannot launch", err)
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unconditional tmux non-fatal change affects all deployments

The tmux startup failure is now silently downgraded for every agentd invocation, not only when --no-sandbox is active. In a standard stereOS deployment where type = "native" agents are configured, a broken agent user or missing tmux binary no longer causes a visible startup failure — agentd starts, the reconcile loop runs, and every native agent logs a per-cycle "error starting agent" error indefinitely, while the root cause (tmux never started) is only printed once at boot. Previously this was a hard startup failure via return fmt.Errorf(...).

Consider gating the non-fatal path on d.noSandbox, so standard sandboxed deployments retain the original fail-fast behavior and only the capstan control-plane path is lenient.

Suggested change
if err := d.tmux.Start(); err != nil {
return fmt.Errorf("starting tmux server: %w", err)
// Non-fatal: agentd still serves its API and status. Native (tmux)
// agents need a working tmux server + the agent user, so they cannot
// launch until that's available; sandboxed agents are unaffected. This
// lets agentd run in a minimal environment (e.g. a capstan VM with no
// sudo/agent user) for control-plane bring-up.
log.Printf("agentd: warning: tmux server unavailable (%v); native agents cannot launch", err)
}
if err := d.tmux.Start(); err != nil {
if d.noSandbox {
// Non-fatal: agentd still serves its API and status. Native (tmux)
// agents need a working tmux server + the agent user, so they cannot
// launch until that's available; sandboxed agents are unaffected. This
// lets agentd run in a minimal environment (e.g. a capstan VM with no
// sudo/agent user) for control-plane bring-up.
log.Printf("agentd: warning: tmux server unavailable (%v); native agents cannot launch", err)
} else {
return fmt.Errorf("starting tmux server: %w", err)
}
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: agentd/agentd.go
Line: 273-280

Comment:
**Unconditional tmux non-fatal change affects all deployments**

The tmux startup failure is now silently downgraded for every agentd invocation, not only when `--no-sandbox` is active. In a standard stereOS deployment where `type = "native"` agents are configured, a broken agent user or missing tmux binary no longer causes a visible startup failure — agentd starts, the reconcile loop runs, and every native agent logs a per-cycle `"error starting agent"` error indefinitely, while the root cause (tmux never started) is only printed once at boot. Previously this was a hard startup failure via `return fmt.Errorf(...)`.

Consider gating the non-fatal path on `d.noSandbox`, so standard sandboxed deployments retain the original fail-fast behavior and only the capstan control-plane path is lenient.

```suggestion
	if err := d.tmux.Start(); err != nil {
		if d.noSandbox {
			// Non-fatal: agentd still serves its API and status. Native (tmux)
			// agents need a working tmux server + the agent user, so they cannot
			// launch until that's available; sandboxed agents are unaffected. This
			// lets agentd run in a minimal environment (e.g. a capstan VM with no
			// sudo/agent user) for control-plane bring-up.
			log.Printf("agentd: warning: tmux server unavailable (%v); native agents cannot launch", err)
		} else {
			return fmt.Errorf("starting tmux server: %w", err)
		}
	}
```

How can I resolve this? If you propose a fix, please make it concise.

The daggerverse ghcontrib module now requires dagger v0.20.8, so the PR
checks (title conformance, Linear magic word) error out under the pinned
v0.20.6 with "module requires dagger v0.20.8, but you have v0.20.6".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant