spike: agents setting maximum policies for spawned agents

## Problem Statement

This spike explores **provable delegated authority for spawned agents**.

Agents should be able to spawn other agents with narrower, task-specific authority. A spawned agent must never receive authority beyond the human/admin maximum policy, beyond the spawning agent’s boundary, or beyond whatever subset of authority the spawning agent is allowed to delegate.

The goal is least privilege for multi-agent workflows: a parent agent can hand off work without handing over its full sandbox authority, credentials, providers, or communication reach.

This is a research spike. The output should define the product/security model, core use cases, policy invariants, policy prover requirements, and follow-up issues before any implementation is chosen.

## Why This Matters

Agent systems increasingly decompose work across multiple specialized agents. Without explicit delegation rules, a spawned agent can easily become an accidental privilege expansion path.

This functionality should support workflows where:

* A parent agent spawns a helper to read a subset of files, inspect logs, run a test, summarize a document, or analyze a repository.
* A child agent receives only the authorities needed for its task.
* Credentials and write-capable providers are withheld, narrowed, or made review-required.
* The parent and child communicate through observable, policy-controlled channels.
* Human/admin boundaries remain authoritative across the entire delegation chain.
* Security teams can audit who delegated what, why, under which policy version, and with which approvals.

## Policy Prover Role

The policy prover is central to this feature.

The system should not rely on an LLM judge to decide whether delegation “seems safe.” An agent may propose a delegated policy, but the gateway should use policy evaluation and prover-backed containment checks to decide whether to apply, ask, or reject.

The prover should establish containment across the delegation chain, roughly:

`child_policy <= parent_delegable_view <= parent_boundary <= human_admin_maximum_policy`

The spike should explore what this proof needs to cover, including:

* Network, filesystem, process, tool, provider, and credential authority.
* Explicit deny rules.
* Review requirements.
* Credential target constraints.
* Provider constraints.
* Communication authority between parent and child sandboxes.
* Transitive delegation if spawned agents can spawn more agents.

## Core Research Questions

This spike should explore:

* What does it mean for one agent to delegate authority to another agent?
* Should the child boundary derive from the parent’s maximum policy, current effective policy, or a separate delegable subset?
* Which authorities should be delegable, and which should never be delegable?
* How should the parent describe the child’s task scope?
* How should the system prove that a child policy is contained within the allowed delegation boundary?
* How do explicit denies, review requirements, provider constraints, and credential constraints carry over?
* When should delegation require human/admin approval?
* Can spawned agents spawn further agents? If so, how does transitive delegation work?
* What policy should govern communication between parent and child agents?
* What should the parent be able to see about the child’s policy, denials, approvals, and results?
* What should the child be able to see about the parent’s policy?
* What audit records are required to reconstruct delegation and communication decisions later?
* What requirements does this create for sandbox-to-sandbox communication in #1049?

## Areas To Explore

The spike may compare several models:

* **Spawner-narrowed policy:** the parent proposes a task-specific maximum policy for the child.
* **Delegable capability view:** only an explicit subset of the parent’s authority can be passed to children.
* **Task-scoped policy:** the child boundary is derived from the work being delegated.
* **Review-preserving delegation:** review-required capabilities remain review-required after delegation.
* **Credential-aware delegation:** credentials have explicit delegation rules and cannot be treated like ordinary reachability.
* **Communication-aware policy:** parent-child communication is modeled as a separate policy-controlled capability.
* **Transitive delegation:** spawned agents may or may not be allowed to spawn additional agents, with containment proven at each step.

## Expected Output

The output should be a short research note or design sketch that covers:

* The main product use cases for spawned-agent delegation.
* The security invariants required for safe delegation.
* Tradeoffs between maximum policy, effective policy, and delegable subsets.
* How the policy prover should prove containment.
* How to prevent privilege expansion through spawned agents.
* How credentials and providers should behave when delegated.
* How review requirements should carry across spawned-agent boundaries.
* What spawned-agent workflows require from sandbox-to-sandbox communication in #1049.
* Recommended follow-up issues.

## Non-Goals

* Do not implement production spawned-agent policy delegation.
* Do not implement sandbox-to-sandbox communication in this issue.
* Do not design a complete multi-agent orchestration framework.
* Do not allow agents to expand beyond the human/admin maximum policy boundary.
* Do not rely on an LLM judge to decide whether delegation is safe.
* Do not focus primarily on OpenShell code changes before the delegation model and prover obligations are clear.

## Related

* #1049
* #1840

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

spike: agents setting maximum policies for spawned agents #2025

Problem Statement

Why This Matters

Policy Prover Role

Core Research Questions

Areas To Explore

Expected Output

Non-Goals

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

spike: agents setting maximum policies for spawned agents #2025

Description

Problem Statement

Why This Matters

Policy Prover Role

Core Research Questions

Areas To Explore

Expected Output

Non-Goals

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions