Skip to content

spike: agents setting maximum policies for spawned agents #2025

Description

@johnnygreco

Problem Statement

This spike explores provable delegated authority for spawned agents.

Agents should be able to spawn other agents with narrower, task-specific authority. A spawned agent must never receive authority beyond the human/admin maximum policy, beyond the spawning agent’s boundary, or beyond whatever subset of authority the spawning agent is allowed to delegate.

The goal is least privilege for multi-agent workflows: a parent agent can hand off work without handing over its full sandbox authority, credentials, providers, or communication reach.

This is a research spike. The output should define the product/security model, core use cases, policy invariants, policy prover requirements, and follow-up issues before any implementation is chosen.

Why This Matters

Agent systems increasingly decompose work across multiple specialized agents. Without explicit delegation rules, a spawned agent can easily become an accidental privilege expansion path.

This functionality should support workflows where:

  • A parent agent spawns a helper to read a subset of files, inspect logs, run a test, summarize a document, or analyze a repository.
  • A child agent receives only the authorities needed for its task.
  • Credentials and write-capable providers are withheld, narrowed, or made review-required.
  • The parent and child communicate through observable, policy-controlled channels.
  • Human/admin boundaries remain authoritative across the entire delegation chain.
  • Security teams can audit who delegated what, why, under which policy version, and with which approvals.

Policy Prover Role

The policy prover is central to this feature.

The system should not rely on an LLM judge to decide whether delegation “seems safe.” An agent may propose a delegated policy, but the gateway should use policy evaluation and prover-backed containment checks to decide whether to apply, ask, or reject.

The prover should establish containment across the delegation chain, roughly:

child_policy <= parent_delegable_view <= parent_boundary <= human_admin_maximum_policy

The spike should explore what this proof needs to cover, including:

  • Network, filesystem, process, tool, provider, and credential authority.
  • Explicit deny rules.
  • Review requirements.
  • Credential target constraints.
  • Provider constraints.
  • Communication authority between parent and child sandboxes.
  • Transitive delegation if spawned agents can spawn more agents.

Core Research Questions

This spike should explore:

  • What does it mean for one agent to delegate authority to another agent?
  • Should the child boundary derive from the parent’s maximum policy, current effective policy, or a separate delegable subset?
  • Which authorities should be delegable, and which should never be delegable?
  • How should the parent describe the child’s task scope?
  • How should the system prove that a child policy is contained within the allowed delegation boundary?
  • How do explicit denies, review requirements, provider constraints, and credential constraints carry over?
  • When should delegation require human/admin approval?
  • Can spawned agents spawn further agents? If so, how does transitive delegation work?
  • What policy should govern communication between parent and child agents?
  • What should the parent be able to see about the child’s policy, denials, approvals, and results?
  • What should the child be able to see about the parent’s policy?
  • What audit records are required to reconstruct delegation and communication decisions later?
  • What requirements does this create for sandbox-to-sandbox communication in Sandbox to Sandbox Communications #1049?

Areas To Explore

The spike may compare several models:

  • Spawner-narrowed policy: the parent proposes a task-specific maximum policy for the child.
  • Delegable capability view: only an explicit subset of the parent’s authority can be passed to children.
  • Task-scoped policy: the child boundary is derived from the work being delegated.
  • Review-preserving delegation: review-required capabilities remain review-required after delegation.
  • Credential-aware delegation: credentials have explicit delegation rules and cannot be treated like ordinary reachability.
  • Communication-aware policy: parent-child communication is modeled as a separate policy-controlled capability.
  • Transitive delegation: spawned agents may or may not be allowed to spawn additional agents, with containment proven at each step.

Expected Output

The output should be a short research note or design sketch that covers:

  • The main product use cases for spawned-agent delegation.
  • The security invariants required for safe delegation.
  • Tradeoffs between maximum policy, effective policy, and delegable subsets.
  • How the policy prover should prove containment.
  • How to prevent privilege expansion through spawned agents.
  • How credentials and providers should behave when delegated.
  • How review requirements should carry across spawned-agent boundaries.
  • What spawned-agent workflows require from sandbox-to-sandbox communication in Sandbox to Sandbox Communications #1049.
  • Recommended follow-up issues.

Non-Goals

  • Do not implement production spawned-agent policy delegation.
  • Do not implement sandbox-to-sandbox communication in this issue.
  • Do not design a complete multi-agent orchestration framework.
  • Do not allow agents to expand beyond the human/admin maximum policy boundary.
  • Do not rely on an LLM judge to decide whether delegation is safe.
  • Do not focus primarily on OpenShell code changes before the delegation model and prover obligations are clear.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:policyPolicy engine and policy lifecycle workarea:sandboxSandbox runtime and isolation workroadmapspike

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions