Skip to content

[BUG] Subagent session isolation #2137

Description

@dimetron

📋 Prerequisites

  • I have searched the existing issues to avoid creating a duplicate
  • By submitting this issue, you agree to follow our Code of Conduct
  • I am using the latest version of the software
  • I have tried to clear cache/cookies or used incognito mode (if ui-related)
  • I can consistently reproduce this issue

🎯 Affected Service(s)

Not Sure

🚦 Impact/Severity

No impact (Default)

🐛 Bug Description

EP: Per-call session isolation for Agent tools (isolateSessions)

Goal

Let a declarative agent fan out parallel, isolated sub-agent calls. Today every
call the parent makes to a given sub-agent (Agent tool) reuses one A2A
context_id, so all calls land in a single sub-agent session and are processed
as one conversation instead of independent sessions.

Add an opt-in isolateSessions flag on Agent tools. When enabled, each call to that
sub-agent uses a fresh context_id, so every invocation runs in its own isolated
sub-agent session — the Go equivalent of ADK's AgentTool "fresh session/context per
invocation" (the pattern the Python coordinator/dispatcher uses for per-repository
context isolation).

Problem (root cause)

go/adk/pkg/tools/remote_a2a_tool.go generates lastContextID once, at tool
construction (pod startup), and stamps it on every outbound A2A message:

state := &remoteA2AState{ /* ... */ lastContextID: a2atype.NewContextID() }
// ...
message.ContextID = s.lastContextID   // same id for every call, forever

The worker derives its session id straight from that context id
(go/adk/pkg/a2a/executor.go: sessionID := reqCtx.ContextID). So N parallel calls
→ 1 shared worker session, serialized/interleaved rather than N parallel sessions.

Observed: a coordinator emitted 11 sub-agent calls in one model turn; all 11 collapsed
into a single worker session (11 A2A tasks in one context) instead of 11 isolated
sessions.

The Python low-level tool (_remote_a2a_tool.py) has the identical shared-context
behavior; the Python side only gets isolation because a custom dispatcher wraps the
worker in ADK AgentTool, which mints a fresh session per call. This EP brings the
same capability to the Go declarative runtime without a custom agent.

Design

isolateSessions is a per-Agent-tool boolean.

  • false (default): unchanged. One stable context_id per tool for its lifetime
    → session continuity for stateful sub-agents.
  • true: context_id is minted per call via a2atype.NewContextID() → each call
    is an isolated sub-agent session; parallel fan-out no longer shares state/history.

Cross-conversation/turn continuity that stateful workers need does not depend on
the message context_id: it rides the x-kagent-root-context-id header
(lineageHeadersInterceptor), which stays stable per conversation regardless of this
flag. So isolation only changes which sub-agent session a turn is recorded in.

UI linking

The executor pre-stamps outgoing function_call parts with a pre-known
subagent_session_id from a startup map (subagentSessionIDs, keyed by tool name).
With per-call ids there is no single pre-known id, so an isolated tool contributes
no map entry: the UI links each sub-agent card via the per-call
subagent_session_id returned in the tool's function_response (already consumed by
AgentCallDisplay).

Changes

File Change
go/api/v1alpha2/agent_types.go Add IsolateSessions *bool to Tool (+ CEL validation: only for type=Agent)
go/api/adk/types.go Add IsolateSessions bool (isolate_sessions) to RemoteAgentConfig
go/core/internal/controller/translator/agent/compiler.go Copy tool.IsolateSessionsRemoteAgentConfig.IsolateSessions
go/adk/pkg/agent/agent.go Pass flag to NewKAgentRemoteA2ATool; skip stamp-map entry when isolated
go/adk/pkg/tools/remote_a2a_tool.go isolateSessions field; per-call context_id; thread it into processResult
python/packages/kagent-adk/src/kagent/adk/types.py Add isolate_sessions field for config-schema parity (Go runtime honors it)
generated CRD YAML + DeepCopy via make controller-manifests; golden files via UPDATE_GOLDEN=true make -C go test

Deployment / configuration

apiVersion: kagent.dev/v1alpha2
kind: Agent
metadata:
  name: coordinator
spec:
  declarative:
    runtime: go
    tools:
      - type: Agent
        agent:
          name: worker
        isolateSessions: true   # each worker call = its own isolated session
  • Unset / false → current behavior (one shared sub-agent session).
  • true → fresh sub-agent session per call; enables parallel isolated fan-out.

Scope / non-goals

  • Runtime: honored by the Go declarative runtime. The field is accepted by the
    Python config model (schema parity) but the Python low-level tool is unchanged.
  • Not a concurrency limiter — see max-concurrency.md for capping parallel
    invocations per pod. Isolation and concurrency limiting are complementary.
  • No change to HITL resume: resume still targets the original sub-agent session via
    the context_id stored in the confirmation payload.

Behavior notes & caveats

  • Stateful sub-agents that rely on reused session history across calls must keep
    isolateSessions: false (or key state on x-kagent-root-context-id).
  • With isolation on, a sub-agent's per-call sessions have source='agent' and are
    hidden from the worker's sidebar by design; they render inline in the parent chat.
  • Parallelism is bounded by how many tool calls the parent LLM emits in a turn and by
    any KAGENT_MAX_CONCURRENCY on the worker pod.

🔄 Steps To Reproduce

  1. Create WorkerAgent
  2. Create Coordinator Agent with WorkerAgent as Tool
  3. Ask coordinator agent to call Worker with 3 tasks in parallel

🤔 Expected Behavior

No response

📱 Actual Behavior

No response

💻 Environment

No response

🔧 CLI Bug Report

No response

🔍 Additional Context

No response

📋 Logs

📷 Screenshots

No response

🙋 Are you willing to contribute?

  • I am willing to submit a PR to fix this issue

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Fields

    No fields configured for Bug.

    Projects

    Status
    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions