Skip to content

OBO + M2M token propagation: mechanism and configuration #2108

Description

@QuentinBisson

OBO + M2M token propagation: mechanism and configuration

Summary

The STS token-propagation plugin (Go go/adk/pkg/sts, Python agentsts-adk)
supports on-behalf-of (OBO) delegation today, but only when a request carries a
user token. There is no machine-to-machine (M2M) path: when no user token is
present, the plugin propagates nothing, so an autonomous/cron request, and the
agent's own connection to a protected MCP server, cannot authenticate.

Before implementing, I would like to agree on the config surface and the mode
semantics. RFC 8707 resource/audience scoping is being handled separately
and is already up as a small, self-contained change: #2106 (Go) and #2107
(Python).

What main does today

  • BeforeRunCallback (Go) / before_run_callback (Python) extracts the user
    bearer token. If there is no user token it returns early and propagates
    nothing.
  • When a user token is present and an STS is configured, it runs an RFC 8693
    exchange with subject = user token and actor = the workload ServiceAccount
    token, and injects the exchanged token into MCP requests.
  • The SA token is read from /var/run/secrets/kubernetes.io/serviceaccount/token
    (ActorTokenService).

Gaps:

  1. No M2M fallback. Tokenless requests get no credential.
  2. The connection/discovery handshake (initialize + tools/list) runs with no
    session and no user, so the per-session header provider returns nothing.
    Today only a static headersFrom secret can authenticate that connection.
  3. Mode is effectively a process-wide env toggle; there is no per-request
    OBO/M2M selection.

Goal

Per request, emit either OBO (subject = user, actor = SA) or a machine
credential, selected by whether a user token is present, with clear failure
semantics, and let the agent authenticate its own connection to a protected MCP
server.

Mechanism

The agent runtime performs the token exchange in-process: it calls the STS
itself (RFC 8693) and injects the resulting token into MCP requests. This is
what main already does for OBO, so M2M is a natural extension of the same
path: exchange the SA token (as subject, no actor) or forward it raw. Token
handling stays self contained in the runtime and assumes nothing about the
downstream topology.

Secondary design questions (assuming a mechanism is chosen)

  1. Modes and default. Proposal: three modes.

    • obo (default): use the user token; if there is no user token on a tool
      call, error. Fail closed on exchange failure.
    • auto: OBO when a user token is present, M2M otherwise.
    • m2m: always machine identity; ignore any user token.
      Default obo so an interactive agent that loses its token errors rather than
      silently escalating to the agent's machine identity. auto as a default
      would mean every agent silently acquires M2M capability without an explicit
      opt-in, which is the wrong default in a multi-tenant cluster.
  2. Failure semantics. A request carrying a user token must never silently
    downgrade to machine identity. auto's no-token branch is the only place
    machine identity is used implicitly, which is why it is opt-in.

  3. Connection/discovery carve-out. initialize/tools/list are tokenless
    infrastructure, not user requests. The machine token should always be allowed
    for the handshake regardless of mode, otherwise obo agents could never
    connect to a protected backend. Consequence: if the backend gates tool
    visibility by caller identity, the agent's tool list (enumerated under machine
    identity) may not match what an OBO user can actually invoke. Backends that
    scope tool visibility by identity are not supported in this model; callers
    must expect tool-call authorization errors for tools the user cannot access.

  4. MachineIdentity applies in all modes. Because machine identity is always used
    for the initialize/tools/list handshake (see point 3), MachineIdentity must
    be specifiable even when mode: obo. Without it, the controller has nothing
    to render for the carve-out credential, and an OBO agent cannot connect to an
    audience-restricted backend. MachineIdentity should be valid (and in practice
    required) regardless of mode.

  5. Machine credential strategy. OBO needs an STS; M2M does not. A backend doing
    Kubernetes-native authz (TokenReview/SAR, or an audience-scoped projected SA
    token) wants the raw SA token, not an exchanged one. Suggest a narrow
    exchange | passthrough choice for the M2M path, independent of whether an
    STS is configured for OBO.

  6. Workload identity source. The actor/machine identity should not be
    hardcoded to the cluster SA token. STSIntegration already accepts a
    fetchActorToken hook, so the source can be pluggable: the default projected
    SA token, an audience-scoped projected token, or a SPIFFE JWT-SVID fetched
    from the Workload API. (Only JWT-SVIDs apply; X.509-SVIDs are mTLS, not a
    token-exchange input.) The config surface below should be able to name the
    source, not assume /var/run/secrets/.../token.

    Scope note: SPIFFE JWT-SVID support (source: spiffe) adds a Workload API
    socket mount and a new rotation lifecycle, meaningfully larger than the SA token
    variants. Proposal: land serviceAccountToken and projectedToken in the
    initial implementation and defer spiffe to a follow-up so the first PR stays
    self-contained.

  7. Config surface. Prefer a schema-validated Agent.spec.auth { mode, machineIdentity{ credential, source, audience } } rendered by the controller into agent-runtime
    env, over raw env vars as the public API (see appendix). Token scoping
    (resource/audience) stays the env vars from feat(sts): send RFC 8707 resource and audience on token exchange #2106/feat(sts): send RFC 8707 resource and audience on token exchange (Python) #2107, independent of
    mode. Precedence: with spec.auth set the managed token wins over a static
    headersFrom Authorization; without it, static wins (unchanged). No opt-out
    flag: pinning a static token means not setting spec.auth.

  8. STS actor authorization prerequisite. When mode is auto or m2m, the
    STS must be configured to permit the agent's ServiceAccount as a valid actor
    before exchanges succeed. This is an operational prerequisite that must be
    addressed in tandem with any code change.

  9. Lifecycle, caching, and concurrency. SA tokens rotate in place and
    exchanged tokens expire, so the machine credential must re-read the SA file and
    refresh before expiry; concurrent first-use should not stampede the STS
    (singleflight / per-key lock). For OBO, the cache key must include the user's
    identity (sub, and iss if tokens from multiple issuers are possible) — not
    just the mode — to prevent two users on the same agent from sharing a cached
    exchanged token.

Proposed rollout

Land RFC 8707 resource/audience first (#2106, #2107, already open). Then
implement the agreed mechanism in Go, with Python parity to follow. Each change
keeps main working and secure by default. SPIFFE source support follows in a
separate PR once the initial implementation is stable.

Questions for maintainers

  • Is obo-as-default acceptable, or do you prefer auto?
  • Is spec.auth the right place for this config, or do you want it elsewhere?
  • Confirm: defer source: spiffe to a follow-up PR?

Appendix: example spec.auth shape (for discussion)

A concrete starting point, not a final API. The controller renders these fields
into the agent-runtime env the plugin already reads, so the runtime change is
small and the CRD is the user-facing surface.

// AuthConfig configures how the agent authenticates to MCP backends.
type AuthConfig struct {
    // Mode selects how each request maps to the credential sent to MCP backends.
    //   obo  (default): use the user token; error if absent (fail closed).
    //   auto:           OBO when a user token is present, M2M otherwise.
    //   m2m:            always machine identity; ignore any user token.
    // +kubebuilder:validation:Enum=obo;auto;m2m
    // +kubebuilder:default=obo
    // +optional
    Mode string `json:"mode,omitempty"`

    // MachineIdentity configures the workload credential used when no user token is
    // present and for the connection/discovery handshake (all modes).
    // +optional
    MachineIdentity *MachineIdentityConfig `json:"machineIdentity,omitempty"`
}

// MachineIdentityConfig configures how the agent presents its own workload identity.
type MachineIdentityConfig struct {
    // Credential selects whether the workload identity is exchanged at the STS
    // (default when an STS is configured) or sent raw to the backend.
    // +kubebuilder:validation:Enum=exchange;passthrough
    // +optional
    Credential string `json:"credential,omitempty"`

    // Source selects the workload identity token.
    //   serviceAccountToken (default): the projected cluster SA token.
    //   projectedToken:                an audience-scoped projected token (see Audience).
    // +kubebuilder:validation:Enum=serviceAccountToken;projectedToken
    // +kubebuilder:default=serviceAccountToken
    // +optional
    Source string `json:"source,omitempty"`

    // Audience for the projected token when Source=projectedToken.
    // +optional
    Audience string `json:"audience,omitempty"`
}

Example Agent that serves interactive and cron traffic against one backend,
delegating as the user when a token is present and exchanging its own SA token
otherwise:

apiVersion: kagent.dev/v1alpha2
kind: Agent
metadata:
  name: sre-agent
spec:
  auth:
    mode: auto
    machineIdentity:
      credential: exchange
      source: serviceAccountToken
  # ... model, tools, etc.

Controller rendering (illustrative) into the runtime Deployment env, which is
what the plugin reads:

KAGENT_TOKEN_MODE=auto
KAGENT_MACHINE_CREDENTIAL=exchange
KAGENT_MACHINE_SOURCE=serviceAccountToken

Token scoping (KAGENT_TOKEN_RESOURCE / KAGENT_TOKEN_AUDIENCE, RFC 8707/8693)
stays the env vars introduced in #2106/#2107, independent of mode; it is not part
of spec.auth here.

Outbound Authorization precedence: with no spec.auth, a static headersFrom
Authorization wins (today's behavior, unchanged). With spec.auth set, the
managed token wins. A user who wants a pinned static token simply does not set
spec.auth; no opt-out flag is needed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    Status
    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions