OBO + M2M token propagation: mechanism and configuration

# OBO + M2M token propagation: mechanism and configuration

## Summary

The STS token-propagation plugin (Go `go/adk/pkg/sts`, Python `agentsts-adk`)
supports on-behalf-of (OBO) delegation today, but only when a request carries a
user token. There is no machine-to-machine (M2M) path: when no user token is
present, the plugin propagates nothing, so an autonomous/cron request, and the
agent's own connection to a protected MCP server, cannot authenticate.

Before implementing, I would like to agree on the config surface and the mode
semantics. RFC 8707 `resource`/`audience` scoping is being handled separately
and is already up as a small, self-contained change: #2106 (Go) and #2107
(Python).

## What `main` does today

- `BeforeRunCallback` (Go) / `before_run_callback` (Python) extracts the user
  bearer token. If there is no user token it returns early and propagates
  nothing.
- When a user token is present and an STS is configured, it runs an RFC 8693
  exchange with `subject` = user token and `actor` = the workload ServiceAccount
  token, and injects the exchanged token into MCP requests.
- The SA token is read from `/var/run/secrets/kubernetes.io/serviceaccount/token`
  (`ActorTokenService`).

Gaps:

1. No M2M fallback. Tokenless requests get no credential.
2. The connection/discovery handshake (`initialize` + `tools/list`) runs with no
   session and no user, so the per-session header provider returns nothing.
   Today only a static `headersFrom` secret can authenticate that connection.
3. Mode is effectively a process-wide env toggle; there is no per-request
   OBO/M2M selection.

## Goal

Per request, emit either OBO (`subject` = user, `actor` = SA) or a machine
credential, selected by whether a user token is present, with clear failure
semantics, and let the agent authenticate its own connection to a protected MCP
server.

## Mechanism

The agent runtime performs the token exchange in-process: it calls the STS
itself (RFC 8693) and injects the resulting token into MCP requests. This is
what `main` already does for OBO, so M2M is a natural extension of the same
path: exchange the SA token (as `subject`, no actor) or forward it raw. Token
handling stays self contained in the runtime and assumes nothing about the
downstream topology.

## Secondary design questions (assuming a mechanism is chosen)

1. **Modes and default.** Proposal: three modes.
   - `obo` (default): use the user token; if there is no user token on a tool
     call, error. Fail closed on exchange failure.
   - `auto`: OBO when a user token is present, M2M otherwise.
   - `m2m`: always machine identity; ignore any user token.
   Default `obo` so an interactive agent that loses its token errors rather than
   silently escalating to the agent's machine identity. `auto` as a default
   would mean every agent silently acquires M2M capability without an explicit
   opt-in, which is the wrong default in a multi-tenant cluster.

2. **Failure semantics.** A request carrying a user token must never silently
   downgrade to machine identity. `auto`'s no-token branch is the only place
   machine identity is used implicitly, which is why it is opt-in.

3. **Connection/discovery carve-out.** `initialize`/`tools/list` are tokenless
   infrastructure, not user requests. The machine token should always be allowed
   for the handshake regardless of mode, otherwise `obo` agents could never
   connect to a protected backend. Consequence: if the backend gates tool
   visibility by caller identity, the agent's tool list (enumerated under machine
   identity) may not match what an OBO user can actually invoke. Backends that
   scope tool visibility by identity are not supported in this model; callers
   must expect tool-call authorization errors for tools the user cannot access.

4. **`MachineIdentity` applies in all modes.** Because machine identity is always used
   for the `initialize`/`tools/list` handshake (see point 3), `MachineIdentity` must
   be specifiable even when `mode: obo`. Without it, the controller has nothing
   to render for the carve-out credential, and an OBO agent cannot connect to an
   audience-restricted backend. `MachineIdentity` should be valid (and in practice
   required) regardless of mode.

5. **Machine credential strategy.** OBO needs an STS; M2M does not. A backend doing
   Kubernetes-native authz (TokenReview/SAR, or an audience-scoped projected SA
   token) wants the raw SA token, not an exchanged one. Suggest a narrow
   `exchange | passthrough` choice for the M2M path, independent of whether an
   STS is configured for OBO.

6. **Workload identity source.** The actor/machine identity should not be
   hardcoded to the cluster SA token. `STSIntegration` already accepts a
   `fetchActorToken` hook, so the source can be pluggable: the default projected
   SA token, an audience-scoped projected token, or a SPIFFE JWT-SVID fetched
   from the Workload API. (Only JWT-SVIDs apply; X.509-SVIDs are mTLS, not a
   token-exchange input.) The config surface below should be able to name the
   source, not assume `/var/run/secrets/.../token`.

   **Scope note:** SPIFFE JWT-SVID support (`source: spiffe`) adds a Workload API
   socket mount and a new rotation lifecycle, meaningfully larger than the SA token
   variants. Proposal: land `serviceAccountToken` and `projectedToken` in the
   initial implementation and defer `spiffe` to a follow-up so the first PR stays
   self-contained.

7. **Config surface.** Prefer a schema-validated `Agent.spec.auth { mode, machineIdentity{
   credential, source, audience } }` rendered by the controller into agent-runtime
   env, over raw env vars as the public API (see appendix). Token scoping
   (`resource`/`audience`) stays the env vars from #2106/#2107, independent of
   mode. Precedence: with `spec.auth` set the managed token wins over a static
   `headersFrom` Authorization; without it, static wins (unchanged). No opt-out
   flag: pinning a static token means not setting `spec.auth`.

8. **STS actor authorization prerequisite.** When `mode` is `auto` or `m2m`, the
   STS must be configured to permit the agent's ServiceAccount as a valid actor
   before exchanges succeed. This is an operational prerequisite that must be
   addressed in tandem with any code change.

9. **Lifecycle, caching, and concurrency.** SA tokens rotate in place and
   exchanged tokens expire, so the machine credential must re-read the SA file and
   refresh before expiry; concurrent first-use should not stampede the STS
   (singleflight / per-key lock). For OBO, the cache key must include the user's
   identity (`sub`, and `iss` if tokens from multiple issuers are possible) — not
   just the mode — to prevent two users on the same agent from sharing a cached
   exchanged token.

## Proposed rollout

Land RFC 8707 `resource`/`audience` first (#2106, #2107, already open). Then
implement the agreed mechanism in Go, with Python parity to follow. Each change
keeps `main` working and secure by default. SPIFFE source support follows in a
separate PR once the initial implementation is stable.

## Questions for maintainers

- Is `obo`-as-default acceptable, or do you prefer `auto`?
- Is `spec.auth` the right place for this config, or do you want it elsewhere?
- Confirm: defer `source: spiffe` to a follow-up PR?

## Appendix: example `spec.auth` shape (for discussion)

A concrete starting point, not a final API. The controller renders these fields
into the agent-runtime env the plugin already reads, so the runtime change is
small and the CRD is the user-facing surface.

```go
// AuthConfig configures how the agent authenticates to MCP backends.
type AuthConfig struct {
    // Mode selects how each request maps to the credential sent to MCP backends.
    //   obo  (default): use the user token; error if absent (fail closed).
    //   auto:           OBO when a user token is present, M2M otherwise.
    //   m2m:            always machine identity; ignore any user token.
    // +kubebuilder:validation:Enum=obo;auto;m2m
    // +kubebuilder:default=obo
    // +optional
    Mode string `json:"mode,omitempty"`

    // MachineIdentity configures the workload credential used when no user token is
    // present and for the connection/discovery handshake (all modes).
    // +optional
    MachineIdentity *MachineIdentityConfig `json:"machineIdentity,omitempty"`
}

// MachineIdentityConfig configures how the agent presents its own workload identity.
type MachineIdentityConfig struct {
    // Credential selects whether the workload identity is exchanged at the STS
    // (default when an STS is configured) or sent raw to the backend.
    // +kubebuilder:validation:Enum=exchange;passthrough
    // +optional
    Credential string `json:"credential,omitempty"`

    // Source selects the workload identity token.
    //   serviceAccountToken (default): the projected cluster SA token.
    //   projectedToken:                an audience-scoped projected token (see Audience).
    // +kubebuilder:validation:Enum=serviceAccountToken;projectedToken
    // +kubebuilder:default=serviceAccountToken
    // +optional
    Source string `json:"source,omitempty"`

    // Audience for the projected token when Source=projectedToken.
    // +optional
    Audience string `json:"audience,omitempty"`
}
```

Example Agent that serves interactive and cron traffic against one backend,
delegating as the user when a token is present and exchanging its own SA token
otherwise:

```yaml
apiVersion: kagent.dev/v1alpha2
kind: Agent
metadata:
  name: sre-agent
spec:
  auth:
    mode: auto
    machineIdentity:
      credential: exchange
      source: serviceAccountToken
  # ... model, tools, etc.
```

Controller rendering (illustrative) into the runtime Deployment env, which is
what the plugin reads:

```
KAGENT_TOKEN_MODE=auto
KAGENT_MACHINE_CREDENTIAL=exchange
KAGENT_MACHINE_SOURCE=serviceAccountToken
```

Token scoping (`KAGENT_TOKEN_RESOURCE` / `KAGENT_TOKEN_AUDIENCE`, RFC 8707/8693)
stays the env vars introduced in #2106/#2107, independent of mode; it is not part
of `spec.auth` here.

Outbound `Authorization` precedence: with no `spec.auth`, a static `headersFrom`
Authorization wins (today's behavior, unchanged). With `spec.auth` set, the
managed token wins. A user who wants a pinned static token simply does not set
`spec.auth`; no opt-out flag is needed.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OBO + M2M token propagation: mechanism and configuration #2108

OBO + M2M token propagation: mechanism and configuration

Summary

What `main` does today

Goal

Mechanism

Secondary design questions (assuming a mechanism is chosen)

Proposed rollout

Questions for maintainers

Appendix: example `spec.auth` shape (for discussion)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

OBO + M2M token propagation: mechanism and configuration #2108

Description

OBO + M2M token propagation: mechanism and configuration

Summary

What main does today

Goal

Mechanism

Secondary design questions (assuming a mechanism is chosen)

Proposed rollout

Questions for maintainers

Appendix: example spec.auth shape (for discussion)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

What `main` does today

Appendix: example `spec.auth` shape (for discussion)