You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OBO + M2M token propagation: mechanism and configuration
Summary
The STS token-propagation plugin (Go go/adk/pkg/sts, Python agentsts-adk)
supports on-behalf-of (OBO) delegation today, but only when a request carries a
user token. There is no machine-to-machine (M2M) path: when no user token is
present, the plugin propagates nothing, so an autonomous/cron request, and the
agent's own connection to a protected MCP server, cannot authenticate.
Before implementing, I would like to agree on the config surface and the mode
semantics. RFC 8707 resource/audience scoping is being handled separately
and is already up as a small, self-contained change: #2106 (Go) and #2107
(Python).
What main does today
BeforeRunCallback (Go) / before_run_callback (Python) extracts the user
bearer token. If there is no user token it returns early and propagates
nothing.
When a user token is present and an STS is configured, it runs an RFC 8693
exchange with subject = user token and actor = the workload ServiceAccount
token, and injects the exchanged token into MCP requests.
The SA token is read from /var/run/secrets/kubernetes.io/serviceaccount/token
(ActorTokenService).
Gaps:
No M2M fallback. Tokenless requests get no credential.
The connection/discovery handshake (initialize + tools/list) runs with no
session and no user, so the per-session header provider returns nothing.
Today only a static headersFrom secret can authenticate that connection.
Mode is effectively a process-wide env toggle; there is no per-request
OBO/M2M selection.
Goal
Per request, emit either OBO (subject = user, actor = SA) or a machine
credential, selected by whether a user token is present, with clear failure
semantics, and let the agent authenticate its own connection to a protected MCP
server.
Mechanism
The agent runtime performs the token exchange in-process: it calls the STS
itself (RFC 8693) and injects the resulting token into MCP requests. This is
what main already does for OBO, so M2M is a natural extension of the same
path: exchange the SA token (as subject, no actor) or forward it raw. Token
handling stays self contained in the runtime and assumes nothing about the
downstream topology.
Secondary design questions (assuming a mechanism is chosen)
Modes and default. Proposal: three modes.
obo (default): use the user token; if there is no user token on a tool
call, error. Fail closed on exchange failure.
auto: OBO when a user token is present, M2M otherwise.
m2m: always machine identity; ignore any user token.
Default obo so an interactive agent that loses its token errors rather than
silently escalating to the agent's machine identity. auto as a default
would mean every agent silently acquires M2M capability without an explicit
opt-in, which is the wrong default in a multi-tenant cluster.
Failure semantics. A request carrying a user token must never silently
downgrade to machine identity. auto's no-token branch is the only place
machine identity is used implicitly, which is why it is opt-in.
Connection/discovery carve-out.initialize/tools/list are tokenless
infrastructure, not user requests. The machine token should always be allowed
for the handshake regardless of mode, otherwise obo agents could never
connect to a protected backend. Consequence: if the backend gates tool
visibility by caller identity, the agent's tool list (enumerated under machine
identity) may not match what an OBO user can actually invoke. Backends that
scope tool visibility by identity are not supported in this model; callers
must expect tool-call authorization errors for tools the user cannot access.
MachineIdentity applies in all modes. Because machine identity is always used
for the initialize/tools/list handshake (see point 3), MachineIdentity must
be specifiable even when mode: obo. Without it, the controller has nothing
to render for the carve-out credential, and an OBO agent cannot connect to an
audience-restricted backend. MachineIdentity should be valid (and in practice
required) regardless of mode.
Machine credential strategy. OBO needs an STS; M2M does not. A backend doing
Kubernetes-native authz (TokenReview/SAR, or an audience-scoped projected SA
token) wants the raw SA token, not an exchanged one. Suggest a narrow exchange | passthrough choice for the M2M path, independent of whether an
STS is configured for OBO.
Workload identity source. The actor/machine identity should not be
hardcoded to the cluster SA token. STSIntegration already accepts a fetchActorToken hook, so the source can be pluggable: the default projected
SA token, an audience-scoped projected token, or a SPIFFE JWT-SVID fetched
from the Workload API. (Only JWT-SVIDs apply; X.509-SVIDs are mTLS, not a
token-exchange input.) The config surface below should be able to name the
source, not assume /var/run/secrets/.../token.
Scope note: SPIFFE JWT-SVID support (source: spiffe) adds a Workload API
socket mount and a new rotation lifecycle, meaningfully larger than the SA token
variants. Proposal: land serviceAccountToken and projectedToken in the
initial implementation and defer spiffe to a follow-up so the first PR stays
self-contained.
Config surface. Prefer a schema-validated Agent.spec.auth { mode, machineIdentity{ credential, source, audience } } rendered by the controller into agent-runtime
env, over raw env vars as the public API (see appendix). Token scoping
(resource/audience) stays the env vars from feat(sts): send RFC 8707 resource and audience on token exchange #2106/feat(sts): send RFC 8707 resource and audience on token exchange (Python) #2107, independent of
mode. Precedence: with spec.auth set the managed token wins over a static headersFrom Authorization; without it, static wins (unchanged). No opt-out
flag: pinning a static token means not setting spec.auth.
STS actor authorization prerequisite. When mode is auto or m2m, the
STS must be configured to permit the agent's ServiceAccount as a valid actor
before exchanges succeed. This is an operational prerequisite that must be
addressed in tandem with any code change.
Lifecycle, caching, and concurrency. SA tokens rotate in place and
exchanged tokens expire, so the machine credential must re-read the SA file and
refresh before expiry; concurrent first-use should not stampede the STS
(singleflight / per-key lock). For OBO, the cache key must include the user's
identity (sub, and iss if tokens from multiple issuers are possible) — not
just the mode — to prevent two users on the same agent from sharing a cached
exchanged token.
Proposed rollout
Land RFC 8707 resource/audience first (#2106, #2107, already open). Then
implement the agreed mechanism in Go, with Python parity to follow. Each change
keeps main working and secure by default. SPIFFE source support follows in a
separate PR once the initial implementation is stable.
Questions for maintainers
Is obo-as-default acceptable, or do you prefer auto?
Is spec.auth the right place for this config, or do you want it elsewhere?
Confirm: defer source: spiffe to a follow-up PR?
Appendix: example spec.auth shape (for discussion)
A concrete starting point, not a final API. The controller renders these fields
into the agent-runtime env the plugin already reads, so the runtime change is
small and the CRD is the user-facing surface.
// AuthConfig configures how the agent authenticates to MCP backends.typeAuthConfigstruct {
// Mode selects how each request maps to the credential sent to MCP backends.// obo (default): use the user token; error if absent (fail closed).// auto: OBO when a user token is present, M2M otherwise.// m2m: always machine identity; ignore any user token.// +kubebuilder:validation:Enum=obo;auto;m2m// +kubebuilder:default=obo// +optionalModestring`json:"mode,omitempty"`// MachineIdentity configures the workload credential used when no user token is// present and for the connection/discovery handshake (all modes).// +optionalMachineIdentity*MachineIdentityConfig`json:"machineIdentity,omitempty"`
}
// MachineIdentityConfig configures how the agent presents its own workload identity.typeMachineIdentityConfigstruct {
// Credential selects whether the workload identity is exchanged at the STS// (default when an STS is configured) or sent raw to the backend.// +kubebuilder:validation:Enum=exchange;passthrough// +optionalCredentialstring`json:"credential,omitempty"`// Source selects the workload identity token.// serviceAccountToken (default): the projected cluster SA token.// projectedToken: an audience-scoped projected token (see Audience).// +kubebuilder:validation:Enum=serviceAccountToken;projectedToken// +kubebuilder:default=serviceAccountToken// +optionalSourcestring`json:"source,omitempty"`// Audience for the projected token when Source=projectedToken.// +optionalAudiencestring`json:"audience,omitempty"`
}
Example Agent that serves interactive and cron traffic against one backend,
delegating as the user when a token is present and exchanging its own SA token
otherwise:
Token scoping (KAGENT_TOKEN_RESOURCE / KAGENT_TOKEN_AUDIENCE, RFC 8707/8693)
stays the env vars introduced in #2106/#2107, independent of mode; it is not part
of spec.auth here.
Outbound Authorization precedence: with no spec.auth, a static headersFrom
Authorization wins (today's behavior, unchanged). With spec.auth set, the
managed token wins. A user who wants a pinned static token simply does not set spec.auth; no opt-out flag is needed.
OBO + M2M token propagation: mechanism and configuration
Summary
The STS token-propagation plugin (Go
go/adk/pkg/sts, Pythonagentsts-adk)supports on-behalf-of (OBO) delegation today, but only when a request carries a
user token. There is no machine-to-machine (M2M) path: when no user token is
present, the plugin propagates nothing, so an autonomous/cron request, and the
agent's own connection to a protected MCP server, cannot authenticate.
Before implementing, I would like to agree on the config surface and the mode
semantics. RFC 8707
resource/audiencescoping is being handled separatelyand is already up as a small, self-contained change: #2106 (Go) and #2107
(Python).
What
maindoes todayBeforeRunCallback(Go) /before_run_callback(Python) extracts the userbearer token. If there is no user token it returns early and propagates
nothing.
exchange with
subject= user token andactor= the workload ServiceAccounttoken, and injects the exchanged token into MCP requests.
/var/run/secrets/kubernetes.io/serviceaccount/token(
ActorTokenService).Gaps:
initialize+tools/list) runs with nosession and no user, so the per-session header provider returns nothing.
Today only a static
headersFromsecret can authenticate that connection.OBO/M2M selection.
Goal
Per request, emit either OBO (
subject= user,actor= SA) or a machinecredential, selected by whether a user token is present, with clear failure
semantics, and let the agent authenticate its own connection to a protected MCP
server.
Mechanism
The agent runtime performs the token exchange in-process: it calls the STS
itself (RFC 8693) and injects the resulting token into MCP requests. This is
what
mainalready does for OBO, so M2M is a natural extension of the samepath: exchange the SA token (as
subject, no actor) or forward it raw. Tokenhandling stays self contained in the runtime and assumes nothing about the
downstream topology.
Secondary design questions (assuming a mechanism is chosen)
Modes and default. Proposal: three modes.
obo(default): use the user token; if there is no user token on a toolcall, error. Fail closed on exchange failure.
auto: OBO when a user token is present, M2M otherwise.m2m: always machine identity; ignore any user token.Default
oboso an interactive agent that loses its token errors rather thansilently escalating to the agent's machine identity.
autoas a defaultwould mean every agent silently acquires M2M capability without an explicit
opt-in, which is the wrong default in a multi-tenant cluster.
Failure semantics. A request carrying a user token must never silently
downgrade to machine identity.
auto's no-token branch is the only placemachine identity is used implicitly, which is why it is opt-in.
Connection/discovery carve-out.
initialize/tools/listare tokenlessinfrastructure, not user requests. The machine token should always be allowed
for the handshake regardless of mode, otherwise
oboagents could neverconnect to a protected backend. Consequence: if the backend gates tool
visibility by caller identity, the agent's tool list (enumerated under machine
identity) may not match what an OBO user can actually invoke. Backends that
scope tool visibility by identity are not supported in this model; callers
must expect tool-call authorization errors for tools the user cannot access.
MachineIdentityapplies in all modes. Because machine identity is always usedfor the
initialize/tools/listhandshake (see point 3),MachineIdentitymustbe specifiable even when
mode: obo. Without it, the controller has nothingto render for the carve-out credential, and an OBO agent cannot connect to an
audience-restricted backend.
MachineIdentityshould be valid (and in practicerequired) regardless of mode.
Machine credential strategy. OBO needs an STS; M2M does not. A backend doing
Kubernetes-native authz (TokenReview/SAR, or an audience-scoped projected SA
token) wants the raw SA token, not an exchanged one. Suggest a narrow
exchange | passthroughchoice for the M2M path, independent of whether anSTS is configured for OBO.
Workload identity source. The actor/machine identity should not be
hardcoded to the cluster SA token.
STSIntegrationalready accepts afetchActorTokenhook, so the source can be pluggable: the default projectedSA token, an audience-scoped projected token, or a SPIFFE JWT-SVID fetched
from the Workload API. (Only JWT-SVIDs apply; X.509-SVIDs are mTLS, not a
token-exchange input.) The config surface below should be able to name the
source, not assume
/var/run/secrets/.../token.Scope note: SPIFFE JWT-SVID support (
source: spiffe) adds a Workload APIsocket mount and a new rotation lifecycle, meaningfully larger than the SA token
variants. Proposal: land
serviceAccountTokenandprojectedTokenin theinitial implementation and defer
spiffeto a follow-up so the first PR staysself-contained.
Config surface. Prefer a schema-validated
Agent.spec.auth { mode, machineIdentity{ credential, source, audience } }rendered by the controller into agent-runtimeenv, over raw env vars as the public API (see appendix). Token scoping
(
resource/audience) stays the env vars from feat(sts): send RFC 8707 resource and audience on token exchange #2106/feat(sts): send RFC 8707 resource and audience on token exchange (Python) #2107, independent ofmode. Precedence: with
spec.authset the managed token wins over a staticheadersFromAuthorization; without it, static wins (unchanged). No opt-outflag: pinning a static token means not setting
spec.auth.STS actor authorization prerequisite. When
modeisautoorm2m, theSTS must be configured to permit the agent's ServiceAccount as a valid actor
before exchanges succeed. This is an operational prerequisite that must be
addressed in tandem with any code change.
Lifecycle, caching, and concurrency. SA tokens rotate in place and
exchanged tokens expire, so the machine credential must re-read the SA file and
refresh before expiry; concurrent first-use should not stampede the STS
(singleflight / per-key lock). For OBO, the cache key must include the user's
identity (
sub, andissif tokens from multiple issuers are possible) — notjust the mode — to prevent two users on the same agent from sharing a cached
exchanged token.
Proposed rollout
Land RFC 8707
resource/audiencefirst (#2106, #2107, already open). Thenimplement the agreed mechanism in Go, with Python parity to follow. Each change
keeps
mainworking and secure by default. SPIFFE source support follows in aseparate PR once the initial implementation is stable.
Questions for maintainers
obo-as-default acceptable, or do you preferauto?spec.auththe right place for this config, or do you want it elsewhere?source: spiffeto a follow-up PR?Appendix: example
spec.authshape (for discussion)A concrete starting point, not a final API. The controller renders these fields
into the agent-runtime env the plugin already reads, so the runtime change is
small and the CRD is the user-facing surface.
Example Agent that serves interactive and cron traffic against one backend,
delegating as the user when a token is present and exchanging its own SA token
otherwise:
Controller rendering (illustrative) into the runtime Deployment env, which is
what the plugin reads:
Token scoping (
KAGENT_TOKEN_RESOURCE/KAGENT_TOKEN_AUDIENCE, RFC 8707/8693)stays the env vars introduced in #2106/#2107, independent of mode; it is not part
of
spec.authhere.Outbound
Authorizationprecedence: with nospec.auth, a staticheadersFromAuthorization wins (today's behavior, unchanged). With
spec.authset, themanaged token wins. A user who wants a pinned static token simply does not set
spec.auth; no opt-out flag is needed.