aip-dev · ahmedtd · Jan 6, 2025 · sai-sunder-s · Jan 22, 2025 · sai-sunder-s
@@ -20,6 +20,11 @@ ADC, developers should be able to run the code in different environments and the
 supporting systems fetch the appropriate credentials based on each environment
 in an effortless manner.
 
+When running on a Google Cloud Platform compute environment such as GCE, GKE, or
+App Engine, in general no configuration input is required for auth libraries as
+described in this AIP.  The auth library will automatically load the correct
+credentials from the environment's metadata server.
+
 Auth libraries following the standards in these AIPs are known as _"Google
 Unified Auth Clients"_, or _GUAC_ for short. The resulting libraries are
 colloquially called _GUACs_.
@@ -42,6 +47,12 @@ auth libraries **must** support this credential type.
 needs to authenticate to access Google APIs. The auth libraries **must** support
 this credential type.
 
+- **Workload Access Token**: A credential retrieved from the metadata server
+running in your GCP compute environment.  This identifies either a service
+account (typical in case of GCE and App Engine) or a federated non-Google
+credential (typical in case of GKE).  The auth libraries **must** support this
+credential type.
+
 - **OAuth Client ID**: A credential that identifies the client application which
 allows human users to sign-in through [3-legged OAuth flow][1], which grants the
 permissions to the application to access Google APIs on behalf of the human
@@ -133,7 +144,7 @@ digraph d_front_back {
     1. If the credential is [an external account][8] JSON, go to step (4)
     1. If the credential is unknown type, return an error saying that _[END]_
   1. Credentials not found _[END]_
-1. **Check workload credentials (on GCE, GKE, GAE and Serverless)**
+1. **Check workload access token (on GCE, GKE, GAE and Serverless)**
   1. If true,
     1. If identity binding is enabled, by meeting the requirements in
        [mTLS Token Binding][9], use the mTLS Token Binding flow to fetch an

@@ -7,33 +7,150 @@ created: 2020-08-13
 
 # Default Credentials For Google Cloud Virtual Environments
 
-If the client runs on Google cloud virtual environments such as [Google Compute Engine (GCE)][0], 
-[Serverless][1], or [Google Kubernetes Engine (GKE)][2], the auth library **may** leverage 
-Google’s default mutual TLS (mTLS) credentials and obtain bound tokens for the instance. 
-The auth library **may** use the default mTLS credentials and bound tokens to access Google APIs. 
-
-mTLS authentication enables authentication of both client and server identities in a TLS handshake. 
-Applications running in Google virtual environments can authenticate to Google APIs using X.509 
-SPIFFE Verifiable Identity Documents (SVIDs). These SVIDs are X.509 certificates that contain SPIFFE 
-IDs specifying the identity of the certificate owner.
-
-Bound tokens are access tokens that are bound to some property of the credentials used to establish 
-the mTLS connection. The advantage of bound tokens is that they can be used over secure channels 
-established via mTLS credentials with the correct binding information, when appropriate access 
-policies have been put in place. Therefore, using bound tokens is more secure than bearer tokens,
-which can be stolen and adversarially replayed.
+If the client runs on Google cloud compute environments such as [Google Compute
+Engine (GCE)][0], [Serverless][1], or [Google Kubernetes Engine (GKE)][2],
+absent any explicit configuration the auth library will follow the Application
+Default Credentials flow described in AIP-4110.  It will detect that it is
+running on a platform with an available metadata server API, and configure
+itself to retrieve workload credentials from the metadata server.
+
+Typically, these workload credentials will be Google oauth access tokens, which
+are opaque tokens (only decodable by Google) that start with the fixed string
+`ya29.`.  Depending on the configuration of the workload and the Google service
+being called, the auth library may use additional features supported on the
+metadata server, such as mTLS-bound access tokens.
 
-This AIP describes the flow of:
+This AIP describes how to:
 
-1. Retrieving a configuration through a metadata server (MDS) endpoint. The configuration specifies 
-   how to access Google’s default mTLS credentials.
-2. Requesting bound tokens.
+1. Retrieve and cache workload access tokens from the metadata server.
+2. Retrieve mTLS-specific configuration from the metadata server
+3. Request mTLS-bound access tokens from the metadata server.
 
-**Note:** Because this AIP describes guidance and requirements in a language-neutral way, it uses 
-generic terminology which may be imprecise or inappropriate in certain languages or environments.
+**Note:** Because this AIP describes guidance and requirements in a
+language-neutral way, it uses generic terminology which may be imprecise or
+inappropriate in certain languages or environments.
 
 ## Guidance
 
+### Metadata Server API
+
+The metadata server is a special API that your workload can access using the
+special hostname `metadata.google.internal`.  This special hostname is
+configured to resolve to the address `169.254.169.254` across all GCP
+compute environments.
+
+The metadata server serves an HTTP API.  The precise set of paths available on
+this API is platform-specific, but the main paths used for authenticating to
+Google APIs are
+
+#### Workload Access Token
+
+The access token endpoint returns an opaque access token that can be used as a
+bearer token to authenticate to Google APIs.
+
+Request: `GET
+http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token`
+
+Response: A JSON object with the following keys:
+* `access_token` (String): The access token.
+* `expires_in` (Number): The number of seconds until the token expires.
+* `token_type` (String): Always the static string `Bearer`.
+
+#### Workload Identity Token
+
+The identity token endpoint returns a JWT asserting the workload's identity in a
+way that can be verified by non-Google third parties.  Third parties that you
+present the token to will expect a specific audience set on the token.
+
+Request: `GET
+http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/identity`.
+Accepts the following query parameters:
+* `audience` (Required): The audience for which this token should be issued.
+* `format` (Optional): `full`, or `standard`.  Only understood on GCE.
+* `licenses`: (Optional): `TRUE` or `FALSE`.  Only understood on GCE. 
+
+Response: A JSON Web Token, no additional framing.
+
+Note that the claims in the JWT can vary based on which compute platform you are
+using:
+* On GCE, the returned JWT is the [VM Identity Token](https://cloud.google.com/compute/docs/instances/verifying-instance-identity).
+* On GKE:
+  * When Workload Identity Federation for GKE is disabled, your workload talks
+    to the underlying GCE metadata server, so the behavior is the same as the
+    GCE case.
+  * When Workload Identity Federation for GKE is enabled, and the pod is not
+    configured with service account impersonation (the default), the identity
+    token endpoint always returns an error.
+  * When Workload Identity Federation for GKE is enabled, and the pod is
+    configured with service account impersonation, the identity token endpoint
+    returns a JWT as issued by
+    [generateIdToken](https://cloud.google.com/iam/docs/reference/credentials/rest/v1/projects.serviceAccounts/generateIdToken)
+    for the Google service account being impersonated.
+
+### Retrieve and Cache Workload Access Tokens
+
+When retrieving workload access tokens from the metadata server on GCE, GKE, App
+Engine, or other GCP compute platforms, the following recommendations apply.
+Google-provided auth libraries adhere to these recommendations.  If your
+workload directly communicates with the metadata server in order to retrieve
+workload access tokens, you should adhere to these recommendations.  If you do
+not, your workload may suffer from intermittent and difficult-to-debug
+authentication errors.
+
+**Cache the access token in memory:** Your workload should not make a call to
+the metadata server every time you make a call to a GCP API.  Doing so will
+cause your workload to be rate-limited by the metadata server, or GCP IAM.
+Instead, you should maintain an in-memory cache of the access token, and use the
+cached token across all of your outbound requests.
+
+**Use a robust refresh strategy:**  Each time you attempt to use the cached
+access token, check the remaining lifetime of the token.
+* If it is greater than 225 seconds, the token is fresh.  Proceed to use it.
+* If it is less than 0 seconds, then the token is expired.  Refresh the cached
+  token, then proceed with the current thread.
+* If it is less than 225 seconds, but more than 0 seconds, then the token is
+  stale.  You can handle the current thread in one of two ways:
+  * (Background Refresh) If the token has greater than two minutes of validity
+    remaining, you can refresh the cached token in the background, allowing the
+    current thread to immediately proceed with the stale token, OR
+  * (Blocking Refresh) Refresh the cached token, blocking the current thread
+    until the refresh is complete.
+
+It is not safe to refresh the cached access token in the background on a
+schedule without additionally checking the status of the token before using it
+to make a request. Clock skew between your workload and the metadata server may
+result in your background refresh attempt still receiving a stale access token.
+
+Note that the access token returned by the metadata server may itself be stale.
+Certain implementations of the metadata server use the "Background Refresh"
+strategy described above for managing their own internal caches of tokens.  For
+example, when running on GKE with Workload Identity Federation for GKE enabled,
+gke-metadata-server will not reliably return a refreshed access token until 120
+seconds before the token expires.  As long as your workload follows the robust
+refresh strategy described above, this will not cause problems.
+
+**Limit concurrency in initial fill and refresh of the cached token:** Use a
+single-flight mechanism, or locking, to ensure that your workload only makes a
+single concurrent call to the metadata server to retrieve the access token, no
+matter how many threads or coroutines might have triggered the refresh.  This
+ensures that your workload won't accidentally get rate-limited by the metadata
+server when your workload is under high load.
+
+### mTLS and Bound Tokens
+
+mTLS authentication enables authentication of both client and server identities
+in a TLS handshake. Applications running in Google virtual environments can
+authenticate to Google APIs using X.509 SPIFFE Verifiable Identity Documents
+(SVIDs). These SVIDs are X.509 certificates that contain SPIFFE IDs specifying
+the identity of the certificate owner.
+
+Bound tokens are access tokens that are bound to some property of the
+credentials used to establish the mTLS connection. The advantage of bound tokens
+is that they can be used over secure channels established via mTLS credentials
+with the correct binding information, when appropriate access policies have been
+put in place. Therefore, using bound tokens is more secure than bearer tokens,
+which can be stolen and adversarially replayed.
+
 ### Access Default mTLS Credentials
 
 **Note:** Before trying to use Google’s default mTLS credentials, the client **must** first check if the remote 
@@ -118,6 +235,7 @@ tokens expire.
 - **2020-12-14**: Replace note on scopes with more detailed discussion.
 - **2021-07-13**: Clarify GCE equivalent runtimes
 - **2023-02-16**: Add mTLS configuration endpoint and unify the token binding flow.
+- **2025-01-09**: Describe how to retrieve and cache standard access tokens.
 
 <!-- prettier-ignore-start -->
 [0]: https://cloud.google.com/compute