Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPIKE] DT-909: Can we switch to using workload identity to auth as SAs across TDR GHAs? #1838

Closed
wants to merge 49 commits into from

Conversation

snf2ye
Copy link
Contributor

@snf2ye snf2ye commented Oct 16, 2024

Jira ticket: https://broadworkbench.atlassian.net/browse/DT-909

Addresses

Ideally, we would use workload identity to authenticate as service accounts across our Github actions rather than maintaining and rotating service account keys. This PR explores in what cases this is possible. I found that it is generally possible to move away from maintaining keys for external services, but without a lot of further work, maintaining keys for the TDR SA continues to be the best approach.

We currently have 9 Github action (GHA) workflows that we maintain in TDR. The approach looks different depending on the GHA.

  1. Int and connected test run - We could move away from using the RBS SA, but still need to maintain the TDR SA.

  2. Cherry pick image action - This PR demonstrates how to use workload identity to auth as TDR GCR SA - this will be implemented in a separate PR.

  3. Dev Image Update action - we again just need the TDR SA, so it makes sense to use the same methodology as used in the test run action. I believe it would be easy to switch to workload identity here if desired, but I don't think it makes sense to introduce another auth mechanism if we can't totally switch to workload identity for this service account.

  4. Staging Smoke tests - This action auths both as the TDR SA and a different "Test runner" SA. I worked to update client tests to work with just one SA, but its not set up to work with only one. My recommendation is to retire clienttests and create a new smoke test suite out of our existing integration/connected/unit junit tests. This would then just use the TDR SA. Work to be done in this ticket: https://broadworkbench.atlassian.net/browse/DT-895.

The remaining actions do not reference a service account

Obstacles encountered for TDR SA

I found it hard to pinpoint the root issue for why tests were not passing with workload identity. I have a few theories, but it was hard to test because you can't exactly replicate authenticating as a workload identity for a local run. I would argue that the different authentication methods per running environment would a blocker for moving forward with this work (local impersonation of SA, workload identity in GHA, and key in running service). But, it seems likely that I am missing something here - Could you auth as workload identity in all 3 cases? I have run out of time to continuing exploring this, but this seems like a good starting place: replicate GHA environment locally and pinpoint issue.

One theory - It seems possible that we are using an old Google API for google billing, particularly for testing if a user has access to a google billing project. However, it doesn't appear that google billing is fully supported by workload identity: https://cloud.google.com/iam/docs/federated-identity-supported-services#cloud-billing

Summary of changes

  • Following these instructions
  • Attempting to get all tests running with workload identity, but not successful

Related Changes

Cherry pick action

Successful test run: https://github.com/DataBiosphere/jade-data-repo/actions/runs/11441504960/job/31829699245

@snf2ye snf2ye requested a review from a team as a code owner October 16, 2024 20:22
Copy link
Contributor

@fboulnois fboulnois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor early suggestion:

Comment on lines 9 to 11
permissions:
contents: 'read'
id-token: 'write'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can put these permissions at the job level:

jobs:
  test-runner-staging:
    runs-on: ubuntu-latest
    permissions:
      contents: 'read'
      id-token: 'write'

@snf2ye snf2ye force-pushed the sh/dcj-755-staging-testrunnersa branch from ead90ca to 4e15e49 Compare October 21, 2024 13:58
@snf2ye snf2ye force-pushed the sh/dcj-755-staging-testrunnersa branch from 70e0d9b to 0db617f Compare November 8, 2024 18:20
Copy link

sonarqubecloud bot commented Nov 8, 2024

@snf2ye snf2ye changed the title DCJ-755: Use workload identity to auth as staging test runner SA for staging smoke tests [SPIKE] DCJ-755: Switch to using workload identity to auth as SAs across GHAs Dec 13, 2024
@snf2ye snf2ye changed the title [SPIKE] DCJ-755: Switch to using workload identity to auth as SAs across GHAs [SPIKE] DT-909: Switch to using workload identity to auth as SAs across GHAs Dec 13, 2024
@snf2ye snf2ye force-pushed the sh/dcj-755-staging-testrunnersa branch from e04b7e1 to 965b50e Compare December 17, 2024 15:13
@snf2ye snf2ye force-pushed the sh/dcj-755-staging-testrunnersa branch from 965b50e to 200768f Compare December 17, 2024 15:15
Comment on lines -435 to +440
if (defaultCredentials instanceof ServiceAccountCredentials) {
return ((ServiceAccountCredentials) defaultCredentials).getClientEmail();
if (defaultCredentials instanceof ServiceAccountCredentials serviceAccountCredentials) {
return serviceAccountCredentials.getClientEmail();
}
if (defaultCredentials instanceof ExternalAccountCredentials externalAccountCredentials) {
return externalAccountCredentials.getServiceAccountEmail();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would have to handle three cases:

  1. service account for service account running in environments
  2. ExternalAccountCredentials for workload identity
  3. Impersonated credentials for local run if we auth while impersonating the TDR SA

@snf2ye snf2ye changed the title [SPIKE] DT-909: Switch to using workload identity to auth as SAs across GHAs [SPIKE] DT-909: Can we switch to using workload identity to auth as SAs across TDR GHAs? Dec 17, 2024
@snf2ye snf2ye closed this Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants