Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DR-3404: Spring Boot 3 upgrade + More #1581

Merged
merged 31 commits into from
Feb 13, 2024
Merged

Conversation

snf2ye
Copy link
Contributor

@snf2ye snf2ye commented Jan 23, 2024

https://broadworkbench.atlassian.net/browse/DC-816


Remaining TODOs

  • Investigate why the logs on BootRun appear to be really verbose [Answer: more widespread issue w/ opentelemetry in TCL]
  • Fix failing unit test testCopyFileWithGcsFile()
  • Do we want to include open telemetry? I don't think I could separate those dependencies from other terra common lib dependencies [Answer: We can't easily do this because the datarepo namespace in "bio.terra", so we can't easily exclude "bio.terra.comm" items from the package scan]
  • Investigate: Does opentelemetry work with these changes?
  • Upgrade junit vintage - This supports strict mocking
  • Do we want tracing turned on in every environment? What sample rate? Next step: Add helm PRs to override locally turning off tracing
  • Add tracing instrumentation to our Sam api
  • Document manual changes to terraform (i.e. adding the Cloud Trace Agent role on dev) and determine if this is relevant to BEEs - [Answer: Actually, this is already terraformed]
  • A week or two after merge: Make PRs against other repos to point at latest datarepo client (instead of datarepo-jakarta-client) [https://broadworkbench.atlassian.net/browse/DC-839]
  • Determine next steps for verbose bean warnings w/ TCL & otel (ticket made in PF)

Upgrades

Spring Boot

  • Spring boot 2.7.18 -> 3.2.2
  • Unpin logback-core & logback-classic - Effective upgrade from 1.2.13 -> 1.4.11
  • Unpin org.codehaus.janino:janino - stays 3.1.11
  • Javax -> jakarta.validation:jakarta.validation-api
  • Add jakarta.validation:jakarta.annotation-api
  • Unpin org.springframework:spring-jdbc - Effective upgrade from 5.1.9 -> 6.0.14
  • Unpin/switch to jakarta servlet jstl - Effective upgrade from 1.2 -> 3.0.0
  • Unpin org.glassfish.jersey - Effective upgrade from 2.30.1 -> 3.1.3
  • Unpin jersey-hk2 - Same version - 4.10.0
  • Unpin jackson-core & jackson-databind - Effective upgrade from 2.13.2 -> 2.14.3
  • Unpin webjars-locator-core - Effective upgrade 0.46 -> 0.52
  • Upgrade Spring dependency management plugin (this doesn't appear to be a required upgrade, but we were quite far behind. 1.0.11 was no longer listed on maven) - 1.0.11-RELEASE -> 1.1.4
  • Unpin snakeymal - Effective upgrade from 1.33 -> 2.2
  • Added implementation 'org.apache.httpcomponents.client5:httpclient5' (note as to why below)

Gradle

  • Gradle 7.3 -> 8.5; Upgrade some of the formatting in our build.gradle
  • Upgrade liquibase gradle plugin, otherwise gradle commands such as dropall fail - 2.1.1 -> 2.2.1
  • Upgrade gradle test retry plugin (required after upgrading gradle)
  • Jacoco & format for jacocoTestReport action - 0.8.7 -> 0.8.9

Swagger

  • Upgrade swagger-codegen to v3 - 2.4.27 -> 3.0.52
  • Upgrade swagger-annotations - 2.1.12 -> 2.2.20
  • Upgrade swagger-codegen-cli - 3.0.47 -> 3.0.51
  • Set swagger generation to use jakarta
  • Swagger-ui-dist: I ultimately didn't update the swagger ui distribution. Upgrading to 5.11.0 had bugs in the ui formatting on the swagger ui. Switching to using springdocs fixed this issue, but then we would need to support a slightly different swagger url (swagger-ui/index.html vs. swagger-ui.html). There is a redirect that we can use so that not users should be interrupted, but it would require changes to the proxy, which is out of scope for this ticket.

Spring cloud GCP starter logging

  • org.springframework.cloud:spring-cloud-gcp-starter-logging has been replaced by com.google.cloud:spring-cloud-gcp-starter-logging
  • More details here
  • Setup instructions here

Spring cloud sleuth replaced by micrometer

  • removed spring-cloud-starter-sleuth

Azure upgrades

This would be a very reasonable things to break into another PR - they're just listed as having CVE vulnerabilities. All of these are minor or patch level ugprades.
Upgraded:

  • com.azure:azure-identity
  • com.azure.resourcemanager:azure-resourcemanager
  • com.azure.resourcemanager:azure-resourcemanager-loganalytics
  • com.azure:azure-storage-common
  • com.azure:azure-storage-file-datalake
  • com.azure:azure-data-tables

Test dependencies

  • Unpin awaitility
  • Unpin junit-vintage
  • Upgraded Kubernetes java client in order to point testrunner tests to jakarta client
  • Major version upgrade to zonky embedded postgres

Terra upgrades

  • Upgrade terrra-common-lib - 0.0.89 -> 0.1.10
  • Upgrade sam client - 2.13:0.1-ae94a59 -> 2.13:0.1-a91acd0 - This git hash maps to upgrading from v0.0.119 to v0.0.186 (And it appears I could remove the exclusion of the sam reference from TCL)
  • Unpin stairway - Effective upgrade from 0.0.78 -> 0.0.80
  • TPS (terra-policy-client) - Upgrade from 1.0.4 -> 1.0.11
  • ECM (externalcreds) - Upgrade from 0.72.0 -> 1.3.0
  • RBS (resourse-buffer) - Upgrade from 0.4.3 -> 0.198.42

Needed Changes

  • Update from javax to jakarta

  • Added implementation 'org.apache.httpcomponents.client5:httpclient5'

Apache HttpClient in RestTemplate
Support for Apache HttpClient has been removed in Spring Framework 6.0, immediately replaced by org.apache.httpcomponents.client5:httpclient5 (note: this dependency has a different groupId). If you are noticing issues with HTTP client behavior, it could be that RestTemplate is falling back to the JDK client. org.apache.httpcomponents:httpclient can be brought transitively by other dependencies, so your application might rely on this dependency without declaring it.

  • HttpStatus -> HttpStatuCode

Move back to publishing single, jakarta-based client: datarepo-client

With this change, we will move back to only publishing a single client library, now Jakarta-based: datarepo-client. This change will be marked with a major version change - v2.0.0. This follows in the steps of ECM.

This means that any service who wants the latest datarepo client library will also need to move to Spring Boot 3. Currently, these services include: Terra CLI (no longer under our ownership), datarepo cli (appears to no longer be maintained), java template project (which will want to upgrade to jakarta to model best practice), cbas, and a number of POCs.

To help alleviate confusion, I'll put up PRs against the other repos in DataBiosphere that are referencing the datarepo-jakarta-client. They'll instead point to the datarepo-client:2.0.0 or later versions. [https://broadworkbench.atlassian.net/browse/DC-839]

Open Telemetry (otel)

By upgrading terra-common-lib, we automatically included the tracing work. It would error without adding the opentelemetry imports. So, why not get it working? I followed the instructions in Doug's DSP blog. I think it's helpful that integrate with the stairway trace hook. Otherwise, I added a sample of @WithSpan tags focusing on the snapshot dao, where we have found some high db load, and integration with other APIs (TPS) to see what was helpful. I support a more complete look into otel given an initial reading on what's helpful with these changes.

Other required changes:

  • Cloud Trace Agent role has been added to our api and k8 SAs in dev and integration. However, it's unclear if we should leave this enabled in our testing environments.
  • We should disable tracing by default and enable it per environment. This will require helm PRs.

Known Issues:

  • There are some really verbose log on app startup around otel. This appears to be a more widespread issue, as these same logs appear in WSM's logs. It's unclear which team would own looking into a fix for this, so I think I should make a ticket - I guess in the tdr jira project?

Example warning message:

WARN  [main] o.s.c.s.PostProcessorRegistrationDelegate$BeanPostProcessorChecker: Bean 'terraMeterProvider' of type [io.opentelemetry.sdk.metrics.SdkMeterProvider] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying). Is this bean getting eagerly injected into a currently created BeanPostProcessor [otelRestTemplateBeanPostProcessor]? Check the corresponding BeanPostProcessor declaration and its dependencies.

Example trace (with many more details if you click on one the dots) 🎉
image

Migrate datarepo-clienttests to use jakarta client

The main change required here was to upgrade the kubernetes client.

Changes in Error Messages

As you'll see in the commits, there were some changes in order to still correctly surface error messages. When we were expecting error details to be empty, they actually returned unhelpful ProblemDetail models. This is one thing for us to be look out for as we move forward with this change -- are our errors messages correct?

Resulting Tickets

@snf2ye snf2ye force-pushed the sh/dr-3404-spring-boot-3-upgrade branch 3 times, most recently from b3e335b to e02378a Compare January 25, 2024 20:08
@snf2ye snf2ye force-pushed the sh/dr-3404-spring-boot-3-upgrade branch 8 times, most recently from dbd1b0a to e7a3040 Compare February 6, 2024 20:22
Comment on lines -181 to -179
// Fix for CVE-2022-25857 which identifies a vulnerability in snake-yaml 1.30
// This override can likely be removed or bumped to 2.0 when we upgrade to spring boot 3
ext['snakeyaml.version'] = '1.33'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snakeyaml is now at 2.2
image

implementation 'jakarta.validation:jakarta.validation-api'
implementation 'jakarta.annotation:jakarta.annotation-api'

implementation 'org.apache.httpcomponents.client5:httpclient5'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we need to add this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a note about this in the description:

Apache HttpClient in RestTemplate
Support for Apache HttpClient has been spring-projects/spring-framework#28925, immediately replaced by org.apache.httpcomponents.client5:httpclient5 (note: this dependency has a different groupId). If you are noticing issues with HTTP client behavior, it could be that RestTemplate is falling back to the JDK client. org.apache.httpcomponents:httpclient can be brought transitively by other dependencies, so your application might rely on this dependency without declaring it.

Comment on lines +44 to 53
var responseBody = body;
// Without specifically grabbing the exception message for validation errors,
// TDR will swallow the error and return a 400 with no error message.
if (responseBody == null || status.isSameCodeAs(HttpStatus.BAD_REQUEST)) {
responseBody =
new ErrorModel()
.message(status + " - see error details")
.addErrorDetailItem(ex.getMessage());
}

return new ResponseEntity<>(responseBody, headers, status);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With these changes, a ProblemDetail is returned for the response body, making the responseBody no longer null for bad request exceptions. So, I've added a special case for BAD_REQUESTS to continue the work originally added in this PR: #869

@snf2ye snf2ye force-pushed the sh/dr-3404-spring-boot-3-upgrade branch 5 times, most recently from 1e1bd77 to 9dd9e62 Compare February 9, 2024 14:20
@snf2ye snf2ye marked this pull request as ready for review February 9, 2024 16:43
@snf2ye snf2ye changed the title [In progress] DR-3404: Spring Boot 3 upgrade DR-3404: Spring Boot 3 upgrade + More Feb 9, 2024
@snf2ye snf2ye force-pushed the sh/dr-3404-spring-boot-3-upgrade branch 5 times, most recently from 1666a5b to 604677e Compare February 9, 2024 19:50
Copy link
Member

@pshapiro4broad pshapiro4broad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, just a few remaining comments

.github/workflows/dev-image-update.yaml Show resolved Hide resolved
gradle/wrapper/gradle-wrapper.properties Show resolved Hide resolved
@@ -736,15 +735,13 @@ public static ErrorReportException convertSAMExToDataRepoEx(final ApiException s
logger.warn("SAM client exception details: {}", samEx.getResponseBody());

// Sometimes the sam message is buried several levels down inside of the error report object.
// If we find an empty message then we try to deserialize the error report and use that message.
String message = samEx.getMessage();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this is overwritten unless an exception is thrown, I think it would be clearer to only assign it once instead. Something like

    final String message;
    try {
      ...
    } catch (...) {
      message = Objects.requireNonNullElse(samEx.getMessage(), "SAM client exception");
    }

build.gradle Show resolved Hide resolved
Copy link
Contributor

@okotsopoulos okotsopoulos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic work on this… looking forward to walking through together, but nothing major is jumping out at me.

build.gradle Show resolved Hide resolved
build.gradle Outdated Show resolved Hide resolved
@@ -273,17 +257,7 @@ public static void deleteRandomPod() throws ApiException, IOException {
}

public static void deletePod(String podNameToDelete) throws ApiException, IOException {
// known issue with java api "deleteNamespacedPod()" endpoint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

Slight change in status message from sam
…requests

The response body used to be null, but now it is returning a Problem detail that hides the actual exception message
- Required upgrading to k8 20.0.0 to support snakeyaml 2.2
- We could now additionally upgrade the kubernetes client in the main service
tracing/stairway env variables

Add otel instrumentation for SAM

Following the example in BPM https://github.com/DataBiosphere/terra-billing-profile-manager/blob/9074f11e45187f18000b024208d2bb118874e187/service/src/main/java/bio/terra/profile/service/iam/SamService.java

update to otel noop

by default we'll turn off tracing with open telemetry
This appears to now implement strict stubbing
Turns out the python client publish GHA has been failing for 3+ weeks, so this change is not tested. A separate ticket has been added to handle this fix.
https://broadworkbench.atlassian.net/browse/DR-3455
From @okotsopoulos -
I'll flag that this latest version of ECM client includes this change:

DataBiosphere/terra-external-credentials-manager#141

Which means that we can remove this 2 year old stopgap (a unit test in EcmServiceTest would need to be updated):

      String passport = oidcApiService.getOidcApi(userReq).getProviderPassport(RAS_PROVIDER);
      // Passports returned by OidcApi have a bug in their formatting:
      // double quotes must be stripped if passing back to PassportApi for validation,
      // otherwise the passport will not be considered valid JWT.
      // This stopgap can be removed when the client is fixed:
      // https://broadworkbench.atlassian.net/browse/ID-128
      return StringUtils.strip(passport, "\"");
…boot 3

- Fixes swagger-ui wonkiness
- Can remove swagger-ui-dist handlers
- Can still keep the swagger-ui.html path
- more details: https://springdoc.org/
upgrading causes ui bugs; switching to spring docs points to a different url (instead of swagger-ui.html, it's swagger-ui/index.html. There is a redirect but we'd also have to make proxy changes to allow the new url to pass through)

Revert "Using springdoc seems to be the recommended path forward with spring boot 3"
@snf2ye snf2ye force-pushed the sh/dr-3404-spring-boot-3-upgrade branch from a2ad0d5 to 462ec39 Compare February 13, 2024 04:03
Copy link

@snf2ye snf2ye merged commit 6b6db0f into develop Feb 13, 2024
12 checks passed
@snf2ye snf2ye deleted the sh/dr-3404-spring-boot-3-upgrade branch February 13, 2024 15:34
snf2ye added a commit to DataBiosphere/terra-data-catalog that referenced this pull request Mar 4, 2024
)

https://broadworkbench.atlassian.net/browse/DC-839
_______
With the DataBiosphere/jade-data-repo#1581 for
data repo, we're switching back to only publishing to one Jakarta backed
client. This PR moves your references to the jakarta client back to the
main data repo client and bumps to the latest version. This is an effort
to alleviate any confusion around upgrading past the last supported
datarepo-jakarta-client version.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants