Skip to content

DT-1194: Speed up integration tests by running them in parallel #1896

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 54 commits into from
Feb 6, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
c49d0a1
Convert two integration tests to no longer require a full application…
pshapiro4broad Jan 23, 2025
3f9f507
WIP integration test migration
pshapiro4broad Jan 23, 2025
3b3f3f3
spotless
pshapiro4broad Jan 23, 2025
745ea66
more integration WIP
pshapiro4broad Jan 23, 2025
587b22d
spotless
pshapiro4broad Jan 24, 2025
2c11234
fix more issues
pshapiro4broad Jan 24, 2025
3797f55
fix more issues
pshapiro4broad Jan 24, 2025
5db6e56
spotless
pshapiro4broad Jan 24, 2025
8936b03
misc integration test changes
pshapiro4broad Jan 27, 2025
731142a
DT-1184: update db retry catch to use new exception class
pshapiro4broad Jan 27, 2025
4c6196a
Merge branch 'develop' into ps/int-test-improvements
pshapiro4broad Jan 27, 2025
e8c1e89
fix map deserialization issue
pshapiro4broad Jan 27, 2025
57f6ee3
merge fixes
pshapiro4broad Jan 27, 2025
83ba235
merge fixes
pshapiro4broad Jan 27, 2025
b7f4cb2
Merge branch 'ps/dt-1184-sql-retry-fix' into ps/int-test-improvements
pshapiro4broad Jan 27, 2025
d7171be
merge fixes
pshapiro4broad Jan 27, 2025
d897097
Fix cases where TransactionSystemException was thrown but not checked…
pshapiro4broad Jan 27, 2025
6b0bb49
Merge branch 'ps/dt-1184-sql-retry-fix' into ps/int-test-improvements
pshapiro4broad Jan 27, 2025
1cd45e3
add retry to db step in delete snapshot flight
pshapiro4broad Jan 27, 2025
c2a3474
thread safety
pshapiro4broad Jan 27, 2025
274fda2
minor fixes
pshapiro4broad Jan 27, 2025
270d479
log test users
pshapiro4broad Jan 27, 2025
1babe2d
spotless
pshapiro4broad Jan 27, 2025
c3464ff
fix NPE
pshapiro4broad Jan 28, 2025
6afd840
convert to concurrent execution
pshapiro4broad Jan 28, 2025
117b5b5
another retry case found while testing
pshapiro4broad Jan 28, 2025
bb20139
convert to concurrent execution
pshapiro4broad Jan 28, 2025
0a79c38
convert DatasetControlFilesIntegrationTest to concurrent execution
pshapiro4broad Jan 29, 2025
a165a85
convert SelfHostedDatasetIntegrationTest to concurrent execution
pshapiro4broad Jan 29, 2025
b3668b3
disable configuration (fault injection) tests
pshapiro4broad Jan 29, 2025
58252cf
concurrent azure tests (not working yet); longer timout for service run
pshapiro4broad Jan 29, 2025
0c1e42f
Merge branch 'develop' into ps/int-test-improvements
pshapiro4broad Jan 31, 2025
af71817
more work to get azure integration tests working in parallel
pshapiro4broad Jan 31, 2025
d544f3a
more parallel work
pshapiro4broad Feb 1, 2025
8a4edf5
more parallel work
pshapiro4broad Feb 1, 2025
2356f3f
use remove over set(null)
pshapiro4broad Feb 3, 2025
908fbcf
back out changes to dao code
pshapiro4broad Feb 3, 2025
547e49a
revert changes to make azure integration tests concurrent
pshapiro4broad Feb 3, 2025
64ebcde
spotless
pshapiro4broad Feb 3, 2025
d675c1c
back out some changes to reduce diffs
pshapiro4broad Feb 3, 2025
e4cbf25
back out some changes to reduce diffs
pshapiro4broad Feb 3, 2025
31cd7fd
back out some changes to reduce diffs
pshapiro4broad Feb 3, 2025
a77bde9
back out some changes to reduce diffs
pshapiro4broad Feb 3, 2025
3f725c2
back out some changes to reduce diffs
pshapiro4broad Feb 3, 2025
ac802eb
back out some changes to reduce diffs
pshapiro4broad Feb 3, 2025
da24845
back out some changes to reduce diffs
pshapiro4broad Feb 3, 2025
763b0c1
back out some changes to reduce diffs
pshapiro4broad Feb 3, 2025
dadfd2e
back out some changes to reduce diffs
pshapiro4broad Feb 3, 2025
96c3d71
back out some changes to reduce diffs
pshapiro4broad Feb 4, 2025
90111ad
back out some changes to reduce diffs
pshapiro4broad Feb 4, 2025
86b4d4b
use threadlocal for http headers
pshapiro4broad Feb 4, 2025
d2c0c8d
fix null testUsers
pshapiro4broad Feb 4, 2025
ed3f2bf
restrict Users component to "testintegration" profile; back out integ…
pshapiro4broad Feb 4, 2025
cfada0e
add back TestConfiguration dependency to IntegrationTestConfiguration…
pshapiro4broad Feb 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -484,6 +484,7 @@ tasks.register('testIntegration', Test) {
}
maxParallelForks = 8
outputs.upToDateWhen { false }
systemProperty 'junit.jupiter.execution.parallel.enabled', 'true'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is required to enable concurrent test execution. If we decide to use concurrent execution for other test types (like connected tests) this configuration can be moved to a shared scope, or moved to a property file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it the case that this just allows for parallel execution and tests still need to explicitly make use of it?

}

tasks.register('testUnit', Test) {
Expand Down
1 change: 1 addition & 0 deletions scripts/compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ services:
- GOOGLE_SA_CERT
- RBS_POOLID
- RBS_INSTANCEURL
- TDR_LOG_APPENDER
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enable passthrough to support non-json logging in the TDR container

ports:
- "8080:8080"
volumes:
Expand Down
12 changes: 12 additions & 0 deletions scripts/run
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,9 @@ source "$(dirname "$0")"/init.sh
die() { log_error "$*" >&2; echo ""; usage; exit 2; } # complain to STDERR and exit with error
needs_arg() { if [ -z "$OPTARG" ]; then die "No arg for --$OPT option"; fi; }

# output plain logs instead of json
export TDR_LOG_APPENDER=Console-Standard

while getopts h-: OPT; do
# support long options: https://stackoverflow.com/a/28466267/519360
if [ "$OPT" = "-" ]; then # long option: reformulate OPT and OPTARG
Expand All @@ -51,17 +54,24 @@ RUNTYPE=${1}

run_tests() {
cd "${ROOT_DIR}"
eval "$("${SCRIPTS_DIR}"/render-configs.sh -e)"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed for both unit and connected tests as well, since these depend on GOOGLE_APPLICATION_CREDENTIALS to provide the bean tdrServiceAccountEmail.


# shellcheck disable=SC2086 # The array syntax and quoting seems to break gradle
./gradlew testUnit ${GRADLE_ARGS}

#./gradlew --build-cache srcclr
}

wait_until_server_up() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this, integration tests would sometimes fail as the server wasn't up by the time the tests started running.

timeout 60 bash -c "until curl -s http:/localhost:8080/status; do sleep 1; done"
}

run_integration_tests() {
TEST_ENV=${TEST_ENV:-local}

cd "${ROOT_DIR}"
start_docker integration
wait_until_server_up
# shellcheck disable=SC2086 # The array syntax and quoting seems to break gradle
./gradlew testIntegration ${GRADLE_ARGS}

Expand All @@ -75,6 +85,8 @@ run_connected_tests() {
TEST_ENV=${TEST_ENV:-local}

cd "${ROOT_DIR}"
eval "$("${SCRIPTS_DIR}"/render-configs.sh -e)"

# shellcheck disable=SC2086 # The array syntax and quoting seems to break gradle
./gradlew testConnected ${GRADLE_ARGS}

Expand Down
6 changes: 2 additions & 4 deletions scripts/run-db
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
usage() {
cat <<-'EOF'
usage: run-db [--clean-db] [-h|--help] COMMAND
[--clean-db] clean the database
[--clean-db] clean the database (only with `stop` or `start`)
[-h|--help] print this help text
COMMAND is one of:
- start start a local database server
Expand Down Expand Up @@ -34,8 +34,6 @@ while getopts hc-: OPT; do # allow -h and -- "with arg"
# support long options: https://stackoverflow.com/a/28466267/519360
if [ "$OPT" = "-" ]; then # long option: reformulate OPT and OPTARG
OPT="${OPTARG%%=*}" # extract long option name
OPTARG="${OPTARG#"$OPT"}" # extract long option argument (may be empty)
OPTARG="${OPTARG#=}" # if long option argument, remove assigning `=`
fi
case "$OPT" in
c | clean-db ) clean_db=true ;;
Expand All @@ -53,7 +51,7 @@ COMMAND=$1
remove_database() {
if [ "$clean_db" = "true" ]; then
log_info "erasing existing database state"
rm -r "$TMP_DIR"/postgres-data
rm -r "$ROOT_DIR"/build/tmp/postgres-data
Copy link
Member Author

@pshapiro4broad pshapiro4broad Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the actual location of the container's database, see https://github.com/DataBiosphere/jade-data-repo/blob/develop/scripts/compose.yaml#L36

fi
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
import java.util.stream.Collectors;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.transaction.TransactionSystemException;

public class CreateDatasetGetOrCreateBucketStep implements Step {
private static final Logger logger =
Expand Down Expand Up @@ -79,8 +80,8 @@ public StepResult doStep(FlightContext context) throws InterruptedException {
googleProjectResource.getServiceAccount());

workingMap.put(FileMapKeys.BUCKET_INFO, bucketForFile);
} catch (BucketLockException blEx) {
return new StepResult(StepStatus.STEP_RESULT_FAILURE_RETRY, blEx);
} catch (BucketLockException | TransactionSystemException e) {
return new StepResult(StepStatus.STEP_RESULT_FAILURE_RETRY, e);
} catch (GoogleResourceNamingException ex) {
return new StepResult(StepStatus.STEP_RESULT_FAILURE_FATAL, ex);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,12 @@
import bio.terra.stairway.FlightMap;
import bio.terra.stairway.Step;
import bio.terra.stairway.StepResult;
import bio.terra.stairway.StepStatus;
import java.util.UUID;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.dao.TransientDataAccessException;
import org.springframework.transaction.TransactionSystemException;

public class CreateDatasetGetOrCreateStorageAccountStep implements Step {
private static Logger logger =
Expand All @@ -40,11 +43,16 @@ public StepResult doStep(FlightContext context) throws InterruptedException {
workingMap.get(ProfileMapKeys.PROFILE_MODEL, BillingProfileModel.class);
UUID datasetId = workingMap.get(DatasetWorkingMapKeys.DATASET_ID, UUID.class);

AzureStorageAccountResource storageAccount =
resourceService.getOrCreateDatasetStorageAccount(
DatasetJsonConversion.datasetRequestToDataset(datasetRequestModel, datasetId),
profileModel,
context.getFlightId());
AzureStorageAccountResource storageAccount;
try {
storageAccount =
resourceService.getOrCreateDatasetStorageAccount(
DatasetJsonConversion.datasetRequestToDataset(datasetRequestModel, datasetId),
profileModel,
context.getFlightId());
} catch (TransientDataAccessException | TransactionSystemException e) {
return new StepResult(StepStatus.STEP_RESULT_FAILURE_RETRY, e);
}
Comment on lines +53 to +55
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there were a different exception thrown, could storageAccount still be unitialized?

Copy link
Member Author

@pshapiro4broad pshapiro4broad Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a different exception was thrown, it wouldn't be caught by the catch, so the method would exit with the thrown exception.

Either the getOrCreateDatasetStorageAccount completes successfully and storageAccount has a value, or an exception was thrown and we catch and return early, or allow the exception to be thrown.


logger.info("Enabling Azure storage account logging");
// Log files will reside in the storage account's $logs container
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,8 @@ public DatasetCreateFlight(FlightMap inputParameters, Object applicationContext)
if (platform.isAzure()) {
addStep(
new CreateDatasetGetOrCreateStorageAccountStep(
resourceService, datasetRequest, azureBlobStorePdao));
resourceService, datasetRequest, azureBlobStorePdao),
getDefaultRandomBackoffRetryRule(appConfig.getMaxStairwayThreads()));

// Create the top level container
addStep(
Expand All @@ -130,7 +131,7 @@ public DatasetCreateFlight(FlightMap inputParameters, Object applicationContext)
// Create dataset metadata objects in postgres and lock the dataset
addStep(
new CreateDatasetMetadataStep(datasetDao, datasetRequest),
getDefaultExponentialBackoffRetryRule());
getDefaultRandomBackoffRetryRule(appConfig.getMaxStairwayThreads()));

// For azure backed datasets, add a link co connect the storage account to the dataset
if (platform.isAzure()) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,12 @@
import bio.terra.stairway.FlightMap;
import bio.terra.stairway.Step;
import bio.terra.stairway.StepResult;
import bio.terra.stairway.StepStatus;
import bio.terra.stairway.exception.RetryException;
import java.util.List;
import java.util.UUID;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.dao.TransientDataAccessException;
import org.springframework.transaction.TransactionSystemException;

public class DeleteSnapshotMarkProjectStep implements Step {

Expand All @@ -26,19 +27,21 @@ public DeleteSnapshotMarkProjectStep(
this.snapshotService = snapshotService;
}

private static final Logger logger = LoggerFactory.getLogger(DeleteSnapshotMarkProjectStep.class);

@Override
public StepResult doStep(FlightContext context) throws InterruptedException, RetryException {
FlightMap workingMap = context.getWorkingMap();
UUID projectId = workingMap.get(SnapshotWorkingMapKeys.PROJECT_RESOURCE_ID, UUID.class);

List<UUID> projectsToBeDeleted =
resourceService.markUnusedProjectsForDelete(List.of(projectId));
try {
List<UUID> projectsToBeDeleted =
resourceService.markUnusedProjectsForDelete(List.of(projectId));

workingMap.put(SnapshotWorkingMapKeys.PROJECTS_MARKED_FOR_DELETE, projectsToBeDeleted);
workingMap.put(SnapshotWorkingMapKeys.PROJECTS_MARKED_FOR_DELETE, projectsToBeDeleted);

return StepResult.getStepResultSuccess();
return StepResult.getStepResultSuccess();
} catch (TransientDataAccessException | TransactionSystemException e) {
return new StepResult(StepStatus.STEP_RESULT_FAILURE_RETRY, e);
}
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,16 @@
import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.Matchers.is;

import bio.terra.common.auth.Users;
import bio.terra.common.category.Integration;
import bio.terra.common.configuration.TestConfiguration.User;
import bio.terra.integration.DataRepoFixtures;
import bio.terra.integration.IntegrationTestConfiguration;
import bio.terra.integration.UsersBase;
import bio.terra.model.ConfigGroupModel;
import bio.terra.model.ConfigModel;
import bio.terra.model.ConfigParameterModel;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Disabled;
import org.junit.jupiter.api.Tag;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.extension.ExtendWith;
Expand All @@ -30,14 +32,27 @@
@SpringBootTest(classes = IntegrationTestConfiguration.class)
@ActiveProfiles({"google", "integrationtest"})
@Tag(Integration.TAG)
class RepositoryApiControllerAccessTest extends UsersBase {
@Disabled
class RepositoryApiControllerAccessTest {

@Autowired private DataRepoFixtures dataRepoFixtures;
@Autowired private Users users;

private User admin;
private User reader;

private User admin() {
return admin;
}

private User reader() {
return reader;
}

@BeforeEach
@Override
public void setup() throws Exception {
super.setup();
admin = users.admin();
reader = users.reader();
}

@Test
Expand Down
117 changes: 71 additions & 46 deletions src/test/java/bio/terra/common/auth/Users.java
Original file line number Diff line number Diff line change
Expand Up @@ -2,49 +2,38 @@

import bio.terra.common.configuration.TestConfiguration;
import bio.terra.common.configuration.TestConfiguration.User;
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Random;
import java.util.stream.Collectors;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Profile;
import org.springframework.stereotype.Component;

@Component
@Profile("integrationtest")
public class Users {
private static final Logger logger = LoggerFactory.getLogger(Users.class);

private Map<String, List<TestConfiguration.User>> usersByRole = new HashMap<>();
private Map<String, TestConfiguration.User> userByName = new HashMap<>();
private static final String ADMIN_ROLE = "admin";
private static final String STEWARD_ROLE = "steward";
private static final String CUSTODIAN_ROLE = "custodian";
private static final String READER_ROLE = "reader";
private static final String DISCOVERER_ROLE = "discoverer";

@Autowired
public Users(TestConfiguration testConfig) {
buildUsersByRole(testConfig.getUsers());
}
private final Map<String, List<User>> usersByRole;

private void buildUsersByRole(List<TestConfiguration.User> users) {
users.forEach(
user -> {
String role = user.getRole();
List<TestConfiguration.User> newList = new ArrayList<>();

if (usersByRole.containsKey(role)) newList = usersByRole.get(role);
newList.add(user);
usersByRole.put(role, newList);
userByName.put(user.getName(), user);
});
}
public record TestUsers(User admin, User steward, User custodian, User reader, User discoverer) {}

public TestConfiguration.User getUserForRole(String role) {
return getUserForRole(role, true);
}

public TestConfiguration.User getUserForRole(String role, boolean shuffle) {
return getUsersForRole(role, 1, shuffle).get(0);
@Autowired
public Users(TestConfiguration testConfig) {
usersByRole = testConfig.getUsers().stream().collect(Collectors.groupingBy(User::getRole));
}

public TestConfiguration.User getUserForRole(String name, String role) {
List<User> usersForRole = getUsersForRole(role, -1, false);
private User getUserForRole(String name, String role) {
var usersForRole = usersByRole.get(role);
return usersForRole.stream()
.filter(u -> u.getName().equals(name))
.findFirst()
Expand All @@ -60,24 +49,60 @@ public TestConfiguration.User getUserForRole(String name, String role) {
.collect(Collectors.joining(", ")))));
}

public TestConfiguration.User getUser(String name) {
return userByName.get(name);
private User getUserForRole(String role) {
var users = usersByRole.get(role);
return users.get(new Random().nextInt(users.size()));
}

public User admin() {
return getUserForRole(ADMIN_ROLE);
}

public User admin(String name) {
return getUserForRole(name, ADMIN_ROLE);
}

public User steward() {
return getUserForRole(STEWARD_ROLE);
}

public User steward(String name) {
return getUserForRole(name, STEWARD_ROLE);
}

public User custodian() {
return getUserForRole(CUSTODIAN_ROLE);
}

public User custodian(String name) {
return getUserForRole(name, CUSTODIAN_ROLE);
}

public User reader() {
return getUserForRole(READER_ROLE);
}

public User reader(String name) {
return getUserForRole(name, READER_ROLE);
}

public User discoverer() {
return getUserForRole(DISCOVERER_ROLE);
}

public User discoverer(String name) {
return getUserForRole(name, DISCOVERER_ROLE);
}

public List<TestConfiguration.User> getUsersForRole(String role, int numUsers, boolean shuffle) {
if (role == null) {
throw new RuntimeException("Role not specified");
}
List<TestConfiguration.User> usersList = usersByRole.get(role);
if (numUsers != -1 && usersList.size() < numUsers) {
throw new RuntimeException("not enough users for " + role);
}
if (shuffle) {
Collections.shuffle(usersList);
}
if (numUsers != -1) {
return usersList.subList(0, numUsers);
}
return usersList;
public TestUsers testUsers() {
TestUsers testUsers = new TestUsers(admin(), steward(), custodian(), reader(), discoverer());
logger.info(
"admin: {}; steward: {}; custodian: {}; reader: {}; discoverer: {}",
testUsers.admin().getName(),
testUsers.steward().getName(),
testUsers.custodian().getName(),
testUsers.reader().getName(),
testUsers.discoverer().getName());
return testUsers;
}
}
Loading
Loading