Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[controller] Add deferred version swap service to parent controller #1421

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

misyel
Copy link
Contributor

@misyel misyel commented Jan 8, 2025

Summary, imperative, start upper case, don't end with a period

Add in a new background service, DeferredVersionSwapService, to the parent controller to perform version swap for the remaining regions after the specified wait time has passed in target region push

How was this PR tested?

Unit tests for the new class

Does this PR introduce any user-facing changes?

  • No. You can skip the rest of this section.
  • [] Yes. Make sure to explain your proposed changes and call out the behavior change.

List<Store> stores = veniceParentHelixAdmin.getAllStores(cluster);
for (Store store: stores) {
int parentVersion = store.getLargestUsedVersionNumber();
if (store.getVersion(parentVersion).getStatus() == VersionStatus.ONLINE) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we should only check STARTED in the loop

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to account for PUSHED too?

Copy link
Contributor

@majisourav99 majisourav99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an e2e integration test for it.

@misyel misyel force-pushed the add-version-swap-service branch from 33dee5e to e6dbf34 Compare January 15, 2025 20:14
return targetRegions.iterator().next();
}

private StoreResponse getStoreForRegion(String clusterName, Set<String> targetRegions, String storeName) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it return a Map of store responses?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only returns one store response because we only need to get the store for one target region to check the dvc heartbeat status

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in that case can you change the function signature to take in only a region string instead of set of strings

for (String targetRegion: targetRegions) {
String executionStatus = pushStatusInfo.getExtraInfo().get(targetRegion);
if (!executionStatus.equals(ExecutionStatus.COMPLETED.toString())) {
didPushCompleteInTargetRegions = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can just continue here without needing this variable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually if you have multiple target regions then its ok. Are we planning to support that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think we should support multiple regions as well. Currently, the user can input a comma separated list of regions to perform the target region push in & if we choose to not support it right now, we should go back and add a check to prevent the user from doing that

}

Map<String, Integer> coloToVersions =
veniceParentHelixAdmin.getCurrentVersionsForMultiColos(cluster, store.getName());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should get the lastest version status instead of current version status

Copy link
Contributor Author

@misyel misyel Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the code again, it's ok for us to keep this as is. We only use the value from this to infer the remaining regions to perform the version swap in (getRegionsForVersionSwap). We use getOffLinePushStatus to verify whether the push is completed using the ExecutionStatus before checking if the wait time has elapsed. There might be a slight delay with the version being marked ONLINE w/ this method, but it should only be a few secs after the push is completed and the default wait time is one hour

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants