Skip to content

Use WaitUntilReady retry strategy for test pool #7643

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 23, 2025

Conversation

torcolvin
Copy link
Collaborator

Followup to #7642. I don't know why this only hits MasterIntegration and not Sync Gateway Integration Tests but I could reproduce locally.

@torcolvin torcolvin requested a review from bbrks July 22, 2025 20:15
@Copilot Copilot AI review requested due to automatic review settings July 22, 2025 20:15
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the cluster agent initialization to use a new CouchbaseClusterWaitUntilReadyOptions struct instead of a simple timeout duration, and adds retry strategy configuration for improved reliability during test execution. This is a follow-up to address integration test issues that were occurring in MasterIntegration tests.

  • Introduces CouchbaseClusterWaitUntilReadyOptions struct to encapsulate timeout and retry strategy configuration
  • Updates NewClusterAgent function signature to accept the new options struct instead of a simple timeout
  • Adds BestEffortRetryStrategy for test pool initialization to improve test reliability

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
base/gocb_utils.go Defines new options struct and updates NewClusterAgent to use retry strategy configuration
base/main_test_cluster.go Updates test cluster creation to use new options struct with BestEffortRetryStrategy
rest/x509_test.go Updates x509 test to use new options struct with retry strategy
rest/server_context.go Updates server context cluster agent initialization to use new options struct

},
base.CouchbaseClusterWaitUntilReadyOptions{
// this timeout is pretty short since we are doing external retries
Timeout: 5 * time.Second,
Copy link
Preview

Copilot AI Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The server context initialization doesn't specify a RetryStrategy in CouchbaseClusterWaitUntilReadyOptions, which means it will use the zero value (nil). This is inconsistent with test code that explicitly sets BestEffortRetryStrategy. Consider adding an explicit RetryStrategy for consistency.

Suggested change
Timeout: 5 * time.Second,
Timeout: 5 * time.Second,
RetryStrategy: gocbcore.NewBestEffortRetryStrategy(nil),

Copilot uses AI. Check for mistakes.

@bbrks
Copy link
Member

bbrks commented Jul 23, 2025

I'm not sure I understand what this change actually does.

Everywhere RetryStrategy is set is nil, either explicitly or implicitly?

agent, err := NewClusterAgent(ctx, clusterSpec,
CouchbaseClusterWaitUntilReadyOptions{
Timeout: TestClusterReadyTimeout,
RetryStrategy: gocbcore.NewBestEffortRetryStrategy(nil),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm, I see - nil is a parameter...

@bbrks bbrks merged commit 316a57e into main Jul 23, 2025
27 checks passed
@bbrks bbrks deleted the no-timeout-test-bucket-pool branch July 23, 2025 13:28
@torcolvin
Copy link
Collaborator Author

I'm not sure I understand what this change actually does.

Everywhere RetryStrategy is set is nil, either explicitly or implicitly?

The default retry strategy is a fast failure retry strategy, from https://github.com/couchbase/gocbcore/blob/e5f9929c8e4dcd4a4d6dcfda751f0748db71cced/agent.go#L182

It is potentially not correct to have a fast failure retry strategy for the cluster agent. We do not have one for the bucket connections, but it is out of scope to change this for this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants