Skip to content

Flexible retry backoff policies #182

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

MattWhelan
Copy link

Extends the existing retry limit mechanism to allow for delays between retries. Includes implementations for fixed delays and exponential backoff.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@MattWhelan MattWhelan force-pushed the backoff branch 2 times, most recently from 3034fdf to d7039d1 Compare June 10, 2025 15:32
@MattWhelan MattWhelan requested a review from sean- June 10, 2025 16:01
Extends the existing retry limit mechanism to allow for delays between
retries. Includes implementations for fixed delays and exponential
backoff.
@dikshant
Copy link

We haven’t implemented this because cockroachdb-go uses save point based retries and because of savepoint-based retries, a transaction can continue holding locks in between retries. So that means with exponential backoff, you'll be holding locks for longer, thus putting the app at even greater risk of contention.

If you could show us that this is not the case with backoffs, we will consider adding this! Thanks!

@MattWhelan
Copy link
Author

We haven’t implemented this because cockroachdb-go uses save point based retries and because of savepoint-based retries, a transaction can continue holding locks in between retries. So that means with exponential backoff, you'll be holding locks for longer, thus putting the app at even greater risk of contention.

If you could show us that this is not the case with backoffs, we will consider adding this! Thanks!

I just pushed 9dda97b. Could you take a look at that? I'm not sure what all the implications are of separating the database transaction lifecycle from the Tx object, but aside from that, I think it's sound, and avoids the problem with holding locks during backoff.

@MattWhelan MattWhelan requested a review from rafiss June 23, 2025 20:05
}

// Vargo converts a go-retry style Delay provider into a RetryPolicy
func Vargo(fn func() VargoBackoff) RetryPolicy {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we could think of a different name to go with here. Users of cockroach-go who don't already know about the sethvargo/go-retry library may be confused to see this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to suggestions. I was trying to build an explicit integration without adding a transitive dependency.

if restartErr := tx.Exec(ctx, "ROLLBACK"); restartErr != nil {
return newTxnRestartError(restartErr, err)
}
if restartErr := tx.Exec(ctx, "BEGIN"); restartErr != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it could have issues. Some libraries may not like it if we use the transaction after rolling it back. For example, I see this in the pgx code where it returns an error if you try to use a transaction that was already rolled back: https://github.com/jackc/pgx/blob/ecc9203ef42fbba50507e773901b5aead75288ef/tx.go#L205-L224

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It returns an error if you use a transaction object that has the closed flag set. Since we're not doing that, it ought to be ok, I would expect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants