Upper Bound on `attempted_by` / Dangers of Snoozing #972

jackHedaya · 2025-07-01T19:50:40Z

jackHedaya
Jul 1, 2025

Hi all!

I want to start by acknowledging that this issue resulted from our misuse of RiverQueue, not a bug in the library itself. However, I'm sharing this experience in case there's an opportunity to add safeguards against similar misuse patterns or to document them somewhere.

Context

My team uses RiverQueue for jobs that wait for incoming webhooks. When a job runs and the webhook hasn't been received yet (checked via database state), the job snoozes itself to retry later.

The Problem

We initially set a (too aggressive) 10-second snooze duration, expecting webhooks to arrive quickly after requests. This was fine initially, but caused an issue when staging was misconfigured -- jobs would snooze indefinitely in tight loops.

While endless retrying is conceptually problematic, the real impact was much worse. River appends to the attempted_by field on each execution without bounds. Our job records silently grew to enormous sizes. In a single month, we had 51TB of inter-AZ traffic costs on our AWS staging account before discovering the issue.

Ideas for Safeguards

Limiting the size of attempted_by arrays
A River log indicating a job has been snoozed an unreasonable amount of times

I'm also wondering: are there other implementation details in River that make endless snoozing dangerous?

cc @magaldima @themaxgoldman for vis

bgentry · 2025-07-02T02:03:30Z

bgentry
Jul 2, 2025
Maintainer

Hey @jackHedaya, thanks for reporting this and sorry you ran into this issue. I think your issue highlights that it is likely prudent for us to put limits all jsonb arrays to prevent infinite growth. I think we should be able to cleanly support ~indefinite snoozing with a few minor tweaks like this.

Thoughts @brandur?

4 replies

brandur Jul 2, 2025
Maintainer

Yeah, agreed on array bounds, especially for ones like the attempted_by one, which really isn't super high value information anyway.

brandur Jul 2, 2025
Maintainer

And wow, my kingdom for a couple more array manipulation functions in Postgres/SQLite. This'll be doable, but not very pretty.

jackHedaya Jul 2, 2025
Author

Limits on the arrays sounds like a great idea. Thank you both 🙏
Want me to take a shot at a PR?

brandur Jul 5, 2025
Maintainer

@jackHedaya Ah thanks for offering, but nah it's okay. The SQLite portions are a little hairy so it's probably better for us to look into it.

Opened a change over here: #974

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Upper Bound on `attempted_by` / Dangers of Snoozing #972

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 4 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Upper Bound on attempted_by / Dangers of Snoozing #972

Uh oh!

Uh oh!

jackHedaya Jul 1, 2025

Context

The Problem

Ideas for Safeguards

Replies: 1 comment · 4 replies

Uh oh!

bgentry Jul 2, 2025 Maintainer

Uh oh!

brandur Jul 2, 2025 Maintainer

Uh oh!

brandur Jul 2, 2025 Maintainer

Uh oh!

Uh oh!

jackHedaya Jul 2, 2025 Author

Uh oh!

brandur Jul 5, 2025 Maintainer

Upper Bound on `attempted_by` / Dangers of Snoozing #972

jackHedaya
Jul 1, 2025

Replies: 1 comment 4 replies

bgentry
Jul 2, 2025
Maintainer

brandur Jul 2, 2025
Maintainer

brandur Jul 2, 2025
Maintainer

jackHedaya Jul 2, 2025
Author

brandur Jul 5, 2025
Maintainer