Slow processing of large batches of jobs

We're looking at River as a replacement for an existing Redis based worker system. We often enqueue a large batch of jobs (using `InsertMany`) that we want processed quickly. In testing it looks like River always waits `FetchPollInterval` between fetches even if there are jobs available in the queue.

The test we've set up is we have a single worker process with 32 workers pulling from the default queue. Connections are direct to Postgres 17. We are logging all of the jobs that are run and they each take ~25ms. We then batch produce 5k records and it prints batches of completions every second. If we increase the `FetchPollInterval` to 5 seconds the batches start to come in every 5 seconds instead. When we drop `FetchPollInterval` to 100ms then everything processes quickly but the batches happen far enough apart (small every minute larger ones every hour) that it using a small interval would cause a lot of unnecessary load on the database.

What knobs do we have here other than `FetchPollInterval`? I'm assuming we should (roughly) be matching worker counts to available core.

Would it make sense for River to ignore the interval and do an immediate fetch when the previous fetch had a full set of records? That is only apply FetchPollInterval when fewer records were returned than available workers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Slow processing of large batches of jobs #651

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Slow processing of large batches of jobs #651

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions