-
Notifications
You must be signed in to change notification settings - Fork 131
Description
Hello, we're using sequences to ensure only one job is running at a given time "per entity". We have a couple of workers (~20ish) that are all set to use an id field as river:"sequence" - some of which will generate follow up jobs for the same entity - think regular maintenance jobs or jobs that first create things, call an external service and then start things. This works fine on production (so far) however our test suite is becoming slower and slower.
The docs states that
If there are no actively running jobs in a sequence, the first job in that sequence may encounter a higher latency before being moved to available by the sequence maintenance process. This latency does not apply to subsequent jobs in the sequence if they are already enqueued when the previous job completes; such subsequent jobs will be scheduled immediately.
but it seems like this is always true for jobs that are created within a sequenced job for the same entity.
For example (im using the same TaskArgs here, but this also happens with different TaskArgs/Workers):
type TaskArgs struct {
EntityId int `json:"entityId" river:"sequence"`
RescheduleCounter int `json:"rescheduleCounter"`
}
func (worker *TaskWorker) Work(ctx context.Context, job *river.Job[TaskArgs]) error {
// ...
_, err := client.Insert(context.Background(), TaskArgs{...}, nil)
// ...
}When run you can see the "1 second" within the logging. For example 10 jobs will usually take around 10-12 seconds.
$ go test
2025/06/24 10:57:12 INFO Work() started entityId=13 rescheduleCounter=10
2025/06/24 10:57:12 INFO Scheduled next job nextArgs="{EntityId:13 RescheduleCounter:9}"
2025/06/24 10:57:13 INFO Work() started entityId=13 rescheduleCounter=9
2025/06/24 10:57:13 INFO Scheduled next job nextArgs="{EntityId:13 RescheduleCounter:8}"
2025/06/24 10:57:14 INFO Work() started entityId=13 rescheduleCounter=8
<snip>
2025/06/24 10:57:21 INFO We're done! entityId=13
PASS
ok rivertestexecution 10.267s
I've build a small river test execution project to recreate the issue in isolation. It has a basic worker implementation calling itself X times to showcase the issue.
Is there a way to work around this timer / issue?