Introduce `Client.JobUpdate` function that can store output incrementally #1098

brandur · 2025-12-02T12:53:47Z

Here, take a stab at a solution for #1064. We introduce a new client
function Client.JobUpdate that takes any output currently recorded in
context and stores it to the given job row. JobUpdate currently only
sets output, but the idea is that it could be expanded in the future in
case it's useful to do so.

The reason that we don't have a river.PersistOutput in line with
river.RecordOutput is that once we're talking about storing data,
we're dealing with the usual persistence semantics. We need a client
instance with access to an executor, and we probably want to have a
*Tx variant like we have for all other functions like this one.

Fixes #1064.

brandur · 2025-12-02T12:54:43Z

@bgentry I think I broke some tests here, but wanted to get your thoughts on the rough shape of this approach. I didn't add tests yet.

riverdriver/riversqlite/internal/dbsqlc/river_job.sql

bgentry · 2025-12-03T15:24:05Z

CHANGELOG.md


+### Added
+
+- Added `Client.JobUpdate` which can be used to persist job output partway through a running work function instead of having to wait until the job is completed. [PR #1093](https://github.com/riverqueue/river/pull/1093).


Suggested change

- Added `Client.JobUpdate` which can be used to persist job output partway through a running work function instead of having to wait until the job is completed. [PR #1093](https://github.com/riverqueue/river/pull/1093).

- Added `Client.JobUpdate` which can be used to persist job output partway through a running work function instead of having to wait until the job is completed. [PR #1098](https://github.com/riverqueue/river/pull/1098).

Fixed, thx.

bgentry · 2025-12-03T15:36:51Z

riverdriver/riverpgxv5/internal/dbsqlc/river_job.sql

+-- name: JobUpdateFast :one
+WITH locked_job AS (
+    SELECT id
+    FROM /* TEMPLATE: schema */river_job
+    WHERE river_job.id = @id
+    FOR UPDATE
+)
+UPDATE /* TEMPLATE: schema */river_job
+SET
+    metadata = CASE WHEN @metadata_do_merge::boolean THEN metadata || @metadata::jsonb ELSE metadata END
+FROM
+    locked_job
+WHERE river_job.id = locked_job.id
+RETURNING river_job.*;


The name JobUpdateFast doesn't seem right—it's not really faster than an alternative option and in its current form it merely updates metadata. Maybe we will expand its purpose, I'm not sure, but in the mean time we could probably still pick a better suffix.

One idea is to rename the current JobUpdate to JobUpdateFull or something of that sort and let this one take the JobUpdate name.

K, I changed this to JobUpdate + JobUpdateFull.

bgentry · 2025-12-03T15:54:18Z

client.go

+// JobUpdate updates the job with the given ID.
+//
+// Currently, no fields are explicitly updatable via this method, but if called
+// inside a work function, it will set output recorded with RecordOutput, if
+// any.
+func (c *Client[TTx]) JobUpdate(ctx context.Context, id int64, params *JobUpdateParams) (*rivertype.JobRow, error) {
+	return c.jobUpdate(ctx, c.driver.GetExecutor(), id, params)
+}


I'm fully onboard with providing an API to make it easy to write metadata updates, particularly for output. At the same time I'm wondering if we want to couple this functionality for "store the recorded output immediately instead of waiting for job completion" to something that's named as a general-purpose JobUpdate. People will probably want to use this to make arbitrary updates to jobs, including those that aren't even the job currently being executed. That'd lead to this method having some surprising and potentially undesired behavior just depending on whether it's called for the current running job or another one.

It also feels a little weird/confusing/unnatural to have to first call the river.RecordOutput() function (which buffers up or lazily records the output) followed by a subsequent call to some client method to immediately save the already-buffered output.

Despite all this, I understand why you went down this route. A special-purpose API like Client.JobRecordOutput / Client.JobRecordOutputNow feels quite specific. And if you exposed metadata as a field on a generalized JobUpdateParams it'd be unclear whether it fully replaces or merges with metadata. Additionally, the nicety of river.RecordOutput is that it both marshals for you and enforces length restrictions that would have to be reproduced elsewhere.

I don't really have a strong preference on an approach here, but I do think it's important to raise these points and hear your perspective before we commit to one. I think I have a slight preference for a single client method that'd allow directly passing output or other metadata updates to be merged in (while applying the same rules & restrictions we do in RecordOutput / with generalized metadata updates from middleware), primarily for two reasons:

Avoid the two-step process of river.RecordOutput + client.JobUpdate without params, in favor of a single explicit call. Let river.RecordOutput be clearly documented to do so lazily, and cross-reference w/ the new method documentation to explain when to use each.

Avoid behavioral differences depending on which job you're updating (the currently running one or another, say a dependency).

Thoughts?

Yeah, I can think of a good half dozen alternatives to this, but I couldn't think of any that are better than the proposed route, just different. For example:

Specific Client.JobRecordOutput functions: definitely thought about this, but just feels too specific and unnecessary given it wouldn't have a major advantage over JobUpdate and we might want to expand JobUpdate.

Client.JobUpdate (or Client.JobRecordOutput) instead of river.RecordOutput + Client.JobUpdate: I see this as a possibility, but it definitely felt to me that it'd be surprising if the normal river.RecordOutput didn't work as expected when doing a mid-work output update.

I've pushed a new change in which I augment the new JobUpdate so it now has two modes of operation:

If river.RecordOutput has been used, it uses the output out of that to save to the job row.

Adds a JobUpdateParams.Output that can be used to override output directly, thereby allowing a one-step operation.

In case both context output and JobUpdateParams.Output is set, we error as it's ambiguous which one was intended.

This feels like a pretty good compromise to me: you get shorter usage if you want it, but we don't introduce any API footguns.

I still need to make a few augmentation like checking output length and such, but do you want to take a look and see if you're okay with this direction?

bgentry

Minor comments only, let's ship this! I think this is a good approach generally, and I like that it gives you flexibility.

We should probably also tweak docs on RecordOutput to make its lazy-write behavior clear and could call out JobUpdate / JobUpdateTx for cases where you want an immediate extra write.

bgentry · 2025-12-15T14:59:31Z

client.go

+// .JobUpdateFull updates the job with the given ID.
+//
+// Currently, no fields are explicitly updatable via this method, but if called
+// inside a work function, it will set output recorded with RecordOutput, if
+// any.


Typo at the start and slightly outdated given the new behavior here. Tx variant below also needs updating.

Yep, good call. Fixed.

bgentry · 2025-12-15T14:59:52Z

client.go

+
+	var (
+		metadataDoMerge      bool
+		metadataUpdatesBytes = []byte("{}") // even in the even of no update, still valid jsonb


Suggested change

metadataUpdatesBytes = []byte("{}") // even in the even of no update, still valid jsonb

metadataUpdatesBytes = []byte("{}") // even in the event of no update, still valid jsonb

Thx, fixed.

bgentry · 2025-12-15T15:04:27Z

client.go

+	if outputFromWorkContext != nil && params.Output != nil {
+		return nil, errors.New("should not set job output both from work context (via RecordOutput) and in JobUpdateParams")


Hmm, feels a bit surprising, but then so is the alternative of quietly allowing JobUpdate to take precedence over the previously recorded output. I think I lean slightly in the direction of allowing it with documentation instead of erroring as done here since that would only happen if you're explicitly updating the job with an Output after you've already called RecordOutput, but I could go either way as long as clearly documented.

For reference we already make the last-write-wins behavior clear in RecordOutput so IMO allowing it here too would be consistent with that:

// Only one output can be stored per job. If this function is called more than // once, the output will be overwritten with the latest value. The output also // must be recorded _before_ the job finishes executing so that it can be stored // when the job's row is updated.

Alright, works for me. I tweaked the code so it's no longer an error and that params output always takes precedence.

…ally Here, take a stab at a solution for #1064. We introduce a new client function `Client.JobUpdate` that takes any output currently recorded in context and stores it to the given job row. `JobUpdate` currently only sets output, but the idea is that it could be expanded in the future in case it's useful to do so. The reason that we don't have a `river.PersistOutput` in line with `river.RecordOutput` is that once we're talking about storing data, we're dealing with the usual persistence semantics. We need a client instance with access to an executor, and we probably want to have a `*Tx` variant like we have for all other functions like this one. Fixes #1064.

brandur · 2025-12-16T16:23:32Z

Great, thx!

We should probably also tweak docs on RecordOutput to make its lazy-write behavior clear and could call out JobUpdate / JobUpdateTx for cases where you want an immediate extra write.

Done.

bgentry

Looks great!

brandur · 2025-12-17T13:13:38Z

Amazing! 🚢

brandur requested a review from bgentry December 2, 2025 12:54

brandur commented Dec 2, 2025

View reviewed changes

riverdriver/riversqlite/internal/dbsqlc/river_job.sql Outdated Show resolved Hide resolved

brandur force-pushed the brandur-job-update branch 2 times, most recently from ecb6ed1 to c016eff Compare December 3, 2025 12:41

bgentry reviewed Dec 3, 2025

View reviewed changes

brandur force-pushed the brandur-job-update branch 2 times, most recently from 08478f0 to 166b496 Compare December 9, 2025 15:31

bgentry reviewed Dec 15, 2025

View reviewed changes

brandur force-pushed the brandur-job-update branch from 166b496 to 8a1d016 Compare December 16, 2025 15:48

brandur force-pushed the brandur-job-update branch from 8a1d016 to 8365c8a Compare December 16, 2025 16:22

brandur requested a review from bgentry December 16, 2025 16:35

bgentry approved these changes Dec 17, 2025

View reviewed changes

brandur merged commit eb0b985 into master Dec 17, 2025
23 of 25 checks passed

brandur deleted the brandur-job-update branch December 17, 2025 13:13


		### Added

		- Added `Client.JobUpdate` which can be used to persist job output partway through a running work function instead of having to wait until the job is completed. [PR #1093](https://github.com/riverqueue/river/pull/1093).

	metadataUpdatesBytes = []byte("{}") // even in the even of no update, still valid jsonb
	metadataUpdatesBytes = []byte("{}") // even in the event of no update, still valid jsonb

		if outputFromWorkContext != nil && params.Output != nil {
		return nil, errors.New("should not set job output both from work context (via RecordOutput) and in JobUpdateParams")

Introduce Client.JobUpdate function that can store output incrementally #1098

Introduce Client.JobUpdate function that can store output incrementally #1098

Uh oh!

Conversation

brandur commented Dec 2, 2025

Uh oh!

brandur commented Dec 2, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bgentry left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brandur commented Dec 16, 2025

Uh oh!

bgentry left a comment

Choose a reason for hiding this comment

Uh oh!

brandur commented Dec 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Introduce `Client.JobUpdate` function that can store output incrementally #1098

Introduce `Client.JobUpdate` function that can store output incrementally #1098