Skip to content

[Feature Request] Record job results #621

@rhodeon

Description

@rhodeon

Currently, there's no way to save or return the result of a job during/after it has been executed.
For example, a job which involves uploading files to an S3 bucket has no way of exposing where the files were stored.
I'm currently using a workaround inspired by this discussion:

-- name: SetJobResult :exec
UPDATE tasks.river_job
-- `ns_result` is namespaced with `ns` to avoid a potential naming conflict with any future fields set internally by River.
SET "metadata" = jsonb_set("metadata", '{ns_result}', to_jsonb(@result::text), true)
WHERE "id" = @job_id;

But it would be even better to have this supported in the River API.

Ideally, this would work by changing the signature of Work() to return a result which can be handled by River internally or with a middleware:

Work(ctx context.Context, job *Job[T]) ([]bytes, error)

But this seems to be a non-starter as it would be backward-incompatible.

Alternatively, a method can be exposed to allow the use explicitly set the result when they need to:

func (w *SortWorker) Work(ctx context.Context, job *river.Job[SortArgs]) error {
    // ...
    err := job.SetResult(result) 
    // or
    err := w.SetJobResult(result) 
    // ...
    return nil
}

I haven't had a deep dive into the River codebase yet, so I'm not sure if it would be better suited as a method on the worker or on the job.

In any case, I can think of 3 approaches to represent this in the database:

  1. Making use of the metadata column like in the hack above (but more robustly).
  2. Having a separate result column under jobs with the jsonb type.
  3. Having a dedicated job_results table.

My preferred option would be the second, as that would be easier to extract the result than with using the metadata, and querying would be simpler with a self-contained jobs table rather than having to perform N-additional queries to get the results.

Another point of consideration is storing a result saved for each job attempt.
The 3rd option of having a separate results table would admittedly be more flexible for this.

And on a related note, if SetResult() is exposed to be called in the job, there would have to be some accommodation for storing multiple results in a single attempt as the method could be called numerous times in a single run.

I'm up for further discussions about this, and would be happy to work on a PR if this is something the main team is open to.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions