Skip to content

bug: Tasks stuck in queue and duplicated indefinitely after nightly server restart #1566

Open
@lpkobamn

Description

@lpkobamn

Provide environment information

System:
OS: Linux 6.5 Ubuntu 22.04.4 LTS 22.04.4 LTS (Jammy Jellyfish)
CPU: (4) x64 unknown
Memory: 14.89 GB / 19.34 GB
Container: Yes
Shell: 5.1.16 - /bin/bash
Binaries:
Node: 22.6.0 - ~/.nvm/versions/node/v22.6.0/bin/node
npm: 10.8.2 - ~/.nvm/versions/node/v22.6.0/bin/npm
bun: 1.1.22 - ~/.bun/bin/bun

Describe the bug

I'm running a self-hosted Trigger.dev, following the setup instructions [here](https://trigger.dev/docs/open-source-self-hosting) and deploying using [triggerdotdev/docker](https://github.com/triggerdotdev/docker).

The issue arises every night after server restart:

  1. Random task gets stuck in the queued state and does not execute.
  2. The same task repeatedly appears in the queue, leading to thousands of duplicates over time (see the attached screenshots).
  3. Manual cancellation of all queued tasks is the only way to allow the task to start properly again. However, canceling 3,800+ tasks manually is time-consuming and impractical.

I've already tried the following steps without success:

  • Running ./stop.sh, ./update.sh, and ./start.sh.
  • Ensuring I'm on the latest version of the self-hosted stack.

Steps to Reproduce:

  1. Run Trigger.dev self-hosted.
  2. Restart server while some task is executing.
  3. Observe tasks getting stuck in queued and duplicated indefinitely.

Expected Behavior:

  • The task should either resume or fail cleanly after restart.
  • Queued tasks should not duplicate endlessly.

Screenshots:

  • Tasks Dashboard: Showing 3800+ queued tasks.
  • Task Runs List: Evidence of duplication and stalled executions.

Environment Details:


Additional Information:
Please advise where I should look to troubleshoot this issue further:

  1. Could this be related to database locking or an issue with worker recovery after restart?
  2. Are there configurations or logs I should check to identify the root cause?
  3. Is there a way to bulk cancel thousands of queued tasks efficiently?

Any guidance on fixing this or preventing task duplication would be greatly appreciated.


Thank you!
Attachments: (Screenshots included)

  1. Tasks Dashboard view.
  2. Task Runs list view.

Let me know if you need more details!

Reproduction repo

https://github.com/triggerdotdev/docker

To reproduce

  1. Run Trigger.dev self-hosted.
  2. Restart server while some task is executing.
  3. Observe tasks getting stuck in queued and duplicated indefinitely.

Additional information

2024-12-16_11-56-04
2024-12-16_11-58-54

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions