Skip to content

"Failure updating submission data" and Queue Congestion #1657

@archettialberto

Description

@archettialberto

Dear Codabench Team,

We are running the competition https://www.codabench.org/competitions/4430. There seems to be an issue with submissions getting stuck on an Internal Server Error 500. All workers on our servers return errors like the following. We tried

  • Restarting the workers
  • Changing the queue

How should we handle this issue? This looks very similar to #1471 and #1446.
Thanks for the support!

an2dl-worker0  | [2024-11-11 14:28:52,863: INFO/ForkPoolWorker-1] Updating submission @ https://www.codabench.org/api/submissions/138752/ with data = {'status': 'Preparing', 'status_details': None, 'secret': '...'}
an2dl-worker0  | [2024-11-11 14:28:53,150: INFO/ForkPoolWorker-1] Submission patch failed with status = 500, and response =
an2dl-worker0  | b'<h1>Server Error (500)</h1>'
an2dl-worker0  | [2024-11-11 14:28:53,150: INFO/ForkPoolWorker-1] Updating submission @ https://www.codabench.org/api/submissions/138752/ with data = {'status': 'Failed', 'status_details': 'Failure updating submission data.', 'secret': '...'}
an2dl-worker0  | [2024-11-11 14:28:53,216: INFO/ForkPoolWorker-1] Submission patch failed with status = 500, and response =
an2dl-worker0  | b'<h1>Server Error (500)</h1>'
an2dl-worker0  | [2024-11-11 14:28:53,217: INFO/ForkPoolWorker-1] Destroying submission temp dir: /codabench/tmpih0msgw6
an2dl-worker0  | [2024-11-11 14:28:53,220: ERROR/ForkPoolWorker-1] Task compute_worker_run[3a3bfe66-e7db-484d-8ee3-d5bde8162a4c] raised unexpected: SubmissionException('Failure updating submission data.')
an2dl-worker0  | Traceback (most recent call last):
an2dl-worker0  |   File "/compute_worker.py", line 112, in run_wrapper
an2dl-worker0  |     run.prepare()
an2dl-worker0  |   File "/compute_worker.py", line 765, in prepare
an2dl-worker0  |     self._update_status(STATUS_PREPARING)
an2dl-worker0  |   File "/compute_worker.py", line 356, in _update_status
an2dl-worker0  |     self._update_submission(data)
an2dl-worker0  |   File "/compute_worker.py", line 339, in _update_submission
an2dl-worker0  |     raise SubmissionException("Failure updating submission data.")
an2dl-worker0  | compute_worker.SubmissionException: Failure updating submission data.
an2dl-worker0  |
an2dl-worker0  | During handling of the above exception, another exception occurred:
an2dl-worker0  |
an2dl-worker0  | Traceback (most recent call last):
an2dl-worker0  |   File "/usr/local/lib/python3.9/site-packages/celery/app/trace.py", line 385, in trace_task
an2dl-worker0  |     R = retval = fun(*args, **kwargs)
an2dl-worker0  |   File "/usr/local/lib/python3.9/site-packages/celery/app/trace.py", line 650, in __protected_call__
an2dl-worker0  |     return self.run(*args, **kwargs)
an2dl-worker0  |   File "/compute_worker.py", line 120, in run_wrapper
an2dl-worker0  |     run._update_status(STATUS_FAILED, str(e))
an2dl-worker0  |   File "/compute_worker.py", line 356, in _update_status
an2dl-worker0  |     self._update_submission(data)
an2dl-worker0  |   File "/compute_worker.py", line 339, in _update_submission
an2dl-worker0  |     raise SubmissionException("Failure updating submission data.")
an2dl-worker0  | compute_worker.SubmissionException: Failure updating submission data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions