-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SolidQueue::Processes::ProcessExitError #333
Comments
Looks like the workers processing these jobs are crashing or being killed somehow. Can you access logs in that instance to see what it might be? |
Do you have local logs for what your Solid Queue worker is doing? |
@rosa i'm trying that now. Looking at the docs to see how I could mute the solid_queue logs as they're drowning out my local logs 😆 |
|
Or if you want to mute the Solid Queue logs completely, you can set: config.solid_queue.logger = ActiveSupport::Logger.new(nil) |
@rosa thanks for those! I managed to get use the logs to figure out what was happening. What happened was that the job was trying to find records to sync, but there weren't any (due to a faulty where clause) However, it's still odd that the job would throw a |
Yes, that's what would happen. |
@rosa When we're deploying new version of our app, we kill off the queue by running Is there a more graceful way to restart the queue that we should be doing? We have all the processes (supervisor, scheduler, workers, etc) running on the one machine. Appreciate your thoughts on graceful shutdown of executing jobs. Especially as we have some long running ones. |
Hey @dglancy! Could you send a |
Hi @rosa |
We got a similar error when ActiveRecord
We're still investigating the issue, although we have a hunch that SQ might need a few more connections for supporting processes/threads. AFAIK the pool should be per-process, but I also need to double check on that. Anyway my understanding is that if the worker has no access to the database it will hardly be able to store the exception details and the supervisor will just see the process being terminated. @rosa (👋❕) what do you think about suggesting to check the worker logs in the error message?
|
Oops, sorry for the delay here! I missed the last notification back in October 😳 About the connection pool errors, the configuration now has a validation step for the connection pool size and will show an error when you start solid queue if the pool is too small, so that should prevent that error, @elia. 20 threads per worker will need more than 20 connections in the pool because the worker also needs to poll and send the heartbeat, and will use two different connections for that.
It won't be able to store the exception details for a job, but this error will happen on the worker itself, outside the job (in your case above it happened on polling), and the error will be reported via -> (exception) { Rails.error.report(exception, handled: false) } |
Going to close this one as there's nothing else to do on the solid queue side! 🙏 |
I setup my
solid_queue.yml
to run a recurring task every 30s that checks for records to syncThe
SyncRecordsJob
does aSyncBatchJob.perform_later(record_ids)
and it's expected to take some time to run as they have to process quite a number of records (aprox 30s - 1min).This has been deployed to Render on a $25/mo instance with 1 CPU and 2GB RAM. Initially, on deploy, some of the jobs execute successfully after some time,
SyncRecordsJob
accumulates in theIn Progress
list and seems to never process. They don't seem to accumulate anymore either.Locally, these jobs run well and don't seem to have any issues.Locally, it seems that the recurring job is getting enqueued but never executing. Here's what I see from Mission Control in production for the jobs that failed:Questions
The text was updated successfully, but these errors were encountered: