Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate RabbitMQ restarts #481

Closed
jpmckinney opened this issue Jan 22, 2024 · 3 comments
Closed

Investigate RabbitMQ restarts #481

jpmckinney opened this issue Jan 22, 2024 · 3 comments
Labels
services Relating to common services like NTP, Apache, uWSGI, cron, etc.

Comments

@jpmckinney
Copy link
Member

The info messages in /var/log/rabbitmq/rabbit@ocp##.log don't seem relevant. Can search with grep -v info or zgrep -v info to find the other error levels (notice, warning, error).

The registry server (ocp13) on 2014-01-18 10:10:46 got "RabbitMQ is asked to stop...", and it stopped by 2024-01-18 10:10:51. It then started again on 2024-01-18 10:10:54.

Looking in Prometheus, the only signals are that memory usage and swapped dropped after restart (not surprising), but it was not high before restart (40%, 175MB).

Looking at /var/log/syslog at the same time, I see messages relating to apt around the same time, so I assume RabbitMQ was upgraded and therefore restarted.

This generated messages in Kingfisher Collect, because it uses a blocking connection and not an async client (only the latter can handle connection close events). To resolve that, we need to close open-contracting/kingfisher-collect#1033


I'll keep this issue open to investigate any other restarts. #238 explains another restart scenario.

@jpmckinney jpmckinney added the services Relating to common services like NTP, Apache, uWSGI, cron, etc. label Jan 22, 2024
@ghost
Copy link

ghost commented Jan 23, 2024

This was due to a rabbitmq-sever patch

@jpmckinney
Copy link
Member Author

I closed open-contracting/kingfisher-collect#1033, so I'll close this issue.

If there are any new RabbitMQ-related messages in Sentry, I can use this issue in future.

@jpmckinney
Copy link
Member Author

jpmckinney commented Jan 25, 2024

RabbitMQ restarts might still cause errors to be reported. If so, I think the solution is here: open-contracting/yapw#2 (comment)

Kingfisher Collect has had issues with restarts, because it only publishes messages, and over a long period of time. The others only ack/nack/publish messages after consuming a message. Since RabbitMQ cancels consumers when restarting, there is maybe only a narrow window in which the consumer can attempt a method on a closing/closed connection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
services Relating to common services like NTP, Apache, uWSGI, cron, etc.
Projects
None yet
Development

No branches or pull requests

1 participant