[Questions] Message rates drops to 0/s across all queues randomly #13479
Unanswered
VinayakSomvanshi
asked this question in
Questions
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Community Support Policy
RabbitMQ version used
other (please specify)
Erlang version used
26.2.x
Operating system (distribution) used
OpenShift
How is RabbitMQ deployed?
Community Docker image
rabbitmq-diagnostics status output
See https://www.rabbitmq.com/docs/cli to learn how to use rabbitmq-diagnostics
Logs from node 1 (with sensitive values edited out)
See https://www.rabbitmq.com/docs/logging to learn how to collect logs
Logs from node 2 (if applicable, with sensitive values edited out)
See https://www.rabbitmq.com/docs/logging to learn how to collect logs
Logs from node 3 (if applicable, with sensitive values edited out)
See https://www.rabbitmq.com/docs/logging to learn how to collect logs
rabbitmq.conf
See https://www.rabbitmq.com/docs/configure#config-location to learn how to find rabbitmq.conf file location
Steps to deploy RabbitMQ cluster
I have a RabbitMQ cluster running on OpenShift in a data center, with approximately 3,000 queues. The cluster is configured almost as default using the RabbitMQ Cluster Operator and has been allocated sufficient resources.
Messages are being sent to the queues, but they are not being consumed at a consistent rate. At times, there are around 500K messages in the queue, with fluctuations in consumption close to <100 messages/s. However, at random intervals, with no apparent pattern, message throughput drops to 0/s—including incoming messages, deliveries, and acknowledgments.
The cluster has persistence enabled, which creates an LDEV on the SAN storage pool for each of the three node volumes. Disk IOPS for each volume reaches around 4000, but there are fluctuations. The limit for each LDEV is 33K IOPS, assuming a 50:50 read/write ratio.
Despite these issues, the RabbitMQ cluster pods do not report any warnings or errors related to this behavior. I would appreciate any insights on potential causes and troubleshooting steps.
Environment Details:
RabbitMQ Version: 3.13.7
Erlang Version: 26.2.5.9
Nodes in Cluster: 3
Steps to reproduce the behavior in question
I have a RabbitMQ cluster running on OpenShift in a data center, with approximately 3,000 queues. The cluster is configured almost as default using the RabbitMQ Cluster Operator and has been allocated sufficient resources.
Messages are being sent to the queues, but they are not being consumed at a consistent rate. At times, there are around 500K messages in the queue, with fluctuations in consumption close to <100 messages/s. However, at random intervals, with no apparent pattern, message throughput drops to 0/s—including incoming messages, deliveries, and acknowledgments.
The cluster has persistence enabled, which creates an LDEV on the SAN storage pool for each of the three node volumes. Disk IOPS for each volume reaches around 4000, but there are fluctuations. The limit for each LDEV is 33K IOPS, assuming a 50:50 read/write ratio.
Despite these issues, the RabbitMQ cluster pods do not report any warnings or errors related to this behavior. I would appreciate any insights on potential causes and troubleshooting steps.
Environment Details:
RabbitMQ Version: 3.13.7
Erlang Version: 26.2.5.9
Nodes in Cluster: 3
advanced.config
See https://www.rabbitmq.com/docs/configure#config-location to learn how to find advanced.config file location
Application code
# PASTE CODE HERE, BETWEEN BACKTICKS
Kubernetes deployment file
What problem are you trying to solve?
I have a RabbitMQ cluster running on OpenShift in a data center, with approximately 3,000 queues. The cluster is configured almost as default using the RabbitMQ Cluster Operator and has been allocated sufficient resources.
Messages are being sent to the queues, but they are not being consumed at a consistent rate. At times, there are around 500K messages in the queue, with fluctuations in consumption close to <100 messages/s. However, at random intervals, with no apparent pattern, message throughput drops to 0/s—including incoming messages, deliveries, and acknowledgments.
The cluster has persistence enabled, which creates an LDEV on the SAN storage pool for each of the three node volumes. Disk IOPS for each volume reaches around 4000, but there are fluctuations. The limit for each LDEV is 33K IOPS, assuming a 50:50 read/write ratio.
Despite these issues, the RabbitMQ cluster pods do not report any warnings or errors related to this behavior. I would appreciate any insights on potential causes and troubleshooting steps.
Environment Details:
RabbitMQ Version: 3.13.7
Erlang Version: 26.2.5.9
Nodes in Cluster: 3
Beta Was this translation helpful? Give feedback.
All reactions