[Questions] Message rates drops to 0/s across all queues randomly #13479

VinayakSomvanshi · 2025-03-11T13:24:16Z

VinayakSomvanshi
Mar 11, 2025

Community Support Policy

I have read RabbitMQ's Community Support Policy
I run RabbitMQ 4.x, the only series currently covered by community support
I promise to provide all relevant information (versions, logs from all nodes, rabbitmq-diagnostics output, detailed reproduction steps)

RabbitMQ version used

other (please specify)

Erlang version used

26.2.x

Operating system (distribution) used

OpenShift

How is RabbitMQ deployed?

Community Docker image

rabbitmq-diagnostics status output

See https://www.rabbitmq.com/docs/cli to learn how to use rabbitmq-diagnostics

# PASTE OUTPUT HERE, BETWEEN BACKTICKS

Logs from node 1 (with sensitive values edited out)

See https://www.rabbitmq.com/docs/logging to learn how to collect logs

# PASTE LOG HERE, BETWEEN BACKTICKS

Logs from node 2 (if applicable, with sensitive values edited out)

See https://www.rabbitmq.com/docs/logging to learn how to collect logs

# PASTE LOG HERE, BETWEEN BACKTICKS

Logs from node 3 (if applicable, with sensitive values edited out)

See https://www.rabbitmq.com/docs/logging to learn how to collect logs

# PASTE LOG HERE, BETWEEN BACKTICKS

rabbitmq.conf

See https://www.rabbitmq.com/docs/configure#config-location to learn how to find rabbitmq.conf file location

# PASTE rabbitmq.conf HERE, BETWEEN BACKTICKS

Steps to deploy RabbitMQ cluster

I have a RabbitMQ cluster running on OpenShift in a data center, with approximately 3,000 queues. The cluster is configured almost as default using the RabbitMQ Cluster Operator and has been allocated sufficient resources.

Messages are being sent to the queues, but they are not being consumed at a consistent rate. At times, there are around 500K messages in the queue, with fluctuations in consumption close to <100 messages/s. However, at random intervals, with no apparent pattern, message throughput drops to 0/s—including incoming messages, deliveries, and acknowledgments.

The cluster has persistence enabled, which creates an LDEV on the SAN storage pool for each of the three node volumes. Disk IOPS for each volume reaches around 4000, but there are fluctuations. The limit for each LDEV is 33K IOPS, assuming a 50:50 read/write ratio.

Despite these issues, the RabbitMQ cluster pods do not report any warnings or errors related to this behavior. I would appreciate any insights on potential causes and troubleshooting steps.

Environment Details:

RabbitMQ Version: 3.13.7

Erlang Version: 26.2.5.9

Nodes in Cluster: 3

Steps to reproduce the behavior in question

I have a RabbitMQ cluster running on OpenShift in a data center, with approximately 3,000 queues. The cluster is configured almost as default using the RabbitMQ Cluster Operator and has been allocated sufficient resources.

Messages are being sent to the queues, but they are not being consumed at a consistent rate. At times, there are around 500K messages in the queue, with fluctuations in consumption close to <100 messages/s. However, at random intervals, with no apparent pattern, message throughput drops to 0/s—including incoming messages, deliveries, and acknowledgments.

The cluster has persistence enabled, which creates an LDEV on the SAN storage pool for each of the three node volumes. Disk IOPS for each volume reaches around 4000, but there are fluctuations. The limit for each LDEV is 33K IOPS, assuming a 50:50 read/write ratio.

Despite these issues, the RabbitMQ cluster pods do not report any warnings or errors related to this behavior. I would appreciate any insights on potential causes and troubleshooting steps.

Environment Details:

RabbitMQ Version: 3.13.7

Erlang Version: 26.2.5.9

Nodes in Cluster: 3

advanced.config

See https://www.rabbitmq.com/docs/configure#config-location to learn how to find advanced.config file location

# PASTE advanced.config HERE, BETWEEN BACKTICKS

Application code

# PASTE CODE HERE, BETWEEN BACKTICKS

Kubernetes deployment file

# Relevant parts of K8S deployment that demonstrate how RabbitMQ is deployed
# PASTE YAML HERE, BETWEEN BACKTICKS

What problem are you trying to solve?

I have a RabbitMQ cluster running on OpenShift in a data center, with approximately 3,000 queues. The cluster is configured almost as default using the RabbitMQ Cluster Operator and has been allocated sufficient resources.

Messages are being sent to the queues, but they are not being consumed at a consistent rate. At times, there are around 500K messages in the queue, with fluctuations in consumption close to <100 messages/s. However, at random intervals, with no apparent pattern, message throughput drops to 0/s—including incoming messages, deliveries, and acknowledgments.

The cluster has persistence enabled, which creates an LDEV on the SAN storage pool for each of the three node volumes. Disk IOPS for each volume reaches around 4000, but there are fluctuations. The limit for each LDEV is 33K IOPS, assuming a 50:50 read/write ratio.

Despite these issues, the RabbitMQ cluster pods do not report any warnings or errors related to this behavior. I would appreciate any insights on potential causes and troubleshooting steps.

Environment Details:

RabbitMQ Version: 3.13.7

Erlang Version: 26.2.5.9

Nodes in Cluster: 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Questions] Message rates drops to 0/s across all queues randomly #13479

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

[Questions] Message rates drops to 0/s across all queues randomly #13479

VinayakSomvanshi Mar 11, 2025

Community Support Policy

RabbitMQ version used

Erlang version used

Operating system (distribution) used

How is RabbitMQ deployed?

rabbitmq-diagnostics status output

Logs from node 1 (with sensitive values edited out)

Logs from node 2 (if applicable, with sensitive values edited out)

Logs from node 3 (if applicable, with sensitive values edited out)

rabbitmq.conf

Steps to deploy RabbitMQ cluster

Steps to reproduce the behavior in question

advanced.config

Application code

Kubernetes deployment file

What problem are you trying to solve?

Replies: 0 comments

VinayakSomvanshi
Mar 11, 2025