-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-evaluate stream SAC group after connection down event #13657
Re-evaluate stream SAC group after connection down event #13657
Conversation
The same connection can contain several consumers belonging to a SAC group (group key = vhost + stream + consumer name). The whole new group must be re-evaluated to select a new active consumer after the consumers of the down connection are removed from it. The previous behavior would not re-evaluate the new group and could select a consumer from the down connection, letting the group with only inactive consumers, as the selected active consumer would never receive the activation message from the stream SAC coordinator. This commit fixes this problem by removing the consumers of the down down connection from the affected groups and then performing the appropriate operations for the groups to keep on consuming (e.g. notifying an active consumer that it needs to step down). References #13372
Acceptance stepsGet the branch and run the broker with the stream plugin: cd /tmp
git clone [email protected]:rabbitmq/rabbitmq-server.git
cd rabbitmq-server
git checkout stream-sac-re-evaluate-group-after-connection-down
make run-broker PLUGINS="rabbitmq_stream" Get Stream PerfTest and run a first instance with 2 consumers on a 3-partition super stream: cd /tmp
wget https://github.com/rabbitmq/rabbitmq-java-tools-binaries-dev/releases/download/v-stream-perf-test-latest/stream-perf-test-latest.jar
java -jar stream-perf-test-latest.jar --producers 0 --consumers 2 \
--stream-count 1 --super-streams --super-stream-partitions 3 \
--single-active-consumer --consumer-names my-app \
--uris rabbitmq-stream://$(hostname):5552 \
--consumers-by-connection 100 Start another identical instance: cd /tmp
java -jar stream-perf-test-latest.jar --producers 0 --consumers 2 \
--stream-count 1 --super-streams --super-stream-partitions 3 \
--single-active-consumer --consumer-names my-app \
--uris rabbitmq-stream://$(hostname):5552 \
--consumers-by-connection 100 List the registered consumers for the group on the first cd /tmp/rabbitmq-server
sbin/rabbitmqctl list_stream_group_consumers --reference my-app --stream stream-0 There should be 1 active consumer and 3 inactive consumers:
Do the same thing for the sbin/rabbitmqctl list_stream_group_consumers --reference my-app --stream stream-1
And for the sbin/rabbitmqctl list_stream_group_consumers --reference my-app --stream stream-2
List now the Java processes: jps
Pick the PID of one of the kill -9 9721 List the consumers of the group for each partition. There should be 1 active consumer and 1 inactive consumer for each partition (no partition should have 2 inactive consumers). sbin/rabbitmqctl list_stream_group_consumers --reference my-app --stream stream-0
sbin/rabbitmqctl list_stream_group_consumers --reference my-app --stream stream-1
sbin/rabbitmqctl list_stream_group_consumers --reference my-app --stream stream-2
|
Re-evaluate stream SAC group after connection down event (backport #13657)
The same connection can contain several consumers belonging to a SAC group (group key = vhost + stream + consumer name). The whole new group must be re-evaluated to select a new active consumer after the consumers of the down connection are removed from it.
The previous behavior would not re-evaluate the new group and could select a consumer from the down connection, letting the group with only inactive consumers, as the selected active consumer would never receive the activation message from the stream SAC coordinator.
This commit fixes this problem by removing the consumers of the down down connection from the affected groups and then performing the appropriate operations for the groups to keep on consuming (e.g. notifying an active consumer that it needs to step down).
References #13372