create new topic-partition list comparison method based on hashing #5014

lucas-sonnabend · 2025-04-02T21:45:10Z

the new method is much faster than the old, O(Na + Nb) compared to the older one, which is O(Na * Nb)

Background:
The old one was causing problems with partition counts ~3600 in the cooperative-sticky partition assignment.
The group leader ended up being kicked out of the group because he spent too much time calculating the partition assignment, and didn't sent any heartbeat inbetween.

reducing the complexity of this pushes the runtime down. Concrete example with 3600 partitions and 1800 consumers before: ~240s
after: ~3s

fixes problem in #4629

I have tried to stick to the styles, but ran into problems with getting the right version of clang-format (10 doesn't seem to be available via homebrew)

I have have also run the unit tests, they pass.

I am currently running 0113-cooperative-rebalance.cpp test case.

I'm not sure if I should be completely replacing the old comparison function, but happy to do so based on comments

the new method is much faster than the old, O(Na + Nb) compared to the older one, which is O(Na * Nb) Background: The old one was causing problems with partition counts ~3600 in the cooperative-sticky partition assignment. The group leader ended up being kicked out of the group because he spent too much time calculating the partition assignment, and didn't sent any heartbeat inbetween. reducing the complexity of this pushes the runtime down. Concrete example with 3600 partitions and 1800 consumers before: ~240s after: ~3s

confluent-cla-assistant · 2025-04-02T21:45:21Z

🎉 All Contributor License Agreements have been signed. Ready to merge.
✅ lucas-sonnabend
_{Please push an empty commit if you would like to re-run the checks to verify CLA status for all contributors.}

Copilot

Pull Request Overview

This PR introduces a new, faster topic-partition list comparison method based on hashing in order to improve performance during cooperative-sticky partition assignment by reducing the time complexity from O(Na * Nb) to O(Na + Nb).

Replace the old list comparison with a new function that leverages a hash-based approach.
Add a new function declaration in the header and its corresponding implementation in the source file.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File	Description
src/rdkafka_sticky_assignor.c	Updated to use the new hash-based comparison function.
src/rdkafka_partition.h	Added declaration of the new comparison function.
src/rdkafka_partition.c	Implemented the new hash-based comparison function.

Comments suppressed due to low confidence (1)

src/rdkafka_partition.c:3174

The initializer is provided with a NULL destructor, causing the topic partition copies created with rd_kafka_topic_partition_copy to potentially leak memory. Consider supplying an appropriate destructor (e.g., rd_kafka_topic_partition_destroy) to manage the allocated memory.

map_toppar_void_t hashmap = RD_MAP_INITIALIZER(a->cnt, cmp, hash, NULL, NULL);

lucas-sonnabend · 2025-04-02T22:37:57Z

so, I'm trying to run the 0113 test, but it keeps timing out locally, both with the changes and on main.
I'm also running a kafka cluster via docker-compose for it rather than the recommended way

Copilot bot review requested due to automatic review settings April 2, 2025 21:45

lucas-sonnabend requested a review from a team as a code owner April 2, 2025 21:45

Copilot AI reviewed Apr 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

create new topic-partition list comparison method based on hashing #5014

create new topic-partition list comparison method based on hashing #5014

lucas-sonnabend commented Apr 2, 2025 •

edited

Loading

confluent-cla-assistant bot commented Apr 2, 2025 •

edited

Loading

Copilot AI left a comment

lucas-sonnabend commented Apr 2, 2025

create new topic-partition list comparison method based on hashing #5014

Are you sure you want to change the base?

create new topic-partition list comparison method based on hashing #5014

Conversation

lucas-sonnabend commented Apr 2, 2025 • edited Loading

confluent-cla-assistant bot commented Apr 2, 2025 • edited Loading

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

lucas-sonnabend commented Apr 2, 2025

lucas-sonnabend commented Apr 2, 2025 •

edited

Loading

confluent-cla-assistant bot commented Apr 2, 2025 •

edited

Loading