Skip to content

create new topic-partition list comparison method based on hashing #5014

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

lucas-sonnabend
Copy link

@lucas-sonnabend lucas-sonnabend commented Apr 2, 2025

the new method is much faster than the old, O(Na + Nb) compared to the older one, which is O(Na * Nb)

Background:
The old one was causing problems with partition counts ~3600 in the cooperative-sticky partition assignment.
The group leader ended up being kicked out of the group because he spent too much time calculating the partition assignment, and didn't sent any heartbeat inbetween.

reducing the complexity of this pushes the runtime down. Concrete example with 3600 partitions and 1800 consumers before: ~240s
after: ~3s

fixes problem in #4629

I have tried to stick to the styles, but ran into problems with getting the right version of clang-format (10 doesn't seem to be available via homebrew)

I have have also run the unit tests, they pass.

I am currently running 0113-cooperative-rebalance.cpp test case.

I'm not sure if I should be completely replacing the old comparison function, but happy to do so based on comments

the new method is much faster than the old, O(Na + Nb)
compared to the older one, which is O(Na * Nb)

Background:
The old one was causing problems with partition counts ~3600
in the cooperative-sticky partition assignment.
The group leader ended up being kicked out of the group because he spent too
much time calculating the partition assignment, and didn't sent any heartbeat inbetween.

reducing the complexity of this pushes the runtime down.
Concrete example with 3600 partitions and 1800 consumers
before: ~240s
after: ~3s
@Copilot Copilot bot review requested due to automatic review settings April 2, 2025 21:45
@lucas-sonnabend lucas-sonnabend requested a review from a team as a code owner April 2, 2025 21:45
@confluent-cla-assistant
Copy link

confluent-cla-assistant bot commented Apr 2, 2025

🎉 All Contributor License Agreements have been signed. Ready to merge.
✅ lucas-sonnabend
Please push an empty commit if you would like to re-run the checks to verify CLA status for all contributors.

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new, faster topic-partition list comparison method based on hashing in order to improve performance during cooperative-sticky partition assignment by reducing the time complexity from O(Na * Nb) to O(Na + Nb).

  • Replace the old list comparison with a new function that leverages a hash-based approach.
  • Add a new function declaration in the header and its corresponding implementation in the source file.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
src/rdkafka_sticky_assignor.c Updated to use the new hash-based comparison function.
src/rdkafka_partition.h Added declaration of the new comparison function.
src/rdkafka_partition.c Implemented the new hash-based comparison function.
Comments suppressed due to low confidence (1)

src/rdkafka_partition.c:3174

  • The initializer is provided with a NULL destructor, causing the topic partition copies created with rd_kafka_topic_partition_copy to potentially leak memory. Consider supplying an appropriate destructor (e.g., rd_kafka_topic_partition_destroy) to manage the allocated memory.
map_toppar_void_t hashmap = RD_MAP_INITIALIZER(a->cnt, cmp, hash, NULL, NULL);

@lucas-sonnabend
Copy link
Author

so, I'm trying to run the 0113 test, but it keeps timing out locally, both with the changes and on main.
I'm also running a kafka cluster via docker-compose for it rather than the recommended way

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant