Add ducktape benchmark tests for consumer (sync + async) #2045

fangnx · 2025-09-11T00:01:59Z

What

This PR implements comprehensive performance benchmarking for AIOConsumer and adds critical performance guidance to help developers optimize their async Kafka consumer applications.

What This PR Does

Adds parameterized benchmarking for AIOConsumer with configurable batch sizes (batch_size=[1, 5, 20])
Implements comprehensive metrics collection and reporting for consumer performance analysis
Documents performance characteristics directly in AIOConsumer.poll() and consume() method docstrings
Creates a complete testing framework for comparing sync vs async consumer performance across different configurations

Performance Results

Consumer Type	Method	Batch Size	Throughput (msg/s)	Performance Ratio
SYNC	`poll()`	1	111,683	1.0x (baseline)
SYNC	`consume()`	1	110,995	1.0x
SYNC	`consume()`	5	110,663	1.0x
SYNC	`consume()`	20	111,017	1.0x
ASYNC	`poll()`	1	16,559	0.15x (7x slower)
ASYNC	`consume()`	1	14,903	0.13x (7.5x slower)
ASYNC	`consume()`	5	65,066	0.58x (1.7x slower)
ASYNC	`consume()`	20	112,532	1.0x (matches sync)

Performance Explanation

The dramatic performance difference stems from AIOConsumer's use of ThreadPoolExecutor to make blocking librdkafka calls async-compatible. For single-message operations (poll() or consume(1)), each message pays the full ThreadPool coordination overhead (~7x slower). However, with larger batch sizes, this overhead is amortized across multiple messages, achieving performance parity with sync consumers at batch_size=20.

Developer Guidance Added

High-throughput applications: Use consume() with batch_size >= 20 for optimal async performance
Latency-sensitive applications: Use consume() with batch_size=5 for balanced performance (65K msg/s)
Avoid: poll() for high-throughput scenarios (7x performance penalty)
Sync consumers: Consistently excellent performance regardless of batch size

This PR provides developers with concrete, data-driven guidance for optimizing their Kafka consumer performance based on their specific throughput and latency requirements.

Checklist

Contains customer facing changes? Including API/behavior changes
Did you add sufficient unit test and/or integration test coverage for this PR?
- If not, please explain why it is not required

References

JIRA: https://confluentinc.atlassian.net/browse/DGS-22195

Test & Review

Open questions / Follow-ups

confluent-cla-assistant · 2025-09-11T00:02:10Z

🎉 All Contributor License Agreements have been signed. Ready to merge.
_{Please push an empty commit if you would like to re-run the checks to verify CLA status for all contributors.}

k-raina

Thanks for PR! Added initial review comments.

k-raina · 2025-09-11T07:14:07Z

tests/ducktape/consumer_benchmark_metrics.py

@@ -0,0 +1,363 @@
+"""


Maybe we can have same file tests/ducktape/benchmark_metrics.py for producer and consumer benchmarks. And in bounds.json we can mention default producer and consumer bounds.

Reason : Benchmarks for both producer and consumer should mostly be same i.e "Latency" "Throghput" "Message processed" etc

Yes, I created this new file to avoid all the merge conflicts we have to deal with otherwise :). I think let's refactor the test suite (consolidating shared code, renaming files properly, etc) once both of our PRs are merged

sonarqube-confluent · 2025-09-12T18:14:22Z

68.00% Coverage on New Code (is less than 80.00%)

Analysis Details

21 Issues

2 Bugs
0 Vulnerabilities
19 Code Smells

Coverage and Duplications

68.00% Coverage (64.60% Estimated after merge)
No duplication information (5.20% Estimated after merge)

Project ID: confluent-kafka-python

View in SonarQube

MSeal · 2025-09-12T23:07:58Z

tests/ducktape/consumer_strategy.py

+        return messages_consumed
+
+
+class AsyncConsumerStrategy(ConsumerStrategy):


A lot of duplicate code in this and the prior class. Would be nice if we could remove some more of that duplication. Not blocking merge on it though

draft

5351814

This comment has been minimized.

Sign in to view

update and cleanup

c45cd75

This comment has been minimized.

Sign in to view

k-raina reviewed Sep 11, 2025

View reviewed changes

fangnx marked this pull request as ready for review September 11, 2025 14:30

fangnx requested review from MSeal and a team as code owners September 11, 2025 14:30

fangnx changed the title ~~WIP: ducktape benchmark tests for consumer (sync + async)~~ Add ducktape benchmark tests for consumer (sync + async) Sep 11, 2025

add perf comments, add batch_size param to consume test, lint fix

4dbfb9f

linter fix

a46b11e

MSeal approved these changes Sep 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add ducktape benchmark tests for consumer (sync + async) #2045

Add ducktape benchmark tests for consumer (sync + async) #2045

fangnx commented Sep 11, 2025 •

edited

Loading

Uh oh!

confluent-cla-assistant bot commented Sep 11, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

k-raina left a comment

Uh oh!

k-raina Sep 11, 2025

Uh oh!

fangnx Sep 11, 2025

Uh oh!

sonarqube-confluent bot commented Sep 12, 2025

Uh oh!

MSeal Sep 12, 2025

Uh oh!

Uh oh!

		return messages_consumed


		class AsyncConsumerStrategy(ConsumerStrategy):

Add ducktape benchmark tests for consumer (sync + async) #2045

Are you sure you want to change the base?

Add ducktape benchmark tests for consumer (sync + async) #2045

Conversation

fangnx commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

What This PR Does

Performance Results

Performance Explanation

Developer Guidance Added

Checklist

References

Test & Review

Open questions / Follow-ups

Uh oh!

confluent-cla-assistant bot commented Sep 11, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

k-raina left a comment

Choose a reason for hiding this comment

Uh oh!

k-raina Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

fangnx Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

sonarqube-confluent bot commented Sep 12, 2025

Analysis Details

21 Issues

Coverage and Duplications

Uh oh!

MSeal Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fangnx commented Sep 11, 2025 •

edited

Loading