-
Notifications
You must be signed in to change notification settings - Fork 930
Add semaphore block for ducktape tests #2037
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
69af5bb
df21c88
3cf1cdf
f599a49
bb627a8
b756186
725e068
a76be6e
59c3cae
89ecaa9
7ac15e0
ab11ed4
1372696
76ffd69
ea7ae2d
3134cdc
e418c4b
1c4d311
6551844
0936e03
5449fa1
3972fd1
33a59f1
d78dc47
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -275,6 +275,106 @@ blocks: | |
- python3 -m venv _venv && source _venv/bin/activate | ||
- chmod u+r+x tools/source-package-verification.sh | ||
- tools/source-package-verification.sh | ||
- name: "Ducktape Performance Tests (Linux x64)" | ||
dependencies: [] | ||
task: | ||
agent: | ||
machine: | ||
type: s1-prod-ubuntu24-04-amd64-3 | ||
env_vars: | ||
- name: OS_NAME | ||
value: linux | ||
- name: ARCH | ||
value: x64 | ||
- name: BENCHMARK_BOUNDS_CONFIG | ||
value: tests/ducktape/benchmark_bounds.json | ||
- name: BENCHMARK_ENVIRONMENT | ||
value: ci | ||
prologue: | ||
commands: | ||
- '[[ -z $DOCKERHUB_APIKEY ]] || docker login --username $DOCKERHUB_USER --password $DOCKERHUB_APIKEY' | ||
jobs: | ||
- name: Build and Tests | ||
commands: | ||
# Setup Python environment | ||
- sem-version python 3.9 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess using an older python version will catch more perf issues but I worry new versions might have behavioral differences we'd miss. e.g. some of our C-bindings we use are deprecated in 3.13. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd be ok with leaving a TODO to matrix this across a couple python versions down the line. What do you think? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Makes sense! We could add couple of python versions eg. |
||
- python3 -m venv _venv && source _venv/bin/activate | ||
|
||
# Install ducktape framework and additional dependencies | ||
- pip install ducktape psutil | ||
|
||
# Install existing test requirements | ||
- pip install -r requirements/requirements-tests.txt | ||
|
||
# Build and install confluent-kafka from source | ||
- lib_dir=dest/runtimes/$OS_NAME-$ARCH/native | ||
- tools/wheels/install-librdkafka.sh "${LIBRDKAFKA_VERSION#v}" dest | ||
- export CFLAGS="$CFLAGS -I${PWD}/dest/build/native/include" | ||
- export LDFLAGS="$LDFLAGS -L${PWD}/${lib_dir}" | ||
- export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$PWD/$lib_dir" | ||
- python3 -m pip install -e . | ||
|
||
# Store project root for reliable navigation | ||
- PROJECT_ROOT="${PWD}" | ||
|
||
# Start Kafka cluster and Schema Registry using dedicated ducktape compose file (KRaft mode) | ||
- cd "${PROJECT_ROOT}/tests/docker" | ||
- docker-compose -f docker-compose.ducktape.yml up -d kafka schema-registry | ||
|
||
# Debug: Check container status and logs | ||
- echo "=== Container Status ===" | ||
- docker-compose -f docker-compose.ducktape.yml ps | ||
- echo "=== Kafka Logs ===" | ||
- docker-compose -f docker-compose.ducktape.yml logs kafka | tail -50 | ||
|
||
# Wait for Kafka to be ready (using PLAINTEXT listener for external access) | ||
- | | ||
timeout 1800 bash -c ' | ||
counter=0 | ||
until docker-compose -f docker-compose.ducktape.yml exec -T kafka kafka-topics --bootstrap-server localhost:9092 --list >/dev/null 2>&1; do | ||
echo "Waiting for Kafka... (attempt $((counter+1)))" | ||
|
||
# Show logs every 4th attempt (every 20 seconds) | ||
if [ $((counter % 4)) -eq 0 ] && [ $counter -gt 0 ]; then | ||
echo "=== Recent Kafka Logs ===" | ||
docker-compose -f docker-compose.ducktape.yml logs --tail=10 kafka | ||
echo "=== Container Status ===" | ||
docker-compose -f docker-compose.ducktape.yml ps kafka | ||
fi | ||
|
||
counter=$((counter+1)) | ||
sleep 5 | ||
done | ||
' | ||
- echo "Kafka cluster is ready!" | ||
|
||
# Wait for Schema Registry to be ready | ||
- echo "=== Waiting for Schema Registry ===" | ||
- | | ||
timeout 300 bash -c ' | ||
counter=0 | ||
until curl -f http://localhost:8081/subjects >/dev/null 2>&1; do | ||
echo "Waiting for Schema Registry... (attempt $((counter+1)))" | ||
|
||
# Show logs every 3rd attempt (every 15 seconds) | ||
if [ $((counter % 3)) -eq 0 ] && [ $counter -gt 0 ]; then | ||
echo "=== Recent Schema Registry Logs ===" | ||
docker-compose -f docker-compose.ducktape.yml logs --tail=10 schema-registry | ||
echo "=== Schema Registry Container Status ===" | ||
docker-compose -f docker-compose.ducktape.yml ps schema-registry | ||
fi | ||
|
||
counter=$((counter+1)) | ||
sleep 5 | ||
done | ||
' | ||
- echo "Schema Registry is ready!" | ||
|
||
# Run standard ducktape tests with CI bounds | ||
- cd "${PROJECT_ROOT}" && PYTHONPATH="${PROJECT_ROOT}" python tests/ducktape/run_ducktape_test.py | ||
|
||
# Cleanup | ||
- cd "${PROJECT_ROOT}/tests/docker" && docker-compose -f docker-compose.ducktape.yml down -v || true | ||
- name: "Packaging" | ||
run: | ||
when: "tag =~ '.*'" | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
services: | ||
kafka: | ||
image: confluentinc/cp-kafka:latest | ||
container_name: kafka-ducktape | ||
ports: | ||
- "9092:9092" | ||
- "29092:29092" | ||
environment: | ||
KAFKA_NODE_ID: 1 | ||
KAFKA_PROCESS_ROLES: broker,controller | ||
KAFKA_CONTROLLER_QUORUM_VOTERS: 1@kafka:9093 | ||
KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER | ||
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092,CONTROLLER://0.0.0.0:9093,PLAINTEXT_HOST://0.0.0.0:29092 | ||
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092,PLAINTEXT_HOST://dockerhost:29092 | ||
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT | ||
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT | ||
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 | ||
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1 | ||
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1 | ||
CLUSTER_ID: 4L6g3nShT-eMCtK--X86sw | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For my knowledge, is this the cluster dedicated for kafka testing? https://github.com/search?q=org%3Aconfluentinc%204L6g3nShT-eMCtK--X86sw&type=code There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes Each CI job spins up its own isolated Kafka container with this ID |
||
|
||
schema-registry: | ||
image: confluentinc/cp-schema-registry:latest | ||
container_name: schema-registry-ducktape | ||
depends_on: | ||
- kafka | ||
ports: | ||
- "8081:8081" | ||
extra_hosts: | ||
- "dockerhost:172.17.0.1" | ||
environment: | ||
SCHEMA_REGISTRY_HOST_NAME: schema-registry | ||
SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: dockerhost:29092 | ||
SCHEMA_REGISTRY_LISTENERS: http://0.0.0.0:8081 | ||
SCHEMA_REGISTRY_KAFKASTORE_TOPIC_REPLICATION_FACTOR: 1 | ||
SCHEMA_REGISTRY_DEBUG: 'true' |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,11 +1,26 @@ | ||
{ | ||
"_comment": "Default performance bounds for benchmark tests", | ||
"min_throughput_msg_per_sec": 1500.0, | ||
"max_p95_latency_ms": 1500.0, | ||
"max_error_rate": 0.01, | ||
"min_success_rate": 0.99, | ||
"max_p99_latency_ms": 2500.0, | ||
"max_memory_growth_mb": 600.0, | ||
"max_buffer_full_rate": 0.03, | ||
"min_messages_per_poll": 15.0 | ||
"_comment": "Performance bounds for benchmark tests by environment", | ||
"local": { | ||
"_comment": "Default bounds for local development - more relaxed thresholds", | ||
"min_throughput_msg_per_sec": 1000.0, | ||
"max_p95_latency_ms": 2000.0, | ||
"max_error_rate": 0.02, | ||
"min_success_rate": 0.98, | ||
"max_p99_latency_ms": 3000.0, | ||
"max_memory_growth_mb": 800.0, | ||
"max_buffer_full_rate": 0.05, | ||
"min_messages_per_poll": 10.0 | ||
}, | ||
"ci": { | ||
"_comment": "Stricter bounds for CI environment - production-like requirements", | ||
"min_throughput_msg_per_sec": 1500.0, | ||
"max_p95_latency_ms": 1500.0, | ||
"max_error_rate": 0.01, | ||
"min_success_rate": 0.99, | ||
"max_p99_latency_ms": 2500.0, | ||
"max_memory_growth_mb": 600.0, | ||
"max_buffer_full_rate": 0.03, | ||
"min_messages_per_poll": 10.0 | ||
}, | ||
"_default_environment": "local" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, how is the machine type chosen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"s1-prod-" this is a production grade system, which is used by other semaphore jobs. IMO, its already timetested for this repository, so we should keep using it.