Initialize ducktape setup #2021

k-raina · 2025-08-14T16:10:14Z

Summary

This pull request introduces a new Ducktape-based integration test suite for the Confluent Kafka Python client, enabling automated producer tests against a locally running Kafka instance.

Ducktape test infrastructure and runner:

Added run_ducktape_test.py script to automate running Ducktape producer tests, with support for running all or specific test cases, environment setup, and result directory management.
Created services/kafka.py with a KafkaClient wrapper to interact with an external Kafka instance, including topic management and connection verification.

Producer test cases:

Implemented test_producer.py with the SimpleProducerTest class, covering basic message production, parameterized batch message tests, and matrix tests for different compression types.

Documentation and structure:

Added a README (README.md) explaining prerequisites, usage, and available test cases for the Ducktape producer tests.

confluent-cla-assistant · 2025-08-14T16:10:25Z

🎉 All Contributor License Agreements have been signed. Ready to merge.
_{Please push an empty commit if you would like to re-run the checks to verify CLA status for all contributors.}

Copilot

Pull Request Overview

This PR introduces a new Ducktape-based integration test suite for the Confluent Kafka Python client, providing automated producer testing capabilities against an external Kafka instance running on localhost:9092.

Adds comprehensive test infrastructure with a test runner script and Kafka client wrapper service
Implements producer test cases covering basic message production, batch processing, and compression scenarios
Provides documentation and setup instructions for running the integration tests

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
tests/ducktape/test_producer.py	Core producer test class with basic, parameterized batch, and matrix compression tests
tests/ducktape/services/kafka.py	Kafka client wrapper service for interacting with external Kafka instance
tests/ducktape/services/init.py	Package initialization for ducktape services
tests/ducktape/run_ducktape_test.py	Test runner script with environment setup and command execution
tests/ducktape/init.py	Package initialization for ducktape tests
tests/ducktape/README.md	Documentation explaining prerequisites, usage, and available test cases

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-08-14T16:11:30Z

tests/ducktape/test_producer.py

+        # Verify topic exists
+        topics = self.kafka.list_topics()
+        assert topic_name in topics, f"Topic {topic_name} was not created. Available topics: {topics}"
+


Using a fixed sleep time is unreliable and can cause flaky tests. Consider implementing a retry mechanism or polling for topic existence instead of a hard-coded sleep.

Suggested change

# Wait for topic to be created by polling

timeout = 10 # seconds

poll_interval = 0.5 # seconds

start_time = time.time()

while True:

topics = self.kafka.list_topics()

if topic_name in topics:

break

if time.time() - start_time > timeout:

raise AssertionError(f"Topic {topic_name} was not created within {timeout} seconds. Available topics: {topics}")

time.sleep(poll_interval)

Copilot · 2025-08-14T16:11:30Z

tests/ducktape/test_producer.py

+
+        # Create topic
+        self.kafka.create_topic(topic_name, partitions=2, replication_factor=1)
+        time.sleep(2)


Using a fixed sleep time is unreliable and can cause flaky tests. Consider implementing a retry mechanism or polling for topic existence instead of a hard-coded sleep.

Suggested change

time.sleep(2)

# Poll for topic existence instead of fixed sleep

timeout = 30 # seconds

poll_interval = 0.5 # seconds

start_time = time.time()

while True:

if self.kafka.topic_exists(topic_name):

break

if time.time() - start_time > timeout:

raise Exception(f"Topic {topic_name} did not become available within {timeout} seconds")

time.sleep(poll_interval)

Overkill copilot suggestion

Copilot · 2025-08-14T16:11:31Z

tests/ducktape/test_producer.py

+
+        # Create topic
+        self.kafka.create_topic(topic_name, partitions=1, replication_factor=1)
+        time.sleep(2)


Using a fixed sleep time is unreliable and can cause flaky tests. Consider implementing a retry mechanism or polling for topic existence instead of a hard-coded sleep.

Suggested change

time.sleep(2)

self._wait_for_topic(topic_name, timeout=30)

Copilot · 2025-08-14T16:11:31Z

tests/ducktape/services/kafka.py

+            # Wait for topic creation to complete
+            for topic_name, f in fs.items():
+                try:
+                    f.result(timeout=30)  # Wait up to 30 seconds


[nitpick] The 30-second timeout for topic creation might be too long for test environments. Consider using a shorter timeout or making it configurable.

Suggested change

f.result(timeout=30) # Wait up to 30 seconds

f.result(timeout=self.topic_creation_timeout) # Wait up to configured seconds

Copilot · 2025-08-14T16:11:32Z

tests/ducktape/run_ducktape_test.py

+import json
+
+# Add the project root to Python path
+project_root = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))


Using multiple os.path.dirname() calls makes the path calculation fragile and hard to understand. Consider using os.path.join() with relative paths or pathlib.Path for more robust path handling.

Suggested change

project_root = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))

from pathlib import Path

# Add the project root to Python path

project_root = str(Path(__file__).resolve().parents[2])

fangnx · 2025-08-14T17:50:26Z

tests/ducktape/test_producer.py

+
+        self.logger.info("Successfully connected to Kafka")
+
+    def test_basic_produce(self):


Let's add a test_produce_with_schema_registry() to we can benchmark test with SR 👍

Can also do this as a follow-up

MSeal · 2025-08-14T22:26:08Z

tests/ducktape/run_ducktape_test.py

+import json
+
+# Add the project root to Python path
+project_root = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))


This shouldn't be necessary given you have an init file to include this in the packaging such that python -m tests.ducktape.run_ducktape_test should work with all the modules loaded.

If need be we can move the integration tests within the module scope if necessary for path loading as an alternative

MSeal · 2025-08-14T22:26:54Z

tests/ducktape/run_ducktape_test.py

+
+    # Check if ducktape is installed
+    try:
+        import ducktape


Can just move this to top of file and we can make it a dev dependency, no need for import catches

MSeal · 2025-08-14T22:30:58Z

tests/ducktape/test_producer.py

+
+        self.logger.info("Successfully connected to Kafka")
+
+    def test_basic_produce(self):


Can also do this as a follow-up

MSeal · 2025-08-14T22:46:28Z

tests/ducktape/run_ducktape_test.py

+    print()
+
+    # Set up environment with proper Python path
+    env = os.environ.copy()


If we use a python -m command below we probably don't need this either

MSeal · 2025-08-14T23:19:47Z

tests/ducktape/test_producer.py

+        self.logger.info("Successfully produced %d messages to topic %s", 
+                        len(delivered_messages), topic_name)
+
+    @parametrize(num_messages=5)


We'll probably want to do time based max throughput rather than time to complete N message, but this is fine for first PR setup

MSeal · 2025-08-14T23:20:29Z

tests/ducktape/test_producer.py

+        self.logger.info("Delivered: %d, Failed: %d", len(delivered_messages), len(failed_messages))
+
+        assert len(failed_messages) == 0, f"Some messages failed to deliver: {failed_messages}"
+        assert len(delivered_messages) == num_messages, \


TODO we'll want to extract execution metrics and assert against bounds therein. A follow-up effort

MSeal · 2025-08-14T23:20:51Z

tests/ducktape/test_producer.py

+
+        # Create topic
+        self.kafka.create_topic(topic_name, partitions=2, replication_factor=1)
+        time.sleep(2)


Overkill copilot suggestion

MSeal · 2025-08-18T21:20:07Z

tests/ducktape/services/kafka.py

@@ -98,3 +98,29 @@ def list_topics(self):
        except Exception as e:
            self.logger.error("Failed to list topics: %s", e)
            return []
+
+    def wait_for_topic(self, topic_name, max_wait_time=30, initial_wait=0.1):


Probably overkill out of the copilot recommendation, but nice to have the abstraction

MSeal · 2025-08-18T21:24:08Z

tests/ducktape/test_producer.py

+        start_time = time.time()
+        messages_sent = 0
+
+        while time.time() - start_time < test_duration:


Let's do a follow-up later on which pulls this into an async context so we don't need to write the while try loop in each test.

rayokota · 2025-08-19T04:34:01Z

@k-raina , @MSeal , this PR broke the build, can you fix?

Please run flake8 before merging in the future.

k-raina · 2025-08-19T06:09:46Z

@rayokota Thanks for catch.
Raised PR #2026

Initialize ducktape setup

4e2e35b

Copilot AI review requested due to automatic review settings August 14, 2025 16:10

k-raina requested review from MSeal and a team as code owners August 14, 2025 16:10

Copilot AI reviewed Aug 14, 2025

View reviewed changes

fangnx reviewed Aug 14, 2025

View reviewed changes

MSeal reviewed Aug 14, 2025

View reviewed changes

k-raina added 3 commits August 16, 2025 14:27

Fix ducktape module resolution

f910e52

Add wait for topic to find topics with timeouts

a86559a

Convert test to throughput based approach

adf9490

k-raina requested review from MSeal and fangnx August 18, 2025 15:18

MSeal approved these changes Aug 18, 2025

View reviewed changes

k-raina merged commit 0173a54 into master Aug 19, 2025
1 of 2 checks passed

k-raina deleted the ducktape-setup branch August 19, 2025 04:24

k-raina mentioned this pull request Aug 19, 2025

Address ducktape dependency to fix build #2026

Merged

+        # Wait for topic to be created by polling
+        timeout = 10  # seconds
+        poll_interval = 0.5  # seconds
+        start_time = time.time()
+        while True:
+            topics = self.kafka.list_topics()
+            if topic_name in topics:
+                break
+            if time.time() - start_time > timeout:
+                raise AssertionError(f"Topic {topic_name} was not created within {timeout} seconds. Available topics: {topics}")
+            time.sleep(poll_interval)

	f.result(timeout=30) # Wait up to 30 seconds
	f.result(timeout=self.topic_creation_timeout) # Wait up to configured seconds


		self.logger.info("Successfully connected to Kafka")

		def test_basic_produce(self):

Initialize ducktape setup #2021

Initialize ducktape setup #2021

Uh oh!

Conversation

k-raina commented Aug 14, 2025

Summary

Uh oh!

confluent-cla-assistant bot commented Aug 14, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rayokota commented Aug 19, 2025

Uh oh!

k-raina commented Aug 19, 2025

Uh oh!

Uh oh!