Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dl/translator: preparatory change for translator porting #25077

Draft
wants to merge 9 commits into
base: dev
Choose a base branch
from

Conversation

bharathv
Copy link
Contributor

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.3.x
  • v24.2.x
  • v24.1.x

Release Notes

  • none

@bharathv
Copy link
Contributor Author

/dt

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Feb 12, 2025

CI test results

test results on build#61812
test_id test_kind job_url test_status passed
datalake_cloud_rpfixture.datalake_cloud_rpfixture unit https://buildkite.com/redpanda/redpanda/builds/61812#0194f809-a850-4ae8-be27-dc244c7b7d7d FAIL 0/2
datalake_cloud_rpfixture.datalake_cloud_rpfixture unit https://buildkite.com/redpanda/redpanda/builds/61812#0194f809-a851-4d8d-994c-4b22503e0924 FAIL 0/2
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery ducktape https://buildkite.com/redpanda/redpanda/builds/61812#0194f853-0be3-430e-bf2d-8b43e45b00e0 FLAKY 1/3
rptest.tests.datalake.compaction_test.CompactionGapsTest.test_translation_no_gaps.cloud_storage_type=CloudStorageType.S3.catalog_type=CatalogType.REST_HADOOP ducktape https://buildkite.com/redpanda/redpanda/builds/61812#0194f853-0be3-430e-bf2d-8b43e45b00e0 FLAKY 1/2
rptest.tests.datalake.compaction_test.CompactionGapsTest.test_translation_no_gaps.cloud_storage_type=CloudStorageType.S3.catalog_type=CatalogType.REST_JDBC ducktape https://buildkite.com/redpanda/redpanda/builds/61812#0194f853-0be0-466f-9b2e-9f40164b3f99 FLAKY 1/4
rptest.tests.enterprise_features_license_test.EnterpriseFeaturesTest.test_enable_features.feature=Feature.oidc.install_license=False.disable_trial=False ducktape https://buildkite.com/redpanda/redpanda/builds/61812#0194f853-0be3-430e-bf2d-8b43e45b00e0 FLAKY 1/2
test results on build#61827
test_id test_kind job_url test_status passed
kafka_server_rpfixture.kafka_server_rpfixture unit https://buildkite.com/redpanda/redpanda/builds/61827#0194fc77-e54c-4094-ac06-db8d2e99455b FLAKY 1/2
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery ducktape https://buildkite.com/redpanda/redpanda/builds/61827#0194fcbc-1cb2-48b9-b97c-63a10f74c063 FLAKY 1/2
rptest.tests.log_compaction_test.LogCompactionTest.compaction_stress_test.cleanup_policy=compact.key_set_cardinality=1000.storage_compaction_key_map_memory_kb=10 ducktape https://buildkite.com/redpanda/redpanda/builds/61827#0194fcc1-9d0c-4df8-b00c-cf2f01da7d89 FLAKY 1/2
rptest.tests.log_compaction_test.LogCompactionTest.compaction_stress_test.cleanup_policy=compact.key_set_cardinality=1000.storage_compaction_key_map_memory_kb=3 ducktape https://buildkite.com/redpanda/redpanda/builds/61827#0194fcc1-9d0e-4eed-b13a-b0603a75e846 FLAKY 1/3
rptest.tests.partition_balancer_test.PartitionBalancerTest.test_recovery_mode_rebalance_finish ducktape https://buildkite.com/redpanda/redpanda/builds/61827#0194fcc1-9d0c-4df8-b00c-cf2f01da7d89 FLAKY 1/2
rptest.tests.write_caching_fi_e2e_test.WriteCachingFailureInjectionE2ETest.test_crash_all_with_consumer_group ducktape https://buildkite.com/redpanda/redpanda/builds/61827#0194fcbc-1cb2-48b9-b97c-63a10f74c063 FLAKY 1/2

@bharathv
Copy link
Contributor Author

/dt

This method loops through the column writers to check if any of them are
flush worthy, computes the memory usage in the same loop. Useful for a
latter commit that avoids this loop again and needs stats right after
append.
The interface implementations keep track of the current memory used by
the writer and related reservations.
Adds the following

- flush() - flushes all the buffered bytes to the output stream
- methods to fetch buffered/flushed bytes
The implementation works with the scheduler to reserve memory as needed.
.. instead of lazy_abort_source. To be used later, they are both
connected anyway.
Currently multiplexer is a one shot class with pattern as follows

mux = create_mux();
co_await reader.consume(mux...)

With the new changes, we want multiplexer to multiplex across scheduling
iterations and release resouces inbetween. This commit makes changes to
the API support this port. The new pattern would look something like
this..

mux = create_mux();
mux.multiplex(reader1...)
mux.flush_writers(); // optional
mux.multiplex(reader2..)
mux.flush_writers(); // optional
...
...

result = co_await std::move(mux).finish();

The ability to temporarily flush all the intermediate state and
multiplex across multiple readers enables porting to the new scheduler
API.
Make the task long running to support batching of data across multiple
iterations of scheduler
@bharathv
Copy link
Contributor Author

/dt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants