Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CORE-8161] storage: add min.cleanable.dirty.ratio, schedule compaction by dirty_ratio #24991

Open
wants to merge 12 commits into
base: dev
Choose a base branch
from

Conversation

WillemKauf
Copy link
Contributor

@WillemKauf WillemKauf commented Jan 31, 2025

Based on PR #24649.

WIP, tests to be added.

Instead of unconditionally compacting all logs during a round of housekeeping, users may now optionally schedule log compaction in the log_manager using the cluster/topic property min_cleanable_dirty_ratio/min.cleanable.dirty.ratio.

As mentioned in the above PR,

The dirty ratio of a log is defined as the ratio between the number of bytes in "dirty" segments and the total number of bytes in closed segments.
Dirty segments are closed segments which have not yet been cleanly compacted- i.e, duplicates for keys in this segment could be found in the prefix of the log up to this segment.

By setting the min.cleanable.dirty.ratio on a per topic basis, users can avoid unnecessary read/write amplification during compaction as the log grows in size.

A housekeeping scan will still be performed every log_compaction_interval_ms, and the log's dirty_ratio will be tested against min.cleanable.dirty.ratio in determining it's eligibility for compaction. Additionally, logs are now compacted in descending order according to their dirty ratio, offering a better "bang for buck" heuristic for compaction scheduling.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.3.x
  • v24.2.x
  • v24.1.x

Release Notes

Improvements

  • Allow for optional scheduling of compaction via min_cleanable_dirty_ratio

@WillemKauf WillemKauf requested a review from a team as a code owner January 31, 2025 17:48
@WillemKauf WillemKauf force-pushed the min_cleanable_dirty_ratio branch 2 times, most recently from db600d4 to 78b2928 Compare January 31, 2025 21:02
@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Feb 1, 2025

Retry command for Build#61464

please wait until all jobs are finished before running the slash command


/ci-repeat 1
tests/rptest/tests/describe_topics_test.py::DescribeTopicsTest.test_describe_topics_with_documentation_and_types
tests/rptest/tests/partition_force_reconfiguration_test.py::NodeWiseRecoveryTest.test_node_wise_recovery@{"dead_node_count":2}

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Feb 1, 2025

CI test results

test results on build#61464
test_id test_kind job_url test_status passed
gtest_raft_rpunit.gtest_raft_rpunit unit https://buildkite.com/redpanda/redpanda/builds/61464#0194be30-4fe3-4201-a506-11eca43a36e9 FLAKY 1/2
kafka_server_rpfixture.kafka_server_rpfixture unit https://buildkite.com/redpanda/redpanda/builds/61464#0194be30-4fe3-4201-a506-11eca43a36e9 FAIL 0/2
rptest.tests.cloud_storage_scrubber_test.CloudStorageScrubberTest.test_scrubber.cloud_storage_type=CloudStorageType.S3 ducktape https://buildkite.com/redpanda/redpanda/builds/61464#0194be87-f8fb-4fbd-9207-0641084924fe FLAKY 1/2
rptest.tests.describe_topics_test.DescribeTopicsTest.test_describe_topics_with_documentation_and_types ducktape https://buildkite.com/redpanda/redpanda/builds/61464#0194be87-f8f9-4197-b75d-9e2b0e669087 FAIL 0/20
rptest.tests.e2e_shadow_indexing_test.EndToEndThrottlingTest.test_throttling.cloud_storage_type=CloudStorageType.S3 ducktape https://buildkite.com/redpanda/redpanda/builds/61464#0194be87-f8fb-4fbd-9207-0641084924fe FLAKY 1/2
rptest.tests.partition_force_reconfiguration_test.NodeWiseRecoveryTest.test_node_wise_recovery.dead_node_count=2 ducktape https://buildkite.com/redpanda/redpanda/builds/61464#0194be87-f8fb-4fbd-9207-0641084924fe FAIL 0/20
rptest.tests.retention_policy_test.RetentionPolicyTest.test_changing_topic_retention_with_restart.cloud_storage_type=CloudStorageType.ABS ducktape https://buildkite.com/redpanda/redpanda/builds/61464#0194be87-f8f9-4197-b75d-9e2b0e669087 FLAKY 1/3
test_compat_rpunit.test_compat_rpunit unit https://buildkite.com/redpanda/redpanda/builds/61464#0194be2d-b364-4273-8107-604c1a24978e FAIL 0/2
test results on build#61480
test_id test_kind job_url test_status passed
gtest_raft_rpunit.gtest_raft_rpunit unit https://buildkite.com/redpanda/redpanda/builds/61480#0194c30c-66c7-425f-882b-a62f163db808 FLAKY 1/2
gtest_raft_rpunit.gtest_raft_rpunit unit https://buildkite.com/redpanda/redpanda/builds/61480#0194c30c-66c8-4db9-a2da-7dc5f0262638 FLAKY 1/2
rptest.tests.e2e_shadow_indexing_test.EndToEndThrottlingTest.test_throttling.cloud_storage_type=CloudStorageType.S3 ducktape https://buildkite.com/redpanda/redpanda/builds/61480#0194c368-0af7-4d62-86e0-9ef129bfeb88 FLAKY 1/2
rptest.tests.offset_for_leader_epoch_archival_test.OffsetForLeaderEpochArchivalTest.test_querying_archive ducktape https://buildkite.com/redpanda/redpanda/builds/61480#0194c368-0af6-4ff1-92d6-ad77df6dc1f7 FLAKY 1/5
rptest.tests.partition_force_reconfiguration_test.NodeWiseRecoveryTest.test_node_wise_recovery.dead_node_count=2 ducktape https://buildkite.com/redpanda/redpanda/builds/61480#0194c368-0af7-4d62-86e0-9ef129bfeb88 FAIL 0/20
test results on build#61769
test_id test_kind job_url test_status passed
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery ducktape https://buildkite.com/redpanda/redpanda/builds/61769#0194ed98-e02d-49fb-815b-8a92ef12fe4a FLAKY 1/2
rptest.tests.log_compaction_test.LogCompactionTest.compaction_stress_test.cleanup_policy=compact.delete.key_set_cardinality=1000.storage_compaction_key_map_memory_kb=3 ducktape https://buildkite.com/redpanda/redpanda/builds/61769#0194ed9d-c07e-41c7-ad37-873a07105423 FLAKY 1/2
rptest.tests.mirror_maker_test.TestMirrorMakerService.test_consumer_group_mirroring.source_type=kafka ducktape https://buildkite.com/redpanda/redpanda/builds/61769#0194ed98-e02c-45df-bba3-2bf6f621197c FAIL 0/20
rptest.tests.partition_movement_test.PartitionMovementTest.test_availability_when_one_node_down ducktape https://buildkite.com/redpanda/redpanda/builds/61769#0194ed98-e02c-45df-bba3-2bf6f621197c FLAKY 1/2
test results on build#61775
test_id test_kind job_url test_status passed
rptest.tests.log_compaction_test.LogCompactionTest.compaction_stress_test.cleanup_policy=compact.delete.key_set_cardinality=1000.storage_compaction_key_map_memory_kb=3 ducktape https://buildkite.com/redpanda/redpanda/builds/61775#0194effa-4c71-49e2-a5c3-06cce4574f93 FLAKY 1/2
rptest.tests.mirror_maker_test.TestMirrorMakerService.test_consumer_group_mirroring.source_type=kafka ducktape https://buildkite.com/redpanda/redpanda/builds/61775#0194eff5-6746-4bf1-aa34-e78dd6b67522 FAIL 0/20
rptest.tests.scaling_up_test.ScalingUpTest.test_scaling_up_with_recovered_topic ducktape https://buildkite.com/redpanda/redpanda/builds/61775#0194effa-4c73-4464-9bfb-d54df885c3c0 FLAKY 1/2
test_compat_rpunit.test_compat_rpunit unit https://buildkite.com/redpanda/redpanda/builds/61775#0194efb0-cf5d-48a4-9e93-707da0399798 FLAKY 1/2
test results on build#61791
test_id test_kind job_url test_status passed
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery ducktape https://buildkite.com/redpanda/redpanda/builds/61791#0194f362-3879-4838-b35b-39ecd5051d85 FLAKY 1/2
rptest.tests.datalake.compaction_test.CompactionGapsTest.test_translation_no_gaps.cloud_storage_type=CloudStorageType.S3.catalog_type=CatalogType.REST_JDBC ducktape https://buildkite.com/redpanda/redpanda/builds/61791#0194f362-387a-4e5d-9f71-62729ec6a9da FLAKY 1/4
rptest.tests.log_compaction_test.LogCompactionTest.compaction_stress_test.cleanup_policy=compact.delete.key_set_cardinality=1000.storage_compaction_key_map_memory_kb=3 ducktape https://buildkite.com/redpanda/redpanda/builds/61791#0194f362-387b-41cd-954d-e30904bab43d FLAKY 1/2
rptest.tests.log_compaction_test.LogCompactionTest.compaction_stress_test.cleanup_policy=compact.key_set_cardinality=1000.storage_compaction_key_map_memory_kb=10 ducktape https://buildkite.com/redpanda/redpanda/builds/61791#0194f362-387b-41cd-954d-e30904bab43d FLAKY 1/4
rptest.tests.log_compaction_test.LogCompactionTest.compaction_stress_test.cleanup_policy=compact.key_set_cardinality=1000.storage_compaction_key_map_memory_kb=3 ducktape https://buildkite.com/redpanda/redpanda/builds/61791#0194f362-3879-4838-b35b-39ecd5051d85 FLAKY 1/2
test results on build#61800
test_id test_kind job_url test_status passed
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery ducktape https://buildkite.com/redpanda/redpanda/builds/61800#0194f54b-039e-4ea0-858f-4f25d07def12 FLAKY 1/2
rptest.tests.datalake.compaction_test.CompactionGapsTest.test_translation_no_gaps.cloud_storage_type=CloudStorageType.S3.catalog_type=CatalogType.REST_HADOOP ducktape https://buildkite.com/redpanda/redpanda/builds/61800#0194f539-0f33-48da-a1a6-50bb0c690fe9 FLAKY 1/3
rptest.tests.datalake.compaction_test.CompactionGapsTest.test_translation_no_gaps.cloud_storage_type=CloudStorageType.S3.catalog_type=CatalogType.REST_JDBC ducktape https://buildkite.com/redpanda/redpanda/builds/61800#0194f539-0f34-4233-b49f-a1d8f5323962 FLAKY 1/3
rptest.tests.datalake.simple_connect_test.RedpandaConnectIcebergTest.test_translating_avro_serialized_records.cloud_storage_type=CloudStorageType.S3 ducktape https://buildkite.com/redpanda/redpanda/builds/61800#0194f54b-039e-4ea0-858f-4f25d07def12 FLAKY 1/2
rptest.tests.delete_records_test.DeleteRecordsTest.test_delete_records_concurrent_truncations.cloud_storage_enabled=True.truncate_point=start_offset ducktape https://buildkite.com/redpanda/redpanda/builds/61800#0194f54b-039c-47ca-bd97-e67be6e3dac2 FLAKY 1/2
rptest.tests.log_compaction_test.LogCompactionTest.compaction_stress_test.cleanup_policy=compact.key_set_cardinality=1000.storage_compaction_key_map_memory_kb=3 ducktape https://buildkite.com/redpanda/redpanda/builds/61800#0194f539-0f33-48da-a1a6-50bb0c690fe9 FLAKY 1/2
test results on build#61803
test_id test_kind job_url test_status passed
rptest.tests.cloud_storage_timing_stress_test.CloudStorageTimingStressTest.test_cloud_storage_with_partition_moves.cleanup_policy=compact.delete ducktape https://buildkite.com/redpanda/redpanda/builds/61803#0194f69c-0782-4a80-b2b3-956b1e980961 FLAKY 1/2
rptest.tests.log_compaction_test.LogCompactionTest.compaction_stress_test.cleanup_policy=compact.delete.key_set_cardinality=1000.storage_compaction_key_map_memory_kb=10 ducktape https://buildkite.com/redpanda/redpanda/builds/61803#0194f69c-0780-4965-bfd9-81ca3854161e FLAKY 1/2
rptest.tests.log_compaction_test.LogCompactionTest.compaction_stress_test.cleanup_policy=compact.delete.key_set_cardinality=1000.storage_compaction_key_map_memory_kb=3 ducktape https://buildkite.com/redpanda/redpanda/builds/61803#0194f69c-0781-4818-a649-7df74eb7ce0f FLAKY 1/2
rptest.tests.partition_force_reconfiguration_test.NodeWiseRecoveryTest.test_node_wise_recovery.dead_node_count=2 ducktape https://buildkite.com/redpanda/redpanda/builds/61803#0194f697-41c0-46d0-a978-c550d40c25dd FLAKY 1/2
rptest.tests.partition_movement_test.SIPartitionMovementTest.test_cross_shard.num_to_upgrade=0.cloud_storage_type=CloudStorageType.ABS ducktape https://buildkite.com/redpanda/redpanda/builds/61803#0194f69c-0780-4965-bfd9-81ca3854161e FLAKY 1/2
rptest.tests.scaling_up_test.ScalingUpTest.test_scaling_up_with_recovered_topic ducktape https://buildkite.com/redpanda/redpanda/builds/61803#0194f69c-0780-4965-bfd9-81ca3854161e FLAKY 1/2

@WillemKauf WillemKauf force-pushed the min_cleanable_dirty_ratio branch from 78b2928 to 7a6aab3 Compare February 1, 2025 19:43
@WillemKauf WillemKauf changed the title storage: schedule compaction by dirty_ratio storage: add min.cleanable.dirty.ratio, schedule compaction by dirty_ratio Feb 1, 2025
@WillemKauf WillemKauf changed the title storage: add min.cleanable.dirty.ratio, schedule compaction by dirty_ratio [CORE-8161] storage: add min.cleanable.dirty.ratio, schedule compaction by dirty_ratio Feb 1, 2025
@vbotbuildovich
Copy link
Collaborator

Retry command for Build#61480

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/tests/partition_force_reconfiguration_test.py::NodeWiseRecoveryTest.test_node_wise_recovery@{"dead_node_count":2}

src/v/storage/probe.cc Outdated Show resolved Hide resolved
src/v/cluster/metadata_cache.cc Show resolved Hide resolved
src/v/storage/log_manager.cc Show resolved Hide resolved
@WillemKauf WillemKauf force-pushed the min_cleanable_dirty_ratio branch from 7a6aab3 to 8b2ac50 Compare February 8, 2025 10:17
@WillemKauf
Copy link
Contributor Author

Force push to:

@WillemKauf WillemKauf force-pushed the min_cleanable_dirty_ratio branch from 8b2ac50 to 0110d6c Compare February 10, 2025 00:45
@WillemKauf
Copy link
Contributor Author

Force push to:

  • Use map of ntp_to_compaction_heuristic to sort _logs_list in log_manager.cc for compaction ordering
  • Add compaction_scheduling storage fixture test

@vbotbuildovich
Copy link
Collaborator

Retry command for Build#61769

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/tests/mirror_maker_test.py::TestMirrorMakerService.test_consumer_group_mirroring@{"source_type":"kafka"}

@WillemKauf WillemKauf force-pushed the min_cleanable_dirty_ratio branch from 0110d6c to e99aa91 Compare February 10, 2025 11:47
@WillemKauf
Copy link
Contributor Author

Force push to:

  • Make recommended docs changes
  • Add log_housekeeping_meta::bitflags::compaction_checked and use it in main compacting loop in log_manager.cc

@vbotbuildovich
Copy link
Collaborator

Retry command for Build#61775

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/tests/mirror_maker_test.py::TestMirrorMakerService.test_consumer_group_mirroring@{"source_type":"kafka"}

@WillemKauf
Copy link
Contributor Author

WillemKauf commented Feb 10, 2025

test_compat is failing, and it's a little strange. Every so often we get a somewhat truncated double back from serde:

Expected: min_cleanable_dirty_ratio: {0.9553594929310676}

Decoded: min_cleanable_dirty_ratio: {0.9553594929310675}

This is the only diff between expected and decoded in the serde test.

Do we have a serde bug with double floating-point precision values?

@WillemKauf WillemKauf force-pushed the min_cleanable_dirty_ratio branch from e99aa91 to 1df39dd Compare February 11, 2025 03:37
@WillemKauf WillemKauf force-pushed the min_cleanable_dirty_ratio branch from 1df39dd to 70c37b7 Compare February 11, 2025 12:14
@WillemKauf
Copy link
Contributor Author

Force push to:

  • Add min_cleanable_dirty_ratio_validator to kafka config utils
  • Add test_min_cleanable_dirty_ratio_validation ducktape test
  • Represent min_cleanable_dirty_ratio cluster property as a decimal, not a percentage to align with Kafka

These call sites were previously hardcoded to `int64_t` for the
type used in a `boost::lexical_cast`.

With the advent of the first `tristate<double>`, this is no longer
a valid default.

Use a conditional type in order to account for the possibility of
a floating point tristate value conversion.
With some minor modifications in `kafka_cli_tools.py` and `types.py`.
The housekeeping loop will no longer compact every partition indiscriminately,
but instead evaluate which partitions are worth compacting via a heuristic.

Add a new `bitflag` to `log_housekeeping_meta::bitflags` that indicates
the meta has been evaluated for compaction (but does not indicate that it
has been compacted.)
@WillemKauf WillemKauf force-pushed the min_cleanable_dirty_ratio branch from 70c37b7 to 1af7ac0 Compare February 11, 2025 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants