[opt](merge-io) Implement adaptive merge window sizing for MergeRangeFileReader to prevent read amplification #57303

morningman · 2025-10-24T06:58:14Z

What problem does this PR solve?

Summary

Introduced adaptive logic to dynamically control merge window sizing in
MergeRangeFileReader, preventing severe read amplification in sparse data
scenarios while maintaining efficient merging for dense data patterns.

Problem

The original implementation used a fixed merge window (e.g., 8MB) which
worked well for dense columnar data but caused severe read amplification
with sparse ranges:

Example: Large Gap Scenario (3 ranges × 100KB with 600KB gaps)

User needs: 300KB total
If original logic merged all: 1500KB
Overall amplification: 5.0x
Problem: Reads 5x more data than needed

Solution

Implemented a three-layer adaptive defense mechanism in read_at_impl():

Hard Gap Limit (max_single_gap = 512KB)
Immediately rejects merging if a single gap exceeds 512KB, preventing
catastrophic amplification from huge gaps.
Original Threshold (SMALL_IO = 2MB)
Stops merging when accumulated data > 2MB and next gap ≥ 2MB, maintaining
backward compatibility for typical use cases.
Predictive Gap Ratio Check (adaptive_shrink_threshold = 0.4)
Key Innovation: Proactively checks if including the next gap would push
the gap/content ratio above 40%. Stops merging BEFORE including problematic
gaps, not after.

Only activates after accumulating ≥512KB content (min_content_for_adaptive)
Prevents over-conservative behavior with small initial ranges
Formula: if (hollow_size + next_gap) / content_size > 0.4 → STOP

Performance Results

Scenario 1: Sparse Gaps (15 ranges × 80KB with 50KB gaps)

Before (without adaptive logic):
- Physical IO: 1950KB (merges all ranges in 1 IO)
- User requests: 1200KB
- Overall amplification: 1.625x
- IO count: 1
After (with adaptive logic):
- Physical IO: 1800KB (partial merges in 3 IOs)
- User requests: 1200KB
- Overall amplification: 1.5x
- IO count: 3
Scenario 2: Dense Gaps (10 ranges × 100KB with 5KB gaps)

Before & After (identical behavior):
- Physical IO: 1045KB
- User requests: 1000KB
- Overall amplification: 1.045x
- IO count: 1
Scenario 3: Large Gaps (3 ranges × 100KB with 600KB gaps)

Before (if original logic merged):
- Physical IO: 1500KB
- User requests: 300KB
- Overall amplification: 5.0x (catastrophic!)
- IO count: 1
After (with max_single_gap=512KB):
- Physical IO: 300KB (each range separate)
- User requests: 300KB
- Overall amplification: 1.0x
- IO count: 3

Release note

None

Check List (For Author)

Test
- Regression test
- Unit Test
- Manual test (add detailed scripts or steps below)
- No need to test or manual test. Explain why:
  - This is a refactor/code format and no logic has been changed.
  - Previous test can cover this change.
  - No code files have been changed.
  - Other reason
Behavior changed:
- No.
- Yes.
Does this need documentation?
- No.
- Yes.

Check List (For Reviewer who merge this PR)

Confirm the release note
Confirm test cases
Confirm document
Add branch pick label

Thearas · 2025-10-24T06:58:21Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

morningman · 2025-10-24T09:51:49Z

run buildall

doris-robot · 2025-10-24T10:29:09Z

ClickBench: Total hot run time: 27.81 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 198a459c9f2605de577197aa6fc37035efe7f756, data reload: false

query1	0.06	0.04	0.05
query2	0.10	0.06	0.06
query3	0.26	0.08	0.08
query4	1.61	0.12	0.12
query5	0.29	0.26	0.25
query6	1.20	0.66	0.66
query7	0.03	0.03	0.03
query8	0.06	0.05	0.05
query9	0.64	0.53	0.52
query10	0.59	0.59	0.59
query11	0.16	0.11	0.12
query12	0.16	0.13	0.13
query13	0.62	0.60	0.60
query14	1.01	1.03	1.02
query15	0.86	0.84	0.86
query16	0.40	0.41	0.39
query17	1.06	1.05	1.03
query18	0.23	0.20	0.21
query19	1.93	1.80	1.81
query20	0.01	0.01	0.01
query21	15.47	0.18	0.12
query22	5.18	0.07	0.04
query23	15.65	0.27	0.10
query24	3.13	0.54	0.59
query25	0.07	0.07	0.06
query26	0.14	0.12	0.13
query27	0.07	0.06	0.06
query28	4.46	1.13	0.94
query29	12.57	4.03	3.34
query30	0.28	0.14	0.11
query31	2.83	0.59	0.39
query32	3.25	0.56	0.46
query33	3.03	3.03	3.07
query34	15.73	5.23	4.53
query35	4.63	4.57	4.58
query36	0.68	0.51	0.49
query37	0.10	0.07	0.07
query38	0.07	0.04	0.04
query39	0.03	0.03	0.03
query40	0.17	0.14	0.14
query41	0.09	0.03	0.04
query42	0.04	0.03	0.03
query43	0.04	0.04	0.04
Total cold run time: 98.99 s
Total hot run time: 27.81 s

liutang123 · 2025-10-24T11:11:46Z

be/src/io/fs/buffered_reader.cpp

+            if (gap >= max_single_gap) {
+                break;
+            }
+


gap >= max_single_gap meas gap >= SMALL_IO is always true

morningman added 7 commits October 23, 2025 17:16

1

a86ff26

2

a96c1e2

add origin test

9063d6f

change logic 1

70da7b4

change logic 2

986659e

change logic 3

35e5c8d

change logic 4 pass

fcbc77b

morningman requested review from dataroaring and gavinchou as code owners October 24, 2025 06:58

3

efa4604

morningman changed the title ~~[opt](merge-io) adaptive merge io~~ [opt](merge-io) Implement adaptive merge window sizing for MergeRangeFileReader to prevent read amplification Oct 24, 2025

morningman added dev/3.0.x dev/3.1.x dev/4.0.x labels Oct 24, 2025

morningman and others added 2 commits October 24, 2025 17:47

new

e0b665e

fix format

198a459

liutang123 reviewed Oct 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[opt](merge-io) Implement adaptive merge window sizing for MergeRangeFileReader to prevent read amplification #57303

[opt](merge-io) Implement adaptive merge window sizing for MergeRangeFileReader to prevent read amplification #57303

morningman commented Oct 24, 2025 •

edited

Loading

Uh oh!

Thearas commented Oct 24, 2025

Uh oh!

morningman commented Oct 24, 2025

Uh oh!

doris-robot commented Oct 24, 2025

Uh oh!

liutang123 Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

[opt](merge-io) Implement adaptive merge window sizing for MergeRangeFileReader to prevent read amplification #57303

Are you sure you want to change the base?

[opt](merge-io) Implement adaptive merge window sizing for MergeRangeFileReader to prevent read amplification #57303

Conversation

morningman commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

Summary

Problem

Solution

Performance Results

Release note

Check List (For Author)

Check List (For Reviewer who merge this PR)

Uh oh!

Thearas commented Oct 24, 2025

Uh oh!

morningman commented Oct 24, 2025

Uh oh!

doris-robot commented Oct 24, 2025

Uh oh!

liutang123 Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

morningman commented Oct 24, 2025 •

edited

Loading