-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[ENH] Make roll dirty log always converge to coalesce everything. #4927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
Make Dirty Log Rollup Deterministically Converge and Refactor Log Compaction Path This PR overhauls the dirty log rollup and compaction mechanics in the Rust log service, making dirty log coalescence deterministic: after a single rollup, the dirty log always converges regardless of further write activity. The changes extend to multi-shard rollup concurrency, cache management for coalesced cursor witnesses, test improvements, and several minor refactors and parameter tunings. Key Changes: Affected Areas: This summary was automatically generated by @propel-code-bot |
rust/log-service/src/lib.rs
Outdated
{ | ||
selected_rollups.push((*collection_id, *rollup)); | ||
} | ||
} | ||
} | ||
// Then allocate the collection ID strings outside the lock. | ||
let mut all_collection_info = Vec::with_capacity(selected_rollups.len()); | ||
for (collection_id, rollup) in selected_rollups { | ||
for (collection_id, rollup) in selected_rollups.into_iter().take(100) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
magic constant. Why do we take the first k (and in what order)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left this in by mistake.
e6ca1a1
to
27b7292
Compare
let mut forget = HashSet::default(); | ||
DirtyMarker::coalesce_markers(&records, &mut rollup.rollups, &mut forget)?; | ||
rollup.forget.extend(forget); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
q: why not coalesce_markers(&records, &mut rollup.rollups, &mut rollup.forget)
or maybe coalesce_marksers(&records, &mut rollup)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forget needs to be maintained across futures. Will document.
b414912
to
8695b78
Compare
The dirty log was not guaranteed to converge even in the absence of further writes. I confirmed this empirically with local testing and inspection of the dirty log. Post-patch it always converges within one rollup.
8695b78
to
4306673
Compare
otherwise it gets compacted quicker than you can see it
Description of changes
The dirty log was not guaranteed to converge even in the absence of
further writes. I confirmed this empirically with local testing and
inspection of the dirty log. Post-patch it always converges within one
rollup.
Test plan
CI
pytest
for python,yarn test
for js,cargo nextest
for rustDocumentation Changes
Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?