Skip to content

feat: trim fragment reuse index after remapping #3911

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 3, 2025

Conversation

jackye1995
Copy link
Contributor

Closes #3836

@jackye1995 jackye1995 marked this pull request as draft May 30, 2025 07:43
@github-actions github-actions bot added the enhancement New feature or request label May 30, 2025
@codecov-commenter
Copy link

codecov-commenter commented May 30, 2025

Codecov Report

Attention: Patch coverage is 81.97674% with 31 lines in your changes missing coverage. Please review.

Project coverage is 78.25%. Comparing base (72f9a9b) to head (6bb53b4).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
rust/lance/src/dataset/index/frag_reuse.rs 81.59% 22 Missing and 8 partials ⚠️
rust/lance-index/src/lib.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3911      +/-   ##
==========================================
+ Coverage   78.24%   78.25%   +0.01%     
==========================================
  Files         283      284       +1     
  Lines      109206   109376     +170     
  Branches   109206   109376     +170     
==========================================
+ Hits        85450    85595     +145     
- Misses      20388    20408      +20     
- Partials     3368     3373       +5     
Flag Coverage Δ
unittests 78.25% <81.97%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jackye1995 jackye1995 marked this pull request as ready for review June 2, 2025 04:57
build_frag_reuse_index_metadata(dataset, frag_reuse_index_details, fragment_bitmaps)
.await?;

let transaction = Transaction::new(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that this PR does not include the conflict resolution logic when there is a CreateIndex transaction vs a Rewrite transaction that both tries to create a new frag reuse index. That is handled in the next PR.

Copy link
Contributor

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Do you think we will eventually tie this in with cleanup_old_versions or will it remain a separate cleanup function even up at the user level?

Comment on lines +29 to +33
let indices = dataset.load_indices().await?;
let frag_reuse_index_meta = match indices.iter().find(|idx| idx.name == FRAG_REUSE_INDEX_NAME) {
None => return Ok(()),
Some(idx) => idx,
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a load_indices_by_name method. Maybe we should convert it to load_index_by_name and then you can use it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 I originally used that but ended up doing this because that one returns a list and I still need to check empty or select the first one. I think it is definitely worth adding a load_index_by_name which can remove a lot of similar code like this even for other indices.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should even start considering name to be unique. We have APIs like drop_index which take the index name as a parameter and sort of imply that a unique index name is already the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The APIs are a bit inconsistent at this point, for example drop_index will drop all the indices sharing the same name, so technically drop_indices(&mut self, name: &str) is a more correct name:

let transaction = Transaction::new(
            self.manifest.version,
            Operation::CreateIndex {
                new_indices: vec![],
                removed_indices: indices.clone(),
            },
            /*blobs_op= */ None,
            None,
        );

But in prewarm_index, it assumes there is only 1 index per name and only warms the first index of that name:

let index = self
            .open_generic_index(name, &indices[0].uuid.to_string(), &NoOpMetricsCollector)
            .await?;
        index.prewarm().await?;

Maybe we need to just re-examine all of them to make them at least consistent. But my understanding is that there could be delta indices that share the same name but are different UUIDs, so we cannot really enforce unique index name?


let mut retained_versions = Vec::new();
let mut fragment_bitmaps = RoaringBitmap::new();
for version in frag_reuse_details.versions.iter() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep track of whether any versions were cleaned up? If nothing is cleaned up we could probably return early instead of proceeding with a transaction?

}
None => {
warn!(
"Index {} ({}) missing fragment bitmap, cannot determine if it is caught up with the fragment reuse version, consider it caught up to unblock cleanup",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Index {} ({}) missing fragment bitmap, cannot determine if it is caught up with the fragment reuse version, consider it caught up to unblock cleanup",
"Index {} ({}) missing fragment bitmap, cannot determine if it is caught up with the fragment reuse version, consider retraining the index",

We should maybe suggest a retrain here so users have a way to remove the warning.

Comment on lines +106 to +130
if old_frag_in_index != frag_reuse_version.old_frags.len() {
// This should never happen because we always commit a full rewrite group
// and we always reindex either the entire group or nothing.
// We use invalid input to be consistent with
// dataset::transaction::recalculate_fragment_bitmap
return Err(Error::invalid_input(
format!("The compaction plan included a rewrite group that was a split of indexed and non-indexed data: {:?}",
frag_reuse_version.old_frags.iter().map(|frag| frag.id).collect::<Vec<_>>()),
location!()));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should be "return error" or just "log warning and skip"? Hopefully it just never happens, but if it did happen what would be the cleanup? Should the user just remove the frag reuse index at that point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah if this ever happens that means the reuse version is corrupted. I kept the logic here, but made the cleanup function remove the version when this error is triggered.

@jackye1995
Copy link
Contributor Author

Looks good. Do you think we will eventually tie this in with cleanup_old_versions or will it remain a separate cleanup function even up at the user level?

Yeah I was considering doing that, but I also felt it might be more flexible to keep them separated because you can cleanup this index much more eagerly as soon as you know you have updated all indices, compared to the old versions cleanup the user might not really run that often if they keep the default of 14 days.

@jackye1995 jackye1995 merged commit 70de37e into lancedb:main Jun 3, 2025
27 checks passed
jackye1995 added a commit that referenced this pull request Jun 3, 2025
Fix rebase error created by #3911 and #3920
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Remap Separation] Support trimming the row ID maps after all indexes have been remapped
3 participants