Skip to content

Conversation

@teskje
Copy link
Contributor

@teskje teskje commented Aug 28, 2023

This PR makes merging of batches with empty batches efficient. Previously this operation would be handled like a regular merge, by allocating a new batch and merging the input batches into it. However, if one of the batches is empty, there is no need to allocate a new batch, we can simply extend the bounds of the existing non-empty batch accordingly.

The implementation of this optimization is complicated by the fact that there are Batch implementations for Rc<Batch> and Abomonated<Batch>. Both of these only provide an immutable reference to the underlying batch, making it impossible to adjust the batch description to extend the batch's bounds. This commit solves this by introducing new RcBatch and AbomonatedBatch wrappers that, in addition to the Rc/Abomonated batch, hold their own Descriptions that override the inner batch's description.

Some adjustments to logging of merge events is also necessary, to account for the new merge path.

Fixes https://github.com/MaterializeInc/database-issues/issues/6368.

teskje added 2 commits August 28, 2023 10:10
This commit makes merging of batches with empty batches efficient.
Previously this operation would be handled like a regular merge, by
allocating a new batch and merging the input batches into it. However,
if one of the batches is empty, there is no need to allocate a new
batch, we can simply extend the bounds of the existing non-empty batch
accordingly.

The implementation of this optimization is complicated by the fact that
there are `Batch` implementations for `Rc<Batch>` and
`Abomonated<Batch>`. Both of these only provide an immutable reference
to the underlying batch, making it impossible to adjust the batch
description to extend the batch's bounds. This commit solves this by
introducing new `RcBatch` and `AbomonatedBatch` wrappers that, in
addition to the `Rc`/`Abomonated` batch, hold their own `Description`s
that override the inner batch's description.
This commit makes sure completed merges are logged also for merges that
used the empty batch optimization.

Also, if we log the dropping of a merged batch, we shouldn't forget to
also log the merge event, so consumers don't become confused about the
whereabouts of the input batches.
@teskje
Copy link
Contributor Author

teskje commented Aug 28, 2023

Materialize CI is happy with this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant