Implement PostgreSQL datafile delta storage and optimize overlay performance#14
Merged
Implement PostgreSQL datafile delta storage and optimize overlay performance#14
Conversation
- Remove unused `read_incremental_block()` from overlay.rs (~65 lines) - Remove unused `writable` field from OpenHandle in fuse.rs - Move `has_pending()` to #[cfg(test)] as it's only used in tests
- Add `server_version` field to BackupMetadata - Add POSTGRES_VERSION_MIN/MAX constants (14-18) - Add UnsupportedPostgresVersion error variant - Validate PostgreSQL version during backup store validation - Remove unused `uncompressed_bytes`, `wal_bytes` fields from ShowBackupJson - Fix chain_resolution_tests to include server_version field
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR implements page-level delta storage for PostgreSQL datafiles, integrates it into the overlay, and tightens several performance-critical paths used by pbkfs when serving large relations (e.g. pgbench_accounts). It also removes the experimental PBKFS_MATERIALIZE_ON_READ path, which was unstable under real workloads, in favour of a single well-tested read path.
Changes
Overlay integration for PostgreSQL datafiles
Overlay::read_datafile_deltaand integrate it intoread_block_dataso that:Overlay::write_datafile_deltato:compute_delta..full, and shrink tails where possible.truncate_pg_datafile,rename_pg_datafile, andunlink_pg_datafileto:.patch/.fulland sparse diff files in sync with PostgreSQL truncates, renames, and unlinks.Delta v2 implementation and tooling
DeltaDiff::PatchV2) and corresponding decoding helpers..patch/.fullsparse file layout with:.fullfile for whole-page writes.DeltaIndexandBlockBitmapfor thread-safe, per-file bitmap caching of PATCH/FULL/EMPTY slots.utils/analyze_pg_delta.pyto:.patch/.full.diff_bytes, v1-style patch length, and v2-encoded length.Logical length and block index optimizations
BlockIndexEntryandOverlay::pg_block_indexto:BlockCacheEntry::logical_len, with:record_write) and truncates (truncate_pg_datafile).logical_len()for repeated calls against hot relations.backup_content.control) is loaded once per layer and used to bound logical length and compression.Sparse diff and truncate behaviour
read_block_datanow probes for sparse regions and falls back to base layers instead of returning zeros.truncate_pg_datafile(size == 0)shadows the base relation with zeroed pages.truncate_pg_datafile:.patchand.fullwhen truncating to zero.Removal of PBKFS_MATERIALIZE_ON_READ
PBKFS_MATERIALIZE_ON_READflag fromOverlayInnerand all call sites.Performance impact
On a large pgbench_accounts table (~1 GiB) backed by pg_probackup: