feat: self-describing blocks for index recovery#99
Conversation
Storage blocks now carry their owning {hash,pos} key (v2 layout,
signatures 173/174), folded into the block checksum. This lets
compio_repair rebuild files from blocks alone when index nodes are
lost, instead of dumping unattributable orphan_*.bin.
- file: v2 read/write, checksum covers back-ref, meta_size_for(sig)
- allocator/reader/remove_file: version-aware on-disk footprint so
mixed v1/v2 archives deallocate correctly
- repair: re-attribute orphan v2 blocks via back-ref
- block read: verify block hash matches looked-up key (catches
index pointing at the wrong file)
- v1 blocks (171/172) still read unchanged (backward compatible)
Overhead: 16 bytes per block (~0.1% at 16KB block size).
В общем наверно пока не стоит. |
mozhaa
left a comment
There was a problem hiding this comment.
Ну насчёт того что формат меняется это ладно, но то что pos может быть устаревшим это да, хотя тут как будто ничего и не поделать. Проблем то это не создаёт, просто и не решает полностью проблему того, что делать при потере индекса. Так что можно и вмерджить
Blocks rebuilt purely from self-describing back-refs can carry stale absolute positions: lazy add_to_range shifts never rewrite evicted blocks, so their on-disk pos drifts while sort order stays intact. The logical stream is densely tiled, so the correct offset of each block is the running sum of preceding block sizes. When no surviving index node anchors a file, re-derive every position this way, recovering the file byte-for-byte regardless of drift. Partial-index files keep their true offsets untouched. Co-authored-by: mozhaa <mozhay2005@gmail.com>
|
Добавил частичное решение проблемы с pos Но это не вся беда, осталось еще это: Поэтому пока так, а об этом надо будет подумать позже. |
Problem
A
storage_blockcarries no key. When a B-tree index node is lost/corrupted, its blocks become orphans — bytes recoverable, but file owner and position unknown.compio_repairdumps them as unorderedorphan_*.bin. Full index reconstruction from blocks alone was impossible.Change
Blocks become self-describing: each v2 block stores its owning
{hash,pos}key (new signatures173/174), folded into the block checksum.meta_size_for(sig)orphan_*.binhashmatches looked-up key (catches index pointing at the wrong file)171/172) still read unchanged — backward compatibleCost
16 bytes per block (~0.1% at 16KB block size). No new syscalls, compression/checksum dominate CPU. No structural change.