Reject embedding dims where dim*nbits isn't a multiple of 8 (instead of panicking)#111
Closed
raphaelsty wants to merge 1 commit into
Closed
Reject embedding dims where dim*nbits isn't a multiple of 8 (instead of panicking)#111raphaelsty wants to merge 1 commit into
raphaelsty wants to merge 1 commit into
Conversation
`quantize_residuals` packs exactly `8 / nbits` dimensions per byte and the
decompress path reads whole bytes, so the codec only supports embedding
dimensions where `dim * nbits` is a multiple of 8 (always true for real
ColBERT dims like 96/128). For other dims, `packed_dim = dim * nbits / 8`
floor-divided below the number of bits actually written, so the packing
loop wrote past the end of the row and panicked with an out-of-bounds
index — deep inside a rayon worker, crashing the indexing task (observed
via the API as a panicked update batch on 3-dim toy embeddings).
Validate the constraint up front and return a descriptive `Error::Codec`
instead. The API now surfaces a clean error ("unsupported embedding
dimension 3 for nbits=4: dim * nbits (12) must be a multiple of 8") and
the worker stays healthy, rather than panicking.
Adds tests for the rejected (dim=3) and accepted byte-aligned (dim=2) cases.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Found while testing #86: posting documents with embeddings whose dimension makes
dim * nbitsnot a multiple of 8 (e.g. a 3-dim toy embedding withnbits=4) crashed the indexing worker with a panic:ResidualCodec::quantize_residualscomputespacked_dim = dim * nbits / 8(floor division) but the packing loop writesdim * nbitsbits =ceil(dim*nbits/8)bytes. Whendim * nbitsisn't a multiple of 8,packed_dimis one byte too small and the loop indexes past the end of the row. The bit-packing format stores exactly8 / nbitsdimensions per byte (anddecompressreads whole bytes), so the codec fundamentally only supports dims wheredim * nbits % 8 == 0— always true for real ColBERT dims (96, 128), but a panic for arbitrary inputs.The panic happened deep inside a rayon worker, taking down the indexing task. Via the API (post-#104) it showed up as a panicked update batch rather than a normal error.
Fix
Validate the constraint at the top of
quantize_residualsand return a descriptiveError::Codecinstead of writing out of bounds:This is a defensive hardening — it does not change the on-disk format or behavior for supported dims. (Supporting arbitrary dims would require a packing/unpacking format rewrite for partial trailing bytes, which isn't needed for real models.)
Verification
quantize_rejects_unaligned_dim_instead_of_panicking(dim=3 →Err, no panic) andquantize_accepts_byte_aligned_dim(dim=2, since2*4=8is valid even though 2 isn't a multiple of 8 — the guard checksdim * nbits, notdim)./updatenow logsupdate.batch.failed … Codec error: unsupported embedding dimension 3 …with 0 panics, and the server stays healthy (and the failure surfaces in/healthvia Make index/centroids updates crash-safe and expose update progress in health #104's failed-status tracking).make ci-quickgreen (fmt, clippy-D warnings, tests).