Skip to content

Reject embedding dims where dim*nbits isn't a multiple of 8 (instead of panicking)#111

Closed
raphaelsty wants to merge 1 commit into
mainfrom
fix/codec-reject-unaligned-dim
Closed

Reject embedding dims where dim*nbits isn't a multiple of 8 (instead of panicking)#111
raphaelsty wants to merge 1 commit into
mainfrom
fix/codec-reject-unaligned-dim

Conversation

@raphaelsty
Copy link
Copy Markdown
Collaborator

Problem

Found while testing #86: posting documents with embeddings whose dimension makes dim * nbits not a multiple of 8 (e.g. a 3-dim toy embedding with nbits=4) crashed the indexing worker with a panic:

thread '<unnamed>' panicked at next-plaid/src/codec.rs:393:
index out of bounds: the len is 1 but the index is 1

ResidualCodec::quantize_residuals computes packed_dim = dim * nbits / 8 (floor division) but the packing loop writes dim * nbits bits = ceil(dim*nbits/8) bytes. When dim * nbits isn't a multiple of 8, packed_dim is one byte too small and the loop indexes past the end of the row. The bit-packing format stores exactly 8 / nbits dimensions per byte (and decompress reads whole bytes), so the codec fundamentally only supports dims where dim * nbits % 8 == 0 — always true for real ColBERT dims (96, 128), but a panic for arbitrary inputs.

The panic happened deep inside a rayon worker, taking down the indexing task. Via the API (post-#104) it showed up as a panicked update batch rather than a normal error.

Fix

Validate the constraint at the top of quantize_residuals and return a descriptive Error::Codec instead of writing out of bounds:

unsupported embedding dimension 3 for nbits=4: dim * nbits (12) must be a multiple of 8

This is a defensive hardening — it does not change the on-disk format or behavior for supported dims. (Supporting arbitrary dims would require a packing/unpacking format rewrite for partial trailing bytes, which isn't needed for real models.)

Verification

  • New unit tests: quantize_rejects_unaligned_dim_instead_of_panicking (dim=3 → Err, no panic) and quantize_accepts_byte_aligned_dim (dim=2, since 2*4=8 is valid even though 2 isn't a multiple of 8 — the guard checks dim * nbits, not dim).
  • End-to-end through the API: re-running the original 3-dim/2-token /update now logs update.batch.failed … Codec error: unsupported embedding dimension 3 … with 0 panics, and the server stays healthy (and the failure surfaces in /health via Make index/centroids updates crash-safe and expose update progress in health #104's failed-status tracking).
  • Full make ci-quick green (fmt, clippy -D warnings, tests).

`quantize_residuals` packs exactly `8 / nbits` dimensions per byte and the
decompress path reads whole bytes, so the codec only supports embedding
dimensions where `dim * nbits` is a multiple of 8 (always true for real
ColBERT dims like 96/128). For other dims, `packed_dim = dim * nbits / 8`
floor-divided below the number of bits actually written, so the packing
loop wrote past the end of the row and panicked with an out-of-bounds
index — deep inside a rayon worker, crashing the indexing task (observed
via the API as a panicked update batch on 3-dim toy embeddings).

Validate the constraint up front and return a descriptive `Error::Codec`
instead. The API now surfaces a clean error ("unsupported embedding
dimension 3 for nbits=4: dim * nbits (12) must be a multiple of 8") and
the worker stays healthy, rather than panicking.

Adds tests for the rejected (dim=3) and accepted byte-aligned (dim=2) cases.
@raphaelsty raphaelsty closed this May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant