Feature/scalar quantized off heap scoring #13497

benwtrent · 2024-06-17T22:07:45Z

This adds off-heap scoring for our scalar quantization.

Opening as DRAFT as I still haven't fully tested out the performance characteristics. Opening early for discussion.

…tized-off-heap-scoring

benwtrent · 2024-06-17T23:52:36Z

Half-byte is showing up as measurably slower with this change.

Candidate:

0.909	 0.54
0.911	 0.58
0.919	 0.88

baseline:

0.909	 0.30
0.911	 0.33
0.919	 0.47

Full-byte is slightly faster

candidate:

0.962	 0.41
0.966	 0.43
0.978	 0.66

baseline:

0.962	 0.47
0.966	 0.48
0.978	 0.73

msokolov · 2024-06-18T15:44:42Z

are you reporting indexing times? query times?

benwtrent · 2024-06-18T16:48:27Z

are you reporting indexing times? query times?

Query times, single segment, 10k docs of 1024 dims.

…tized-off-heap-scoring

benwtrent · 2024-07-10T18:29:48Z

Ok, I double checked, and indeed, half-byte is way slower when reading directly from memory segments instead of reading on heap.
memsegment_vs_baseline.zip

The flamegraphs are wildly different. So much more time is being spent reading from memory segment and then comparing the vectors

candidate (this PR):

baseline:

benwtrent · 2024-07-10T19:15:54Z

@ChrisHegarty have you seen a significant performance regression on MemorySegments & JDK22?

Doing some testing, I updated my performance testing for this PR to use JDK22 and now it is WAY slower, more than 2x slower, even for full-byte.

For int7, this branch is marginally faster (20%) with JDK21, but basically 2x slower on JDK22.

I wonder if our off-heap scoring for byte vectors also suffers on JDK22. The quantized scorer for int7 is just using those same methods.

benwtrent · 2024-07-10T19:32:09Z

To verify it wasn't some weird artifact in my code, I slightly changed it to where my execution path always reads the vectors on-heap and then wraps them in a memorysegment. Now JDK22 performs the same as JDK21 & the current baseline.

Its weird to me that reading from a memory segment onto ByteVector objects would be 2x slower on JDK22 than 21.

Regardless that its already much slower for the int4 case on both jdk 21 & 22.

ChrisHegarty · 2024-07-11T13:11:24Z

Regardless that its already much slower for the int4 case on both jdk 21 & 22.

@benwtrent I was not aware, lemme take a look.

…tized-off-heap-scoring

kaivalnp · 2025-06-25T18:22:29Z

+1 to this feature

I work on Amazon product search, and in one of our searchers we see a high proportion of CPU cycles within HNSW search being spent in copying quantized vectors to heap:

Perhaps off-heap scoring could help us!

benwtrent · 2025-06-25T20:28:55Z

@kaivalnp feel free to take my initial work here and dig in deeper.

I haven't benchmarked it recently on later JVMs to figure out why I was experiencing such a weird slowdown when going off heap :/

kaivalnp · 2025-06-29T10:50:40Z

Thanks @benwtrent! I opened #14863

benwtrent · 2025-08-14T15:58:28Z

I am gonna close this as work is progressing elsewhere, also, we should just move to off-heap bulk scoring ;)

benwtrent added 4 commits June 17, 2024 11:01

Add off-heap scalar quantized scoring

f46920a

iter

32ce602

iter

86dac9b

iter

f2c3f85

benwtrent added the vector-based-search label Jun 17, 2024

benwtrent added this to the 9.12.0 milestone Jun 17, 2024

Merge remote-tracking branch 'upstream/main' into feature/scalar-quan…

f4e0066

…tized-off-heap-scoring

benwtrent mentioned this pull request Jun 21, 2024

Examine adding more off-heap vector scoring #13515

Open

Merge remote-tracking branch 'upstream/main' into feature/scalar-quan…

2640493

…tized-off-heap-scoring

ChrisHegarty mentioned this pull request Jul 16, 2024

Investigate possible perf regression with off-heap scoring on JDK 22 #13575

Open

benwtrent added 2 commits August 14, 2024 13:52

Merge remote-tracking branch 'upstream/main' into feature/scalar-quan…

40182c9

…tized-off-heap-scoring

iter

fb73b2b

benwtrent mentioned this pull request Sep 11, 2024

Can we remove compress option for quantized KNN vector indexing? #13768

Open

ChrisHegarty removed this from the 9.12.0 milestone Sep 17, 2024

kaivalnp mentioned this pull request Jul 29, 2025

Implement off-heap quantized scoring #14863

Merged

benwtrent closed this Aug 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/scalar quantized off heap scoring #13497

Feature/scalar quantized off heap scoring #13497

Uh oh!

benwtrent commented Jun 17, 2024

Uh oh!

benwtrent commented Jun 17, 2024

Uh oh!

msokolov commented Jun 18, 2024

Uh oh!

benwtrent commented Jun 18, 2024

Uh oh!

benwtrent commented Jul 10, 2024

Uh oh!

benwtrent commented Jul 10, 2024

Uh oh!

benwtrent commented Jul 10, 2024

Uh oh!

ChrisHegarty commented Jul 11, 2024

Uh oh!

kaivalnp commented Jun 25, 2025

Uh oh!

benwtrent commented Jun 25, 2025

Uh oh!

kaivalnp commented Jun 29, 2025

Uh oh!

benwtrent commented Aug 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Feature/scalar quantized off heap scoring #13497

Feature/scalar quantized off heap scoring #13497

Uh oh!

Conversation

benwtrent commented Jun 17, 2024

Uh oh!

benwtrent commented Jun 17, 2024

Uh oh!

msokolov commented Jun 18, 2024

Uh oh!

benwtrent commented Jun 18, 2024

Uh oh!

benwtrent commented Jul 10, 2024

Uh oh!

benwtrent commented Jul 10, 2024

Uh oh!

benwtrent commented Jul 10, 2024

Uh oh!

ChrisHegarty commented Jul 11, 2024

Uh oh!

kaivalnp commented Jun 25, 2025

Uh oh!

benwtrent commented Jun 25, 2025

Uh oh!

kaivalnp commented Jun 29, 2025

Uh oh!

benwtrent commented Aug 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants