-
Notifications
You must be signed in to change notification settings - Fork 44
Closed
Description
Summary
Commit 15b8df3 (V2 segment format) introduced a significant query performance regression. Query execution time increased dramatically due to segments being opened/closed O(T × S) times instead of O(S) times (where T = query terms, S = segments).
Symptoms
- Query latency increased ~29x on MS-MARCO dataset (based on profiling sample counts)
tp_segment_openconsuming 24% of query CPU time- Observed in benchmark dashboard as spike in query latency metrics
Root Cause
The scoring loop was structured as:
Phase 1 - Get doc_freq:
for each term:
for each segment: open → read doc_freq → close
Phase 2 - Score:
for each term:
for each segment: open → iterate postings → close
Each tp_segment_open is expensive because it:
- Allocates reader structure
- Reads segment header from disk
- Reads entire page index (potentially multiple pages)
- For V2: potentially preloads CTID table
For a query with 5 terms and 10 segments, this resulted in 100 segment opens instead of 10.
Fix
Restructure to open each segment once:
for each segment:
open
for each term: get doc_freq + score
close
Fix is in PR #85.
Profiling Data
Baseline (d560f1e, before V2):
- 104 samples total
- Top function: kernel spin lock (9.6%)
tp_segment_opennot in top functions
Regressed (15b8df3, with V2):
- 3,022 samples total
- Top function:
tp_segment_open(24%) - ~29x more CPU time in queries
Metadata
Metadata
Assignees
Labels
No labels