Query performance regression in V2 segment format

## Summary

Commit 15b8df3 (V2 segment format) introduced a significant query performance regression. Query execution time increased dramatically due to segments being opened/closed O(T × S) times instead of O(S) times (where T = query terms, S = segments).

## Symptoms

- Query latency increased ~29x on MS-MARCO dataset (based on profiling sample counts)
- `tp_segment_open` consuming 24% of query CPU time
- Observed in benchmark dashboard as spike in query latency metrics

## Root Cause

The scoring loop was structured as:
```
Phase 1 - Get doc_freq:
  for each term:
    for each segment: open → read doc_freq → close

Phase 2 - Score:
  for each term:
    for each segment: open → iterate postings → close
```

Each `tp_segment_open` is expensive because it:
1. Allocates reader structure
2. Reads segment header from disk
3. Reads entire page index (potentially multiple pages)
4. For V2: potentially preloads CTID table

For a query with 5 terms and 10 segments, this resulted in 100 segment opens instead of 10.

## Fix

Restructure to open each segment once:
```
for each segment:
  open
  for each term: get doc_freq + score
  close
```

Fix is in PR #85.

## Profiling Data

**Baseline (d560f1e, before V2):**
- 104 samples total
- Top function: kernel spin lock (9.6%)
- `tp_segment_open` not in top functions

**Regressed (15b8df3, with V2):**
- 3,022 samples total
- Top function: `tp_segment_open` (24%)
- ~29x more CPU time in queries

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Query performance regression in V2 segment format #86

Summary

Symptoms

Root Cause

Fix

Profiling Data

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Query performance regression in V2 segment format #86

Description

Summary

Symptoms

Root Cause

Fix

Profiling Data

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions