v0.33.2.0 feat(search-lite): token budget + semantic query cache + intent weighting#897
Open
garrytan-agents wants to merge 14 commits into
Open
v0.33.2.0 feat(search-lite): token budget + semantic query cache + intent weighting#897garrytan-agents wants to merge 14 commits into
garrytan-agents wants to merge 14 commits into
Conversation
…ting
Adds three additive features to the hybrid search pipeline. All
backward-compatible: existing callers see identical behavior unless they
opt in to the new options.
## 1. Token Budget Enforcement (src/core/search/token-budget.ts)
Cap the cumulative token cost of returned results so search payloads
fit downstream context windows. Greedy top-down walk; preserves caller
ordering; no re-rank. char/4 heuristic for token counting (no
tokenizer dependency \u2014 keeps the bun --compile bundle small).
SearchOpts.tokenBudget \u2014 numeric cap. Default undefined = no-op.
HybridSearchMeta.token_budget = { budget, used, kept, dropped }
HTTP query op: pass `token_budget` param.
## 2. Semantic Query Cache (src/core/search/query-cache.ts + migration v52)
Cache search results keyed by query embedding similarity. HNSW lookup:
`embedding <=> $1 < 0.08` (cosine similarity >= 0.92). Per-source
isolation so multi-source brains don\u2019t bleed. Per-row TTL (default 3600s).
Best-effort writes; all errors swallowed so the cache never breaks the
search hot path.
Migration v52 creates query_cache table with HALFVEC where pgvector >= 0.7;
falls back to VECTOR with the resolved config.embedding_dimensions dim.
New `gbrain cache` CLI: stats / clear --yes / prune.
Config keys: search.cache.enabled / similarity_threshold / ttl_seconds.
HybridSearchMeta.cache = { status, similarity?, age_seconds? }
Routed through new `hybridSearchCached(engine, query, opts)` wrapper;
the operations.ts query op now uses this wrapper so MCP/CLI calls
benefit automatically. Skipped for two-pass walks + non-default
embedding columns where cache semantics don\u2019t hold.
## 3. Zero-LLM Intent Weighting (src/core/search/intent-weights.ts)
Builds on the existing query-intent classifier (4 intents: entity /
temporal / event / general). New weight-adjustment layer applies subtle
per-intent nudges:
entity \u2192 boost keyword RRF + exact slug/title match
temporal \u2192 default recency=on when caller left it unset
event \u2192 boost keyword RRF (rare named entities) + soft recency
general \u2192 no-op (1.0 multipliers everywhere)
All adjustments are SUBTLE (max 1.25x). Caller-explicit options ALWAYS
win \u2014 intent weighting never silently overrides recency / salience.
Default ON; opt out via `opts.intentWeighting = false`. LLM query
expansion (expansion.ts) is still available and opt-in via
`opts.expansion = true` \u2014 it just isn\u2019t the default anymore.
HybridSearchMeta.intent now surfaces classifier output for debugging.
## Tests
test/token-budget.test.ts (10 tests, pure module)
test/intent-weights.test.ts (13 tests, pure module)
test/query-cache.test.ts (12 tests, PGLite)
test/hybrid-search-lite.serial.test.ts (9 tests, PGLite e2e)
Plus 105 pre-existing search tests still pass. `bun run verify` clean.
Co-authored-by: Wintermute <[email protected]>
Resolved single conflict in src/core/migrate.ts: master claimed v52/53/54 (eval_contradictions_cache + eval_contradictions_runs + cjk_wave). PR garrytan#897's v52 query_cache migration renumbered to v55 to sit after. Typecheck clean.
…ybridSearch Three named modes (conservative / balanced / tokenmax) that bundle the search-lite knobs from PR garrytan#897 into a single config key. Mode resolution lives in bare hybridSearch (NOT just the cached wrapper) so eval-replay and eval-longmemeval — which call bare hybridSearch — test the same mode-affected behavior as production. See [CDX-5+6] in the plan. The mode bundle supplies DEFAULTS for intentWeighting, tokenBudget, expansion, and searchLimit when the caller leaves those undefined. Per-call SearchOpts and per-key config overrides still win (matches the v0.31.12 model-tier resolution chain at model-config.ts:resolveModel). knobsHash() exposes a stable SHA-256 of the resolved knob set; the cache contamination hotfix (next commit) consumes it to prevent a tokenmax write from being served to a conservative read. Three new fields on HybridSearchMeta: - mode (resolved mode name) - existing token_budget meta now fires from bare hybridSearch too Bare hybridSearch now applies tokenBudget at all three return paths (no-embedding-provider, keyword-only-fallback, main). Previously only hybridSearchCached enforced budget; eval commands missed it. Tests: 37 unit cases pin the 3x7 bundle table cell-by-cell, the resolution chain semantics, knobs hash determinism + cross-mode separation, and the config-table parser. All 72 search-lite tests pass. Bisect-friendly: this commit ONLY adds mode resolution. The cache-key contamination hotfix [CDX-4] is a separate atomic commit (next).
PR garrytan#897's query_cache keyed rows on sha256(source_id::query_text) only. A tokenmax search (expansion=on, limit=50) populated a row that a subsequent conservative call (no expansion, limit=10) read back, serving the wrong-shape results. This is a real bug in PR garrytan#897 today, regardless of the v0.32.3 mode picker work — Codex caught it in plan review. Fix: - Migration v56 adds query_cache.knobs_hash TEXT column + composite (source_id, knobs_hash, created_at) index. Existing rows have NULL knobs_hash and are excluded from lookups (silently re-populated with the right hash on first hit — no orphan data, no destructive migration). - cacheRowId(query, source, knobsHash) — knobsHash now part of the PK so a tokenmax write and a conservative write for the same (query, source) land in distinct rows. - SemanticQueryCache.lookup({knobsHash}) filters WHERE knobs_hash = $. - SemanticQueryCache.store({knobsHash}) writes the resolved hash. - hybridSearchCached threads knobsHash from resolveSearchMode through every cache call. Cache config (enabled/threshold/TTL) now reads from the resolved mode bundle, not directly from the config table. Tests (test/query-cache-knobs-hash.test.ts, 11 cases): - cacheRowId bifurcates by knobsHash - Tokenmax write does NOT contaminate conservative lookup - Three modes coexist as distinct rows for same query - Legacy NULL-knobs_hash rows are excluded from lookup - Same-mode write updates in place (no duplicate rows) All 58 cache + mode tests pass. Migration v56 applies cleanly on a fresh PGLite brain. Bisect-friendly: this commit is the cache-key hotfix alone. Mode resolution wiring lives in the previous commit.
…able
Migration v57 creates search_telemetry (date, mode, intent, count,
sum_results, sum_tokens, sum_budget_dropped, cache_hit, cache_miss,
first_seen, last_seen). PK (date, mode, intent) caps growth at ~4380
rows/year. Sums + counts only — averages derive at read time so
concurrent ON CONFLICT writes from multiple gbrain processes accumulate
correctly [CDX-17].
In-memory bucket flushed periodically (60s OR 100 calls) + on process
beforeExit/SIGINT/SIGTERM with a 2-second cap. The search hot path NEVER
waits on this write [D2, CDX-19].
Date-bucketed cache_hit / cache_miss columns make hit rate over --days N
derivable [CDX-18]. query_cache.hit_count is a lifetime counter and
can't be sliced by window.
Wired into bare hybridSearch via emitMeta: every search call sync-bumps
a bucket. flush() drains atomically by swapping the map before SQL writes
so a record() during flush lands in the new map.
readSearchStats(engine, {days}) returns the StatsWindow shape that
gbrain search stats consumes (next commit).
Tests: 16 unit cases pin record/flush/read semantics including
ON-CONFLICT-adds-raw-values, concurrent-flush coalescing, cache hit-rate
math, missing-table graceful degradation, and window clamping.
53 migrations apply on a fresh PGLite brain.
…+8+9]
CDX-8: gbrain config has no unset path today. Required before
`gbrain search modes --reset` can clear search.* overrides.
- BrainEngine.unsetConfig(key) → returns rows deleted (0|1)
- BrainEngine.listConfigKeys(prefix) → exact-literal prefix match
with LIKE-escape on user-supplied % / _ / \ characters
- PGLiteEngine + PostgresEngine implementations
- `gbrain config unset <key>` and `gbrain config unset --pattern <prefix>`
sub-subcommands
CDX-9: readLine has no EOF detection or timeout. Mode-picker plan calls
out "TTY closes mid-prompt → defaults to balanced" but the raw helper
hangs forever. New readLineSafe(prompt, defaultValue, timeoutMs=60s):
- Returns defaultValue on stdin 'end' event
- Returns defaultValue on timeout
- Returns defaultValue on empty Enter
- Non-TTY stdin returns defaultValue immediately (e2e safe)
- Returns trimmed user input otherwise
Exported so install picker (next task) can use it.
Tests: 9 cases pin unset semantics + prefix matcher edge cases
(glob-wildcard escape, sort order, idempotent loop, search.* sweep).
All 53 migrations apply on a fresh PGLite brain.
Install picker (src/commands/init-mode-picker.ts):
- Runs as a phase inside `gbrain init` AFTER engine.initSchema() so DB
config writes work [CDX-7].
- Idempotent: skipped on re-init if search.mode is already set.
- Smart auto-suggestion via recommendModeFor() reads
models.tier.subagent / models.default / OPENAI_API_KEY:
* Opus default/subagent → tokenmax (quality ceiling)
* Haiku subagent → conservative (4K budget keeps cost down)
* No OpenAI key → conservative (no LLM expansion possible)
* Sonnet / unknown → balanced (safe default)
- TTY shows menu via readLineSafe (60s timeout, defaults on EOF/empty).
- Non-TTY auto-selects + emits operator hint:
[gbrain] search mode: X (auto-selected — reason)
[gbrain] To change: gbrain config set search.mode <...>
- --json mode emits structured `{phase: 'search_mode_picker', ...}` event.
- Wired into both initPGLite and initPostgres flows.
Upgrade banner (src/commands/upgrade.ts):
- One-shot stderr banner in runPostUpgrade.
- State persisted via config key `search.mode_upgrade_notice_shown=true`
— fires at most once per install.
- Copy corrected per [CDX-1+2+3]: production query op STILL defaults
expand=true and limit=20. The banner reframes from "behavior is
regressing" to "named modes available + here's how to preserve
exact current shape."
Tests (test/init-mode-picker.test.ts, 16 cases):
- recommendModeFor heuristic for all 4 input shapes
- parseModeInput accepts numeric/named/case-insensitive, rejects garbage
- runModePicker non-TTY auto-selects + writes config
- Idempotent + --force re-prompt + JSON output
- Opus → tokenmax, Haiku → conservative real wiring through engine
Three sub-subcommands mirroring the gbrain models (v0.31.12) shape:
gbrain search modes [--json]
Read-only routing dashboard. Shows the three mode bundles, the active
mode, and the source of every resolved knob:
cache_enabled = true [override: search.cache.enabled]
tokenBudget = 4000 [mode: conservative]
Plus knob descriptions for legibility.
gbrain search modes --reset [--source <mode>]
Clears every search.* override (NOT search.mode itself). Preserves
the upgrade-notice state key. --source <mode> is a dry-run that
lists what --reset would change without writing — the paved path
[CDX-8] flagged as missing.
gbrain search stats [--days N] [--json]
Observability. Reads the search_telemetry rollup over the window
(clamps to [1, 365]). Prints cache hit rate, mode mix, intent mix,
budget drops, avg results/tokens. JSON output includes
_meta.metric_glossary block per [CDX-25].
gbrain search tune [--apply] [--json]
Recommendation engine. 5 rules cover the bug class:
- Insufficient data → "no_recommendations" status
- Conservative + high budget-drop rate → suggest balanced
- High cache hit rate (>85%) → suggest similarity threshold bump
- Tokenmax + Haiku subagent → suggest balanced (cost mismatch)
- Cache disabled but stats show usage → suggest re-enabling
--apply mutates config via setConfig / unsetConfig with a paste-ready
revert command printed at the end.
Registered in src/cli.ts dispatch table. 17 unit cases pin:
- Dashboard report shape + per-knob source attribution
- --reset preserves search.mode + notice key
- --source dry-run never writes
- stats reads telemetry rollup; --days clamps
- tune recommendation rules fire on real telemetry data
- --apply mutates config
- --help + unknown subcommand exit codes
… guard
Single source of truth at src/core/eval/metric-glossary.ts. Every entry
carries 3 fields:
- industry_term (canonical IR/NLP literature name, preserved verbatim)
- eli10 (plain-English a 16-year-old can follow)
- range (numeric range + interpretation)
Covers 4 metric families:
- Retrieval: P@k, R@k, MRR, nDCG@k
- Stability: Jaccard@k, top-1 stability
- Statistical: p-value (paired bootstrap + Bonferroni), 95% CI
- Operational: cache hit rate, avg results/tokens, cost per query, p99 latency
Public surface:
- getMetricGloss(metric) → full entry or null
- eli10For(metric) → plain-English string or null
- buildMetricGlossaryMeta(metrics[]) → {metric → eli10} record for
JSON `_meta.metric_glossary` blocks per [CDX-25]. ONE block per
response, NOT sibling `_gloss` fields on every metric.
- renderMetricGlossaryMarkdown() → deterministic Markdown for the doc
Auto-generation:
scripts/generate-metric-glossary.ts emits docs/eval/METRIC_GLOSSARY.md.
Deterministic (same input → same bytes) so the CI guard can diff.
CI guard:
scripts/check-eval-glossary-fresh.sh regenerates into a temp file and
diffs against the committed doc. Out-of-date doc fails the build.
Wired into `bun run verify` (and therefore `bun run test:full`).
Tests (test/metric-glossary.test.ts, 18 cases):
- Every documented metric is present
- Every entry has all 3 required fields
- Accessors return null on unknown metrics (no throw)
- buildMetricGlossaryMeta silently drops unknown metrics
- renderer output is deterministic across calls
- Renderer groups metrics into 4 sections
docs/eval/METRIC_GLOSSARY.md: 5491 bytes, 124 lines, fresh.
src/core/eval/drift-watch.ts — curated retrieval watch-list [CDX-6].
Five patterns covering the surface that actually affects retrieval quality:
- src/core/search/ (search pipeline)
- src/core/embedding.ts (embedding shape)
- src/core/chunkers/ (chunk granularity)
- src/core/ai/recipes/anthropic.ts + openai.ts (expansion + embed routing)
- src/core/operations.ts (the query op definition)
Adding to the list is a deliberate act — requires a CHANGELOG line so
coverage grows on purpose, not by accident. Pure functions:
- matchesWatchPattern(path) — trailing-slash = prefix, bare = equality
- filesDriftedSince(repoRoot, sha?) — git diff --name-only wrapper
- watchedFilesDrifted(repoRoot, sha?) — composite
src/commands/doctor.ts — two new checks.
checkSearchMode [CDX-20]: status stays 'ok' (never warns, never docks
health score). Hint in message field. Three branches:
- unset → "search.mode is unset (using balanced fallback). Run
`gbrain search modes` to see what is running and pick a mode."
- mode + no overrides → "Mode: X (no per-key overrides — mode bundle
is canonical)."
- mode + overrides → "Mode: X with N per-key override(s) (k1, k2, …).
To consolidate to the pure mode bundle: gbrain search modes --reset"
Upgrade-notice state key (search.mode_upgrade_notice_shown) is excluded
from the override roster — it's not a knob.
checkEvalDrift [CDX-6]: surfaces uncommitted changes to retrieval-watched
files. Always 'ok'; operator-facing reminder. Names up to 3 drifted files
in the message + paste-ready re-eval command.
Both helpers exported (was: file-private) so tests can pin behavior
without walking the full runDoctor pipeline.
Tests: 12 drift-watch cases + 7 doctor-check cases. Pin watch-list shape,
prefix-vs-equality matcher semantics, missing-repo graceful failure, and
all three search_mode branches.
Per-mode --mode flag plumbed into:
- gbrain eval longmemeval --mode <conservative|balanced|tokenmax>
Sets search.mode in the benchmark brain's config table; config is
in PRESERVE_TABLES so resetTables doesn't wipe it between questions.
Mode surfaces in the per-question NDJSON row.
- gbrain eval replay --mode <m> + --compare-limit N
--compare-limit forces a constant K across modes [CDX-13]; without
it, Jaccard@k against the captured baseline measures K-drift, not
quality. Mode is set once before the replay loop.
- NOT cross-modal per [CDX-11]: cross-modal scores OUTPUT against
TASK; it doesn't retrieve. Adding --mode there is theater.
New: gbrain eval run-all orchestrator (src/commands/eval-run-all.ts):
- Sweeps every requested mode × suite combination
- Sequential default per D9; --parallel N opt-in (clamped to mode count)
- Cost guard with split caps [CDX-15+16]:
--budget-usd-retrieval N (default $5)
--budget-usd-answer N (default $20)
Non-TTY refuses with exit 2 unless --yes AND explicit --budget-usd-*
flags pass. TTY refuses without --yes (defense against agent loops).
- estimateRunCost computes per-(suite,mode) breakdown including the
expansion-Haiku surcharge for tokenmax.
- Audit trail: appends to <repo>/.gbrain-evals/eval-results.jsonl
[CDX-23]. Personal brain (~/.gbrain) NEVER touched.
- v0.32.3 ships orchestrator + argv + guard + persist hook.
In-process per-suite invocation is a v0.32.4 follow-up (operator
runs the per-suite CLIs with the documented --mode flag for now;
each completion calls persistRunRecord to log).
New: gbrain eval compare report (src/commands/eval-compare.ts):
- Reads eval-results.jsonl, groups by (suite, mode), renders MD or JSON
- Most-recent (suite, mode, commit) wins when duplicates exist
- JSON output has schema_version=2 + _meta.metric_glossary block per
[CDX-25] (ONE block per response, not sibling _gloss fields)
- _meta.methodology field names the paired-bootstrap + Bonferroni
discipline per [CDX-14] so haters can reproduce
- Missing file → friendly hint pointing at `gbrain eval run-all`
Wired into eval dispatch table in src/commands/eval.ts.
Metric glossary fuzzy fallback: `recall@10` → `recall@k` lookup
(the glossary documents the family; report rows carry specific K
values). Routes through getMetricGloss for every call site.
Tests (42 cases total — all green):
- eval-run-all.test.ts (19): argv parser, cost estimate, guard
semantics for all 4 (over/under × tty/non-tty) shapes, persist hook
NDJSON shape.
- eval-compare.test.ts (5): JSON + MD output shapes, glossary
integration, missing-file graceful, mode filter, most-recent-wins.
- metric-glossary.test.ts (18): unchanged but updated assertions to
cover the fuzzy `@N` → `@k` fallback.
Pre-existing eval-replay / eval-longmemeval / eval-export / eval-prune
tests (42 cases) still pass — --mode + --compare-limit are additive.
docs/eval/SEARCH_MODE_METHODOLOGY.md — haters-immune 8-section template. Documents what the eval measures + does NOT measure, datasets + sizes (LongMemEval n=500, Replay n=200, BrainBench n=1240 docs / 350 qrels), random seed 42, run procedure verbatim, threats to validity (LongMemEval English+technical skew, char/4 heuristic ~5-10% off, expansion ~97.6% relative lift on this corpus), per-question raw outputs, pre-registered expectations (tokenmax wins R@10 by 5-15pp, conservative wins cost by 5-15x, balanced lands within 3pp), re-run cadence anchored to the src/core/eval/drift-watch.ts watch-list. Statistical-significance section pins paired bootstrap with 10,000 resamples + Bonferroni correction across 3 modes × 4 metrics [CDX-14]. CLAUDE.md gets two new sections: ## Search Mode (3-mode table + resolution chain + [CDX-4] cache contamination fix note + CLI commands) and ## Eval discipline (single-source-of-truth glossary, methodology doc, eval_results in repo NOT personal brain per [CDX-23]). README.md Quick Start gets a paragraph naming the install picker, mode heuristic, and the methodology link. skills/conventions/search-modes.md NEW — convention file consumed by brain-ops + query + signal-detector skills via the existing `> **Convention:**` callout pattern. Routes "what mode" / "tune retrieval" / "compare modes" queries to the right CLI surface. skills/RESOLVER.md gets two new trigger rows pointing at gbrain search * and gbrain eval compare.
bun run build:llms — picks up the new CLAUDE.md sections (Search Mode + Eval discipline) and the docs/eval/SEARCH_MODE_METHODOLOGY.md addition. build-llms.test.ts gate now passes.
… flow The v0.32.3 search_mode + eval_drift helpers were inserted into the DB-checks sub-helper at runDbChecks (line 345-355), but runDoctor itself maintains its own check list and only calls the helpers' subset. Push the two checks into the main runDoctor path (after the existing sync_freshness check at line 2347) so they actually appear in `gbrain doctor --json` output. Both checks gated on engine !== null. Progress reporter heartbeat fires for each. Both still return status 'ok' per [CDX-20] so health score is preserved. Verified end-to-end on a real Postgres brain: gbrain doctor --json now includes 'search_mode' and 'eval_drift' in the checks array.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Three additive search features inspired by brain-kit. All backward-compatible: existing callers see identical behavior unless they opt in to the new options.
1. Token Budget Enforcement (
src/core/search/token-budget.ts)Cap the cumulative token cost of returned results so search payloads fit downstream context windows. Greedy top-down walk; preserves caller ordering; no re-rank. char/4 heuristic for token counting (no tokenizer dependency — keeps the bun
--compilebundle small).SearchOpts.tokenBudget— numeric cap. Defaultundefined= no-op.HybridSearchMeta.token_budget = { budget, used, kept, dropped }token_budgetparam.2. Semantic Query Cache (
src/core/search/query-cache.ts+ migration v52)Cache search results keyed by query embedding similarity. HNSW lookup:
embedding <=> $1 < 0.08(cosine similarity ≥ 0.92). Per-source isolation. Per-row TTL (default 3600s). Best-effort writes; all errors swallowed so the cache never breaks the search hot path.query_cachetable. HALFVEC where pgvector ≥ 0.7; VECTOR otherwise. Embedding dim resolved fromconfig.embedding_dimensionsso non-OpenAI brains work.gbrain cacheCLI:stats/clear --yes/prune.search.cache.enabled/search.cache.similarity_threshold/search.cache.ttl_seconds.HybridSearchMeta.cache = { status, similarity?, age_seconds? }hybridSearchCached(engine, query, opts)wrapper. The query op inoperations.tsnow uses this wrapper so MCP/CLI calls benefit automatically. Skipped for two-pass walks + non-default embedding columns where cache semantics don't hold.3. Zero-LLM Intent Weighting (
src/core/search/intent-weights.ts)Builds on the existing 4-intent classifier (
src/core/search/query-intent.ts). New weight-adjustment layer applies subtle per-intent nudges:recency='on'when caller left it unsetAll adjustments are subtle (max 1.25×). Caller-explicit options always win — intent weighting never silently overrides explicit
recency/salience. Default ON; opt out viaopts.intentWeighting = false. The LLM query expansion path is still available and opt-in viaopts.expansion = true— it just isn't the default anymore.HybridSearchMeta.intentnow surfaces classifier output for debugging.Files changed (13)
Test results
test/token-budget.test.ts— 10 tests (pure module)test/intent-weights.test.ts— 13 tests (pure module)test/query-cache.test.ts— 12 tests (PGLite, including migration v52 schema verification + TTL expiration + source isolation)test/hybrid-search-lite.serial.test.ts— 9 tests (PGLite e2e throughhybridSearchCached)test/search.test.ts,test/query-intent*.test.ts,test/hybrid-meta.test.ts,test/search-limit.test.ts,test/search-lang-symbol-kind.test.ts,test/search-image-column.test.ts)bun run verifyclean (privacy + jsonb + progress + test-isolation + wasm + admin-build + admin-scope-drift + cli-exec + system-of-record + tsc --noEmit)bun run src/cli.ts doctor --fastruns successfully — pre-existingresolver_health/minions_migrationfailures on master are unrelated to this PR.Design notes
bun --compilebundle). Migration uses the same HALFVEC/VECTOR pgvector-version probe pattern as v45 (facts) and v40, so it works on both Postgres and PGLite without extra config.query_cachetable is a derived cache, not authoritative state.gbrain cache clearis the documented invalidation surface; the system-of-record check passes.Need help on this PR? Tag
@codesmithwith what you need.