diff --git a/CLAUDE-original-read-only.md b/CLAUDE-original-read-only.md new file mode 100644 index 000000000..9b2a6df11 --- /dev/null +++ b/CLAUDE-original-read-only.md @@ -0,0 +1,1445 @@ +# CLAUDE.md + +GBrain is a personal knowledge brain and GStack mod for agent platforms. Pluggable +engines: PGLite (embedded Postgres via WASM, zero-config default) or Postgres + pgvector ++ hybrid search in a managed Supabase instance. `gbrain init` defaults to PGLite; +suggests Supabase for 1000+ files. GStack teaches agents how to code. GBrain teaches +agents everything else: brain ops, signal detection, content ingestion, enrichment, +cron scheduling, reports, identity, and access control. + +## Two organizational axes (read this first) + +GBrain knowledge is organized along two orthogonal axes. Users AND agents must +understand both, or queries misroute silently. + +- **Brain** — WHICH DATABASE. Your personal brain is `host`. You can mount + additional brains (team-published, each with their own DB and access policy) + via `gbrain mounts add` (v0.19+). Routing: `--brain`, `GBRAIN_BRAIN_ID`, + `.gbrain-mount` dotfile. +- **Source** — WHICH REPO INSIDE THE DATABASE. A brain can hold many sources + (wiki, gstack, openclaw, essays). Slugs scope per source. Routing: + `--source`, `GBRAIN_SOURCE`, `.gbrain-source` dotfile. + +Both axes follow the same 6-tier resolution pattern. Read +`docs/architecture/brains-and-sources.md` for topology diagrams (personal, team +mount, CEO-class with multiple team brains) and +`skills/conventions/brain-routing.md` for the agent-facing decision table. + +## Architecture + +Contract-first: `src/core/operations.ts` defines ~47 shared operations (v0.29 adds `get_recent_salience`, `find_anomalies`, `get_recent_transcripts`). CLI and MCP +server are both generated from this single source. Engine factory (`src/core/engine-factory.ts`) +dynamically imports the configured engine (`'pglite'` or `'postgres'`). Skills are fat +markdown files (tool-agnostic, work with both CLI and plugin contexts). + +**Trust boundary:** `OperationContext.remote` distinguishes trusted local CLI callers +(`remote: false` set by `src/cli.ts`) from untrusted agent-facing callers +(`remote: true` set by `src/mcp/server.ts`). Security-sensitive operations like +`file_upload` tighten filesystem confinement when `remote=true` and default to +strict behavior when unset. + +## Key files + +- `src/core/operations.ts` — Contract-first operation definitions (the foundation). Also exports upload validators: `validateUploadPath`, `validatePageSlug`, `validateFilename`, plus `matchesSlugAllowList(slug, prefixes)` (v0.23 glob matcher: `/*` matches recursive children; bare `` matches exact only). `OperationContext.remote` flags untrusted callers; `OperationContext.allowedSlugPrefixes` (v0.23) is the trusted-workspace allow-list set by the dream cycle. `put_page` enforces: when `viaSubagent` and `allowedSlugPrefixes` is set, slug must match the allow-list; else the legacy `wiki/agents//...` namespace check applies. Auto-link enabled for trusted-workspace writes (skipped only when `remote=true && !trustedWorkspace`). As of v0.26.0, every `Operation` also carries `scope?: 'read' | 'write' | 'admin'` + `localOnly?: boolean`. All ops are annotated; `sync_brain`, `file_upload`, `file_list`, and `file_url` are `admin + localOnly` (rejected over HTTP). `OperationContext.auth?: AuthInfo` is threaded through HTTP dispatch for scope enforcement in `serve-http.ts` before the op runs. **v0.26.9 (D12 + F7b):** `OperationContext.remote` is now a REQUIRED field in the TypeScript type — the compiler is the first defense against transports that forget to set it. Four trust-boundary call sites (`put_page` allowlist, file_upload trust-narrowing, submit_job protected-name guard, auto-link skip) flipped from falsy-default (`!ctx.remote`) to fail-closed semantics (`ctx.remote === false` for "trusted-only" sites and `ctx.remote !== false` for "untrust unless explicit-false"). Anything that isn't strictly `false` is now treated as remote. Closed an HTTP MCP shell-job RCE: a `read+write`-scoped OAuth token could submit `shell` jobs because the HTTP request handler's literal context skipped `remote: true` and `submit_job`'s protected-name guard saw a falsy undefined. Stdio MCP set the field correctly via dispatch.ts; HTTP inlined a parallel context-builder for several releases and lost it. +- `src/core/engine.ts` — Pluggable engine interface (BrainEngine). `clampSearchLimit(limit, default, cap)` takes an explicit cap so per-operation caps can be tighter than `MAX_SEARCH_LIMIT`. Exports `LinkBatchInput` / `TimelineBatchInput` for the v0.12.1 bulk-insert API (`addLinksBatch` / `addTimelineEntriesBatch`). As of v0.13.1, `BrainEngine` has a `readonly kind: 'postgres' | 'pglite'` discriminator so migrations (`src/core/migrate.ts`) and other consumers can branch on engine without `instanceof` + dynamic imports. **v0.29:** four new methods — `batchLoadEmotionalInputs(slugs?)` (CTE-shaped read with per-table aggregates so a page × N tags × M takes never produces N×M rows), `setEmotionalWeightBatch(rows)` (`UPDATE FROM unnest($1::text[], $2::text[], $3::real[])` composite-keyed on `(slug, source_id)` for multi-source safety), `getRecentSalience(opts)`, `findAnomalies(opts)`. `PageFilters` extended with `sort?: 'updated_desc' | 'updated_asc' | 'created_desc' | 'slug'` + `PAGE_SORT_SQL` whitelist consumed by both engines (was hardcoded `ORDER BY updated_at DESC`). **v0.32.8 (PR #860):** new `listAllPageRefs(): Promise>` ordered by `(source_id, slug)`. Cheap cross-source enumeration for hot loops on large brains — replaces the `getAllSlugs()→getPage(slug)` N+1 pattern in extract-takes, extract, integrity, which silently defaulted to `source_id='default'` for non-default-source pages. Implementation parity across postgres-engine.ts + pglite-engine.ts. Pinned by `test/e2e/multi-source-bug-class.test.ts`. +- `src/core/engine-factory.ts` — Engine factory with dynamic imports (`'pglite'` | `'postgres'`) +- `src/core/pglite-engine.ts` — PGLite (embedded Postgres 17.5 via WASM) implementation, all 40 BrainEngine methods. `addLinksBatch` / `addTimelineEntriesBatch` use multi-row `unnest()` with manual `$N` placeholders. As of v0.13.1, `connect()` wraps `PGlite.create()` in a try/catch that emits an actionable error naming the macOS 26.3 WASM bug (#223) and pointing at `gbrain doctor`; the lock is released on failure so the next process can retry cleanly. v0.22.0: `searchKeyword` and `searchKeywordChunks` multiply `ts_rank` by the source-factor CASE expression at the chunk-grain level; `searchVector` becomes a two-stage CTE — inner CTE keeps `ORDER BY cc.embedding <=> vec` so HNSW stays usable, outer SELECT re-ranks by `raw_score * source_factor`. Inner LIMIT scales with offset to preserve pagination contract. As of v0.22.6.1, `initSchema()` calls `applyForwardReferenceBootstrap()` BEFORE replaying SCHEMA_SQL — probes for the specific forward-referenced state the embedded schema blob needs (`pages.source_id`, `links.link_source`, `links.origin_page_id`, `content_chunks.symbol_name`, `content_chunks.language`, `sources` FK target table) and adds only what's missing. Closes the upgrade-wedge bug class that bit users 10+ times across 6 schema versions over 2 years (#239/#243/#266/#357/#366/#374/#375/#378/#395/#396). No-op on fresh installs and modern brains. +- `src/core/pglite-schema.ts` — PGLite-specific DDL (pgvector, pg_trgm, triggers) +- `src/core/postgres-engine.ts` — Postgres + pgvector implementation (Supabase / self-hosted). `addLinksBatch` / `addTimelineEntriesBatch` use `INSERT ... SELECT FROM unnest($1::text[], ...) JOIN pages ON CONFLICT DO NOTHING RETURNING 1` — 4-5 array params regardless of batch size, sidesteps the 65535-parameter cap. As of v0.12.3, `searchKeyword` / `searchVector` scope `statement_timeout` via `sql.begin` + `SET LOCAL` so the GUC dies with the transaction instead of leaking across the pooled postgres.js connection (contributed by @garagon). `getEmbeddingsByChunkIds` uses `tryParseEmbedding` so one corrupt row skips+warns instead of killing the query. v0.22.0: `searchKeyword`, `searchKeywordChunks`, and `searchVector` apply source-aware ranking by inlining the source-factor CASE and `NOT (col LIKE …)` hard-exclude clause from `src/core/search/sql-ranking.ts`. `searchVector` switches to a two-stage CTE (HNSW-safe inner ORDER BY, source-boost re-rank in the outer SELECT) and carries `p.source_id` through inner→outer for v0.18 multi-source callers. v0.22.1 (#406): `_savedConfig` retains the connect config; `reconnect()` tears down + recreates the pool from saved config (called by supervisor watchdog after 3 consecutive health-check failures). `executeRaw` is a single-statement passthrough — no per-call retry (D3 dropped that as unsound for non-idempotent statements; recovery is supervisor-driven). v0.22.1 (#363, contributed by @orendi84): `connect()` applies `resolveSessionTimeouts()` from `db.ts` as connection-time startup parameters (`statement_timeout`, `idle_in_transaction_session_timeout`) so orphan pgbouncer backends can't hold locks for hours. v0.22.1 (#409, contributed by @atrevino47): `countStaleChunks()` + `listStaleChunks()` server-side-filter on `embedding IS NULL` for `embed --stale`, eliminating ~76 MB/call client-side pull on a fully-embedded brain; `upsertChunks()` resets both `embedding` AND `embedded_at` to NULL when chunk_text changes without a new embedding (consistency). As of v0.22.6.1, `initSchema()` calls `applyForwardReferenceBootstrap()` BEFORE replaying SCHEMA_SQL on the same forward-reference probe set as the PGLite engine, so old Postgres brains pinned at v0.13/v0.18/v0.19 walk forward cleanly instead of wedging on `column "..." does not exist`. **v0.28.1:** `disconnect()` is now idempotent. New `_connectionStyle` instance field tracks whether the engine owns its pool (worker engines) or shares the module-level singleton; second call on an instance-pool engine is a no-op rather than falling through to `db.disconnect()` and clobbering the singleton. Pinned by `test/e2e/postgres-engine-disconnect-idempotency.test.ts` (2 cases). Closes the bug class where any test sharing an engine across multiple `worker.start()` / `worker.stop()` cycles silently broke its own DB connectivity. +- `src/core/cjk.ts` (v0.32.7 CJK wave) — Single source of truth for CJK detection across the codebase. Exports `CJK_RANGES_REGEX`, `CJK_SLUG_CHARS` (character-class fragment for embedding inside other regexes), `CJK_SENTENCE_DELIMITERS` (`。!?`), `CJK_CLAUSE_DELIMITERS` (`;:,、`), `CJK_DENSITY_THRESHOLD = 0.30`, `hasCJK(s)`, `countCJKAwareWords(s)` (30% density threshold — English docs with one Japanese term stay whitespace-tokenized; Chinese-dominant docs get char-counted), and `escapeLikePattern(s)` (escapes `%`, `_`, `\\` for `ILIKE ... ESCAPE '\\'`). Replaces the inline hasCJK regex previously duplicated at `expansion.ts:58`. BMP-only ranges (Han / Hiragana / Katakana / Hangul Syllables); widening to Unicode property escapes is a v0.33+ TODO. Consumers: `expansion.ts`, `sync.ts:slugifySegment`, `operations.ts:validatePageSlug + validateFilename`, `chunkers/recursive.ts:countWords + DELIMITERS`, `pglite-engine.ts:searchKeyword + searchKeywordChunks`. +- `src/core/audit-slug-fallback.ts` (v0.32.7 CJK wave) — Weekly ISO-week-rotated audit JSONL at `~/.gbrain/audit/slug-fallback-YYYY-Www.jsonl`. `logSlugFallback(slug, sourcePath)` fires when `importFromFile` falls back to a frontmatter slug because `slugifyPath` returned empty (emoji / Thai / Arabic / non-CJK exotic-script filenames). `readRecentSlugFallbacks(days)` reads the last N days for `gbrain doctor`'s `slug_fallback_audit` check. Honors `GBRAIN_AUDIT_DIR` via the shared `resolveAuditDir()` from shell-audit.ts. Separate surface from `sync-failures.jsonl` per codex outside-voice review — that file carries bookmark-gating semantics that info events shouldn't trigger. +- `src/core/embedding-pricing.ts` (v0.32.7 CJK wave) — `EMBEDDING_PRICING` map keyed `provider:model` for the post-upgrade reindex cost estimate. Sibling to `anthropic-pricing.ts`. Entries: OpenAI text-embedding-3-large ($0.13/1M), 3-small ($0.02/1M), ada-002 ($0.10/1M), Voyage 3-large ($0.18/1M), 3 ($0.06/1M). `lookupEmbeddingPrice(modelString)` returns a tagged union (`known` with price + `unknown` with provider name); `estimateCostFromChars(charCount, pricePerMTok)` uses 3.5 chars/token approximation. Unknown providers degrade gracefully to "estimate unavailable" instead of fabricating numbers. +- `src/core/post-upgrade-reembed.ts` (v0.32.7 CJK wave) — Pure functions backing the `gbrain upgrade` chunker-bump cost prompt. `computeReembedEstimate(engine, model)` queries real SQL (`COUNT(*)` + `COALESCE(SUM(LENGTH(compiled_truth)) + SUM(LENGTH(timeline)), 0)`) on `pages WHERE chunker_version < MARKDOWN_CHUNKER_VERSION`. `formatReembedPrompt(est, graceSeconds)` is the stderr-line formatter. `runPostUpgradeReembedPrompt(engine, model, opts)` orchestrates the 10-second Ctrl-C window; TTY-only wait (non-TTY auto-proceeds for CI / cron); `GBRAIN_NO_REEMBED=1` bails out with a doctor-warning marker; `GBRAIN_REEMBED_GRACE_SECONDS=0` skips the wait. +- `src/commands/reindex.ts` (v0.32.7 CJK wave) — `gbrain reindex --markdown [--limit N] [--dry-run] [--json] [--no-embed] [--repo PATH]`. Walks `pages WHERE page_kind = 'markdown' AND chunker_version < MARKDOWN_CHUNKER_VERSION` in 100-row batches, ordered by id. Rows with non-null `source_path` re-import via `importFromFile`; rows without fall back to `importFromContent` against the stored `compiled_truth`. **Both paths pass `forceRechunk: true`** to bypass `importFromContent`'s `content_hash` short-circuit — without that flag (codex post-merge F1), the chunker version bump never reaches pages whose source content hasn't changed since last sync, AND master's v0.32.2 stripFactsFence privacy strip never applies to pre-strip chunks. Idempotent — partial-completion re-runs pick up where they left off via id-ordered batches. Wired into `src/commands/upgrade.ts:runPostUpgrade` after `apply-migrations`. +- `src/commands/sync.ts:resolveSlugByPathOrSourcePath` (v0.32.7 CJK wave, codex post-merge F4) — Resolves a slug by `pages.source_path` first (returns the stored slug for frontmatter-fallback pages whose path doesn't derive a slug), then falls back to `resolveSlugForPath(path)`. Threaded into all 4 delete/rename call sites (`performSync`'s un-syncable cleanup at ~:531, deletes at ~:603, rename oldSlug at ~:622). Without this, emoji-only / Thai / Arabic filenames whose slug came from frontmatter would orphan on delete/rename (the delete path would compute the wrong path-derived slug). Best-effort query — pre-migration brains fall through to the legacy path. +- `src/core/utils.ts` — Shared SQL utilities extracted from postgres-engine.ts. Exports `parseEmbedding(value)` (throws on unknown input, used by migration + ingest paths where data integrity matters) and as of v0.12.3 `tryParseEmbedding(value)` (returns `null` + warns once per process, used by search/rescore paths where availability matters more than strictness). **v0.26.9 (D14):** adds `isUndefinedColumnError(err)` predicate — pattern-matches Postgres SQLSTATE 42703 / "column ... does not exist" with engine-driver shape variation tolerated. Replaces bare `catch {}` blocks in `oauth-provider.ts` so genuine errors (lock timeout, network blip, permission denied) propagate while column-missing falls through to the legacy fallback path. Reusable from any future code that needs the same column-existence probe semantics. **v0.32.8 (PR #860):** adds `validateSourceId(id)` that throws on anything outside `^[a-z0-9_-]+$`. Used by the per-source disk-layout fix in patterns.ts/synthesize.ts before any `join(brainDir, '.sources', source_id, slug+'.md')` call so source_id can't traverse out of brainDir. `rowToPage` updated to populate the now-required `Page.source_id` field from the SELECT projection (`scripts/check-source-id-projection.sh` enforces that every projection feeding `rowToPage` includes the column). +- `src/core/db.ts` — Connection management, schema initialization. v0.22.1 (#363, contributed by @orendi84): `resolveSessionTimeouts()` returns `statement_timeout` + `idle_in_transaction_session_timeout` (defaults: 5min each, env-overridable via `GBRAIN_STATEMENT_TIMEOUT` / `GBRAIN_IDLE_TX_TIMEOUT` / `GBRAIN_CLIENT_CHECK_INTERVAL`). Both `connect()` (module singleton) and `PostgresEngine.connect()` (worker pool) consume the result via postgres.js's `connection` option, sending GUCs as startup parameters that survive PgBouncer transaction mode (unlike the prior `setSessionDefaults` post-pool SET, kept as a back-compat no-op shim). +- `src/commands/migrate-engine.ts` — Bidirectional engine migration (`gbrain migrate --to supabase/pglite`) +- `src/core/import-file.ts` — importFromFile + importFromContent (chunk + embed + tags) +- `src/core/sync.ts` — Pure sync functions (manifest parsing, filtering, slug conversion). v0.22.12 (#500, foundation by @wintermute via #501): `classifyErrorCode(errorMsg)` regex-based classifier with 12 codes (`SLUG_MISMATCH`, `YAML_PARSE`, `YAML_DUPLICATE_KEY`, `MISSING_OPEN`, `MISSING_CLOSE`, `NESTED_QUOTES`, `EMPTY_FRONTMATTER`, `NULL_BYTES`, `INVALID_UTF8`, `STATEMENT_TIMEOUT`, `FILE_TOO_LARGE`, `SYMLINK_NOT_ALLOWED`) plus `UNKNOWN` fallback. `summarizeFailuresByCode(failures)` returns sorted `[{code, count}]`. `code?` optional field on `SyncFailure`; backfilled at ack time on pre-v0.22.12 entries. `acknowledgeSyncFailures()` returns `AcknowledgeResult { count, summary }`. Three regexes (`MISSING_OPEN`, `MISSING_CLOSE`, `EMPTY_FRONTMATTER`) broadened to match actual `markdown.ts:159-244` validator message strings, not just the literal code-name prefix. `FILE_TOO_LARGE` covers all three production size sites in `import-file.ts:199, 352, 401`; `SYMLINK_NOT_ALLOWED` covers the rejection at `:347`. Closes the silent-skip pattern that motivated #500. +- `src/core/storage.ts` — Pluggable storage interface (S3, Supabase Storage, local) +- `src/core/storage-config.ts` (v0.22.11) — Storage tiering: `loadStorageConfig` reads `gbrain.yml`, normalizes deprecated keys (`git_tracked` / `supabase_only`) to canonical (`db_tracked` / `db_only`) with once-per-process deprecation warning, and runs `normalizeAndValidateStorageConfig` (auto-fixes missing trailing `/`, throws `StorageConfigError` on tier overlap). Path-segment matcher: `media/x/` does NOT match `media/xerox/foo`. Replaces gray-matter (broken on delimiter-less YAML) with a dedicated parser for the `gbrain.yml` shape. +- `src/core/disk-walk.ts` (v0.22.11) — `walkBrainRepo(repoPath)` returns `Map` from one recursive `readdirSync`. Skips dot-dirs, `node_modules`, non-`.md` files. Used by `gbrain storage status` to replace per-page `existsSync + statSync` (~400K syscalls on 200K-page brains → tens). +- `src/commands/storage.ts` (v0.22.11) — `gbrain storage status [--repo P] [--json]`. Split into pure data (`getStorageStatus`) + JSON formatter + human formatter (ASCII-only per D10) matching the `orphans.ts` pattern. `PageCountsByTier` and `DiskUsageByTier` are distinct nominal types so swaps fail at compile time. +- `gbrain.yml` (brain repo root, v0.22.11) — Optional storage tiering config. Top-level `storage:` section with `db_tracked:` and `db_only:` array-valued keys. `gbrain sync` auto-manages `.gitignore` for `db_only` paths on successful sync (skips on dry-run, blocked-by-failures, submodule context, or `GBRAIN_NO_GITIGNORE=1`). `gbrain export --restore-only [--repo P] [--type T] [--slug-prefix S]` repopulates missing `db_only` files from the database. +- `src/core/supabase-admin.ts` — Supabase admin API (project discovery, pgvector check) +- `src/core/file-resolver.ts` — File resolution with fallback chain (local -> .redirect.yaml -> .redirect -> .supabase) +- `src/core/chunkers/` — 3-tier chunking (recursive, semantic, LLM-guided). v0.19.0 adds `code.ts` — tree-sitter-based semantic chunker for 29 languages with embedded-asset WASMs (`src/assets/wasm/`), `@dqbd/tiktoken` cl100k_base tokenizer, small-sibling merging. `CHUNKER_VERSION` constant folded into `importCodeFile`'s `content_hash` so chunker shape changes force clean re-chunks across releases. +- `src/core/errors.ts` (v0.19.0) — `StructuredAgentError` + `buildError` + `serializeError`. Every new v0.19.0 agent-facing surface (code-def, code-refs, usage errors) uses this envelope; matches v0.17.0 `CycleReport.PhaseResult.error` shape. +- `src/assets/wasm/` (v0.19.0) — 36 tree-sitter grammar WASMs + tree-sitter runtime. Committed to the repo so `bun --compile` embeds them deterministically via `import path from ... with { type: 'file' }`. The CI guard `scripts/check-wasm-embedded.sh` fails the build if the compiled binary ever silently falls through to recursive chunks. +- `src/commands/code-def.ts` + `src/commands/code-refs.ts` (v0.19.0) — symbol definition + references lookup. Query `content_chunks.symbol_name` or chunk_text ILIKE with `page_kind='code'` filter. Auto-JSON when stdout is not a TTY (gh-CLI convention). Bypass the standard `searchKeyword` `DISTINCT ON (slug)` collapse so multiple call-sites from the same file surface. +- `src/core/search/` — Hybrid search: vector + keyword + RRF + multi-query expansion + dedup. As of v0.22.0, `searchKeyword` / `searchKeywordChunks` / `searchVector` apply source-aware ranking at the SQL layer (curated content like `originals/`, `concepts/`, `writing/` outranks bulk content like `wintermute/chat/`, `daily/`, `media/x/`). `searchVector` uses a two-stage CTE so source-boost re-ranking doesn't kill the HNSW index. Hard-exclude prefixes (`test/`, `archive/`, `attachments/`, `.raw/` by default) filter at retrieval, not post-rank. Both gates honor `detail !== 'high'` so temporal queries surface chat pages normally. +- `src/core/search/intent.ts` — Query intent classifier (entity/temporal/event/general → auto-selects detail level) +- `src/core/search/eval.ts` — Retrieval eval harness: P@k, R@k, MRR, nDCG@k metrics + runEval() orchestrator +- `src/core/search/source-boost.ts` (v0.22.0) — Source-type boost map keyed by slug prefix. `DEFAULT_SOURCE_BOOSTS` (originals/ 1.5, concepts/ 1.3, writing/ 1.4, people/companies/deals/ 1.2, daily/ 0.8, media/x/ 0.7, wintermute/chat/ 0.5) and `DEFAULT_HARD_EXCLUDES` (test/, archive/, attachments/, .raw/). `parseSourceBoostEnv` / `parseHardExcludesEnv` parse comma-separated `prefix:factor` pairs from `GBRAIN_SOURCE_BOOST` / `GBRAIN_SEARCH_EXCLUDE` env vars. `resolveBoostMap` and `resolveHardExcludes` merge defaults + env + caller `SearchOpts.exclude_slug_prefixes`/`include_slug_prefixes`. +- `src/core/search/sql-ranking.ts` (v0.22.0) — Pure SQL string builders. `buildSourceFactorCase(slugColumn, boostMap, detail)` emits a CASE expression with longest-prefix-match wins (returns literal `'1.0'` when `detail === 'high'` for temporal-bypass parity with COMPILED_TRUTH_BOOST). `buildHardExcludeClause(slugColumn, prefixes)` emits `NOT (col LIKE 'p1%' OR col LIKE 'p2%')` — OR-chain wrapped in NOT, NOT `NOT LIKE ALL/ANY` (those quantifiers don't express set-exclusion). LIKE meta-character escape covers all three of `%`, `_`, AND `\` (backslash matters because it's Postgres LIKE's default escape char). Single-quote doubling on SQL string literals so injection-style inputs are inert text. +- `src/commands/eval.ts` — `gbrain eval` command: single-run table + A/B config comparison. v0.25.0 adds sub-subcommand dispatch on `args[0]` so `gbrain eval export` + `gbrain eval prune` + `gbrain eval replay` route into session-capture handlers; bare `gbrain eval --qrels …` fall-through preserves the legacy IR-metrics flow. v0.27.x adds `gbrain eval cross-modal` to the dispatch (the user-facing path is the cli.ts no-DB branch — `src/commands/eval.ts:cross-modal` only fires when callers re-enter with an existing engine). +- `src/commands/eval-cross-modal.ts` (v0.27.x) — multi-model quality gate. Three different-provider frontier models score the OUTPUT against the TASK on a 5-dim list. Verdict `pass` (exit 0) / `fail` (exit 1) / `inconclusive` (exit 2; <2/3 model successes per Q3=A in plans/radiant-napping-lerdorf.md). Reuses `src/core/ai/gateway.ts:chat()` so config/auth/aliasing comes from the gateway recipe registry — no parallel provider stack. Self-configures the gateway (`configureGateway(loadConfig() + process.env)`) since the cli.ts dispatch bypasses `connectEngine()`. Default cycles 3 in TTY, 1 in non-TTY (T11=B partial cost guardrail). Receipts land at `gbrainPath('eval-receipts')/-.json`. The full `--budget-usd` cap is a v0.27.x follow-up TODO. +- `src/core/cross-modal-eval/json-repair.ts` (v0.27.x) — `parseModelJSON(raw)` named export with a 4-strategy fallback chain (direct parse → fence-strip → trailing-comma + single-quote + embedded-newline repair → regex nuclear option). Adversarial input throws rather than fabricating scores — the aggregator treats a throw as "this model contributed nothing this cycle" so the gate stays correct at >=2/3 successes. +- `src/core/cross-modal-eval/aggregate.ts` (v0.27.x) — pure verdict logic. Pass criterion: `(successes >= 2) AND (every dim mean >= 7) AND (every dim min across models >= 5)` (Q2=A floor). Inconclusive when <2/3 models returned parseable scores (Q3=A regression guard for the v1 .mjs `Object.values({}).every(...) === true` empty-array PASS bug). +- `src/core/cross-modal-eval/runner.ts` (v0.27.x) — orchestrator. Each cycle runs `Promise.allSettled([gwChat(slotA), gwChat(slotB), gwChat(slotC)])` (T4=A — bare allSettled, no rate-leases for the CLI path; minion-integration TODO recovers cross-process concurrency). Stops early on PASS or INCONCLUSIVE; runs up to 3 cycles. Default slots: `openai:gpt-4o` / `anthropic:claude-opus-4-7` / `google:gemini-1.5-pro`. `estimateCost()` exports a small per-model pricing table (drifts; refresh alongside model-family bumps). +- `src/core/cross-modal-eval/receipt-name.ts` (v0.27.x) — receipt filename binds (slug, SKILL.md sha-8). `findReceiptForSkill(skillPath, receiptDir)` returns `'found' | 'stale' | 'missing'` (T10=A). Skillify-check item 11 surfaces the status as informational (T7=C); the audit does NOT fail on missing/stale receipts. +- `src/core/cross-modal-eval/receipt-write.ts` (v0.27.x) — wraps `fs.writeFileSync` with `mkdirSync({recursive:true})` ahead of every write (T5 correction; `gbrainPath()` does NOT auto-mkdir). +- `src/commands/eval-export.ts` (v0.25.0) — streams `eval_candidates` rows as NDJSON to stdout with `schema_version: 1` prefix on every line. EPIPE-safe, progress heartbeats on stderr, stable id-desc tiebreaker so `--since` windows never dupe/miss rows. +- `src/commands/eval-prune.ts` (v0.25.0) — explicit retention cleanup. Requires `--older-than DUR`. `--dry-run` reports would-delete count. +- `src/commands/eval-replay.ts` (v0.25.0) — contributor-facing replay tool. Reads NDJSON from `gbrain eval export`, re-runs each captured `query` / `search` op against the current brain, computes set-Jaccard@k between captured + current `retrieved_slugs`, top-1 stability rate, and latency Δ. Stable JSON shape (`schema_version: 1`) for CI gating; human mode prints a regression table. Pure Bun, zero new deps. The dev-loop half of BrainBench-Real that closes the gap between "data captured" and "data used to gate a PR." See `docs/eval-bench.md` for the workflow. +- `src/commands/eval-suspected-contradictions.ts` + `src/core/eval-contradictions/{judge,runner,types,date-filter,cost-tracker,cache,severity-classify,cross-source,trends,calibration,judge-errors,auto-supersession,fixture-redact}.ts` (v0.32.6) — `gbrain eval suspected-contradictions [run|trend|review]`. Probe samples top-K retrieval pairs per query (cross-slug + intra-page chunk-vs-take), date pre-filters (3-rule layered — same-paragraph-dual-date overrides separation rule), LLM judge (query-conditioned per Codex; UTF-8-safe truncation; C1 confidence-floor double-enforcement; resolution_kind output drives M7 paste-ready commands), persistent cache keyed on `(chunk_a_hash, chunk_b_hash, model_id, prompt_version, truncation_policy)` (Codex outside-voice fix — prompt edits cleanly invalidate prior verdicts), Wilson 95% CI calibration on the headline percentage with `small_sample_note` when n<30, judge_errors as first-class typed counters (parse_fail/refusal/timeout/http_5xx/unknown — Codex fix to bias from silent skip), M5 trend writes to `eval_contradictions_runs`, M6 source-tier breakdown reuses `DEFAULT_SOURCE_BOOSTS` prefix logic, deterministic sampling (combined_score DESC + lex tiebreaker — stable cache hit-rate across re-runs). Hermetic via `judgeFn` + `searchFn` DI in the runner; never touches the real gateway in tests. Engine surface: `BrainEngine.listActiveTakesForPages` (P1 batched), `writeContradictionsRun` + `loadContradictionsTrend` (M5), `getContradictionCacheEntry` + `putContradictionCacheEntry` + `sweepContradictionCache` (P2). Schema migrations v51 + v52. MCP op `find_contradictions` (read scope, NOT localOnly, NOT in subagent allowlist — user-initiated only). M1 doctor check surfaces high-severity findings with paste-ready resolution commands. M2 synthesize phase pre-fetches latest probe's top-5-by-severity findings and threads them into `buildSynthesisPrompt` as an informational block. 226 hermetic unit tests + 12 real-Postgres E2E. Plan: `~/.claude/plans/system-instruction-you-are-working-hashed-dewdrop.md`. Architecture doc: `docs/contradictions.md`. +- `src/commands/eval-longmemeval.ts` + `src/eval/longmemeval/{harness,adapter,sanitize}.ts` (v0.28.1) — `gbrain eval longmemeval ` runs the public [LongMemEval](https://huggingface.co/datasets/xiaowu0162/longmemeval) benchmark against gbrain's hybrid retrieval. Architecture: one in-memory PGLite per benchmark run created via `createBenchmarkBrain` + `withBenchmarkBrain` (NO `EphemeralBrain` class). Between questions, `TRUNCATE` over runtime-enumerated `pg_tables` so future schema migrations don't silently leak data across questions; infrastructure tables (`sources`, `config`, `gbrain_cycle_locks`, `subagent_rate_leases`) are preserved. `cli.ts` has a pre-dispatch bypass so `eval longmemeval` skips `connectEngine()` — the user's `~/.gbrain` brain is never opened. `--expansion` defaults to OFF (deterministic, no per-query Haiku call); pass `--expansion` to opt in. Default model resolves through `resolveModel()` 6-tier chain with `models.eval.longmemeval` as the new config key. Sanitization parity: `harness.ts` re-uses `INJECTION_PATTERNS` from `src/core/think/sanitize.ts` (now exported, line 22) so adding a pattern automatically covers takes AND benchmarks. Retrieved chat content is wrapped in `` framing; the answer-gen system prompt declares the content UNTRUSTED. LLM injection seam: `runEvalLongMemEval(args, {client?: ThinkLLMClient})` lets tests stub the client so the full pipeline runs without an Anthropic API key. p50 25.9ms / p99 30.3ms warm reset+import+search on Apple Silicon (per `test/eval-longmemeval.test.ts` perf gate). Hand the JSONL output to LongMemEval's `evaluate_qa.py` to score (their published evaluator, not bundled — needs OpenAI gpt-4o per their spec). +- `docs/eval-bench.md` (v0.25.0) — contributor guide for using captured data to benchmark retrieval changes before merging. Linked from CONTRIBUTING.md under "Running real-world eval benchmarks (touching retrieval code)". +- `src/core/eval-capture.ts` (v0.25.0) — op-layer capture wrapper called from `src/core/operations.ts` `query` + `search` handlers. Catches MCP + CLI + subagent tool-bridge from one site. Fire-and-forget; failures route to `engine.logEvalCaptureFailure` so `gbrain doctor` sees drops cross-process. **Capture is off by default** — `isEvalCaptureEnabled` resolution: explicit `config.eval.capture` (true/false) wins, else `process.env.GBRAIN_CONTRIBUTOR_MODE === '1'`, else off. Production users get a quiet brain; contributors set `export GBRAIN_CONTRIBUTOR_MODE=1` in `.zshrc` to enable the dev loop. PII scrubber gate is independent and defaults to true regardless of CONTRIBUTOR_MODE. +- `src/core/eval-capture-scrub.ts` (v0.25.0) — zero-deps PII scrubber: emails, phones, SSN, Luhn-verified credit cards, JWT-shaped tokens, bearer tokens. +- `src/core/search/hybrid.ts` — Cathedral II `Promise` return shape unchanged in v0.25.0. Adds `onMeta?: (m: HybridSearchMeta) => void` callback so op-layer capture can record what hybridSearch actually did. Existing callers leave it undefined. +- `docs/eval-capture.md` (v0.25.0) — stable NDJSON schema reference for gbrain-evals consumers. +- `test/public-exports.test.ts` (v0.25.0 / R2) — runtime contract test. Imports each of the 17 public subpaths via package name and pins a canary symbol per module. Paired with `scripts/check-exports-count.sh`. +- `src/core/embedding.ts` — OpenAI text-embedding-3-large, batch, retry, backoff. **v0.28.7:** `BATCH_SIZE` reverted 50→100 — the original Voyage safety guard halved OpenAI throughput on every page. Per-recipe pre-split + recursive halving + adaptive shrink-on-miss now live in the gateway, so the outer paginator goes back to its original purpose: progress-callback granularity, not batch protection. +- `src/core/ai/types.ts` — provider/recipe types. **v0.28.7 (#680):** `EmbeddingTouchpoint` extended with optional `chars_per_token` (default 4 chars/token, matching OpenAI tiktoken on English) and `safety_factor` (default 0.8, budget-utilization ceiling). Both consulted only when `max_batch_tokens` is also set. Voyage declares `chars_per_token=1` + `safety_factor=0.5` to handle dense payloads (CJK/JSON/base64) that overshoot tiktoken. The pre-split budget is `max_batch_tokens × safety_factor / chars_per_token`. **v0.28.11 (#719):** `EmbeddingTouchpoint.multimodal_models?: string[]` model-level allow-list for recipes that mix text-only + multimodal models under one touchpoint (Voyage's 12 models share `supports_multimodal: true` but only `voyage-multimodal-3` accepts `/multimodalembeddings`). When omitted, recipe-level `supports_multimodal` is sufficient. `AIGatewayConfig.embedding_multimodal_model?: string` lets `embedMultimodal()` route to a different model than `embedding_model` — brains using OpenAI for text can use Voyage for images without flipping the primary embedding pipeline. +- `src/core/ai/gateway.ts` — unified seam for every AI call. **v0.28.7 (#680):** module-scoped `_embedTransport` defaulting to AI SDK `embedMany`, with `__setEmbedTransportForTests(fn)` test seam so tests drive the public `embed()` function with a stubbed transport instead of probing private helpers. `splitByTokenBudget` and `isTokenLimitError` are now exported `@internal` — pure functions reused directly by the test file. Module-level `_shrinkState: Map` halves the recipe's effective `safety_factor` on token-limit miss (floor 0.05) and heals back ×1.5 toward the ceiling after `SHRINK_HEAL_AFTER=10` consecutive successes. `configureGateway()` walks every registered recipe at construction time and emits a once-per-process stderr warning for any embedding touchpoint missing `max_batch_tokens` (excluding the canonical OpenAI fast-path recipe). `resetGateway()` clears `_shrinkState`, the warned-set, and restores the real transport. ASCII flow diagram embedded in the `embed()` JSDoc covers the routing decision, recursion + halving, and shrinkState lifecycle. **v0.28.11 (#719):** `embedMultimodal()` reads `cfg.embedding_multimodal_model` first (falls back to `cfg.embedding_model` for single-model setups). After the existing recipe-level `supports_multimodal` fast-fail, validates the resolved model against `touchpoint.multimodal_models` when declared — closes the Voyage-text-only-model-into-multimodal-endpoint footgun before any HTTP call (Codex F1 from PR review). New `getMultimodalModel()` accessor mirrors `getEmbeddingModel` / `getChatModel` so doctor and integration tests can read the gateway state. +- `src/core/ai/recipes/voyage.ts` — Voyage AI openai-compatible recipe. **v0.28.7 (#680):** declares `chars_per_token=1` + `safety_factor=0.5` so the gateway pre-splits Voyage batches at a 60K-character budget (50% of 120K-token cap with the dense-tokenizer ratio). Closes the v0.27 backfill loop where ~26% of the corpus stayed un-embedded because tiktoken-grounded budgeting silently undercounted Voyage's actual token usage. **v0.28.11 (#719):** declares `multimodal_models: ['voyage-multimodal-3']` so the gateway rejects text-only Voyage models pointed at the multimodal endpoint with a clear `AIConfigError` instead of waiting for Voyage's HTTP 400. +- `src/core/ai/recipes/anthropic.ts` — Anthropic recipe (chat + expansion touchpoints). **v0.31.12:** chat and expansion `models:` lists drop the v0.31.6 phantom `claude-sonnet-4-6-20250929` date suffix — canonical id is `claude-sonnet-4-6`. The wrong-direction alias `claude-sonnet-4-6 → claude-sonnet-4-6-20250929` is removed; a reverse alias `claude-sonnet-4-6-20250929 → claude-sonnet-4-6` keeps stale user configs working (rescues `facts.extraction_model` and `models.dream.synthesize` set by v0.31.6 installs). Recipe-shape regression pinned by `test/anthropic-model-ids.test.ts` (6 cases, verbatim cherry-pick of PR #830 plus the reverse-alias rescue case). +- `src/core/anthropic-pricing.ts` — Single source of truth for Anthropic model pricing (per-MTok input/output). **v0.31.12:** Opus 4.7 corrected from `$15/$75` to `$5/$25` (the old number was from Opus 4 generation, never refreshed when 4.7 shipped); Opus 4.6 also corrected. Consumed by `src/core/budget-meter.ts` and `src/core/cross-modal-eval/runner.ts` — the cross-modal estimator now reads `ANTHROPIC_PRICING` for Anthropic models instead of duplicating the table, killing the v0.31.6 drift bug class. +- `src/core/model-config.ts` — Model-string resolution (the seam every internal LLM call walks through). **v0.31.12:** four-tier system (`ModelTier = 'utility' | 'reasoning' | 'deep' | 'subagent'`) with `TIER_DEFAULTS` (utility→haiku-4-5, reasoning→sonnet-4-6, deep→opus-4-7, subagent→sonnet-4-6) and `tier?: ModelTier` on `ResolveModelOpts`. Resolution chain is now 8 steps: cliFlag → deprecated key → config key → `models.default` → `models.tier.` → env var → `TIER_DEFAULTS[tier]` → caller fallback. Two new exports — `isAnthropicProvider(modelString)` checks `provider:model` prefix OR `claude-` bare-id pattern, and `enforceSubagentAnthropic()` is the layer-2 runtime guard: when `tier === 'subagent'` resolves to a non-Anthropic provider, it emits a once-per-`(source, model)` stderr warn AND falls back to `TIER_DEFAULTS.subagent` instead of letting the Anthropic Messages API tool-loop attempt to run on OpenAI/Gemini. `_resetDeprecationWarningsForTest()` now also clears `_subagentTierWarningsEmitted` so tests re-emit. +- `src/core/ai/model-resolver.ts` — Recipe-touchpoint validator. **v0.31.12:** `assertTouchpoint(recipe, touchpoint, modelId, extendedModels?)` gains an optional 4th `extendedModels: ReadonlySet` argument. When the modelId is in that set, the native-recipe allowlist throw is bypassed — the user explicitly opted into this model via config so we let provider rejection surface as `model_not_found` at HTTP call time (and `gbrain models doctor` catches it earlier). Default code paths with hardcoded model strings MUST NOT pass `extendedModels` — typos in source code still fail fast. Replaces the earlier plan to soften the validator wholesale (Codex F4/F5 in plan review flagged that as too broad — it would have removed the fail-fast contract for chat + expand + embed all three). +- `src/core/ai/gateway.ts` extension (v0.31.12) — new module-scoped `_extendedModels: Map>` registry feeds `assertTouchpoint`'s 4th-arg path. New `reconfigureGatewayWithEngine(engine)` async function is called from `cli.ts` after `engine.connect()` (and before every command except `CLI_ONLY` no-DB commands) — re-resolves expansion + chat defaults through `resolveModel()` so `models.tier.*` and `models.default` overrides apply to expansion + chat both. `DEFAULT_CHAT_MODEL` corrected to `anthropic:claude-sonnet-4-6` (was the v0.31.6 phantom `-20250929`). New `__setChatTransportForTests` seam mirrors `__setEmbedTransportForTests` so tests drive `chat()` with a stubbed transport. +- `src/core/minions/queue.ts` extension (v0.31.12) — `MinionQueue.add()` now rejects `subagent` jobs whose `data.model` resolves through `isAnthropicProvider()` to a non-Anthropic provider. Lazy-imports `model-config.ts` to avoid pulling engine types into queue's eager-load surface. Layer 1 of the three-layer subagent provider enforcement (Codex F1+F2 in plan review). Layers 2 + 3 live in `src/core/model-config.ts` (`enforceSubagentAnthropic` runtime fallback) and `src/commands/doctor.ts` (`subagent_provider` check). Pinned by 3 cases in `test/agent-cli.test.ts`. +- `src/commands/models.ts` (v0.31.12) — `gbrain models [--json]` read-only routing dashboard: prints tier defaults (`utility`/`reasoning`/`deep`/`subagent`), the resolved value for each (re-walking the resolution chain to attribute properly), every per-task override (11 `PER_TASK_KEYS` entries — `models.dream.synthesize`, `models.dream.patterns`, `models.drift`, `models.auto_think`, `models.think`, `models.subagent`, `facts.extraction_model`, `models.eval.longmemeval`, `models.expansion`, `models.chat`, `models.dream.synthesize_verdict`), the alias map (defaults + user overrides), and a source-of-truth column showing `default` / `config: ` / `env: `. `gbrain models doctor [--skip=] [--json]` fires a 1-token `gateway.chat()` probe against each configured chat + expansion model and classifies failures into `{model_not_found, auth, rate_limit, network, unknown}` — the structural fix for the v0.31.6 silent-no-op bug class. Wired into `cli.ts` dispatch table + `CLI_ONLY` set. +- `src/commands/doctor.ts` extension (v0.31.12) — new `subagent_provider` check (layer 3 of 3 — Codex F13). Warns when `models.tier.subagent` is explicitly set to a non-Anthropic provider (fail-loud since the user clearly meant it — message names the bad value and prints the paste-ready fix command `gbrain config set models.tier.subagent anthropic:claude-sonnet-4-6`); also warns when `models.default` would sneak `subagent` into a non-Anthropic provider via tier inheritance. OK status when subagent tier resolves to Anthropic. Tests cover all three paths in `test/doctor.test.ts`. +- `src/core/check-resolvable.ts` — Resolver validation: reachability, MECE overlap, DRY checks, structured fix objects. v0.14.1: `CROSS_CUTTING_PATTERNS.conventions` is an array (notability gate accepts both `conventions/quality.md` and `_brain-filing-rules.md`). New `extractDelegationTargets()` parses `> **Convention:**`, `> **Filing rule:**`, and inline backtick references. DRY suppression is proximity-based via `DRY_PROXIMITY_LINES = 40`. +- `src/core/repo-root.ts` — Shared `findRepoRoot(startDir?)` (v0.16.4): walks up from `startDir` (default `process.cwd()`) looking for `skills/RESOLVER.md`. Zero-dependency module imported by both `doctor.ts` and `check-resolvable.ts`. Parameterized `startDir` makes tests hermetic. **v0.31.7:** read-path / write-path split. `autoDetectSkillsDir` (shared, read+write-safe) gains tier-0 `$GBRAIN_SKILLS_DIR` explicit operator override (Docker mounts, CI, monorepo subdirs) ahead of the existing 4-tier chain. New `autoDetectSkillsDirReadOnly` wraps it with a tier-5 install-path fallback that walks up from `fileURLToPath(import.meta.url)` and gates on `isGbrainRepoRoot` so unrelated repos can't false-positive. Read-path callers (`doctor`, `check-resolvable`, `routing-eval`) use the read-only variant; write-path callers (`skillpack install`, `skillify scaffold`, `post-install-advisory`) deliberately stay on the shared function so `gbrain skillpack install` from `~` cannot silently retarget the bundled gbrain repo's `skills/` instead of the user's actual workspace. Two new `SkillsDirSource` variants: `'env_explicit'`, `'install_path'`. New `AUTO_DETECT_HINT_READ_ONLY` documents the extra tier. The D6 `--fix` safety gate in `doctor.ts` + `check-resolvable.ts` refuses auto-repair when `detected.source === 'install_path'` so `gbrain doctor --fix` from `~` cannot silently rewrite the bundled install tree. +- `src/commands/check-resolvable.ts` — Standalone CLI wrapper (v0.16.4) over `checkResolvable()`. Exports `parseFlags`, `resolveSkillsDir`, `DEFERRED`, `runCheckResolvable`. Exit rule: **1 on any issue (warnings OR errors)**, stricter than doctor's `ok` flag — honors README:259. Stable JSON envelope `{ok, skillsDir, report, autoFix, deferred, error, message}` — same shape on success and error paths. `--fix` path runs `autoFixDryViolations` BEFORE `checkResolvable` (same ordering as doctor). `scripts/skillify-check.ts` subprocess-calls `gbrain check-resolvable --json` (cached per process) and fails loud on binary-missing — no silent false-pass. **v0.19:** AGENTS.md workspaces now resolve natively (see `src/core/resolver-filenames.ts`) — gbrain inspects the 107-skill OpenClaw deployment whether the routing file is `RESOLVER.md` or `AGENTS.md`. `DEFERRED[]` is empty — Checks 5 + 6 shipped as real code, not issue URLs. **v0.31.7:** the resolver lookup switched from first-match-wins to the multi-file merge in `src/core/check-resolvable.ts` — entries collected from every `RESOLVER.md` / `AGENTS.md` across the skills dir AND its parent, deduped by `skillPath` (first occurrence wins). Lifted reachable skills on the reference OpenClaw layout from 37/224 to 200/224 — the deployment ships a thin `skills/RESOLVER.md` (~40 entries from skillpack) plus a fat `../AGENTS.md` (200+ entries, the real dispatcher), and the previous code only saw the first one. The CLI also switched to `autoDetectSkillsDirReadOnly` so `cd ~ && gbrain check-resolvable` finds the bundled skills via the install-path fallback. `--fix` carries the same D6 safety gate as `gbrain doctor --fix`: refuses to write when `detected.source === 'install_path'`. +- `src/core/resolver-filenames.ts` (v0.19) — central list of accepted routing filenames (`RESOLVER.md`, `AGENTS.md`). Shared by `findRepoRoot`, `check-resolvable`, and skillpack install so every code path walks the same fallback chain. +- `src/commands/skillify.ts` + `src/core/skillify/{generator,templates}.ts` (v0.19) — `gbrain skillify scaffold ` creates all stubs for a new skill in one command: SKILL.md, script, tests, routing-eval.jsonl, resolver entry, filing-rules pointer. `gbrain skillify check