diff --git a/docs/specs/claude-mem-harness-for-gbrain.md b/docs/specs/claude-mem-harness-for-gbrain.md new file mode 100644 index 000000000..754cd6f62 --- /dev/null +++ b/docs/specs/claude-mem-harness-for-gbrain.md @@ -0,0 +1,304 @@ +# Spec: Claude-Mem Harness for GBrain + +## Goal + +Use claude-mem's hook pipeline code AS-IS to give gbrain reliable event +capture, context injection, and session-end consolidation. The hook layer +guarantees the brain-agent loop fires under all execution modes (/goal, +conversational, skill) without modifying gbrain's skill files or MCP server. + +## Constraints + +1. **claude-mem code can be deleted but not modified.** If a file needs + changes, create a new file instead. +2. **New files are highly discouraged.** Only create when there is no + alternative. Prefer reusing existing files by composition. +3. **gbrain's skill files and MCP server are untouched.** The hook layer + is additive infrastructure, not a replacement for the brain-agent loop. +4. **gbrain's ethos is preserved.** Hooks handle deterministic capture + (thin harness). Skills handle semantic interpretation (fat skills). + +## Why This Works + +claude-mem's hook pipeline has two clean layers: + +1. **Generic layer** (orchestrator, adapters, types, constants): handles + stdin parsing, platform normalization, exit codes, and timeout. Zero + storage coupling. Reusable AS-IS. +2. **Handler layer** (context.ts, observation.ts, summarize.ts): calls + an HTTP API to store/retrieve observations. The API endpoints are + the only coupling point. + +We reuse layer 1 entirely. For layer 2, we replace the HTTP targets: +instead of calling claude-mem's worker (`/api/sessions/observations`), +the handler appends to a JSONL event log and shells out to gbrain CLI. + +## Architecture + +``` +Claude Code hook event (PostToolUse, SessionStart, Stop) + | + v +bun-runner.js (AS-IS from claude-mem) + | + v +hook-command.ts (AS-IS from claude-mem, generic orchestrator) + | + v +claude-code adapter (AS-IS from claude-mem, normalizes input) + | + v +gbrain handler (NEW - replaces claude-mem handlers) + | + +--> PostToolUse: append JSONL to ~/.gbrain/events/.jsonl + +--> SessionStart: gbrain search -> format as additionalContext + +--> Stop: gbrain consolidate (async, skill-driven) +``` + +## Files from claude-mem: Reuse AS-IS + +These files are copied from claude-mem's source tree without modification. +They form the generic hook pipeline. + +| File | Lines | Role | +|------|-------|------| +| `plugin/scripts/bun-runner.js` | ~180 | Node-to-Bun bridge, spawns hook process | +| `src/cli/hook-command.ts` | 116 | Stdin -> adapter -> handler -> stdout | +| `src/cli/types.ts` | 46 | NormalizedHookInput, HookResult, interfaces | +| `src/cli/adapters/index.ts` | 22 | Adapter registry | +| `src/cli/adapters/claude-code.ts` | 42 | Claude Code input normalizer | +| `src/cli/adapters/errors.ts` | 11 | AdapterRejectedInput error class | +| `src/shared/hook-constants.ts` | 26 | Timeouts, exit codes | + +**Total reused: 7 files, ~443 lines, zero modifications.** + +### Import closure verification (Codex finding #3) + +Before copying, verify the import graph of these 7 files resolves +completely within the set. Known risk: `hook-command.ts` imports a +handler index and a stdin reader. The handler index is replaced by the +new file. The stdin reader must be verified: if it exists as a separate +file, add it to the reuse list. If it is inlined in hook-command.ts, +no action needed. + +Run: `grep -n "from\|import" src/cli/hook-command.ts` in the claude-mem +repo and trace every local import. Any file outside the 7-file set that +is imported must be either (a) added to the reuse list or (b) inlined +into the new handler file. + +## Files from claude-mem: Delete (not needed) + +Everything else. The worker service, SQLite storage, ChromaDB sync, SDK +agent, MCP server, viewer UI, mode system, prompt templates, and all +platform adapters except claude-code. gbrain has its own equivalents for +all of these. + +## New Files + +### File count (corrected per Codex finding #2) + +| File | Type | Why it cannot be avoided | +|------|------|--------------------------| +| `src/cli/handlers/gbrain-handler.ts` | Handler | Replaces claude-mem's handler layer with gbrain targets | +| `~/.gbrain/hooks/gbrain-hook.sh` | Shell wrapper | Entry point called by Claude Code hook registration. Pipes stdin to bun-runner.js, writes stdout. Cannot be inlined into an existing file because Claude Code hooks require a command path. | + +**Total new files: 2** (1 TypeScript handler + 1 shell wrapper for all 3 events). + +The 3 hook events (PostToolUse, SessionStart, Stop) are routed by the +single shell wrapper using an argument. No need for 3 separate scripts. + +The `gbrain consolidate` CLI command is added to gbrain's existing +`src/commands/` directory. This is a new command in an existing command +registry file, not a new standalone file (same pattern as `gbrain doctor`, +`gbrain sync`, etc). + +### `gbrain-handler.ts` (the handler) + +Implements the `EventHandler` interface from claude-mem's `types.ts`. +Handles three events: + +#### PostToolUse + +Deterministic. No LLM call. Appends one JSONL line to the session event log. + +``` +Input: NormalizedHookInput (toolName, toolInput, toolResponse, sessionId, cwd) +Output: HookResult { continue: true } +Side effect: append to ~/.gbrain/events/.jsonl +``` + +Fields: timestamp, tool_name, tool_input (truncated to 500 chars), +tool_exit_code, cwd, session_id. No judgment. No filtering. + +Concurrency (Codex finding #9): JSONL appends are atomic at the OS level +for writes under PIPE_BUF (4KB on Linux). Each event line is well under +this. For extra safety, use `O_APPEND` mode which guarantees atomic +appends regardless of size. + +Cost: ~1ms per tool call (file append). Exits 0 always. + +#### SessionStart + +Deterministic. No LLM call. Queries gbrain for prior context. + +``` +Input: NormalizedHookInput (sessionId, cwd) +Output: HookResult { additionalContext: "" } +Side effect: none +``` + +Output key is `additionalContext` (Codex finding #4: Claude Code hook +protocol uses `additionalContext` for context injection, not +`systemMessage`). + +Implementation: shell out to `gbrain search` with the project name +derived from cwd. Format results as markdown. Return in the HookResult. + +If gbrain is unavailable (CLI missing, DB not initialized), return +empty `additionalContext`. Exits 0 always. + +#### Stop + +**Async fire-and-forget.** The handler spawns `gbrain consolidate` as +a detached background process and immediately returns `HookResult +{ continue: true }`. This way the hook exits in <100ms (Codex finding +#8: LLM consolidation under a 120s timeout is unreliable). + +The consolidation runs outside the hook timeout. If it fails, the event +log persists at `~/.gbrain/events/.jsonl` for manual or +next-session processing. + +``` +Input: NormalizedHookInput (sessionId) +Output: HookResult { continue: true } +Side effect: spawns background consolidation process +``` + +### `gbrain-hook.sh` (the shell wrapper) + +```bash +#!/bin/sh +exec bun ~/.gbrain/hooks/bun-runner.js hook claude-code "$1" +``` + +One wrapper, one argument (`post-tool-use`, `session-start`, or `stop`). +Claude Code hook registration points all 3 events at this script with +different args. + +## `gbrain consolidate` CLI command + +Added to gbrain's existing command registry (not a new file). This is the +bridge between the hook layer (Plane A) and the skill layer (Plane B). + +``` +gbrain consolidate --session [--events-dir ~/.gbrain/events] +``` + +Implementation: +1. Read `~/.gbrain/events/.jsonl` +2. Format events as a structured tool-call timeline +3. Feed the timeline to the signal-detector skill (via the existing + `gbrain serve` MCP interface or direct function call) which decides + what entities to extract +4. Call `put_page` to create/update a goal page with the skill's output +5. Optionally call `add_link` for cross-references +6. On success: delete the processed event log +7. On failure: leave the event log intact, exit non-zero + +Cleanup responsibility is solely in `consolidate` (Codex finding #7: +the Stop handler does NOT delete the event log; it only spawns +consolidate which owns the lifecycle). + +### How skills are invoked (Codex finding #5) + +The consolidate command is deterministic orchestration. The skill +invocation happens through gbrain's existing MCP tool interface: the +command calls `put_page` with the event timeline as content. The +signal-detector skill file (loaded into CLAUDE.md) determines page +structure when the agent reads the page later. For richer consolidation, +the command can spawn a short Claude Code session (`claude -p` with +the event log as input and signal-detector instructions as system +prompt) to produce the structured goal page. This is the same pattern +as the test-gbrain-agent.sh round-trip test in the devcontainer. + +### MCP usage clarification (Codex finding #6) + +Hooks do NOT use MCP. The PostToolUse handler writes JSONL (file I/O). +The SessionStart handler shells out to `gbrain search` (CLI). The Stop +handler spawns `gbrain consolidate` (CLI). + +The consolidate command MAY use gbrain's MCP tools internally (via +`gbrain serve` stdio interface or direct PGLite calls). This is gbrain +CLI-to-CLI, not hook-to-MCP. The hooks never touch MCP. + +## Hook Registration + +In the devcontainer entrypoint or Claude Code settings: + +```json +{ + "hooks": { + "PostToolUse": [{ + "command": "~/.gbrain/hooks/gbrain-hook.sh post-tool-use", + "timeout": 5000 + }], + "SessionStart": [{ + "command": "~/.gbrain/hooks/gbrain-hook.sh session-start", + "timeout": 10000 + }], + "Stop": [{ + "command": "~/.gbrain/hooks/gbrain-hook.sh stop", + "timeout": 120000 + }] + } +} +``` + +## Existing gbrain CLI commands used (Codex finding #11) + +| Command | Exists? | Used by | +|---------|---------|---------| +| `gbrain search` | Yes (MCP tool + CLI) | SessionStart handler | +| `gbrain serve` | Yes (MCP server) | consolidate (internal) | +| `put_page` | Yes (MCP tool) | consolidate | +| `add_link` | Yes (MCP tool) | consolidate | +| `gbrain consolidate` | **New** (added to existing command registry) | Stop handler | + +## What This Preserves + +| gbrain principle | How preserved | +|-----------------|---------------| +| Thin harness | Hooks are 7 reused files + 2 new files | +| Fat skills | Signal-detector and brain-ops still do all interpretation | +| Harness-agnostic | Hook layer is an optional Claude Code adapter; skills work without it | +| Markdown is code | Skill files unchanged; they drive consolidation | +| Self-rewriting skills | Skills can evolve based on consolidation patterns | +| FS-canonical | Pages go through put_page -> export -> markdown repo | + +## What This Fixes + +| Experiment finding | How fixed | +|-------------------|-----------| +| 0/85 gbrain calls under /goal | PostToolUse captures every tool call automatically | +| No context injection | SessionStart injects brain context before prompt | +| No session-end summary | Stop triggers async skill-driven consolidation | +| Deferred MCP tool friction | Hooks use CLI + JSONL, not MCP tools | +| Skill/goal pressure suppression | Hooks run outside agent control | + +## Scope + +- **Files copied from claude-mem:** 7 (unmodified) + import closure additions TBD +- **New files created:** 2 (`gbrain-handler.ts` + `gbrain-hook.sh`) +- **New gbrain CLI commands:** 1 (`consolidate`, added to existing command registry) +- **gbrain skill files changed:** 0 +- **gbrain MCP server changed:** 0 + +## Open Questions + +1. **Import closure:** Does the 7-file set resolve cleanly? Must verify + before implementation. +2. **Consolidation quality:** Does `claude -p` with the event log produce + good goal pages, or does the signal-detector skill need adaptation? +3. **Event log size:** A 30-minute /goal session with 85 tool calls + produces ~85 JSONL lines (~50KB). Is this within consolidation's + context budget? diff --git a/docs/specs/customized-domain.md b/docs/specs/customized-domain.md index 6a5d4fc25..d9d5ad875 100644 --- a/docs/specs/customized-domain.md +++ b/docs/specs/customized-domain.md @@ -863,3 +863,9 @@ execution order, risk). replacing CLAUDE.md and adding a 1.4k-line backup is noise. Cosmetic — does not affect the spec's correctness or implementation. +### Codex review 7: `/goal` integration gap + +See `docs/specs/customized-domain/experiment-log.md` Experiment 1 for the +full finding, cross-model analysis, and fix. Moved out of the spec because +it is an experimental observation, not a normative spec decision. + diff --git a/docs/specs/customized-domain/experiment-log.md b/docs/specs/customized-domain/experiment-log.md new file mode 100644 index 000000000..df596599d --- /dev/null +++ b/docs/specs/customized-domain/experiment-log.md @@ -0,0 +1,459 @@ +# Experiment Log: GBrain Customized Domain + +Experiments testing whether the customized-domain gbrain skill files +actually work in practice. Each experiment is reproducible from the +documented commit hash and devcontainer configuration. + +--- + +## Experiment 1: Does gbrain fire during a `/goal` session? + +**Date:** 2026-05-15 +**Branch:** `test/gbrain-developer-persona` (practicespace-2) +**gbrain commit:** `86abd45ce63484d4cc62c14e7f8f32095c3f76a4` +**Dockerfile:** `practicespace-2/.devcontainer/Dockerfile` line 109, `ARG GBRAIN_COMMIT=86abd45...` +**Claude Code version:** 2.1.140 +**Model:** claude-opus-4-6[1m] +**Session ID:** `94bb5cda-e085-42c0-8217-df3e093bdd3b` +**Session log:** `/home/dev/.claude/projects/-workspaces-goal-email-thread-separator/94bb5cda-e085-42c0-8217-df3e093bdd3b.jsonl` + +### What we tried + +Ran `/goal` on a fresh project (`/workspaces/goal-email-thread-separator/`) +with the customized-domain gbrain skill files loaded. The goal spec asked the +agent to build an email chain splitter: parse `.eml` files, split chains into +individual emails, render Outlook-style screenshots, produce a manifest. + +The purpose was to test whether gbrain's signal-detector and brain-ops loop +would capture the development session as knowledge (goals, decisions, debug +trails) that could be retrieved later by a different agent session. + +### To reproduce + +1. Build the devcontainer with `GBRAIN_COMMIT=86abd45ce63484d4cc62c14e7f8f32095c3f76a4` +2. Start Claude Code in `/workspaces/goal-email-thread-separator/` +3. Run `/goal goal-email-thread-separator.md` +4. Let the agent complete autonomously +5. Check the session log for `mcp__gbrain__*` tool calls + +### What happened + +The agent successfully completed the goal: built a 758-line Python script, +split a 3-email chain, rendered Outlook-style screenshots, produced a manifest. +It ran 85 tool calls (45 Bash, 22 Read, 15 Edit, 3 Write) across 288 session +log lines. The user spoke only 3 times during the autonomous session. + +Session tool histogram: + +``` + 85 user/tool_result + 45 assistant/tool_use/Bash + 44 assistant/text + 22 assistant/tool_use/Read + 19 assistant/thinking + 15 assistant/tool_use/Edit + 12 attachment/task_reminder + 3 assistant/tool_use/Write + 3 user/text + 2 attachment/goal_status + 1 attachment/deferred_tools_delta ← gbrain's 60+ tools registered here + 0 assistant/tool_use/mcp__gbrain__* ← ZERO brain interactions +``` + +**Result: gbrain was completely ignored.** Zero `mcp__gbrain__*` calls. The +MCP tools were registered (visible in `deferred_tools_delta`) but never +invoked. No goal page created. No decisions captured. No debug trails stored. +No knowledge persisted for future sessions. + +### What should have happened + +Per the gbrain skill files loaded into `~/.claude/CLAUDE.md`: + +1. **RESOLVER.md** routes intents to skills. The "always-on" table says + signal-detector fires on every inbound message and brain-ops handles + any brain read/write. + +2. **signal-detector/SKILL.md** should detect entities on every message: + goals being worked on, technical decisions, debug sessions, reusable + concepts. + +3. **brain-ops/SKILL.md** defines the 6-phase loop: + DETECT -> WRITE -> STORE -> RETRIEVE -> PRESENT -> ENRICH. + Phase 1 (READ): search brain for prior context before starting work. + Phase 2 (WRITE): create/update pages as entities are detected. + Phase 4 (ENRICH): update existing pages with new information. + +### What the ideal session would have looked like + +The 85 actual tool calls should have been interleaved with ~11 gbrain calls: + +``` +Session start: + 1. ToolSearch "gbrain" ← load MCP schemas + 2. mcp__gbrain__search "email chain splitter" ← prior context? + 3. mcp__gbrain__put_page goals/email-chain... ← create goal page (skeleton) + +During development (interleaved with the 85 code tool calls): + 4. mcp__gbrain__add_timeline_entry ← "chain detection working, 3 emails found" + 5. mcp__gbrain__add_timeline_entry ← debug: avatar initials 'I)' from paren in name + 6. mcp__gbrain__add_timeline_entry ← debug: RFC 5322 To field raw format + 7. mcp__gbrain__add_timeline_entry ← "screenshots working via gstack browse" + 8. mcp__gbrain__add_timeline_entry ← "single-email edge case verified" + +Session completion: + 9. mcp__gbrain__put_page goals/email-chain... ← final update with full execution arc + 10. mcp__gbrain__search "MIME parsing" ← check for related concepts + 11. mcp__gbrain__add_link ← link to related entities if found +``` + +That is ~13% overhead (11 extra calls on 85). The brain would then contain +a goal page with the full execution arc: approach, local decisions (Python +stdlib, gstack browse for screenshots), debug trails (avatar initials fix, +RFC 5322 encoding fix), and verification evidence. A future `/goal` session +on email parsing would find it via `mcp__gbrain__search "email"` at step 2. + +### Why it didn't happen + +**Root cause (observed, cross-model agreement between Claude and Codex):** + +The gbrain instructions are advisory, not enforced. Under `/goal` execution +pressure, the agent optimizes for the completion condition and skips all +optional side-paths. + +Four compounding factors: + +1. **"Always-on" is declarative text, not runtime enforcement.** The + signal-detector's "fire on every message" instruction is a sentence in a + skill file that the agent must choose to read. The RESOLVER says to read it. + But the RESOLVER itself is just text in `~/.claude/CLAUDE.md`. Under `/goal` + pressure, the agent never reads the signal-detector skill file because + doing so does not advance the goal completion condition. + +2. **`/goal` completion condition doesn't include brain I/O.** The goal + rewards code + tests + artifacts (the `.eml` files, `.png` screenshots, + `manifest.json`). There is zero incentive for gbrain actions because no + part of the completion condition checks for brain writes. + +3. **Deferred MCP tools add activation friction.** Claude Code defers MCP + tool schemas when there are too many tools (gbrain has 60+). The agent + must call `ToolSearch` to load schemas before it can invoke any + `mcp__gbrain__*` tool. Without a reason to reach for gbrain, `ToolSearch` + is never called, schemas never load, and the tools remain invisible. + +4. **No project-level CLAUDE.md** existed in `/workspaces/goal-email-thread-separator/`. + The gbrain instructions lived only in `~/.claude/CLAUDE.md` (global) and + `/workspaces/CLAUDE.md` (parent workspace). Claude Code loads CLAUDE.md + files hierarchically, but the closest-to-execution-site file (project-level) + has the strongest influence on agent behavior. Its absence meant no + reinforcement at the point where the agent was actually working. + +### Codex analysis (gpt-5.3-codex) + +**First consult:** Asked Codex to independently analyze the root cause. + +Codex agreed with the analysis and sharpened the framing: "policy was +advisory, not enforced." Codex also identified a fifth factor: **trigger +mismatch.** The RESOLVER's brain triggers are phrased as explicit knowledge +tasks ("what do we know...", "search for..."). A build task doesn't hit +those triggers unless you force a pre/post memory pattern. + +**Codex recommended fixes (priority order):** + +1. **Change the `/goal` contract (highest leverage).** Make brain steps part + of the completion criteria. On start: mandatory `ToolSearch` + search. On + finish: mandatory `put_page` with outcome. Goal incomplete without the + brain write. *Requires Claude Code platform changes or a `/goal` wrapper.* + +2. **Enforce always-on outside the model.** Implement signal-detector as a + host-side hook or completion validator that checks for `mcp__gbrain__*` + calls. *Requires platform changes.* + +3. **Reduce deferred-tool friction.** Preload gbrain schemas at session + start or register a slim MCP server with only the 5-6 critical tools + (search, put_page, get_page, add_link, query) to stay under the deferral + threshold. *Partially implementable without platform changes.* + +4. **Add workspace-level policy.** In `/workspaces/CLAUDE.md`: mandatory + gbrain read at goal start, write at goal end. *Immediate, no platform + changes needed.* + +**Second consult:** Asked Codex to review the spec update and CLAUDE.md +patch we wrote for fix #4. + +Codex found 4 issues: +- Spec section was misplaced in normative flow (should be in Reviews or + experiment log, not between "After implementation" and "Spec Update Rules") +- CLAUDE.md patch is still advisory text -- the same class of instruction the + root cause says gets ignored. Acknowledged limitation. +- No failure policy for when gbrain is unavailable. Fixed: added degraded + mode with `gbrain-deferred.md` fallback. +- Risk of knowledge spam from over-eager page creation. Fixed: restricted + mandatory writes to ONE goal page per session; separate pages only for + clearly durable cross-goal decisions. + +### What we changed + +1. **`/workspaces/CLAUDE.md`** (inside devcontainer): appended "GBrain Goal + Integration (mandatory)" section with: + - On goal start: ToolSearch + search for prior context + - On goal completion: ONE goal page via put_page + - Degraded mode: if gbrain unavailable, write `gbrain-deferred.md` + - Completion criteria: goal incomplete without brain write OR deferred file + +2. **Spec** (`docs/specs/customized-domain.md`): added pointer in Codex review + 7 to this experiment log. + +### Open gap + +Fix #4 (workspace-level CLAUDE.md policy) is a stopgap. It is the same class +of advisory text that the root cause says gets ignored under execution +pressure. The real fixes (#1-#3) require either: + +- Claude Code platform changes (goal hooks, tool preloading) +- A slim gbrain MCP server with <10 tools to avoid deferral +- A `/goal` wrapper that enforces brain I/O in the completion check + +### Next experiment + +Re-run the same `/goal` session with the CLAUDE.md patch in place. Check +whether the advisory text is sufficient to trigger gbrain calls, or whether +fix #3 (slim MCP server) is needed to overcome the deferred-tool friction. + +--- + +## Experiment 2: `/goal` fails on spec character limit + +**Date:** 2026-05-16 +**Session ID:** `7839c8d6-9ed8-41a6-94f9-510f71c2e47f` +**Project:** `/workspaces/goal-email-thread-count` +**Duration:** ~40 seconds (15 log lines) + +### What we tried + +Ran `/goal goal-email-thread-count.md` on a new project spec for an email +thread counter tool. + +### What happened + +The `/goal` command rejected the spec immediately: + +``` +Goal condition is limited to 4000 characters (got 4045) +``` + +The spec was 45 characters over the limit. The user ran `/exit` immediately +after. Claude never got a turn to respond. Zero tool calls. + +### Tool histogram + +``` + 0 assistant/tool_use/* ← no assistant messages at all + 3 user (command payloads: /goal, error output, /exit) + 3 system +``` + +### gbrain calls: ZERO + +No agent turn occurred, so gbrain could not have been invoked. + +### Observation + +The `/goal` command has a 4000-character limit on the spec file content. This +is a hard gate enforced by the Claude Code harness before the agent gets +control. Specs need to stay under this limit, or the goal file must reference +an external spec rather than embedding the full content. + +--- + +## Experiment 3: Goal announcement stub (no `/goal` command) + +**Date:** 2026-05-16 +**Session ID:** `aa5faea5-661d-4776-a7b1-f928d7dcf9ab` +**Project:** `/workspaces/goal-email-thread-count` +**Duration:** ~2.7 seconds (15 log lines) + +### What we tried + +User sent: "heads up, we're going to work on a goal now. just informing." + +### What happened + +Agent replied: "Got it, standing by for the goal. Ready when you are." +One text response, zero tool calls. Session ended. + +### Tool histogram + +``` + 0 assistant/tool_use/* ← zero tool calls + 1 assistant/text + 1 user/text +``` + +### gbrain calls: ZERO + +The gbrain tools were registered in `deferred_tools_delta` but the agent +had no reason to invoke them on a single notification message. The signal- +detector's "fire on every message" instruction was not followed, but this +is a trivial case -- no substantive content to detect signals in. + +### Observation + +A "heads up" message with no actionable content does not trigger gbrain +interaction. This is arguably correct behavior -- there is nothing to +store. However, per the RESOLVER, signal-detector should still fire and +determine there is nothing to capture (rather than not firing at all). + +--- + +## Experiment 4: gbrain fires on conversational goal start + +**Date:** 2026-05-16 +**Session ID:** `9b03a158-10fa-473f-8355-ba3b857f49f9` +**Project:** `/workspaces/goal-email-thread-count` +**Duration:** ~69 seconds (46 log lines) + +### What we tried + +User said: "we're going to work on a goal now" (natural language, no `/goal` +command). The project had a spec at `goal-email-thread-count.md` and a sample +`.eml` file. + +### What happened + +The agent: +1. Called `ToolSearch` to load gbrain MCP tool schemas +2. Called `mcp__gbrain__search` with query `"email thread count"` +3. Read the goal spec file +4. Explored the sample `.eml` file with `Bash` (grepping for boundary markers) +5. Was interrupted by the user mid-investigation + +No code was written. The agent was still in the discovery/analysis phase when +interrupted. + +### Tool histogram + +``` + 4 assistant/tool_use/Bash + 2 assistant/tool_use/Read + 1 assistant/tool_use/ToolSearch ← loaded gbrain schemas + 1 assistant/tool_use/mcp__gbrain__search ← SEARCHED THE BRAIN +``` + +### gbrain calls: 1 + +| Tool | Input | +|------|-------| +| `mcp__gbrain__search` | `{"query": "email thread count"}` | + +### Why gbrain fired here but not in Experiment 1 + +This is the key finding. The difference between this session and Experiment 1: + +| Factor | Experiment 1 (`/goal`) | Experiment 4 (conversational) | +|--------|----------------------|-------------------------------| +| Entry mode | `/goal` command with completion condition | Natural language prompt | +| Execution pressure | High -- agent focused on completion gate | Low -- agent in exploration mode | +| First action | Read spec, start coding | ToolSearch + brain search | +| gbrain calls | 0 | 1 (search) | + +The `/goal` command creates a completion condition that the agent optimizes +for, suppressing optional side-paths. A conversational prompt ("we're going +to work on a goal now") does NOT create a completion gate, so the agent +follows the CLAUDE.md instructions (including gbrain search) because there +is no competing objective. + +This confirms the root cause from Experiment 1: **it is the `/goal` execution +pressure that suppresses gbrain, not the instructions themselves.** When the +same instructions are followed without `/goal` pressure, gbrain fires correctly. + +### Observation + +The `/workspaces/CLAUDE.md` patch (fix #4 from Experiment 1) may be +effective for conversational goal sessions but insufficient for `/goal` +sessions where the completion gate overrides advisory instructions. Fix #1 +(changing the `/goal` contract) remains necessary for `/goal`-mode sessions. + +--- + +## Experiment 5: `/init` skill runs instead of `/goal` + +**Date:** 2026-05-16 +**Session ID:** `9cacd941-807c-4f7c-9bad-15fe9ad9d23d` +**Project:** `/workspaces/goal-email-thread-count` +**Duration:** ~2 minutes (37 log lines) + +### What we tried + +User mentioned they wanted to work on the goal and use the `/goal` skill. + +### What happened + +The agent ran the `/init` skill (via `Skill` tool) instead of `/goal`, then +dispatched an `Agent` subagent, read the spec and CLAUDE.md, and updated +CLAUDE.md with a project overview and spec summary. No implementation work. +No brain writes. + +### Tool histogram + +``` + 2 assistant/tool_use/Read + 1 assistant/tool_use/ToolSearch + 1 assistant/tool_use/Skill (/init) + 1 assistant/tool_use/Agent + 1 assistant/tool_use/Edit +``` + +### gbrain calls: ZERO + +`ToolSearch` was called (loading gbrain schemas), but no `mcp__gbrain__*` +tool was invoked. The agent loaded the schemas but did not use them. + +### Why gbrain didn't fire + +The `/init` skill took over the session. It has its own workflow (read the +repo, update CLAUDE.md with project info) that does not include brain +writes. The agent followed the `/init` skill instructions faithfully, +which is the correct behavior for skill execution, but `/init` does not +include any gbrain interaction. + +### Observation + +When a skill (like `/init`) takes control of the session, it supersedes +the gbrain instructions in CLAUDE.md. Skills define their own tool +sequences and do not check the brain. This is a third mode of failure +distinct from Experiments 1 and 4: + +1. **`/goal` mode** -- completion pressure suppresses brain (Experiment 1) +2. **Conversational mode** -- brain fires correctly (Experiment 4) +3. **Skill mode** -- skill workflow supersedes brain instructions (Experiment 5) + +For gbrain to fire during skill execution, individual skills would need +to include brain read/write steps in their own workflows. + +--- + +## Cross-experiment summary + +| # | Session | Mode | Duration | Tool calls | gbrain calls | Brain fired? | +|---|---------|------|----------|------------|-------------|-------------| +| 1 | `94bb5cda` | `/goal` | ~30min | 85 | 0 | No | +| 2 | `7839c8d6` | `/goal` (failed) | 40s | 0 | 0 | N/A (no agent turn) | +| 3 | `aa5faea5` | Conversational (stub) | 2.7s | 0 | 0 | No (nothing to detect) | +| 4 | `9b03a158` | Conversational | 69s | 8 | 1 (search) | **Yes** | +| 5 | `9cacd941` | Skill (`/init`) | 2min | 6 | 0 | No | + +### Key finding + +gbrain fires in exactly ONE mode: **conversational prompts without a +competing execution framework** (no `/goal` gate, no skill workflow). + +Three distinct suppression mechanisms observed: +1. `/goal` completion pressure overrides advisory brain instructions +2. Skill workflows (e.g. `/init`) supersede CLAUDE.md brain instructions +3. Deferred MCP tools add friction (but ToolSearch is called when there is no competing pressure) + +### Implication for fixes + +Fix #4 (CLAUDE.md policy) is likely effective only for conversational +mode. For `/goal` and skill modes, enforcement must be built into the +execution framework itself (fix #1) or into individual skill workflows. diff --git a/docs/specs/customized-domain/superpowers-session-context.md b/docs/specs/customized-domain/superpowers-session-context.md new file mode 100644 index 000000000..cfd91e7a1 --- /dev/null +++ b/docs/specs/customized-domain/superpowers-session-context.md @@ -0,0 +1,101 @@ +--- +status: complete +branch: customized-domain-superpowers +base_branch: chapter37haptics/customized-domain +timestamp: 2026-05-15T21:58:00Z +pr: https://github.com/chapter37haptics/gbrain/pull/5 +approach: superpowers (writing-plans + subagent-driven-development) +files_modified: [] +--- + +## Working on: Customized Domain (VC to Developer) via Superpowers + +### Summary + +Implemented the customized-domain spec using the superpowers skill chain. All code shipped, reviewed by Codex (GATE: PASS), tests improved with realistic examples, PR #5 open with 4 review comments posted. Clean working tree, 16 commits total. + +### Approach Used + +**Superpowers workflow:** +1. `/writing-plans` read the spec, read all 12 target files, produced a detailed 11-task plan with exact code, file paths, and test steps at `docs/superpowers/plans/2026-05-15-customized-domain.md` +2. `/subagent-driven-development` executed the plan: + - Tasks 1-3 (code: types.ts, markdown.ts, link-extraction.ts): dispatched as sonnet subagents with TDD (write test, verify fail, implement, verify pass, commit) + - Tasks 4-9 (skill files: quality.md, brain-ops, signal-detector, filing-rules, RESOLVER, brain-first): executed inline as direct edits since content was fully specified in the plan + - Task 10 (verification): ran typecheck + 485 targeted tests (0 failures) + PageType consumer audit + - Task 11 (doctor.ts): one-line SQL clause expansion +3. `/verification-before-completion` ran fresh typecheck and tests before claiming done +4. `/gstack-codex review` got an independent Codex review (GATE: PASS, 4 findings) +5. `/receiving-code-review` evaluated each finding against the codebase +6. Second `/gstack-codex` consult asked Codex to grade Claude's dismissals (1 VALID, 3 PARTIAL) +7. Fixed finding #2 (added `concept` to doctor.ts graph_coverage) based on Codex feedback +8. Third `/gstack-codex` consult on test plan inspired by real email-parsing spec +9. Fourth `/gstack-codex` audit: spec-vs-tests gap analysis, pinned ALL_PAGE_TYPES count at 25 + +### Decisions Made + +- **Code-first execution order**: types.ts before markdown.ts before link-extraction.ts, so skill file changes could be tested immediately without transient type errors +- **Conservative RESOLVER.md rewrite**: kept all quoted trigger phrases unchanged because the resolver test (D5/C) fuzzy-matches them against skill frontmatter, and the underlying skills (query, enrich, data-research) were NOT modified +- **Singular/plural DIR_PATTERN**: only added plural forms (goals, decisions, processes) matching the existing codebase pattern where inferType accepts singulars as fallbacks but DIR_PATTERN only has plurals +- **llms-full.txt rebuild**: the `build:llms` CI check caught stale generated output after skill file changes; rebuilt and committed +- **gh repo set-default**: set after accidentally posting a PR comment to the upstream repo (garrytan/gbrain) instead of the fork (chapter37haptics/gbrain) due to `gh` CLI ambiguous remote resolution +- **Filing rules JSON is additive-only**: the JSON is a registry of valid directories (infrastructure), the MD is the behavioral layer (replaceable). Never delete from JSON. +- **Added concept to doctor.ts**: Codex cross-model review correctly flagged that touching the query and leaving concept out was a missed opportunity +- **Pinned ALL_PAGE_TYPES at 25**: Codex spec audit flagged the contract test didn't pin the count + +### What Shipped (16 commits) + +| Commit | Change | +|--------|--------| +| 91af5a7 | types.ts: goal, decision, process added to PageType | +| bd27f60 | markdown.ts: inferType() directory mappings | +| bc11839 | link-extraction.ts: DIR_PATTERN expansion | +| ba00bdf | quality.md: generalized Iron Law + notability | +| 661ed79 | brain-ops: 8 hard-gate sites patched | +| cc163ff | signal-detector: full rewrite | +| 3b7fb1c | filing rules: developer taxonomy | +| 3fff873 | RESOLVER.md: disambiguation updated | +| ba75466 | brain-first.md: entity conventions | +| 8a69e92 | llms-full.txt: rebuilt | +| d490a6f | doctor.ts: developer types in graph_coverage | +| 5856aa2 | doctor.ts: concept added (Codex finding) | +| b0894ab | superpowers plan file | +| ab93d6c | session context saved | +| 7c700c5 | realistic developer-entity tests | +| 05c9a66 | ALL_PAGE_TYPES count pinned at 25 | + +### PR Review Comments Posted + +1. [Codex review](https://github.com/chapter37haptics/gbrain/pull/5#issuecomment-4463344835): GATE PASS, 4 findings +2. [Claude's evaluation](https://github.com/chapter37haptics/gbrain/pull/5#issuecomment-4463503409): dismissed all 4 +3. [Codex second opinion](https://github.com/chapter37haptics/gbrain/pull/5#issuecomment-4463505987): graded dismissals, finding #2 fixed +4. [Codex spec audit](https://github.com/chapter37haptics/gbrain/pull/5#issuecomment-4463798644): spec-vs-tests gap analysis + +### Verification Evidence + +- `bun run typecheck`: exit 0 +- 497 tests across 8 affected test files: 0 failures (485 original + 10 new + 2 new assertions) +- `bun run verify` (CI pre-test gate): exit 0 +- Full `bun run test` (8-shard parallel): exit 0 +- Codex review: GATE PASS, 4 findings (1 fixed, 3 dismissed with justification) + +### Known Limitations + +1. **Singular/plural DIR_PATTERN**: inferType accepts `/goal/` but DIR_PATTERN only has `goals`. Pre-existing pattern (same as person/people). Low risk. +2. **inferLinkType**: classifies developer entity relationships as `mentions` (default fallback). Tier 2 follow-up. + +### Remaining Work + +1. **Merge PR #5** into `chapter37haptics/customized-domain` +2. **Compare approaches**: this was the superpowers approach. User plans to try GSD and other approaches on separate branches for comparison +3. **Follow-up (Tier 2)**: developer-specific inferLinkType heuristics +4. **Follow-up (devcontainer)**: update entrypoint.sh in practicespace-2 to load the rewritten skill files (out of scope for this spec, separate repo) + +### Lessons Learned + +- `gh` CLI in a fork with both `origin` and `upstream` remotes will ambiguously resolve. Always run `gh repo set-default ` after cloning a fork. Detect forks with `gh repo view --json isFork,parent`. +- `bun run test` (npm script) includes pre-checks like `build:llms` that catch stale generated files. Changing CLAUDE.md or skill files requires `bun run build:llms` before the test suite passes. +- The superpowers subagent-driven-development skill worked well for the 3 code tasks but was overhead for the 6 skill file tasks where content was fully specified. Inline execution was faster for those. +- Codex CLI `codex review` doesn't support `--base` with `[PROMPT]` simultaneously. Use `codex exec` with embedded diff for custom-prompt reviews. +- Codex sandbox (bwrap) doesn't work in this devcontainer. Embedding the diff in the prompt works around it. +- llms-full.txt large deletion was from CLAUSE.md key-files section trimmed in an earlier commit, not caused by this PR. +- Filing rules JSON is additive-only (infrastructure registry); filing rules MD is replaceable (agent behavior). Different contracts. diff --git a/docs/superpowers/plans/2026-05-15-customized-domain.md b/docs/superpowers/plans/2026-05-15-customized-domain.md new file mode 100644 index 000000000..70f4f896a --- /dev/null +++ b/docs/superpowers/plans/2026-05-15-customized-domain.md @@ -0,0 +1,1102 @@ +# Customized Domain (VC to Developer) Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Adapt gbrain's skill layer from a VC/executive knowledge domain to a developer knowledge domain, replacing people/companies/deals entity detection with goals/decisions/processes/concepts. + +**Architecture:** Code-first, then skills. Three narrow code patches extend the `PageType` union, `inferType()` directory mapper, and `DIR_PATTERN` auto-link regex to recognize new entity types. Six skill files are then rewritten or patched to redirect the agent's detection, filing, retrieval, and quality gates from VC entities to developer entities. No schema, pipeline, or MCP changes. + +**Tech Stack:** TypeScript (Bun runtime), Markdown skill files, JSON config + +--- + +## File Map + +| Action | File | Responsibility | +|--------|------|----------------| +| Patch | `src/core/types.ts:13,22-27` | Add `goal`, `decision`, `process` to `PageType` union + `ALL_PAGE_TYPES` array | +| Patch | `test/page-type-exhaustive.test.ts:63-89` | Add `goal`, `decision`, `process` cases to exhaustive switch | +| Patch | `src/core/markdown.ts:344-375` | Add `goals/`, `decisions/`, `processes/` to `inferType()` | +| Patch | `src/core/link-extraction.ts:46` | Add `goals`, `decisions`, `processes` to `DIR_PATTERN` regex | +| Rewrite | `skills/conventions/quality.md` | Generalize Iron Law, add developer notability criteria | +| Patch | `skills/brain-ops/SKILL.md` | Replace 8 VC hard-gates with developer entity references | +| Rewrite | `skills/signal-detector/SKILL.md` | Replace VC detection with developer signal detection | +| Rewrite | `skills/_brain-filing-rules.md` | Replace VC taxonomy with developer filing taxonomy | +| Patch | `skills/_brain-filing-rules.json` | Add `goal`, `decision`, `process` kinds + dream paths | +| Rewrite | `skills/RESOLVER.md` | Replace VC triggers with developer triggers | +| Patch | `skills/conventions/brain-first.md` | Replace VC entity conventions with developer entity table | + +--- + +## Task 1: Add developer PageTypes to type system + +**Files:** +- Modify: `src/core/types.ts:13` (PageType union) +- Modify: `src/core/types.ts:22-27` (ALL_PAGE_TYPES array) +- Modify: `test/page-type-exhaustive.test.ts:63-89` (exhaustive switch) + +- [ ] **Step 1: Add `goal`, `decision`, `process` to the `PageType` union** + +In `src/core/types.ts` line 13, append the three new types before the closing semicolon: + +```typescript +export type PageType = 'person' | 'company' | 'deal' | 'yc' | 'civic' | 'project' | 'concept' | 'source' | 'media' | 'writing' | 'analysis' | 'guide' | 'hardware' | 'architecture' | 'meeting' | 'note' | 'email' | 'slack' | 'calendar-event' | 'code' | 'image' | 'synthesis' | 'goal' | 'decision' | 'process'; +``` + +- [ ] **Step 2: Add the same three types to `ALL_PAGE_TYPES`** + +In `src/core/types.ts` lines 22-27, add the three new types to the array: + +```typescript +export const ALL_PAGE_TYPES: readonly PageType[] = [ + 'person', 'company', 'deal', 'yc', 'civic', 'project', 'concept', + 'source', 'media', 'writing', 'analysis', 'guide', 'hardware', + 'architecture', 'meeting', 'note', 'email', 'slack', 'calendar-event', + 'code', 'image', 'synthesis', 'goal', 'decision', 'process', +] as const; +``` + +- [ ] **Step 3: Add cases to the exhaustive switch in the contract test** + +In `test/page-type-exhaustive.test.ts`, add three cases to the `classify` function (lines 63-89) before the `default`: + +```typescript + case 'synthesis': return 'doc'; + case 'goal': return 'work'; + case 'decision': return 'doc'; + case 'process': return 'doc'; + default: return assertNever(t); +``` + +- [ ] **Step 4: Run typecheck to verify the union is consistent** + +Run: `bun run typecheck` +Expected: PASS (no type errors). If any switch/assertNever consumer fails, it means there's an exhaustive switch elsewhere that needs new cases — fix those before proceeding. + +- [ ] **Step 5: Run unit tests to verify contract test passes** + +Run: `bun test test/page-type-exhaustive.test.ts` +Expected: All 4 tests pass, including the round-trip and exhaustive switch tests. + +- [ ] **Step 6: Commit** + +```bash +git add src/core/types.ts test/page-type-exhaustive.test.ts +git commit -m "feat: add goal, decision, process to PageType union" +``` + +--- + +## Task 2: Add developer directory mappings to `inferType()` + +**Files:** +- Modify: `src/core/markdown.ts:344-375` (inferType function) +- Test: `test/markdown.test.ts` + +- [ ] **Step 1: Write failing tests for the three new directory mappings** + +Add tests to `test/markdown.test.ts` inside the existing `inferType` / `parseMarkdown` describe block. Find the section that tests type inference from file paths (look for `people/someone.md` test around line 70-89) and add after it: + +```typescript + test('inferType: goals/ → goal', () => { + const result = parseMarkdown('---\ntitle: Test\n---\nBody', 'goals/setup-jwt-auth.md'); + expect(result.type).toBe('goal'); + }); + + test('inferType: decisions/ → decision', () => { + const result = parseMarkdown('---\ntitle: Test\n---\nBody', 'decisions/chose-postgres.md'); + expect(result.type).toBe('decision'); + }); + + test('inferType: processes/ → process', () => { + const result = parseMarkdown('---\ntitle: Test\n---\nBody', 'processes/deploy-to-prod.md'); + expect(result.type).toBe('process'); + }); + + test('inferType: decisions/ under projects/ → decision (longest prefix)', () => { + const result = parseMarkdown('---\ntitle: Test\n---\nBody', 'projects/my-app/decisions/use-redis.md'); + expect(result.type).toBe('decision'); + }); +``` + +- [ ] **Step 2: Run the tests to confirm they fail** + +Run: `bun test test/markdown.test.ts` +Expected: The four new tests FAIL (goals/ returns `concept`, decisions/ returns `concept`, processes/ returns `concept`, nested decisions/ returns `project`). + +- [ ] **Step 3: Add the directory mappings to `inferType()`** + +In `src/core/markdown.ts`, add three lines inside `inferType()`. Place them BEFORE the `/projects/` check (line 364) so `decisions/` under `projects/` matches `decision` first (longest prefix wins): + +```typescript + if (lower.includes('/goals/') || lower.includes('/goal/')) return 'goal'; + if (lower.includes('/decisions/') || lower.includes('/decision/')) return 'decision'; + if (lower.includes('/processes/') || lower.includes('/process/')) return 'process'; + if (lower.includes('/projects/') || lower.includes('/project/')) return 'project'; +``` + +The three new lines go right before the existing `projects/` line. Do NOT remove or change any existing lines — the VC directory mappings stay for backward compatibility. + +- [ ] **Step 4: Run the tests to confirm they pass** + +Run: `bun test test/markdown.test.ts` +Expected: All tests pass, including the four new ones. + +- [ ] **Step 5: Run typecheck** + +Run: `bun run typecheck` +Expected: PASS + +- [ ] **Step 6: Commit** + +```bash +git add src/core/markdown.ts test/markdown.test.ts +git commit -m "feat: add goals/decisions/processes directory mappings to inferType" +``` + +--- + +## Task 3: Add developer directories to `DIR_PATTERN` auto-link regex + +**Files:** +- Modify: `src/core/link-extraction.ts:46` (DIR_PATTERN) +- Test: `test/link-extraction.test.ts` + +- [ ] **Step 1: Write failing tests for entity ref extraction from developer directories** + +Add tests to `test/link-extraction.test.ts` inside the existing `extractEntityRefs` describe block: + +```typescript + test('extractEntityRefs: goals/ directory link', () => { + const refs = extractEntityRefs('[Setup JWT](goals/setup-jwt-auth)'); + expect(refs).toEqual([{ name: 'Setup JWT', slug: 'goals/setup-jwt-auth' }]); + }); + + test('extractEntityRefs: decisions/ directory link', () => { + const refs = extractEntityRefs('[Chose Postgres](decisions/chose-postgres)'); + expect(refs).toEqual([{ name: 'Chose Postgres', slug: 'decisions/chose-postgres' }]); + }); + + test('extractEntityRefs: processes/ directory link', () => { + const refs = extractEntityRefs('[Deploy Flow](processes/deploy-to-prod)'); + expect(refs).toEqual([{ name: 'Deploy Flow', slug: 'processes/deploy-to-prod' }]); + }); + + test('extractEntityRefs: goals/ wikilink', () => { + const refs = extractEntityRefs('[[goals/setup-jwt-auth|Setup JWT]]'); + expect(refs).toEqual([{ name: 'Setup JWT', slug: 'goals/setup-jwt-auth' }]); + }); +``` + +- [ ] **Step 2: Run tests to confirm they fail** + +Run: `bun test test/link-extraction.test.ts` +Expected: The four new tests FAIL (DIR_PATTERN doesn't match goals/decisions/processes). + +- [ ] **Step 3: Add `goals`, `decisions`, `processes` to `DIR_PATTERN`** + +In `src/core/link-extraction.ts` line 46, add the three new directories to the regex alternation. Place them at the beginning (longest-first for the regex engine): + +```typescript +const DIR_PATTERN = '(?:goals|decisions|processes|people|companies|meetings|concepts|deal|civic|project|projects|source|media|yc|tech|finance|personal|openclaw|entities)'; +``` + +- [ ] **Step 4: Run tests to confirm they pass** + +Run: `bun test test/link-extraction.test.ts` +Expected: All tests pass, including the four new ones. + +- [ ] **Step 5: Run typecheck** + +Run: `bun run typecheck` +Expected: PASS + +- [ ] **Step 6: Commit** + +```bash +git add src/core/link-extraction.ts test/link-extraction.test.ts +git commit -m "feat: add goals/decisions/processes to DIR_PATTERN auto-link" +``` + +--- + +## Task 4: Rewrite `quality.md` — root of the delegation chain + +**Files:** +- Rewrite: `skills/conventions/quality.md` + +This is the most important skill file change. Every other file's Iron Law and notability gate delegates here. The VC scoping ("person or company") must become entity-generic. + +- [ ] **Step 1: Rewrite `quality.md`** + +Replace the entire contents of `skills/conventions/quality.md` with: + +```markdown +# Quality Convention + +Cross-cutting quality rules for all brain-writing skills. + +## Citations (MANDATORY) + +Every fact written to a brain page must carry an inline `[Source: ...]` citation. + +- **User's statements:** `[Source: User, {context}, YYYY-MM-DD]` +- **Meeting data:** `[Source: Meeting "{title}", YYYY-MM-DD]` +- **Email/message:** `[Source: email from {name} re: {subject}, YYYY-MM-DD]` +- **Web content:** `[Source: {publication}, {URL}, YYYY-MM-DD]` +- **Social media:** `[Source: X/@handle, YYYY-MM-DD](URL)` +- **Synthesis:** `[Source: compiled from {sources}]` + +### Source precedence (highest to lowest) + +1. User's direct statements (highest authority) +2. Compiled truth (brain's synthesized understanding) +3. Timeline entries (raw evidence) +4. External sources (API enrichment, web search) + +## Back-Linking (MANDATORY) + +Every mention of an entity WITH a brain page MUST create a back-link +FROM that entity's page TO the page mentioning it. + +Entities: goals, decisions, processes, concepts — any page in a recognized +entity directory. + +Format: `- **YYYY-MM-DD** | Referenced in [page title](path) -- context` + +An unlinked mention is a broken brain. + +## Notability Gate + +Before creating a new brain page, check notability: + +- **Goals:** Is this a distinct execution arc worth documenting? (Not a sub-step of an existing goal) +- **Decisions:** Does this choice govern future work beyond the current goal? +- **Processes:** Is this repeatable and handoff-worthy? (Not a one-off sequence) +- **Concepts:** Reusable across goals? Stable? Non-procedural? (If it's steps, it's a process) + +When in doubt, capture in the current goal page first. Promote to its own page +only when reuse is clear. A missing page can be created later. A junk page +wastes attention and degrades search quality. +``` + +- [ ] **Step 2: Verify the file reads correctly** + +Run: `cat skills/conventions/quality.md` +Expected: The full new content with developer-domain notability criteria. + +- [ ] **Step 3: Commit** + +```bash +git add skills/conventions/quality.md +git commit -m "feat: generalize quality.md Iron Law and notability gate for developer domain" +``` + +--- + +## Task 5: Patch `brain-ops/SKILL.md` — the loop engine (8 sites) + +**Files:** +- Modify: `skills/brain-ops/SKILL.md` + +Eight hard-gate sites say "person or company" and must be changed to developer entity references. The `writes_to` frontmatter also needs updating. + +- [ ] **Step 1: Update `writes_to` frontmatter (lines 22-26)** + +Replace: +```yaml +writes_to: + - people/ + - companies/ + - deals/ + - concepts/ + - meetings/ +``` + +With: +```yaml +writes_to: + - goals/ + - decisions/ + - processes/ + - concepts/ +``` + +- [ ] **Step 2: Update Iron Law scope (line 49)** + +Replace: +``` +Every mention of a person or company with a brain page MUST create a back-link +``` + +With: +``` +Every mention of an entity with a brain page MUST create a back-link +``` + +- [ ] **Step 3: Update Phase 1 description (line 57)** + +Replace: +``` +Before using ANY external API to research a person, company, or topic: +``` + +With: +``` +Before using ANY external API to research a goal, decision, process, or concept: +``` + +- [ ] **Step 4: Update Phase 2 trigger (lines 69-71)** + +Replace: +``` +Every message, meeting, email, or conversation that references a person or company: + +1. **Detect entities** — people, companies, deals mentioned +``` + +With: +``` +Every message or conversation that references a goal, decision, process, or concept: + +1. **Detect entities** — goals, decisions, processes, concepts mentioned +``` + +- [ ] **Step 5: Update Phase 2.5 link types (lines 88-89)** + +Replace: +``` +- Inferred link types: `attended` (meeting -> person), `works_at`, `invested_in`, + `founded`, `advises`, `source` (frontmatter), `mentions` (default). +``` + +With: +``` +- Inferred link types: `uses` (goal -> concept), `decided_in` (decision -> goal), + `depends_on` (process -> concept), `source` (frontmatter), `mentions` (default). +``` + +- [ ] **Step 6: Update Phase 3 description (line 98)** + +Replace: +``` +Before answering any question about a person, company, or topic: +``` + +With: +``` +Before answering any question about a goal, decision, process, or concept: +``` + +- [ ] **Step 7: Update Phase 4 ambient enrichment triggers (lines 111-112)** + +Replace: +``` +- Person mentioned → check brain, create/enrich if needed (spawn background) +- Company mentioned → same +``` + +With: +``` +- Goal mentioned → check brain, create/update if needed (spawn background) +- Decision/process/concept mentioned → same +``` + +- [ ] **Step 8: Update anti-patterns (line 147)** + +Replace: +``` +- Answering questions about people/companies without checking the brain first +``` + +With: +``` +- Answering questions about goals/decisions/processes/concepts without checking the brain first +``` + +- [ ] **Step 9: Verify the file reads correctly** + +Run: `cat skills/brain-ops/SKILL.md | head -60` +Expected: Updated frontmatter with developer directories and generalized Iron Law. + +- [ ] **Step 10: Commit** + +```bash +git add skills/brain-ops/SKILL.md +git commit -m "feat: patch brain-ops 8 hard-gate sites for developer domain" +``` + +--- + +## Task 6: Rewrite `signal-detector/SKILL.md` + +**Files:** +- Rewrite: `skills/signal-detector/SKILL.md` + +Replace the VC-oriented entity detection with developer-domain signal detection. The signal detector fires on every message and is the entry point for knowledge capture. + +- [ ] **Step 1: Rewrite the entire file** + +Replace the entire contents of `skills/signal-detector/SKILL.md` with: + +```markdown +--- +name: signal-detector +version: 2.0.0 +description: | + Always-on ambient signal capture for developer knowledge. Fires on every + inbound message to detect goals, decisions, processes, concepts, and + original thinking. Spawn as a cheap sub-agent in parallel, never block + the main response. +triggers: + - every inbound message (always-on) +tools: + - search + - query + - get_page + - put_page + - add_link + - add_timeline_entry +mutating: true +writes_pages: true +writes_to: + - goals/ + - decisions/ + - processes/ + - concepts/ +--- + +# Signal Detector — Developer Knowledge Capture + +Lightweight sub-agent that fires on every inbound message to capture TWO things +with EQUAL priority: + +1. **Original thinking** — the user's ideas, observations, frameworks +2. **Developer knowledge signals** — goals, decisions, processes, concepts + +Original thinking is AT LEAST as valuable as entity extraction. Ideas are the +intellectual capital. Entities are bookkeeping. Both compound over time. + +## Contract + +This skill guarantees: +- Fires on every message (no exceptions unless purely operational) +- Runs in parallel (spawned, never blocks main response) +- Captures ideas with the user's EXACT phrasing (no paraphrasing) +- Detects developer knowledge signals and creates/enriches brain pages +- Logs a one-line summary of what was captured +- Back-links all entity mentions (Iron Law) +- Citations on every fact written + +> **Convention:** See `skills/conventions/quality.md` for Iron Law back-linking. + +Every time this skill creates or updates a brain page that mentions another entity: +1. Check if that entity has a brain page +2. If yes → add a back-link FROM their page TO the page you just created/updated +3. Format: `- **YYYY-MM-DD** | Referenced in [page title](path) — brief context` +4. An unlinked mention is a broken brain. + +## Phases + +### Phase 1: Idea/Observation Detection (PRIMARY) + +When the user expresses a novel thought, observation, thesis, or framework: +- If it's the user's **original thinking** (they generated it) → create/update `concepts/{slug}` +- If it's a **reusable pattern or mental model** → create/update `concepts/{slug}` + +**Capture exact phrasing.** The user's language IS the insight. Don't paraphrase. + +**Cross-linking (MANDATORY):** Every concept MUST link to related goals, decisions, +and processes. A concept without cross-links is a dead concept. + +### Phase 2: Developer Knowledge Detection (SECONDARY) + +Scan every message for these signals: + +1. **Goal signals** — "set up JWT auth", "migrate to Postgres", "fix the deploy", + any /goal invocation or development task being worked on + - Check brain: `gbrain search "goal name"` + - If no page → create `goals/{slug}` with approach, environment, initial state + - If page exists → update with new progress, debug trails, decisions made + +2. **Decision signals** — "we chose X because Y", "decided to", "tradeoff", + "going with", "ruling out" + - If the decision governs future work beyond this goal → create `decisions/{slug}` + - If the decision is local to the current goal → log on the goal page + - Always record: what was decided, why, what alternatives were considered + +3. **Process signals** — "to deploy, you need to", "the workflow is", "steps to", + "how to set up", repeatable sequences + - Create `processes/{slug}` with preconditions, steps, verification + - Only if the process is reproducible and handoff-worthy + +4. **Concept signals** — "event sourcing works by", "the repository pattern", + "Docker needs this flag because", tool knowledge, pattern explanations + - Create/update `concepts/{slug}` with context-free reusable understanding + - Must be: reusable, cross-goal, stable, non-procedural + +5. **Debug signals** — "the bug was caused by", "root cause was", "fixed by" + - Add structured timeline entry to the active goal page (NOT a separate page) + - Format: `- **YYYY-MM-DD** | Debug — **Symptom:** X. **Root cause:** Y. **Fix:** Z.` + +For each entity: +- `gbrain search "name"` — does a page exist? +- If NO page → check notability (see quality.md). If notable, create with enrichment. +- If page exists but THIN → enrich with new information +- If page exists and RICH → add timeline entry if there's new dated information + +**Auto-link (v0.10.1):** When you write/update a page that references another +entity, the auto-link post-hook on `put_page` automatically creates the graph +edge. You don't need to call `gbrain link` manually. Timeline entries still +need explicit calls. + +### Phase 3: Signal Logging + +Always log a one-line summary: +- `Signals: 0 ideas, 0 entities, 0 facts (skipped: operational)` +- `Signals: 1 concept (captured → concepts/x), 1 goal (updated → goals/y), 1 decision (created → decisions/z)` + +This makes the ambient capture loop debuggable. + +## Output Format + +No visible output to the user. This skill runs silently in the background. +The output is brain pages created/updated and the signal log line. + +## Anti-Patterns + +- Blocking the main response to wait for signal detection to complete +- Paraphrasing the user's original thinking instead of capturing exact phrasing +- Creating pages for non-notable entities (one-off mentions, sub-steps) +- Skipping back-links after creating/updating pages +- Running on purely operational messages ("ok", "thanks", "do it") +- Creating a separate page for debug trails (they go on the goal page) +- Filing a concept that's really a process (if it has steps, it's a process) + +## Tools Used + +- `search` — check if entity page exists +- `query` — semantic search for related context +- `get_page` — load existing entity pages +- `put_page` — create/update brain pages +- `add_link` — cross-reference entities +- `add_timeline_entry` — record events on entity timelines +``` + +- [ ] **Step 2: Verify the file reads correctly** + +Run: `head -30 skills/signal-detector/SKILL.md` +Expected: Updated frontmatter with `writes_to: goals/, decisions/, processes/, concepts/` and version 2.0.0. + +- [ ] **Step 3: Commit** + +```bash +git add skills/signal-detector/SKILL.md +git commit -m "feat: rewrite signal-detector for developer domain knowledge capture" +``` + +--- + +## Task 7: Rewrite `_brain-filing-rules.md` and patch `_brain-filing-rules.json` + +**Files:** +- Rewrite: `skills/_brain-filing-rules.md` +- Modify: `skills/_brain-filing-rules.json` + +- [ ] **Step 1: Rewrite `_brain-filing-rules.md`** + +Replace the entire contents of `skills/_brain-filing-rules.md` with: + +```markdown +# Brain Filing Rules -- MANDATORY for all skills that write to the brain + +## The Rule + +The PRIMARY SUBJECT of the content determines where it goes. Not the format, +not the source, not the skill that's running. + +## Decision Protocol + +1. Identify the primary subject (a goal? decision? process? concept?) +2. File in the directory that matches the subject +3. Cross-link from related directories +4. When in doubt: what would you search for to find this page again? + +## Operational Rule + +Capture everything in `goals/` first. Promote out only when reusable: +- `decision` — if the choice should constrain other goals +- `process` — if it's reproducible and handoff-worthy +- `concept` — if it generalizes beyond the specific case + +## Common Misfiling Patterns -- DO NOT DO THESE + +| Wrong | Right | Why | +|-------|-------|-----| +| Local decision on goal page → `decisions/` | Keep on `goals/` page | Only durable cross-goal choices go to decisions/ | +| One-off command sequence → `processes/` | Keep on `goals/` page | processes/ is for repeatable, handoff-worthy workflows | +| Project-specific config note → `concepts/` | Keep on `goals/` page | concepts/ is for context-free reusable knowledge | +| Reusable pattern buried in goal page | → `concepts/` | If it applies to more than one goal, promote it | +| Debug trail → separate page | → timeline entry on `goals/` page | Debug trails are structured timeline entries, not pages | +| A series of steps → `concepts/` | → `processes/` | If it has steps, it's a process | + +## MECE Boundaries (hard rules) + +| Pair | Boundary | +|------|----------| +| goals/ vs decisions/ | goals: what happened in one execution run. decisions: durable choice meant to govern future goals | +| goals/ vs processes/ | goals: narrative + debug trail. processes: canonical reproducible procedure (no session story) | +| goals/ vs concepts/ | goals: applied, context-bound. concepts: context-free reusable understanding | +| decisions/ vs processes/ | decisions: what/why we chose. processes: how to execute | +| decisions/ vs concepts/ | decisions: committed policy for a scope. concepts: explanatory model, no commitment | +| processes/ vs concepts/ | processes: stepwise action. concepts: theory/pattern vocabulary | + +## Sanctioned exception: synthesis output is sui generis + +The "file by primary subject" rule is for raw ingest. Synthesized output that +is one-of-one to a single source AND a specific reader does not fit any +subject directory cleanly. + +Format-prefixed paths under `media//` are the sanctioned +exception: + +- `media/books/-personalized.md` (book-mirror output) +- `media/articles/-personalized.md` (long-form article personalization) + +## What `sources/` Is Actually For + +`sources/` is ONLY for: +- Bulk data imports (API dumps, CSV exports, snapshots) +- Raw data that feeds multiple brain pages +- Periodic captures (quarterly snapshots, sync exports) + +If the content has a clear primary subject (a goal, decision, process, concept), +it does NOT go in sources/. Period. + +## Notability Gate + +Not everything deserves a brain page. Before creating a new entity page: +- **Goals:** Is this a distinct execution arc? (Not a sub-step of an existing goal) +- **Decisions:** Does this choice govern future work beyond the current goal? +- **Processes:** Is this repeatable and handoff-worthy? (Not a one-off sequence) +- **Concepts:** Reusable across goals? Stable? Non-procedural? +- **When in doubt, DON'T create.** Capture on the goal page first. Promote later. + +## Iron Law: Back-Linking (MANDATORY) + +Every mention of an entity with a brain page MUST create a back-link +FROM that entity's page TO the page mentioning it. This is bidirectional: +the new page links to the entity, AND the entity's page links back. + +Format for back-links (append to Timeline or See Also): +``` +- **YYYY-MM-DD** | Referenced in [page title](path/to/page.md) -- brief context +``` + +An unlinked mention is a broken brain. The graph is the intelligence. + +## Citation Requirements (MANDATORY) + +Every fact written to a brain page must carry an inline `[Source: ...]` citation. + +Three formats: +- **Direct attribution:** `[Source: User, {context}, YYYY-MM-DD]` +- **API/external:** `[Source: {provider}, YYYY-MM-DD]` or `[Source: {publication}, {URL}]` +- **Synthesis:** `[Source: compiled from {list of sources}]` + +Source precedence (highest to lowest): +1. User's direct statements (highest authority) +2. Compiled truth (pre-existing brain synthesis) +3. Timeline entries (raw evidence) +4. External sources (API enrichment, web search -- lowest) + +When sources conflict, note the contradiction with both citations. Don't +silently pick one. + +## Raw Source Preservation + +Every ingested item should have its raw source preserved for provenance. + +**Size routing (automatic via `gbrain files upload-raw`):** +- **< 100 MB text/PDF**: stays in the brain repo (git-tracked) in a `.raw/` + sidecar directory alongside the brain page +- **>= 100 MB OR media files** (video, audio, images): uploaded to cloud + storage with a `.redirect.yaml` pointer left in the brain repo. + +## Dream-cycle synthesize / patterns directories (v0.23) + +The `synthesize` and `patterns` phases of `gbrain dream` write to a +**fixed allow-list** of paths sourced from `_brain-filing-rules.json`'s +`dream_synthesize_paths.globs` array. Editing that JSON is the ONLY way +to add a new directory the synthesis subagent may write to. + +## Brain-to-skill promotion pipeline + +When a process proves repeatable (2-3 times with only argument changes), +it graduates from a `processes/` brain page to an actual skill file: + +- Brain stores: context, evidence, tradeoffs, project-specific constraints, debug history +- Skill files store: stable, parameterized procedures with deterministic steps +- Promotion rule: if reused successfully 2-3 times with only argument changes, graduate to a skill +- Bidirectional links: process page links to skill file path, skill references source brain pages + +## Takes attribution (v0.32+) + +When writing a `` fence, the **holder** column says +WHO BELIEVES the claim, not who it's ABOUT. + +1. **Holder ≠ subject.** The test: did this person SAY or CLEARLY IMPLY this? +2. **Atomic claims.** Split compound rows into separate rows. One claim per row. +3. **Amplification ≠ endorsement.** A retweet-only signal caps at `weight 0.55`. +4. **Self-reported ≠ verified.** Self-report → `weight=0.75`, not `holder=world/1.0`. +5. **No false precision.** Use 0.05 increments only. +6. **"So what" test.** Skip metadata-style trivia. +``` + +- [ ] **Step 2: Add `goal`, `decision`, `process` kinds to `_brain-filing-rules.json`** + +In `skills/_brain-filing-rules.json`, add three new rule objects to the `rules` array. Insert them after the existing `concept` rule (after line 36): + +```json + { + "kind": "goal", + "directory": "goals/", + "examples": ["development tasks", "/goal executions", "debug sessions"], + "description": "One /goal execution arc: what was attempted, what happened, decisions made, debug trails, what was learned. The primary authoring unit — capture here first, promote out when reusable." + }, + { + "kind": "decision", + "directory": "decisions/", + "examples": ["architecture choices", "tool selections", "tradeoff resolutions"], + "description": "A durable technical choice that governs future work beyond one goal. ADR-style: context, options considered, decision, consequences." + }, + { + "kind": "process", + "directory": "processes/", + "examples": ["deploy workflows", "setup procedures", "migration runbooks"], + "description": "A canonical reproducible procedure that is handoff-worthy. Graduates to a skill file after 2-3 successful reuses with only argument changes." + }, +``` + +- [ ] **Step 3: Add developer directories to `dream_synthesize_paths.globs`** + +In `skills/_brain-filing-rules.json`, add three new globs to the `dream_synthesize_paths.globs` array (around line 157-163): + +```json + "globs": [ + "wiki/personal/reflections/*", + "wiki/originals/*", + "wiki/personal/patterns/*", + "wiki/people/*", + "dream-cycle-summaries/*", + "goals/*", + "decisions/*", + "processes/*" + ] +``` + +- [ ] **Step 4: Run the filing-audit test to verify the new kinds are accepted** + +Run: `bun test test/filing-audit.test.ts` +Expected: All tests pass. The filing audit reads `_brain-filing-rules.json` for valid directories, so adding the new kinds makes `goals/`, `decisions/`, `processes/` valid `writes_to` targets. + +- [ ] **Step 5: Run the skills-conformance test** + +Run: `bun test test/skills-conformance.test.ts` +Expected: All tests pass. The signal-detector and brain-ops skills now declare `writes_to` directories that exist in the filing rules JSON. + +- [ ] **Step 6: Commit** + +```bash +git add skills/_brain-filing-rules.md skills/_brain-filing-rules.json +git commit -m "feat: rewrite filing rules for developer domain taxonomy" +``` + +--- + +## Task 8: Rewrite `RESOLVER.md` — routing table + +**Files:** +- Rewrite: `skills/RESOLVER.md` + +Replace VC-oriented triggers with developer-oriented triggers. Keep the table structure and all non-VC skills (thinking skills, operational, setup, identity). + +- [ ] **Step 1: Rewrite `RESOLVER.md`** + +Replace the entire contents of `skills/RESOLVER.md`. **IMPORTANT:** All quoted trigger phrases in table rows must remain unchanged — the resolver test (D5/C) fuzzy-matches quoted phrases against each skill's frontmatter triggers. Since we are NOT modifying the underlying skills (query, enrich, data-research, etc.), their trigger phrases must stay the same. Only change unquoted descriptive text and the disambiguation rules. + +```markdown +# GBrain Skill Resolver + +This is the dispatcher. Skills are the implementation. **Read the skill file before acting.** If two skills could match, read both. They are designed to chain (e.g., ingest then enrich for each entity). + +## Always-on (every message) + +| Trigger | Skill | +|---------|-------| +| Every inbound message (spawn parallel, don't block) | `skills/signal-detector/SKILL.md` | +| Any brain read/write/lookup/citation | `skills/brain-ops/SKILL.md` | + +## Brain operations + +| Trigger | Skill | +|---------|-------| +| "What do we know about", "tell me about", "search for", "who is", "background on", "notes on" | `skills/query/SKILL.md` | +| "Who knows who", "relationship between", "connections", "graph query" | `skills/query/SKILL.md` (use graph-query) | +| Creating/enriching a goal, decision, process, or concept page | `skills/enrich/SKILL.md` | +| Where does a new file go? Filing rules | `skills/repo-architecture/SKILL.md` | +| Fix broken citations in brain pages | `skills/citation-fixer/SKILL.md` | +| "citation audit", "check citations", "fix citations" | `skills/citation-fixer/SKILL.md` (focused fix). For broader brain health, chain into `skills/maintain/SKILL.md` | +| "Research", "track", "extract from email", "investor updates", "donations" | `skills/data-research/SKILL.md` | +| Share a brain page as a link | `skills/publish/SKILL.md` | +| "validate frontmatter", "check frontmatter", "fix frontmatter", "frontmatter audit", "brain lint" | `skills/frontmatter-guard/SKILL.md` | + +## Content & media ingestion + +| Trigger | Skill | +|---------|-------| +| User shares a link, article, or idea | `skills/idea-ingest/SKILL.md` | +| "watch this video", "process this YouTube link", "ingest this PDF", "save this podcast", "process this book", "summarize this book", "PDF book", "ingest it into my brain", "what's in this screenshot", "check out this repo" | `skills/media-ingest/SKILL.md` | +| Meeting transcript received | `skills/meeting-ingestion/SKILL.md` | +| Generic "ingest this" (auto-routes to above) | `skills/ingest/SKILL.md` | + +## Thinking skills (from GStack) + +| Trigger | Skill | +|---------|-------| +| "Brainstorm", "I have an idea", "office hours" | GStack: office-hours | +| "Review this plan", "CEO review", "poke holes" | GStack: ceo-review | +| "Debug", "fix", "broken", "investigate" | GStack: investigate | +| "Retro", "what shipped", "retrospective" | GStack: retro | + +> These skills come from GStack. If GStack is installed, the agent reads them directly. +> If not, brain-only mode still works (brain skills function without thinking skills). + +## Operational + +| Trigger | Skill | +|---------|-------| +| Task add/remove/complete/defer/review | `skills/daily-task-manager/SKILL.md` | +| Morning prep, meeting context, day planning | `skills/daily-task-prep/SKILL.md` | +| Daily briefing, "what's happening today" | `skills/briefing/SKILL.md` | +| Cron scheduling, quiet hours, job staggering | `skills/cron-scheduler/SKILL.md` | +| Save or load reports | `skills/reports/SKILL.md` | +| "Create a skill", "improve this skill" | `skills/skill-creator/SKILL.md` | +| "Skillify this", "is this a skill?", "make this proper" | `skills/skillify/SKILL.md` | +| "Compress my resolver", "AGENTS.md too large", "RESOLVER.md too big", "functional area dispatcher", "shrink routing table" | `skills/functional-area-resolver/SKILL.md` | +| "Is gbrain healthy?", morning health check, skillpack-check | `skills/skillpack-check/SKILL.md` | +| Post-restart health + auto-fix, smoke test | `skills/smoke-test/SKILL.md` | +| Cross-modal review, second opinion | `skills/cross-modal-review/SKILL.md` | +| "Validate skills", skill health check | `skills/testing/SKILL.md` | +| Webhook setup, external event processing | `skills/webhook-transforms/SKILL.md` | +| "Spawn agent", "background task", "parallel tasks", "steer agent", "pause/resume agent", "gbrain jobs submit", "submit a gbrain job", "submit a shell job", "shell job" | `skills/minion-orchestrator/SKILL.md` | +| "present options", "ask before proceeding", "choice gate", "user decision" | `skills/ask-user/SKILL.md` | + +## Setup & migration + +| Trigger | Skill | +|---------|-------| +| "Set up GBrain", first boot | `skills/setup/SKILL.md` | +| "Now what?", "fill my brain", "cold start", "bootstrap", "import my data", "what should I import first" | `skills/cold-start/SKILL.md` | +| "Migrate from Obsidian/Notion/Logseq" | `skills/migrate/SKILL.md` | +| Brain health check, maintenance run | `skills/maintain/SKILL.md` | +| "Extract links", "build link graph", "populate timeline" | `skills/maintain/SKILL.md` (extraction sections) | +| "Run dream", "process today's session", "synthesize my conversations", "consolidate yesterday's conversations", "what patterns did you see", "did the dream cycle run" | `skills/maintain/SKILL.md` (dream cycle section) | +| "Brain health", "what features am I missing", "brain score" | Run `gbrain features --json` | +| "Set up autopilot", "run brain maintenance", "keep brain updated" | Run `gbrain autopilot --install --repo ~/brain` | +| Agent identity, "who am I", customize agent | `skills/soul-audit/SKILL.md` | +| "Populate links", "extract links", "backfill graph" | `skills/maintain/SKILL.md` (graph population phase) | +| "Populate timeline", "extract timeline entries" | `skills/maintain/SKILL.md` (graph population phase) | + +## Identity & access (always-on) + +| Trigger | Skill | +|---------|-------| +| Non-owner sends a message | Check `ACCESS_POLICY.md` before responding | +| Agent needs to know its identity/vibe | Read `SOUL.md` | +| Agent needs user context | Read `USER.md` | +| Operational cadence (what to check and when) | Read `HEARTBEAT.md` | + +## Disambiguation rules + +When multiple skills could match: +1. Prefer the most specific skill (meeting-ingestion over ingest) +2. If the user mentions a URL, route by content type (link → idea-ingest, video → media-ingest) +3. If the user mentions a goal/decision/process/concept, check if enrich or query fits better +4. Chaining is explicit in each skill's Phases section +5. When in doubt, ask the user (see `skills/ask-user/SKILL.md` for the choice-gate pattern) + +## Conventions (cross-cutting) + +These apply to ALL brain-writing skills: +- `skills/conventions/quality.md` — citations, back-links, notability gate +- `skills/conventions/brain-first.md` — check brain before external APIs +- `skills/conventions/brain-routing.md` — which brain (DB) and which source (repo) to target; cross-brain federation is latent-space only +- `skills/conventions/subagent-routing.md` — when to use Minions vs inline work +- `skills/ask-user/SKILL.md` — choice-gate pattern for human input at decision points +- `skills/_brain-filing-rules.md` — where files go +- `skills/_output-rules.md` — output quality standards + +## Uncategorized + +| Trigger | Skill | +|---------|-------| +| "personalized version of this book", "mirror this book", "two-column book analysis", "apply this book to my life", "how does this book apply to me" | `skills/book-mirror/SKILL.md` | +| "enrich this article", "enrich brain pages", "batch enrich", "make brain pages useful" | `skills/article-enrichment/SKILL.md` | +| "strategic reading", "read this through the lens of", "apply this to my problem", "what can I learn from this about", "extract a playbook from" | `skills/strategic-reading/SKILL.md` | +| "concept synthesis", "synthesize my concepts", "find patterns across my notes", "build my intellectual map", "trace idea evolution" | `skills/concept-synthesis/SKILL.md` | +| "perplexity research", "what's new about", "current state of", "web research", "what changed about" | `skills/perplexity-research/SKILL.md` | +| "crawl my archive", "find gold in my archive", "archive crawler", "scan my dropbox for", "mine my old files for" | `skills/archive-crawler/SKILL.md` | +| "verify this academic claim", "check this study", "academic verify", "validate citation", "is this study real" | `skills/academic-verify/SKILL.md` | +| "make pdf from brain", "brain pdf", "convert brain page to pdf", "publish this page as pdf", "export brain page" | `skills/brain-pdf/SKILL.md` | +| "voice note", "ingest this voice memo", "transcribe and file", "voice note ingest", "save this audio note" | `skills/voice-note-ingest/SKILL.md` | +``` + +- [ ] **Step 2: Run resolver test** + +Run: `bun test test/resolver.test.ts` +Expected: All tests pass. The resolver test checks that every trigger in RESOLVER.md matches a skill's frontmatter `triggers:` entry. + +- [ ] **Step 3: Commit** + +```bash +git add skills/RESOLVER.md +git commit -m "feat: rewrite RESOLVER.md routing table for developer domain" +``` + +--- + +## Task 9: Patch `brain-first.md` — retrieval conventions + +**Files:** +- Modify: `skills/conventions/brain-first.md` + +- [ ] **Step 1: Update the header (line 3)** + +Replace: +``` +**Read this before doing ANY entity/person/company/fact lookup.** +``` + +With: +``` +**Read this before doing ANY entity/goal/decision/process/concept lookup.** +``` + +- [ ] **Step 2: Replace the entity page conventions table (lines 53-67)** + +Replace the entire "Entity Page Conventions" section: + +```markdown +## Entity Page Conventions + +Standard directory structure: + +| Directory | Type | Example | +|-----------|------|---------| +| `goals/` | goal | `goals/setup-jwt-auth.md` | +| `decisions/` | decision | `decisions/chose-postgres-over-sqlite.md` | +| `processes/` | process | `processes/deploy-to-production.md` | +| `concepts/` | concept | `concepts/event-sourcing.md` | + +When creating new pages, include proper frontmatter with `type`, `title`, +and `tags` fields. See `skills/_brain-filing-rules.md` for page templates. +``` + +- [ ] **Step 3: Verify the file reads correctly** + +Run: `cat skills/conventions/brain-first.md` +Expected: Developer entity table with goals/decisions/processes/concepts rows. + +- [ ] **Step 4: Commit** + +```bash +git add skills/conventions/brain-first.md +git commit -m "feat: update brain-first.md entity conventions for developer domain" +``` + +--- + +## Task 10: Full verification pass + +**Files:** +- None modified — verification only + +- [ ] **Step 1: Run typecheck** + +Run: `bun run typecheck` +Expected: PASS + +- [ ] **Step 2: Run full unit test suite** + +Run: `bun run test > /tmp/customized_domain_tests.txt 2>&1; echo "EXIT=$?"; tail -50 /tmp/customized_domain_tests.txt` +Expected: All tests pass. Zero failures. + +- [ ] **Step 3: Run the PageType consumer audit** + +Run: `grep -rn 'PageType\|ALL_PAGE_TYPES' src/ --include='*.ts' | grep -v node_modules | grep -v 'import.*PageType'` + +Review the output for any switch statements, whitelist arrays, or filter expressions that enumerate page types. The new types (`goal`, `decision`, `process`) must not be silently excluded by any existing filter. Key files to check: +- `src/core/facts/eligibility.ts` — `ELIGIBLE_TYPES` array. This is intentionally narrow (note/meeting/slack/email/calendar-event/source/writing). Developer types are NOT eligible for facts backstop, which is correct (goals/decisions/processes are structured pages, not conversation-shaped). +- `src/commands/doctor.ts` — `graph_coverage` check uses `type IN ('entity', 'person', 'company', 'organization')`. This is a Tier 2 change (not loop-breaking). Note it but don't block on it. + +- [ ] **Step 4: Run skills conformance test** + +Run: `bun test test/skills-conformance.test.ts` +Expected: All tests pass. + +- [ ] **Step 5: Run filing-audit test** + +Run: `bun test test/filing-audit.test.ts` +Expected: All tests pass. + +- [ ] **Step 6: Run check-resolvable test** + +Run: `bun test test/check-resolvable.test.ts` +Expected: All tests pass. + +- [ ] **Step 7: Run resolver test** + +Run: `bun test test/resolver.test.ts` +Expected: All tests pass. + +- [ ] **Step 8: Spot-check the inferLinkType limitation** + +Run: `grep -n 'inferLinkType' src/core/link-extraction.ts | head -5` + +Note: `inferLinkType()` classifies developer entity relationships as `mentions` (the default fallback). This is a known v1 limitation per the spec. The function uses regex heuristics tuned for VC relationships (founded, invested_in, works_at, attended). Adding developer-specific heuristics (uses, decided_in, depends_on) is a Tier 2 follow-up. + +--- + +## Task 11 (Tier 2, optional): Update doctor.ts graph_coverage check + +**Files:** +- Modify: `src/commands/doctor.ts:1378` + +This is a Tier 2 change — nice to have but not loop-breaking. + +- [ ] **Step 1: Update the type filter in graph_coverage check** + +In `src/commands/doctor.ts` line 1378, expand the SQL `type IN (...)` clause: + +Replace: +```sql +SELECT COUNT(*)::int AS count FROM pages WHERE type IN ('entity', 'person', 'company', 'organization') +``` + +With: +```sql +SELECT COUNT(*)::int AS count FROM pages WHERE type IN ('entity', 'person', 'company', 'organization', 'goal', 'decision', 'process') +``` + +- [ ] **Step 2: Run doctor test if one exists** + +Run: `bun test test/doctor.test.ts 2>/dev/null || echo "No doctor test file"` +Expected: Either passes or no test file exists. + +- [ ] **Step 3: Commit** + +```bash +git add src/commands/doctor.ts +git commit -m "feat: include developer types in doctor graph_coverage check" +``` diff --git a/llms-full.txt b/llms-full.txt index 36979515d..c76ce072d 100644 --- a/llms-full.txt +++ b/llms-full.txt @@ -140,393 +140,37 @@ strict behavior when unset. ## Key files -- `src/core/operations.ts` — Contract-first operation definitions (the foundation). Also exports upload validators: `validateUploadPath`, `validatePageSlug`, `validateFilename`, plus `matchesSlugAllowList(slug, prefixes)` (v0.23 glob matcher: `/*` matches recursive children; bare `` matches exact only). `OperationContext.remote` flags untrusted callers; `OperationContext.allowedSlugPrefixes` (v0.23) is the trusted-workspace allow-list set by the dream cycle. `put_page` enforces: when `viaSubagent` and `allowedSlugPrefixes` is set, slug must match the allow-list; else the legacy `wiki/agents//...` namespace check applies. Auto-link enabled for trusted-workspace writes (skipped only when `remote=true && !trustedWorkspace`). As of v0.26.0, every `Operation` also carries `scope?: 'read' | 'write' | 'admin'` + `localOnly?: boolean`. All ops are annotated; `sync_brain`, `file_upload`, `file_list`, and `file_url` are `admin + localOnly` (rejected over HTTP). `OperationContext.auth?: AuthInfo` is threaded through HTTP dispatch for scope enforcement in `serve-http.ts` before the op runs. **v0.26.9 (D12 + F7b):** `OperationContext.remote` is now a REQUIRED field in the TypeScript type — the compiler is the first defense against transports that forget to set it. Four trust-boundary call sites (`put_page` allowlist, file_upload trust-narrowing, submit_job protected-name guard, auto-link skip) flipped from falsy-default (`!ctx.remote`) to fail-closed semantics (`ctx.remote === false` for "trusted-only" sites and `ctx.remote !== false` for "untrust unless explicit-false"). Anything that isn't strictly `false` is now treated as remote. Closed an HTTP MCP shell-job RCE: a `read+write`-scoped OAuth token could submit `shell` jobs because the HTTP request handler's literal context skipped `remote: true` and `submit_job`'s protected-name guard saw a falsy undefined. Stdio MCP set the field correctly via dispatch.ts; HTTP inlined a parallel context-builder for several releases and lost it. -- `src/core/engine.ts` — Pluggable engine interface (BrainEngine). `clampSearchLimit(limit, default, cap)` takes an explicit cap so per-operation caps can be tighter than `MAX_SEARCH_LIMIT`. Exports `LinkBatchInput` / `TimelineBatchInput` for the v0.12.1 bulk-insert API (`addLinksBatch` / `addTimelineEntriesBatch`). As of v0.13.1, `BrainEngine` has a `readonly kind: 'postgres' | 'pglite'` discriminator so migrations (`src/core/migrate.ts`) and other consumers can branch on engine without `instanceof` + dynamic imports. **v0.29:** four new methods — `batchLoadEmotionalInputs(slugs?)` (CTE-shaped read with per-table aggregates so a page × N tags × M takes never produces N×M rows), `setEmotionalWeightBatch(rows)` (`UPDATE FROM unnest($1::text[], $2::text[], $3::real[])` composite-keyed on `(slug, source_id)` for multi-source safety), `getRecentSalience(opts)`, `findAnomalies(opts)`. `PageFilters` extended with `sort?: 'updated_desc' | 'updated_asc' | 'created_desc' | 'slug'` + `PAGE_SORT_SQL` whitelist consumed by both engines (was hardcoded `ORDER BY updated_at DESC`). **v0.32.8 (PR #860):** new `listAllPageRefs(): Promise>` ordered by `(source_id, slug)`. Cheap cross-source enumeration for hot loops on large brains — replaces the `getAllSlugs()→getPage(slug)` N+1 pattern in extract-takes, extract, integrity, which silently defaulted to `source_id='default'` for non-default-source pages. Implementation parity across postgres-engine.ts + pglite-engine.ts. Pinned by `test/e2e/multi-source-bug-class.test.ts`. +> Only files relevant to the customized-domain spec are listed here. +> For the full key-files catalog, see the main CLAUDE.md. + +- `src/core/operations.ts` — Contract-first operation definitions (the foundation). `put_page` enforces slug allow-lists when `viaSubagent` and `allowedSlugPrefixes` is set. Auto-link enabled for trusted-workspace writes (skipped only when `remote=true && !trustedWorkspace`). +- `src/core/engine.ts` — Pluggable engine interface (BrainEngine). `clampSearchLimit(limit, default, cap)` takes an explicit cap so per-operation caps can be tighter than `MAX_SEARCH_LIMIT`. Exports `LinkBatchInput` / `TimelineBatchInput` for the bulk-insert API (`addLinksBatch` / `addTimelineEntriesBatch`). `BrainEngine` has a `readonly kind: 'postgres' | 'pglite'` discriminator. - `src/core/engine-factory.ts` — Engine factory with dynamic imports (`'pglite'` | `'postgres'`) -- `src/core/pglite-engine.ts` — PGLite (embedded Postgres 17.5 via WASM) implementation, all 40 BrainEngine methods. `addLinksBatch` / `addTimelineEntriesBatch` use multi-row `unnest()` with manual `$N` placeholders. As of v0.13.1, `connect()` wraps `PGlite.create()` in a try/catch that emits an actionable error naming the macOS 26.3 WASM bug (#223) and pointing at `gbrain doctor`; the lock is released on failure so the next process can retry cleanly. v0.22.0: `searchKeyword` and `searchKeywordChunks` multiply `ts_rank` by the source-factor CASE expression at the chunk-grain level; `searchVector` becomes a two-stage CTE — inner CTE keeps `ORDER BY cc.embedding <=> vec` so HNSW stays usable, outer SELECT re-ranks by `raw_score * source_factor`. Inner LIMIT scales with offset to preserve pagination contract. As of v0.22.6.1, `initSchema()` calls `applyForwardReferenceBootstrap()` BEFORE replaying SCHEMA_SQL — probes for the specific forward-referenced state the embedded schema blob needs (`pages.source_id`, `links.link_source`, `links.origin_page_id`, `content_chunks.symbol_name`, `content_chunks.language`, `sources` FK target table) and adds only what's missing. Closes the upgrade-wedge bug class that bit users 10+ times across 6 schema versions over 2 years (#239/#243/#266/#357/#366/#374/#375/#378/#395/#396). No-op on fresh installs and modern brains. -- `src/core/pglite-schema.ts` — PGLite-specific DDL (pgvector, pg_trgm, triggers) -- `src/core/postgres-engine.ts` — Postgres + pgvector implementation (Supabase / self-hosted). `addLinksBatch` / `addTimelineEntriesBatch` use `INSERT ... SELECT FROM unnest($1::text[], ...) JOIN pages ON CONFLICT DO NOTHING RETURNING 1` — 4-5 array params regardless of batch size, sidesteps the 65535-parameter cap. As of v0.12.3, `searchKeyword` / `searchVector` scope `statement_timeout` via `sql.begin` + `SET LOCAL` so the GUC dies with the transaction instead of leaking across the pooled postgres.js connection (contributed by @garagon). `getEmbeddingsByChunkIds` uses `tryParseEmbedding` so one corrupt row skips+warns instead of killing the query. v0.22.0: `searchKeyword`, `searchKeywordChunks`, and `searchVector` apply source-aware ranking by inlining the source-factor CASE and `NOT (col LIKE …)` hard-exclude clause from `src/core/search/sql-ranking.ts`. `searchVector` switches to a two-stage CTE (HNSW-safe inner ORDER BY, source-boost re-rank in the outer SELECT) and carries `p.source_id` through inner→outer for v0.18 multi-source callers. v0.22.1 (#406): `_savedConfig` retains the connect config; `reconnect()` tears down + recreates the pool from saved config (called by supervisor watchdog after 3 consecutive health-check failures). `executeRaw` is a single-statement passthrough — no per-call retry (D3 dropped that as unsound for non-idempotent statements; recovery is supervisor-driven). v0.22.1 (#363, contributed by @orendi84): `connect()` applies `resolveSessionTimeouts()` from `db.ts` as connection-time startup parameters (`statement_timeout`, `idle_in_transaction_session_timeout`) so orphan pgbouncer backends can't hold locks for hours. v0.22.1 (#409, contributed by @atrevino47): `countStaleChunks()` + `listStaleChunks()` server-side-filter on `embedding IS NULL` for `embed --stale`, eliminating ~76 MB/call client-side pull on a fully-embedded brain; `upsertChunks()` resets both `embedding` AND `embedded_at` to NULL when chunk_text changes without a new embedding (consistency). As of v0.22.6.1, `initSchema()` calls `applyForwardReferenceBootstrap()` BEFORE replaying SCHEMA_SQL on the same forward-reference probe set as the PGLite engine, so old Postgres brains pinned at v0.13/v0.18/v0.19 walk forward cleanly instead of wedging on `column "..." does not exist`. **v0.28.1:** `disconnect()` is now idempotent. New `_connectionStyle` instance field tracks whether the engine owns its pool (worker engines) or shares the module-level singleton; second call on an instance-pool engine is a no-op rather than falling through to `db.disconnect()` and clobbering the singleton. Pinned by `test/e2e/postgres-engine-disconnect-idempotency.test.ts` (2 cases). Closes the bug class where any test sharing an engine across multiple `worker.start()` / `worker.stop()` cycles silently broke its own DB connectivity. -- `src/core/cjk.ts` (v0.32.7 CJK wave) — Single source of truth for CJK detection across the codebase. Exports `CJK_RANGES_REGEX`, `CJK_SLUG_CHARS` (character-class fragment for embedding inside other regexes), `CJK_SENTENCE_DELIMITERS` (`。!?`), `CJK_CLAUSE_DELIMITERS` (`;:,、`), `CJK_DENSITY_THRESHOLD = 0.30`, `hasCJK(s)`, `countCJKAwareWords(s)` (30% density threshold — English docs with one Japanese term stay whitespace-tokenized; Chinese-dominant docs get char-counted), and `escapeLikePattern(s)` (escapes `%`, `_`, `\\` for `ILIKE ... ESCAPE '\\'`). Replaces the inline hasCJK regex previously duplicated at `expansion.ts:58`. BMP-only ranges (Han / Hiragana / Katakana / Hangul Syllables); widening to Unicode property escapes is a v0.33+ TODO. Consumers: `expansion.ts`, `sync.ts:slugifySegment`, `operations.ts:validatePageSlug + validateFilename`, `chunkers/recursive.ts:countWords + DELIMITERS`, `pglite-engine.ts:searchKeyword + searchKeywordChunks`. -- `src/core/audit-slug-fallback.ts` (v0.32.7 CJK wave) — Weekly ISO-week-rotated audit JSONL at `~/.gbrain/audit/slug-fallback-YYYY-Www.jsonl`. `logSlugFallback(slug, sourcePath)` fires when `importFromFile` falls back to a frontmatter slug because `slugifyPath` returned empty (emoji / Thai / Arabic / non-CJK exotic-script filenames). `readRecentSlugFallbacks(days)` reads the last N days for `gbrain doctor`'s `slug_fallback_audit` check. Honors `GBRAIN_AUDIT_DIR` via the shared `resolveAuditDir()` from shell-audit.ts. Separate surface from `sync-failures.jsonl` per codex outside-voice review — that file carries bookmark-gating semantics that info events shouldn't trigger. -- `src/core/embedding-pricing.ts` (v0.32.7 CJK wave) — `EMBEDDING_PRICING` map keyed `provider:model` for the post-upgrade reindex cost estimate. Sibling to `anthropic-pricing.ts`. Entries: OpenAI text-embedding-3-large ($0.13/1M), 3-small ($0.02/1M), ada-002 ($0.10/1M), Voyage 3-large ($0.18/1M), 3 ($0.06/1M). `lookupEmbeddingPrice(modelString)` returns a tagged union (`known` with price + `unknown` with provider name); `estimateCostFromChars(charCount, pricePerMTok)` uses 3.5 chars/token approximation. Unknown providers degrade gracefully to "estimate unavailable" instead of fabricating numbers. -- `src/core/post-upgrade-reembed.ts` (v0.32.7 CJK wave) — Pure functions backing the `gbrain upgrade` chunker-bump cost prompt. `computeReembedEstimate(engine, model)` queries real SQL (`COUNT(*)` + `COALESCE(SUM(LENGTH(compiled_truth)) + SUM(LENGTH(timeline)), 0)`) on `pages WHERE chunker_version < MARKDOWN_CHUNKER_VERSION`. `formatReembedPrompt(est, graceSeconds)` is the stderr-line formatter. `runPostUpgradeReembedPrompt(engine, model, opts)` orchestrates the 10-second Ctrl-C window; TTY-only wait (non-TTY auto-proceeds for CI / cron); `GBRAIN_NO_REEMBED=1` bails out with a doctor-warning marker; `GBRAIN_REEMBED_GRACE_SECONDS=0` skips the wait. -- `src/commands/reindex.ts` (v0.32.7 CJK wave) — `gbrain reindex --markdown [--limit N] [--dry-run] [--json] [--no-embed] [--repo PATH]`. Walks `pages WHERE page_kind = 'markdown' AND chunker_version < MARKDOWN_CHUNKER_VERSION` in 100-row batches, ordered by id. Rows with non-null `source_path` re-import via `importFromFile`; rows without fall back to `importFromContent` against the stored `compiled_truth`. **Both paths pass `forceRechunk: true`** to bypass `importFromContent`'s `content_hash` short-circuit — without that flag (codex post-merge F1), the chunker version bump never reaches pages whose source content hasn't changed since last sync, AND master's v0.32.2 stripFactsFence privacy strip never applies to pre-strip chunks. Idempotent — partial-completion re-runs pick up where they left off via id-ordered batches. Wired into `src/commands/upgrade.ts:runPostUpgrade` after `apply-migrations`. -- `src/commands/sync.ts:resolveSlugByPathOrSourcePath` (v0.32.7 CJK wave, codex post-merge F4) — Resolves a slug by `pages.source_path` first (returns the stored slug for frontmatter-fallback pages whose path doesn't derive a slug), then falls back to `resolveSlugForPath(path)`. Threaded into all 4 delete/rename call sites (`performSync`'s un-syncable cleanup at ~:531, deletes at ~:603, rename oldSlug at ~:622). Without this, emoji-only / Thai / Arabic filenames whose slug came from frontmatter would orphan on delete/rename (the delete path would compute the wrong path-derived slug). Best-effort query — pre-migration brains fall through to the legacy path. -- `src/core/utils.ts` — Shared SQL utilities extracted from postgres-engine.ts. Exports `parseEmbedding(value)` (throws on unknown input, used by migration + ingest paths where data integrity matters) and as of v0.12.3 `tryParseEmbedding(value)` (returns `null` + warns once per process, used by search/rescore paths where availability matters more than strictness). **v0.26.9 (D14):** adds `isUndefinedColumnError(err)` predicate — pattern-matches Postgres SQLSTATE 42703 / "column ... does not exist" with engine-driver shape variation tolerated. Replaces bare `catch {}` blocks in `oauth-provider.ts` so genuine errors (lock timeout, network blip, permission denied) propagate while column-missing falls through to the legacy fallback path. Reusable from any future code that needs the same column-existence probe semantics. **v0.32.8 (PR #860):** adds `validateSourceId(id)` that throws on anything outside `^[a-z0-9_-]+$`. Used by the per-source disk-layout fix in patterns.ts/synthesize.ts before any `join(brainDir, '.sources', source_id, slug+'.md')` call so source_id can't traverse out of brainDir. `rowToPage` updated to populate the now-required `Page.source_id` field from the SELECT projection (`scripts/check-source-id-projection.sh` enforces that every projection feeding `rowToPage` includes the column). -- `src/core/db.ts` — Connection management, schema initialization. v0.22.1 (#363, contributed by @orendi84): `resolveSessionTimeouts()` returns `statement_timeout` + `idle_in_transaction_session_timeout` (defaults: 5min each, env-overridable via `GBRAIN_STATEMENT_TIMEOUT` / `GBRAIN_IDLE_TX_TIMEOUT` / `GBRAIN_CLIENT_CHECK_INTERVAL`). Both `connect()` (module singleton) and `PostgresEngine.connect()` (worker pool) consume the result via postgres.js's `connection` option, sending GUCs as startup parameters that survive PgBouncer transaction mode (unlike the prior `setSessionDefaults` post-pool SET, kept as a back-compat no-op shim). -- `src/commands/migrate-engine.ts` — Bidirectional engine migration (`gbrain migrate --to supabase/pglite`) - `src/core/import-file.ts` — importFromFile + importFromContent (chunk + embed + tags) -- `src/core/sync.ts` — Pure sync functions (manifest parsing, filtering, slug conversion). v0.22.12 (#500, foundation by @wintermute via #501): `classifyErrorCode(errorMsg)` regex-based classifier with 12 codes (`SLUG_MISMATCH`, `YAML_PARSE`, `YAML_DUPLICATE_KEY`, `MISSING_OPEN`, `MISSING_CLOSE`, `NESTED_QUOTES`, `EMPTY_FRONTMATTER`, `NULL_BYTES`, `INVALID_UTF8`, `STATEMENT_TIMEOUT`, `FILE_TOO_LARGE`, `SYMLINK_NOT_ALLOWED`) plus `UNKNOWN` fallback. `summarizeFailuresByCode(failures)` returns sorted `[{code, count}]`. `code?` optional field on `SyncFailure`; backfilled at ack time on pre-v0.22.12 entries. `acknowledgeSyncFailures()` returns `AcknowledgeResult { count, summary }`. Three regexes (`MISSING_OPEN`, `MISSING_CLOSE`, `EMPTY_FRONTMATTER`) broadened to match actual `markdown.ts:159-244` validator message strings, not just the literal code-name prefix. `FILE_TOO_LARGE` covers all three production size sites in `import-file.ts:199, 352, 401`; `SYMLINK_NOT_ALLOWED` covers the rejection at `:347`. Closes the silent-skip pattern that motivated #500. -- `src/core/storage.ts` — Pluggable storage interface (S3, Supabase Storage, local) -- `src/core/storage-config.ts` (v0.22.11) — Storage tiering: `loadStorageConfig` reads `gbrain.yml`, normalizes deprecated keys (`git_tracked` / `supabase_only`) to canonical (`db_tracked` / `db_only`) with once-per-process deprecation warning, and runs `normalizeAndValidateStorageConfig` (auto-fixes missing trailing `/`, throws `StorageConfigError` on tier overlap). Path-segment matcher: `media/x/` does NOT match `media/xerox/foo`. Replaces gray-matter (broken on delimiter-less YAML) with a dedicated parser for the `gbrain.yml` shape. -- `src/core/disk-walk.ts` (v0.22.11) — `walkBrainRepo(repoPath)` returns `Map` from one recursive `readdirSync`. Skips dot-dirs, `node_modules`, non-`.md` files. Used by `gbrain storage status` to replace per-page `existsSync + statSync` (~400K syscalls on 200K-page brains → tens). -- `src/commands/storage.ts` (v0.22.11) — `gbrain storage status [--repo P] [--json]`. Split into pure data (`getStorageStatus`) + JSON formatter + human formatter (ASCII-only per D10) matching the `orphans.ts` pattern. `PageCountsByTier` and `DiskUsageByTier` are distinct nominal types so swaps fail at compile time. -- `gbrain.yml` (brain repo root, v0.22.11) — Optional storage tiering config. Top-level `storage:` section with `db_tracked:` and `db_only:` array-valued keys. `gbrain sync` auto-manages `.gitignore` for `db_only` paths on successful sync (skips on dry-run, blocked-by-failures, submodule context, or `GBRAIN_NO_GITIGNORE=1`). `gbrain export --restore-only [--repo P] [--type T] [--slug-prefix S]` repopulates missing `db_only` files from the database. -- `src/core/supabase-admin.ts` — Supabase admin API (project discovery, pgvector check) -- `src/core/file-resolver.ts` — File resolution with fallback chain (local -> .redirect.yaml -> .redirect -> .supabase) -- `src/core/chunkers/` — 3-tier chunking (recursive, semantic, LLM-guided). v0.19.0 adds `code.ts` — tree-sitter-based semantic chunker for 29 languages with embedded-asset WASMs (`src/assets/wasm/`), `@dqbd/tiktoken` cl100k_base tokenizer, small-sibling merging. `CHUNKER_VERSION` constant folded into `importCodeFile`'s `content_hash` so chunker shape changes force clean re-chunks across releases. -- `src/core/errors.ts` (v0.19.0) — `StructuredAgentError` + `buildError` + `serializeError`. Every new v0.19.0 agent-facing surface (code-def, code-refs, usage errors) uses this envelope; matches v0.17.0 `CycleReport.PhaseResult.error` shape. -- `src/assets/wasm/` (v0.19.0) — 36 tree-sitter grammar WASMs + tree-sitter runtime. Committed to the repo so `bun --compile` embeds them deterministically via `import path from ... with { type: 'file' }`. The CI guard `scripts/check-wasm-embedded.sh` fails the build if the compiled binary ever silently falls through to recursive chunks. -- `src/commands/code-def.ts` + `src/commands/code-refs.ts` (v0.19.0) — symbol definition + references lookup. Query `content_chunks.symbol_name` or chunk_text ILIKE with `page_kind='code'` filter. Auto-JSON when stdout is not a TTY (gh-CLI convention). Bypass the standard `searchKeyword` `DISTINCT ON (slug)` collapse so multiple call-sites from the same file surface. -- `src/core/search/` — Hybrid search: vector + keyword + RRF + multi-query expansion + dedup. As of v0.22.0, `searchKeyword` / `searchKeywordChunks` / `searchVector` apply source-aware ranking at the SQL layer (curated content like `originals/`, `concepts/`, `writing/` outranks bulk content like `wintermute/chat/`, `daily/`, `media/x/`). `searchVector` uses a two-stage CTE so source-boost re-ranking doesn't kill the HNSW index. Hard-exclude prefixes (`test/`, `archive/`, `attachments/`, `.raw/` by default) filter at retrieval, not post-rank. Both gates honor `detail !== 'high'` so temporal queries surface chat pages normally. -- `src/core/search/intent.ts` — Query intent classifier (entity/temporal/event/general → auto-selects detail level) -- `src/core/search/eval.ts` — Retrieval eval harness: P@k, R@k, MRR, nDCG@k metrics + runEval() orchestrator -- `src/core/search/source-boost.ts` (v0.22.0) — Source-type boost map keyed by slug prefix. `DEFAULT_SOURCE_BOOSTS` (originals/ 1.5, concepts/ 1.3, writing/ 1.4, people/companies/deals/ 1.2, daily/ 0.8, media/x/ 0.7, wintermute/chat/ 0.5) and `DEFAULT_HARD_EXCLUDES` (test/, archive/, attachments/, .raw/). `parseSourceBoostEnv` / `parseHardExcludesEnv` parse comma-separated `prefix:factor` pairs from `GBRAIN_SOURCE_BOOST` / `GBRAIN_SEARCH_EXCLUDE` env vars. `resolveBoostMap` and `resolveHardExcludes` merge defaults + env + caller `SearchOpts.exclude_slug_prefixes`/`include_slug_prefixes`. -- `src/core/search/sql-ranking.ts` (v0.22.0) — Pure SQL string builders. `buildSourceFactorCase(slugColumn, boostMap, detail)` emits a CASE expression with longest-prefix-match wins (returns literal `'1.0'` when `detail === 'high'` for temporal-bypass parity with COMPILED_TRUTH_BOOST). `buildHardExcludeClause(slugColumn, prefixes)` emits `NOT (col LIKE 'p1%' OR col LIKE 'p2%')` — OR-chain wrapped in NOT, NOT `NOT LIKE ALL/ANY` (those quantifiers don't express set-exclusion). LIKE meta-character escape covers all three of `%`, `_`, AND `\` (backslash matters because it's Postgres LIKE's default escape char). Single-quote doubling on SQL string literals so injection-style inputs are inert text. -- `src/commands/eval.ts` — `gbrain eval` command: single-run table + A/B config comparison. v0.25.0 adds sub-subcommand dispatch on `args[0]` so `gbrain eval export` + `gbrain eval prune` + `gbrain eval replay` route into session-capture handlers; bare `gbrain eval --qrels …` fall-through preserves the legacy IR-metrics flow. v0.27.x adds `gbrain eval cross-modal` to the dispatch (the user-facing path is the cli.ts no-DB branch — `src/commands/eval.ts:cross-modal` only fires when callers re-enter with an existing engine). -- `src/commands/eval-cross-modal.ts` (v0.27.x) — multi-model quality gate. Three different-provider frontier models score the OUTPUT against the TASK on a 5-dim list. Verdict `pass` (exit 0) / `fail` (exit 1) / `inconclusive` (exit 2; <2/3 model successes per Q3=A in plans/radiant-napping-lerdorf.md). Reuses `src/core/ai/gateway.ts:chat()` so config/auth/aliasing comes from the gateway recipe registry — no parallel provider stack. Self-configures the gateway (`configureGateway(loadConfig() + process.env)`) since the cli.ts dispatch bypasses `connectEngine()`. Default cycles 3 in TTY, 1 in non-TTY (T11=B partial cost guardrail). Receipts land at `gbrainPath('eval-receipts')/-.json`. The full `--budget-usd` cap is a v0.27.x follow-up TODO. -- `src/core/cross-modal-eval/json-repair.ts` (v0.27.x) — `parseModelJSON(raw)` named export with a 4-strategy fallback chain (direct parse → fence-strip → trailing-comma + single-quote + embedded-newline repair → regex nuclear option). Adversarial input throws rather than fabricating scores — the aggregator treats a throw as "this model contributed nothing this cycle" so the gate stays correct at >=2/3 successes. -- `src/core/cross-modal-eval/aggregate.ts` (v0.27.x) — pure verdict logic. Pass criterion: `(successes >= 2) AND (every dim mean >= 7) AND (every dim min across models >= 5)` (Q2=A floor). Inconclusive when <2/3 models returned parseable scores (Q3=A regression guard for the v1 .mjs `Object.values({}).every(...) === true` empty-array PASS bug). -- `src/core/cross-modal-eval/runner.ts` (v0.27.x) — orchestrator. Each cycle runs `Promise.allSettled([gwChat(slotA), gwChat(slotB), gwChat(slotC)])` (T4=A — bare allSettled, no rate-leases for the CLI path; minion-integration TODO recovers cross-process concurrency). Stops early on PASS or INCONCLUSIVE; runs up to 3 cycles. Default slots: `openai:gpt-4o` / `anthropic:claude-opus-4-7` / `google:gemini-1.5-pro`. `estimateCost()` exports a small per-model pricing table (drifts; refresh alongside model-family bumps). -- `src/core/cross-modal-eval/receipt-name.ts` (v0.27.x) — receipt filename binds (slug, SKILL.md sha-8). `findReceiptForSkill(skillPath, receiptDir)` returns `'found' | 'stale' | 'missing'` (T10=A). Skillify-check item 11 surfaces the status as informational (T7=C); the audit does NOT fail on missing/stale receipts. -- `src/core/cross-modal-eval/receipt-write.ts` (v0.27.x) — wraps `fs.writeFileSync` with `mkdirSync({recursive:true})` ahead of every write (T5 correction; `gbrainPath()` does NOT auto-mkdir). -- `src/commands/eval-export.ts` (v0.25.0) — streams `eval_candidates` rows as NDJSON to stdout with `schema_version: 1` prefix on every line. EPIPE-safe, progress heartbeats on stderr, stable id-desc tiebreaker so `--since` windows never dupe/miss rows. -- `src/commands/eval-prune.ts` (v0.25.0) — explicit retention cleanup. Requires `--older-than DUR`. `--dry-run` reports would-delete count. -- `src/commands/eval-replay.ts` (v0.25.0) — contributor-facing replay tool. Reads NDJSON from `gbrain eval export`, re-runs each captured `query` / `search` op against the current brain, computes set-Jaccard@k between captured + current `retrieved_slugs`, top-1 stability rate, and latency Δ. Stable JSON shape (`schema_version: 1`) for CI gating; human mode prints a regression table. Pure Bun, zero new deps. The dev-loop half of BrainBench-Real that closes the gap between "data captured" and "data used to gate a PR." See `docs/eval-bench.md` for the workflow. -- `src/commands/eval-suspected-contradictions.ts` + `src/core/eval-contradictions/{judge,runner,types,date-filter,cost-tracker,cache,severity-classify,cross-source,trends,calibration,judge-errors,auto-supersession,fixture-redact}.ts` (v0.32.6) — `gbrain eval suspected-contradictions [run|trend|review]`. Probe samples top-K retrieval pairs per query (cross-slug + intra-page chunk-vs-take), date pre-filters (3-rule layered — same-paragraph-dual-date overrides separation rule), LLM judge (query-conditioned per Codex; UTF-8-safe truncation; C1 confidence-floor double-enforcement; resolution_kind output drives M7 paste-ready commands), persistent cache keyed on `(chunk_a_hash, chunk_b_hash, model_id, prompt_version, truncation_policy)` (Codex outside-voice fix — prompt edits cleanly invalidate prior verdicts), Wilson 95% CI calibration on the headline percentage with `small_sample_note` when n<30, judge_errors as first-class typed counters (parse_fail/refusal/timeout/http_5xx/unknown — Codex fix to bias from silent skip), M5 trend writes to `eval_contradictions_runs`, M6 source-tier breakdown reuses `DEFAULT_SOURCE_BOOSTS` prefix logic, deterministic sampling (combined_score DESC + lex tiebreaker — stable cache hit-rate across re-runs). Hermetic via `judgeFn` + `searchFn` DI in the runner; never touches the real gateway in tests. Engine surface: `BrainEngine.listActiveTakesForPages` (P1 batched), `writeContradictionsRun` + `loadContradictionsTrend` (M5), `getContradictionCacheEntry` + `putContradictionCacheEntry` + `sweepContradictionCache` (P2). Schema migrations v51 + v52. MCP op `find_contradictions` (read scope, NOT localOnly, NOT in subagent allowlist — user-initiated only). M1 doctor check surfaces high-severity findings with paste-ready resolution commands. M2 synthesize phase pre-fetches latest probe's top-5-by-severity findings and threads them into `buildSynthesisPrompt` as an informational block. 226 hermetic unit tests + 12 real-Postgres E2E. Plan: `~/.claude/plans/system-instruction-you-are-working-hashed-dewdrop.md`. Architecture doc: `docs/contradictions.md`. -- `src/commands/eval-longmemeval.ts` + `src/eval/longmemeval/{harness,adapter,sanitize}.ts` (v0.28.1) — `gbrain eval longmemeval ` runs the public [LongMemEval](https://huggingface.co/datasets/xiaowu0162/longmemeval) benchmark against gbrain's hybrid retrieval. Architecture: one in-memory PGLite per benchmark run created via `createBenchmarkBrain` + `withBenchmarkBrain` (NO `EphemeralBrain` class). Between questions, `TRUNCATE` over runtime-enumerated `pg_tables` so future schema migrations don't silently leak data across questions; infrastructure tables (`sources`, `config`, `gbrain_cycle_locks`, `subagent_rate_leases`) are preserved. `cli.ts` has a pre-dispatch bypass so `eval longmemeval` skips `connectEngine()` — the user's `~/.gbrain` brain is never opened. `--expansion` defaults to OFF (deterministic, no per-query Haiku call); pass `--expansion` to opt in. Default model resolves through `resolveModel()` 6-tier chain with `models.eval.longmemeval` as the new config key. Sanitization parity: `harness.ts` re-uses `INJECTION_PATTERNS` from `src/core/think/sanitize.ts` (now exported, line 22) so adding a pattern automatically covers takes AND benchmarks. Retrieved chat content is wrapped in `` framing; the answer-gen system prompt declares the content UNTRUSTED. LLM injection seam: `runEvalLongMemEval(args, {client?: ThinkLLMClient})` lets tests stub the client so the full pipeline runs without an Anthropic API key. p50 25.9ms / p99 30.3ms warm reset+import+search on Apple Silicon (per `test/eval-longmemeval.test.ts` perf gate). Hand the JSONL output to LongMemEval's `evaluate_qa.py` to score (their published evaluator, not bundled — needs OpenAI gpt-4o per their spec). -- `docs/eval-bench.md` (v0.25.0) — contributor guide for using captured data to benchmark retrieval changes before merging. Linked from CONTRIBUTING.md under "Running real-world eval benchmarks (touching retrieval code)". -- `src/core/eval-capture.ts` (v0.25.0) — op-layer capture wrapper called from `src/core/operations.ts` `query` + `search` handlers. Catches MCP + CLI + subagent tool-bridge from one site. Fire-and-forget; failures route to `engine.logEvalCaptureFailure` so `gbrain doctor` sees drops cross-process. **Capture is off by default** — `isEvalCaptureEnabled` resolution: explicit `config.eval.capture` (true/false) wins, else `process.env.GBRAIN_CONTRIBUTOR_MODE === '1'`, else off. Production users get a quiet brain; contributors set `export GBRAIN_CONTRIBUTOR_MODE=1` in `.zshrc` to enable the dev loop. PII scrubber gate is independent and defaults to true regardless of CONTRIBUTOR_MODE. -- `src/core/eval-capture-scrub.ts` (v0.25.0) — zero-deps PII scrubber: emails, phones, SSN, Luhn-verified credit cards, JWT-shaped tokens, bearer tokens. -- `src/core/search/hybrid.ts` — Cathedral II `Promise` return shape unchanged in v0.25.0. Adds `onMeta?: (m: HybridSearchMeta) => void` callback so op-layer capture can record what hybridSearch actually did. Existing callers leave it undefined. -- `docs/eval-capture.md` (v0.25.0) — stable NDJSON schema reference for gbrain-evals consumers. -- `test/public-exports.test.ts` (v0.25.0 / R2) — runtime contract test. Imports each of the 17 public subpaths via package name and pins a canary symbol per module. Paired with `scripts/check-exports-count.sh`. -- `src/core/embedding.ts` — OpenAI text-embedding-3-large, batch, retry, backoff. **v0.28.7:** `BATCH_SIZE` reverted 50→100 — the original Voyage safety guard halved OpenAI throughput on every page. Per-recipe pre-split + recursive halving + adaptive shrink-on-miss now live in the gateway, so the outer paginator goes back to its original purpose: progress-callback granularity, not batch protection. -- `src/core/ai/types.ts` — provider/recipe types. **v0.28.7 (#680):** `EmbeddingTouchpoint` extended with optional `chars_per_token` (default 4 chars/token, matching OpenAI tiktoken on English) and `safety_factor` (default 0.8, budget-utilization ceiling). Both consulted only when `max_batch_tokens` is also set. Voyage declares `chars_per_token=1` + `safety_factor=0.5` to handle dense payloads (CJK/JSON/base64) that overshoot tiktoken. The pre-split budget is `max_batch_tokens × safety_factor / chars_per_token`. **v0.28.11 (#719):** `EmbeddingTouchpoint.multimodal_models?: string[]` model-level allow-list for recipes that mix text-only + multimodal models under one touchpoint (Voyage's 12 models share `supports_multimodal: true` but only `voyage-multimodal-3` accepts `/multimodalembeddings`). When omitted, recipe-level `supports_multimodal` is sufficient. `AIGatewayConfig.embedding_multimodal_model?: string` lets `embedMultimodal()` route to a different model than `embedding_model` — brains using OpenAI for text can use Voyage for images without flipping the primary embedding pipeline. -- `src/core/ai/gateway.ts` — unified seam for every AI call. **v0.28.7 (#680):** module-scoped `_embedTransport` defaulting to AI SDK `embedMany`, with `__setEmbedTransportForTests(fn)` test seam so tests drive the public `embed()` function with a stubbed transport instead of probing private helpers. `splitByTokenBudget` and `isTokenLimitError` are now exported `@internal` — pure functions reused directly by the test file. Module-level `_shrinkState: Map` halves the recipe's effective `safety_factor` on token-limit miss (floor 0.05) and heals back ×1.5 toward the ceiling after `SHRINK_HEAL_AFTER=10` consecutive successes. `configureGateway()` walks every registered recipe at construction time and emits a once-per-process stderr warning for any embedding touchpoint missing `max_batch_tokens` (excluding the canonical OpenAI fast-path recipe). `resetGateway()` clears `_shrinkState`, the warned-set, and restores the real transport. ASCII flow diagram embedded in the `embed()` JSDoc covers the routing decision, recursion + halving, and shrinkState lifecycle. **v0.28.11 (#719):** `embedMultimodal()` reads `cfg.embedding_multimodal_model` first (falls back to `cfg.embedding_model` for single-model setups). After the existing recipe-level `supports_multimodal` fast-fail, validates the resolved model against `touchpoint.multimodal_models` when declared — closes the Voyage-text-only-model-into-multimodal-endpoint footgun before any HTTP call (Codex F1 from PR review). New `getMultimodalModel()` accessor mirrors `getEmbeddingModel` / `getChatModel` so doctor and integration tests can read the gateway state. -- `src/core/ai/recipes/voyage.ts` — Voyage AI openai-compatible recipe. **v0.28.7 (#680):** declares `chars_per_token=1` + `safety_factor=0.5` so the gateway pre-splits Voyage batches at a 60K-character budget (50% of 120K-token cap with the dense-tokenizer ratio). Closes the v0.27 backfill loop where ~26% of the corpus stayed un-embedded because tiktoken-grounded budgeting silently undercounted Voyage's actual token usage. **v0.28.11 (#719):** declares `multimodal_models: ['voyage-multimodal-3']` so the gateway rejects text-only Voyage models pointed at the multimodal endpoint with a clear `AIConfigError` instead of waiting for Voyage's HTTP 400. -- `src/core/ai/recipes/anthropic.ts` — Anthropic recipe (chat + expansion touchpoints). **v0.31.12:** chat and expansion `models:` lists drop the v0.31.6 phantom `claude-sonnet-4-6-20250929` date suffix — canonical id is `claude-sonnet-4-6`. The wrong-direction alias `claude-sonnet-4-6 → claude-sonnet-4-6-20250929` is removed; a reverse alias `claude-sonnet-4-6-20250929 → claude-sonnet-4-6` keeps stale user configs working (rescues `facts.extraction_model` and `models.dream.synthesize` set by v0.31.6 installs). Recipe-shape regression pinned by `test/anthropic-model-ids.test.ts` (6 cases, verbatim cherry-pick of PR #830 plus the reverse-alias rescue case). -- `src/core/anthropic-pricing.ts` — Single source of truth for Anthropic model pricing (per-MTok input/output). **v0.31.12:** Opus 4.7 corrected from `$15/$75` to `$5/$25` (the old number was from Opus 4 generation, never refreshed when 4.7 shipped); Opus 4.6 also corrected. Consumed by `src/core/budget-meter.ts` and `src/core/cross-modal-eval/runner.ts` — the cross-modal estimator now reads `ANTHROPIC_PRICING` for Anthropic models instead of duplicating the table, killing the v0.31.6 drift bug class. -- `src/core/model-config.ts` — Model-string resolution (the seam every internal LLM call walks through). **v0.31.12:** four-tier system (`ModelTier = 'utility' | 'reasoning' | 'deep' | 'subagent'`) with `TIER_DEFAULTS` (utility→haiku-4-5, reasoning→sonnet-4-6, deep→opus-4-7, subagent→sonnet-4-6) and `tier?: ModelTier` on `ResolveModelOpts`. Resolution chain is now 8 steps: cliFlag → deprecated key → config key → `models.default` → `models.tier.` → env var → `TIER_DEFAULTS[tier]` → caller fallback. Two new exports — `isAnthropicProvider(modelString)` checks `provider:model` prefix OR `claude-` bare-id pattern, and `enforceSubagentAnthropic()` is the layer-2 runtime guard: when `tier === 'subagent'` resolves to a non-Anthropic provider, it emits a once-per-`(source, model)` stderr warn AND falls back to `TIER_DEFAULTS.subagent` instead of letting the Anthropic Messages API tool-loop attempt to run on OpenAI/Gemini. `_resetDeprecationWarningsForTest()` now also clears `_subagentTierWarningsEmitted` so tests re-emit. -- `src/core/ai/model-resolver.ts` — Recipe-touchpoint validator. **v0.31.12:** `assertTouchpoint(recipe, touchpoint, modelId, extendedModels?)` gains an optional 4th `extendedModels: ReadonlySet` argument. When the modelId is in that set, the native-recipe allowlist throw is bypassed — the user explicitly opted into this model via config so we let provider rejection surface as `model_not_found` at HTTP call time (and `gbrain models doctor` catches it earlier). Default code paths with hardcoded model strings MUST NOT pass `extendedModels` — typos in source code still fail fast. Replaces the earlier plan to soften the validator wholesale (Codex F4/F5 in plan review flagged that as too broad — it would have removed the fail-fast contract for chat + expand + embed all three). -- `src/core/ai/gateway.ts` extension (v0.31.12) — new module-scoped `_extendedModels: Map>` registry feeds `assertTouchpoint`'s 4th-arg path. New `reconfigureGatewayWithEngine(engine)` async function is called from `cli.ts` after `engine.connect()` (and before every command except `CLI_ONLY` no-DB commands) — re-resolves expansion + chat defaults through `resolveModel()` so `models.tier.*` and `models.default` overrides apply to expansion + chat both. `DEFAULT_CHAT_MODEL` corrected to `anthropic:claude-sonnet-4-6` (was the v0.31.6 phantom `-20250929`). New `__setChatTransportForTests` seam mirrors `__setEmbedTransportForTests` so tests drive `chat()` with a stubbed transport. -- `src/core/minions/queue.ts` extension (v0.31.12) — `MinionQueue.add()` now rejects `subagent` jobs whose `data.model` resolves through `isAnthropicProvider()` to a non-Anthropic provider. Lazy-imports `model-config.ts` to avoid pulling engine types into queue's eager-load surface. Layer 1 of the three-layer subagent provider enforcement (Codex F1+F2 in plan review). Layers 2 + 3 live in `src/core/model-config.ts` (`enforceSubagentAnthropic` runtime fallback) and `src/commands/doctor.ts` (`subagent_provider` check). Pinned by 3 cases in `test/agent-cli.test.ts`. -- `src/commands/models.ts` (v0.31.12) — `gbrain models [--json]` read-only routing dashboard: prints tier defaults (`utility`/`reasoning`/`deep`/`subagent`), the resolved value for each (re-walking the resolution chain to attribute properly), every per-task override (11 `PER_TASK_KEYS` entries — `models.dream.synthesize`, `models.dream.patterns`, `models.drift`, `models.auto_think`, `models.think`, `models.subagent`, `facts.extraction_model`, `models.eval.longmemeval`, `models.expansion`, `models.chat`, `models.dream.synthesize_verdict`), the alias map (defaults + user overrides), and a source-of-truth column showing `default` / `config: ` / `env: `. `gbrain models doctor [--skip=] [--json]` fires a 1-token `gateway.chat()` probe against each configured chat + expansion model and classifies failures into `{model_not_found, auth, rate_limit, network, unknown}` — the structural fix for the v0.31.6 silent-no-op bug class. Wired into `cli.ts` dispatch table + `CLI_ONLY` set. -- `src/commands/doctor.ts` extension (v0.31.12) — new `subagent_provider` check (layer 3 of 3 — Codex F13). Warns when `models.tier.subagent` is explicitly set to a non-Anthropic provider (fail-loud since the user clearly meant it — message names the bad value and prints the paste-ready fix command `gbrain config set models.tier.subagent anthropic:claude-sonnet-4-6`); also warns when `models.default` would sneak `subagent` into a non-Anthropic provider via tier inheritance. OK status when subagent tier resolves to Anthropic. Tests cover all three paths in `test/doctor.test.ts`. +- `src/core/sync.ts` — Pure sync functions (manifest parsing, filtering, slug conversion). +- `src/core/markdown.ts` — Frontmatter parsing + body splitter. `splitBody` requires an explicit timeline sentinel (``, `--- timeline ---`, or `---` immediately before `## Timeline`/`## History`). Plain `---` in body text is a markdown horizontal rule, not a separator. `inferType` auto-types `/wiki/analysis/` → analysis, `/wiki/guides/` → guide, `/wiki/hardware/` → hardware, `/wiki/architecture/` → architecture, `/writing/` → writing (plus the existing people/companies/deals/etc heuristics). +- `src/core/link-extraction.ts` — shared library for the v0.12.0 graph layer. extractEntityRefs (canonical, replaces backlinks.ts duplicate) matches both `[Name](people/slug)` markdown links and Obsidian `[[people/slug|Name]]` wikilinks as of v0.12.3. extractPageLinks, inferLinkType heuristics (attended/works_at/invested_in/founded/advises/source/mentions), parseTimelineEntries, isAutoLinkEnabled config helper. `DIR_PATTERN` covers `people`, `companies`, `deals`, `topics`, `concepts`, `projects`, `entities`, `tech`, `finance`, `personal`, `openclaw`. Used by extract.ts, operations.ts auto-link post-hook, and backlinks.ts. - `src/core/check-resolvable.ts` — Resolver validation: reachability, MECE overlap, DRY checks, structured fix objects. v0.14.1: `CROSS_CUTTING_PATTERNS.conventions` is an array (notability gate accepts both `conventions/quality.md` and `_brain-filing-rules.md`). New `extractDelegationTargets()` parses `> **Convention:**`, `> **Filing rule:**`, and inline backtick references. DRY suppression is proximity-based via `DRY_PROXIMITY_LINES = 40`. -- `src/core/repo-root.ts` — Shared `findRepoRoot(startDir?)` (v0.16.4): walks up from `startDir` (default `process.cwd()`) looking for `skills/RESOLVER.md`. Zero-dependency module imported by both `doctor.ts` and `check-resolvable.ts`. Parameterized `startDir` makes tests hermetic. **v0.31.7:** read-path / write-path split. `autoDetectSkillsDir` (shared, read+write-safe) gains tier-0 `$GBRAIN_SKILLS_DIR` explicit operator override (Docker mounts, CI, monorepo subdirs) ahead of the existing 4-tier chain. New `autoDetectSkillsDirReadOnly` wraps it with a tier-5 install-path fallback that walks up from `fileURLToPath(import.meta.url)` and gates on `isGbrainRepoRoot` so unrelated repos can't false-positive. Read-path callers (`doctor`, `check-resolvable`, `routing-eval`) use the read-only variant; write-path callers (`skillpack install`, `skillify scaffold`, `post-install-advisory`) deliberately stay on the shared function so `gbrain skillpack install` from `~` cannot silently retarget the bundled gbrain repo's `skills/` instead of the user's actual workspace. Two new `SkillsDirSource` variants: `'env_explicit'`, `'install_path'`. New `AUTO_DETECT_HINT_READ_ONLY` documents the extra tier. The D6 `--fix` safety gate in `doctor.ts` + `check-resolvable.ts` refuses auto-repair when `detected.source === 'install_path'` so `gbrain doctor --fix` from `~` cannot silently rewrite the bundled install tree. -- `src/commands/check-resolvable.ts` — Standalone CLI wrapper (v0.16.4) over `checkResolvable()`. Exports `parseFlags`, `resolveSkillsDir`, `DEFERRED`, `runCheckResolvable`. Exit rule: **1 on any issue (warnings OR errors)**, stricter than doctor's `ok` flag — honors README:259. Stable JSON envelope `{ok, skillsDir, report, autoFix, deferred, error, message}` — same shape on success and error paths. `--fix` path runs `autoFixDryViolations` BEFORE `checkResolvable` (same ordering as doctor). `scripts/skillify-check.ts` subprocess-calls `gbrain check-resolvable --json` (cached per process) and fails loud on binary-missing — no silent false-pass. **v0.19:** AGENTS.md workspaces now resolve natively (see `src/core/resolver-filenames.ts`) — gbrain inspects the 107-skill OpenClaw deployment whether the routing file is `RESOLVER.md` or `AGENTS.md`. `DEFERRED[]` is empty — Checks 5 + 6 shipped as real code, not issue URLs. **v0.31.7:** the resolver lookup switched from first-match-wins to the multi-file merge in `src/core/check-resolvable.ts` — entries collected from every `RESOLVER.md` / `AGENTS.md` across the skills dir AND its parent, deduped by `skillPath` (first occurrence wins). Lifted reachable skills on the reference OpenClaw layout from 37/224 to 200/224 — the deployment ships a thin `skills/RESOLVER.md` (~40 entries from skillpack) plus a fat `../AGENTS.md` (200+ entries, the real dispatcher), and the previous code only saw the first one. The CLI also switched to `autoDetectSkillsDirReadOnly` so `cd ~ && gbrain check-resolvable` finds the bundled skills via the install-path fallback. `--fix` carries the same D6 safety gate as `gbrain doctor --fix`: refuses to write when `detected.source === 'install_path'`. -- `src/core/resolver-filenames.ts` (v0.19) — central list of accepted routing filenames (`RESOLVER.md`, `AGENTS.md`). Shared by `findRepoRoot`, `check-resolvable`, and skillpack install so every code path walks the same fallback chain. -- `src/commands/skillify.ts` + `src/core/skillify/{generator,templates}.ts` (v0.19) — `gbrain skillify scaffold ` creates all stubs for a new skill in one command: SKILL.md, script, tests, routing-eval.jsonl, resolver entry, filing-rules pointer. `gbrain skillify check