From b0894ab8945271374b79ffd80c3cbaa1105c8a17 Mon Sep 17 00:00:00 2001 From: chapter37haptics <249148637+chapter37haptics@users.noreply.github.com> Date: Fri, 15 May 2026 02:54:54 +0000 Subject: [PATCH 01/19] feat: add superpowers implementation plan for customized-domain spec Co-Authored-By: Claude Opus 4.6 (1M context) --- .../plans/2026-05-15-customized-domain.md | 1102 +++++++++++++++++ 1 file changed, 1102 insertions(+) create mode 100644 docs/superpowers/plans/2026-05-15-customized-domain.md diff --git a/docs/superpowers/plans/2026-05-15-customized-domain.md b/docs/superpowers/plans/2026-05-15-customized-domain.md new file mode 100644 index 000000000..70f4f896a --- /dev/null +++ b/docs/superpowers/plans/2026-05-15-customized-domain.md @@ -0,0 +1,1102 @@ +# Customized Domain (VC to Developer) Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Adapt gbrain's skill layer from a VC/executive knowledge domain to a developer knowledge domain, replacing people/companies/deals entity detection with goals/decisions/processes/concepts. + +**Architecture:** Code-first, then skills. Three narrow code patches extend the `PageType` union, `inferType()` directory mapper, and `DIR_PATTERN` auto-link regex to recognize new entity types. Six skill files are then rewritten or patched to redirect the agent's detection, filing, retrieval, and quality gates from VC entities to developer entities. No schema, pipeline, or MCP changes. + +**Tech Stack:** TypeScript (Bun runtime), Markdown skill files, JSON config + +--- + +## File Map + +| Action | File | Responsibility | +|--------|------|----------------| +| Patch | `src/core/types.ts:13,22-27` | Add `goal`, `decision`, `process` to `PageType` union + `ALL_PAGE_TYPES` array | +| Patch | `test/page-type-exhaustive.test.ts:63-89` | Add `goal`, `decision`, `process` cases to exhaustive switch | +| Patch | `src/core/markdown.ts:344-375` | Add `goals/`, `decisions/`, `processes/` to `inferType()` | +| Patch | `src/core/link-extraction.ts:46` | Add `goals`, `decisions`, `processes` to `DIR_PATTERN` regex | +| Rewrite | `skills/conventions/quality.md` | Generalize Iron Law, add developer notability criteria | +| Patch | `skills/brain-ops/SKILL.md` | Replace 8 VC hard-gates with developer entity references | +| Rewrite | `skills/signal-detector/SKILL.md` | Replace VC detection with developer signal detection | +| Rewrite | `skills/_brain-filing-rules.md` | Replace VC taxonomy with developer filing taxonomy | +| Patch | `skills/_brain-filing-rules.json` | Add `goal`, `decision`, `process` kinds + dream paths | +| Rewrite | `skills/RESOLVER.md` | Replace VC triggers with developer triggers | +| Patch | `skills/conventions/brain-first.md` | Replace VC entity conventions with developer entity table | + +--- + +## Task 1: Add developer PageTypes to type system + +**Files:** +- Modify: `src/core/types.ts:13` (PageType union) +- Modify: `src/core/types.ts:22-27` (ALL_PAGE_TYPES array) +- Modify: `test/page-type-exhaustive.test.ts:63-89` (exhaustive switch) + +- [ ] **Step 1: Add `goal`, `decision`, `process` to the `PageType` union** + +In `src/core/types.ts` line 13, append the three new types before the closing semicolon: + +```typescript +export type PageType = 'person' | 'company' | 'deal' | 'yc' | 'civic' | 'project' | 'concept' | 'source' | 'media' | 'writing' | 'analysis' | 'guide' | 'hardware' | 'architecture' | 'meeting' | 'note' | 'email' | 'slack' | 'calendar-event' | 'code' | 'image' | 'synthesis' | 'goal' | 'decision' | 'process'; +``` + +- [ ] **Step 2: Add the same three types to `ALL_PAGE_TYPES`** + +In `src/core/types.ts` lines 22-27, add the three new types to the array: + +```typescript +export const ALL_PAGE_TYPES: readonly PageType[] = [ + 'person', 'company', 'deal', 'yc', 'civic', 'project', 'concept', + 'source', 'media', 'writing', 'analysis', 'guide', 'hardware', + 'architecture', 'meeting', 'note', 'email', 'slack', 'calendar-event', + 'code', 'image', 'synthesis', 'goal', 'decision', 'process', +] as const; +``` + +- [ ] **Step 3: Add cases to the exhaustive switch in the contract test** + +In `test/page-type-exhaustive.test.ts`, add three cases to the `classify` function (lines 63-89) before the `default`: + +```typescript + case 'synthesis': return 'doc'; + case 'goal': return 'work'; + case 'decision': return 'doc'; + case 'process': return 'doc'; + default: return assertNever(t); +``` + +- [ ] **Step 4: Run typecheck to verify the union is consistent** + +Run: `bun run typecheck` +Expected: PASS (no type errors). If any switch/assertNever consumer fails, it means there's an exhaustive switch elsewhere that needs new cases — fix those before proceeding. + +- [ ] **Step 5: Run unit tests to verify contract test passes** + +Run: `bun test test/page-type-exhaustive.test.ts` +Expected: All 4 tests pass, including the round-trip and exhaustive switch tests. + +- [ ] **Step 6: Commit** + +```bash +git add src/core/types.ts test/page-type-exhaustive.test.ts +git commit -m "feat: add goal, decision, process to PageType union" +``` + +--- + +## Task 2: Add developer directory mappings to `inferType()` + +**Files:** +- Modify: `src/core/markdown.ts:344-375` (inferType function) +- Test: `test/markdown.test.ts` + +- [ ] **Step 1: Write failing tests for the three new directory mappings** + +Add tests to `test/markdown.test.ts` inside the existing `inferType` / `parseMarkdown` describe block. Find the section that tests type inference from file paths (look for `people/someone.md` test around line 70-89) and add after it: + +```typescript + test('inferType: goals/ → goal', () => { + const result = parseMarkdown('---\ntitle: Test\n---\nBody', 'goals/setup-jwt-auth.md'); + expect(result.type).toBe('goal'); + }); + + test('inferType: decisions/ → decision', () => { + const result = parseMarkdown('---\ntitle: Test\n---\nBody', 'decisions/chose-postgres.md'); + expect(result.type).toBe('decision'); + }); + + test('inferType: processes/ → process', () => { + const result = parseMarkdown('---\ntitle: Test\n---\nBody', 'processes/deploy-to-prod.md'); + expect(result.type).toBe('process'); + }); + + test('inferType: decisions/ under projects/ → decision (longest prefix)', () => { + const result = parseMarkdown('---\ntitle: Test\n---\nBody', 'projects/my-app/decisions/use-redis.md'); + expect(result.type).toBe('decision'); + }); +``` + +- [ ] **Step 2: Run the tests to confirm they fail** + +Run: `bun test test/markdown.test.ts` +Expected: The four new tests FAIL (goals/ returns `concept`, decisions/ returns `concept`, processes/ returns `concept`, nested decisions/ returns `project`). + +- [ ] **Step 3: Add the directory mappings to `inferType()`** + +In `src/core/markdown.ts`, add three lines inside `inferType()`. Place them BEFORE the `/projects/` check (line 364) so `decisions/` under `projects/` matches `decision` first (longest prefix wins): + +```typescript + if (lower.includes('/goals/') || lower.includes('/goal/')) return 'goal'; + if (lower.includes('/decisions/') || lower.includes('/decision/')) return 'decision'; + if (lower.includes('/processes/') || lower.includes('/process/')) return 'process'; + if (lower.includes('/projects/') || lower.includes('/project/')) return 'project'; +``` + +The three new lines go right before the existing `projects/` line. Do NOT remove or change any existing lines — the VC directory mappings stay for backward compatibility. + +- [ ] **Step 4: Run the tests to confirm they pass** + +Run: `bun test test/markdown.test.ts` +Expected: All tests pass, including the four new ones. + +- [ ] **Step 5: Run typecheck** + +Run: `bun run typecheck` +Expected: PASS + +- [ ] **Step 6: Commit** + +```bash +git add src/core/markdown.ts test/markdown.test.ts +git commit -m "feat: add goals/decisions/processes directory mappings to inferType" +``` + +--- + +## Task 3: Add developer directories to `DIR_PATTERN` auto-link regex + +**Files:** +- Modify: `src/core/link-extraction.ts:46` (DIR_PATTERN) +- Test: `test/link-extraction.test.ts` + +- [ ] **Step 1: Write failing tests for entity ref extraction from developer directories** + +Add tests to `test/link-extraction.test.ts` inside the existing `extractEntityRefs` describe block: + +```typescript + test('extractEntityRefs: goals/ directory link', () => { + const refs = extractEntityRefs('[Setup JWT](goals/setup-jwt-auth)'); + expect(refs).toEqual([{ name: 'Setup JWT', slug: 'goals/setup-jwt-auth' }]); + }); + + test('extractEntityRefs: decisions/ directory link', () => { + const refs = extractEntityRefs('[Chose Postgres](decisions/chose-postgres)'); + expect(refs).toEqual([{ name: 'Chose Postgres', slug: 'decisions/chose-postgres' }]); + }); + + test('extractEntityRefs: processes/ directory link', () => { + const refs = extractEntityRefs('[Deploy Flow](processes/deploy-to-prod)'); + expect(refs).toEqual([{ name: 'Deploy Flow', slug: 'processes/deploy-to-prod' }]); + }); + + test('extractEntityRefs: goals/ wikilink', () => { + const refs = extractEntityRefs('[[goals/setup-jwt-auth|Setup JWT]]'); + expect(refs).toEqual([{ name: 'Setup JWT', slug: 'goals/setup-jwt-auth' }]); + }); +``` + +- [ ] **Step 2: Run tests to confirm they fail** + +Run: `bun test test/link-extraction.test.ts` +Expected: The four new tests FAIL (DIR_PATTERN doesn't match goals/decisions/processes). + +- [ ] **Step 3: Add `goals`, `decisions`, `processes` to `DIR_PATTERN`** + +In `src/core/link-extraction.ts` line 46, add the three new directories to the regex alternation. Place them at the beginning (longest-first for the regex engine): + +```typescript +const DIR_PATTERN = '(?:goals|decisions|processes|people|companies|meetings|concepts|deal|civic|project|projects|source|media|yc|tech|finance|personal|openclaw|entities)'; +``` + +- [ ] **Step 4: Run tests to confirm they pass** + +Run: `bun test test/link-extraction.test.ts` +Expected: All tests pass, including the four new ones. + +- [ ] **Step 5: Run typecheck** + +Run: `bun run typecheck` +Expected: PASS + +- [ ] **Step 6: Commit** + +```bash +git add src/core/link-extraction.ts test/link-extraction.test.ts +git commit -m "feat: add goals/decisions/processes to DIR_PATTERN auto-link" +``` + +--- + +## Task 4: Rewrite `quality.md` — root of the delegation chain + +**Files:** +- Rewrite: `skills/conventions/quality.md` + +This is the most important skill file change. Every other file's Iron Law and notability gate delegates here. The VC scoping ("person or company") must become entity-generic. + +- [ ] **Step 1: Rewrite `quality.md`** + +Replace the entire contents of `skills/conventions/quality.md` with: + +```markdown +# Quality Convention + +Cross-cutting quality rules for all brain-writing skills. + +## Citations (MANDATORY) + +Every fact written to a brain page must carry an inline `[Source: ...]` citation. + +- **User's statements:** `[Source: User, {context}, YYYY-MM-DD]` +- **Meeting data:** `[Source: Meeting "{title}", YYYY-MM-DD]` +- **Email/message:** `[Source: email from {name} re: {subject}, YYYY-MM-DD]` +- **Web content:** `[Source: {publication}, {URL}, YYYY-MM-DD]` +- **Social media:** `[Source: X/@handle, YYYY-MM-DD](URL)` +- **Synthesis:** `[Source: compiled from {sources}]` + +### Source precedence (highest to lowest) + +1. User's direct statements (highest authority) +2. Compiled truth (brain's synthesized understanding) +3. Timeline entries (raw evidence) +4. External sources (API enrichment, web search) + +## Back-Linking (MANDATORY) + +Every mention of an entity WITH a brain page MUST create a back-link +FROM that entity's page TO the page mentioning it. + +Entities: goals, decisions, processes, concepts — any page in a recognized +entity directory. + +Format: `- **YYYY-MM-DD** | Referenced in [page title](path) -- context` + +An unlinked mention is a broken brain. + +## Notability Gate + +Before creating a new brain page, check notability: + +- **Goals:** Is this a distinct execution arc worth documenting? (Not a sub-step of an existing goal) +- **Decisions:** Does this choice govern future work beyond the current goal? +- **Processes:** Is this repeatable and handoff-worthy? (Not a one-off sequence) +- **Concepts:** Reusable across goals? Stable? Non-procedural? (If it's steps, it's a process) + +When in doubt, capture in the current goal page first. Promote to its own page +only when reuse is clear. A missing page can be created later. A junk page +wastes attention and degrades search quality. +``` + +- [ ] **Step 2: Verify the file reads correctly** + +Run: `cat skills/conventions/quality.md` +Expected: The full new content with developer-domain notability criteria. + +- [ ] **Step 3: Commit** + +```bash +git add skills/conventions/quality.md +git commit -m "feat: generalize quality.md Iron Law and notability gate for developer domain" +``` + +--- + +## Task 5: Patch `brain-ops/SKILL.md` — the loop engine (8 sites) + +**Files:** +- Modify: `skills/brain-ops/SKILL.md` + +Eight hard-gate sites say "person or company" and must be changed to developer entity references. The `writes_to` frontmatter also needs updating. + +- [ ] **Step 1: Update `writes_to` frontmatter (lines 22-26)** + +Replace: +```yaml +writes_to: + - people/ + - companies/ + - deals/ + - concepts/ + - meetings/ +``` + +With: +```yaml +writes_to: + - goals/ + - decisions/ + - processes/ + - concepts/ +``` + +- [ ] **Step 2: Update Iron Law scope (line 49)** + +Replace: +``` +Every mention of a person or company with a brain page MUST create a back-link +``` + +With: +``` +Every mention of an entity with a brain page MUST create a back-link +``` + +- [ ] **Step 3: Update Phase 1 description (line 57)** + +Replace: +``` +Before using ANY external API to research a person, company, or topic: +``` + +With: +``` +Before using ANY external API to research a goal, decision, process, or concept: +``` + +- [ ] **Step 4: Update Phase 2 trigger (lines 69-71)** + +Replace: +``` +Every message, meeting, email, or conversation that references a person or company: + +1. **Detect entities** — people, companies, deals mentioned +``` + +With: +``` +Every message or conversation that references a goal, decision, process, or concept: + +1. **Detect entities** — goals, decisions, processes, concepts mentioned +``` + +- [ ] **Step 5: Update Phase 2.5 link types (lines 88-89)** + +Replace: +``` +- Inferred link types: `attended` (meeting -> person), `works_at`, `invested_in`, + `founded`, `advises`, `source` (frontmatter), `mentions` (default). +``` + +With: +``` +- Inferred link types: `uses` (goal -> concept), `decided_in` (decision -> goal), + `depends_on` (process -> concept), `source` (frontmatter), `mentions` (default). +``` + +- [ ] **Step 6: Update Phase 3 description (line 98)** + +Replace: +``` +Before answering any question about a person, company, or topic: +``` + +With: +``` +Before answering any question about a goal, decision, process, or concept: +``` + +- [ ] **Step 7: Update Phase 4 ambient enrichment triggers (lines 111-112)** + +Replace: +``` +- Person mentioned → check brain, create/enrich if needed (spawn background) +- Company mentioned → same +``` + +With: +``` +- Goal mentioned → check brain, create/update if needed (spawn background) +- Decision/process/concept mentioned → same +``` + +- [ ] **Step 8: Update anti-patterns (line 147)** + +Replace: +``` +- Answering questions about people/companies without checking the brain first +``` + +With: +``` +- Answering questions about goals/decisions/processes/concepts without checking the brain first +``` + +- [ ] **Step 9: Verify the file reads correctly** + +Run: `cat skills/brain-ops/SKILL.md | head -60` +Expected: Updated frontmatter with developer directories and generalized Iron Law. + +- [ ] **Step 10: Commit** + +```bash +git add skills/brain-ops/SKILL.md +git commit -m "feat: patch brain-ops 8 hard-gate sites for developer domain" +``` + +--- + +## Task 6: Rewrite `signal-detector/SKILL.md` + +**Files:** +- Rewrite: `skills/signal-detector/SKILL.md` + +Replace the VC-oriented entity detection with developer-domain signal detection. The signal detector fires on every message and is the entry point for knowledge capture. + +- [ ] **Step 1: Rewrite the entire file** + +Replace the entire contents of `skills/signal-detector/SKILL.md` with: + +```markdown +--- +name: signal-detector +version: 2.0.0 +description: | + Always-on ambient signal capture for developer knowledge. Fires on every + inbound message to detect goals, decisions, processes, concepts, and + original thinking. Spawn as a cheap sub-agent in parallel, never block + the main response. +triggers: + - every inbound message (always-on) +tools: + - search + - query + - get_page + - put_page + - add_link + - add_timeline_entry +mutating: true +writes_pages: true +writes_to: + - goals/ + - decisions/ + - processes/ + - concepts/ +--- + +# Signal Detector — Developer Knowledge Capture + +Lightweight sub-agent that fires on every inbound message to capture TWO things +with EQUAL priority: + +1. **Original thinking** — the user's ideas, observations, frameworks +2. **Developer knowledge signals** — goals, decisions, processes, concepts + +Original thinking is AT LEAST as valuable as entity extraction. Ideas are the +intellectual capital. Entities are bookkeeping. Both compound over time. + +## Contract + +This skill guarantees: +- Fires on every message (no exceptions unless purely operational) +- Runs in parallel (spawned, never blocks main response) +- Captures ideas with the user's EXACT phrasing (no paraphrasing) +- Detects developer knowledge signals and creates/enriches brain pages +- Logs a one-line summary of what was captured +- Back-links all entity mentions (Iron Law) +- Citations on every fact written + +> **Convention:** See `skills/conventions/quality.md` for Iron Law back-linking. + +Every time this skill creates or updates a brain page that mentions another entity: +1. Check if that entity has a brain page +2. If yes → add a back-link FROM their page TO the page you just created/updated +3. Format: `- **YYYY-MM-DD** | Referenced in [page title](path) — brief context` +4. An unlinked mention is a broken brain. + +## Phases + +### Phase 1: Idea/Observation Detection (PRIMARY) + +When the user expresses a novel thought, observation, thesis, or framework: +- If it's the user's **original thinking** (they generated it) → create/update `concepts/{slug}` +- If it's a **reusable pattern or mental model** → create/update `concepts/{slug}` + +**Capture exact phrasing.** The user's language IS the insight. Don't paraphrase. + +**Cross-linking (MANDATORY):** Every concept MUST link to related goals, decisions, +and processes. A concept without cross-links is a dead concept. + +### Phase 2: Developer Knowledge Detection (SECONDARY) + +Scan every message for these signals: + +1. **Goal signals** — "set up JWT auth", "migrate to Postgres", "fix the deploy", + any /goal invocation or development task being worked on + - Check brain: `gbrain search "goal name"` + - If no page → create `goals/{slug}` with approach, environment, initial state + - If page exists → update with new progress, debug trails, decisions made + +2. **Decision signals** — "we chose X because Y", "decided to", "tradeoff", + "going with", "ruling out" + - If the decision governs future work beyond this goal → create `decisions/{slug}` + - If the decision is local to the current goal → log on the goal page + - Always record: what was decided, why, what alternatives were considered + +3. **Process signals** — "to deploy, you need to", "the workflow is", "steps to", + "how to set up", repeatable sequences + - Create `processes/{slug}` with preconditions, steps, verification + - Only if the process is reproducible and handoff-worthy + +4. **Concept signals** — "event sourcing works by", "the repository pattern", + "Docker needs this flag because", tool knowledge, pattern explanations + - Create/update `concepts/{slug}` with context-free reusable understanding + - Must be: reusable, cross-goal, stable, non-procedural + +5. **Debug signals** — "the bug was caused by", "root cause was", "fixed by" + - Add structured timeline entry to the active goal page (NOT a separate page) + - Format: `- **YYYY-MM-DD** | Debug — **Symptom:** X. **Root cause:** Y. **Fix:** Z.` + +For each entity: +- `gbrain search "name"` — does a page exist? +- If NO page → check notability (see quality.md). If notable, create with enrichment. +- If page exists but THIN → enrich with new information +- If page exists and RICH → add timeline entry if there's new dated information + +**Auto-link (v0.10.1):** When you write/update a page that references another +entity, the auto-link post-hook on `put_page` automatically creates the graph +edge. You don't need to call `gbrain link` manually. Timeline entries still +need explicit calls. + +### Phase 3: Signal Logging + +Always log a one-line summary: +- `Signals: 0 ideas, 0 entities, 0 facts (skipped: operational)` +- `Signals: 1 concept (captured → concepts/x), 1 goal (updated → goals/y), 1 decision (created → decisions/z)` + +This makes the ambient capture loop debuggable. + +## Output Format + +No visible output to the user. This skill runs silently in the background. +The output is brain pages created/updated and the signal log line. + +## Anti-Patterns + +- Blocking the main response to wait for signal detection to complete +- Paraphrasing the user's original thinking instead of capturing exact phrasing +- Creating pages for non-notable entities (one-off mentions, sub-steps) +- Skipping back-links after creating/updating pages +- Running on purely operational messages ("ok", "thanks", "do it") +- Creating a separate page for debug trails (they go on the goal page) +- Filing a concept that's really a process (if it has steps, it's a process) + +## Tools Used + +- `search` — check if entity page exists +- `query` — semantic search for related context +- `get_page` — load existing entity pages +- `put_page` — create/update brain pages +- `add_link` — cross-reference entities +- `add_timeline_entry` — record events on entity timelines +``` + +- [ ] **Step 2: Verify the file reads correctly** + +Run: `head -30 skills/signal-detector/SKILL.md` +Expected: Updated frontmatter with `writes_to: goals/, decisions/, processes/, concepts/` and version 2.0.0. + +- [ ] **Step 3: Commit** + +```bash +git add skills/signal-detector/SKILL.md +git commit -m "feat: rewrite signal-detector for developer domain knowledge capture" +``` + +--- + +## Task 7: Rewrite `_brain-filing-rules.md` and patch `_brain-filing-rules.json` + +**Files:** +- Rewrite: `skills/_brain-filing-rules.md` +- Modify: `skills/_brain-filing-rules.json` + +- [ ] **Step 1: Rewrite `_brain-filing-rules.md`** + +Replace the entire contents of `skills/_brain-filing-rules.md` with: + +```markdown +# Brain Filing Rules -- MANDATORY for all skills that write to the brain + +## The Rule + +The PRIMARY SUBJECT of the content determines where it goes. Not the format, +not the source, not the skill that's running. + +## Decision Protocol + +1. Identify the primary subject (a goal? decision? process? concept?) +2. File in the directory that matches the subject +3. Cross-link from related directories +4. When in doubt: what would you search for to find this page again? + +## Operational Rule + +Capture everything in `goals/` first. Promote out only when reusable: +- `decision` — if the choice should constrain other goals +- `process` — if it's reproducible and handoff-worthy +- `concept` — if it generalizes beyond the specific case + +## Common Misfiling Patterns -- DO NOT DO THESE + +| Wrong | Right | Why | +|-------|-------|-----| +| Local decision on goal page → `decisions/` | Keep on `goals/` page | Only durable cross-goal choices go to decisions/ | +| One-off command sequence → `processes/` | Keep on `goals/` page | processes/ is for repeatable, handoff-worthy workflows | +| Project-specific config note → `concepts/` | Keep on `goals/` page | concepts/ is for context-free reusable knowledge | +| Reusable pattern buried in goal page | → `concepts/` | If it applies to more than one goal, promote it | +| Debug trail → separate page | → timeline entry on `goals/` page | Debug trails are structured timeline entries, not pages | +| A series of steps → `concepts/` | → `processes/` | If it has steps, it's a process | + +## MECE Boundaries (hard rules) + +| Pair | Boundary | +|------|----------| +| goals/ vs decisions/ | goals: what happened in one execution run. decisions: durable choice meant to govern future goals | +| goals/ vs processes/ | goals: narrative + debug trail. processes: canonical reproducible procedure (no session story) | +| goals/ vs concepts/ | goals: applied, context-bound. concepts: context-free reusable understanding | +| decisions/ vs processes/ | decisions: what/why we chose. processes: how to execute | +| decisions/ vs concepts/ | decisions: committed policy for a scope. concepts: explanatory model, no commitment | +| processes/ vs concepts/ | processes: stepwise action. concepts: theory/pattern vocabulary | + +## Sanctioned exception: synthesis output is sui generis + +The "file by primary subject" rule is for raw ingest. Synthesized output that +is one-of-one to a single source AND a specific reader does not fit any +subject directory cleanly. + +Format-prefixed paths under `media//` are the sanctioned +exception: + +- `media/books/-personalized.md` (book-mirror output) +- `media/articles/-personalized.md` (long-form article personalization) + +## What `sources/` Is Actually For + +`sources/` is ONLY for: +- Bulk data imports (API dumps, CSV exports, snapshots) +- Raw data that feeds multiple brain pages +- Periodic captures (quarterly snapshots, sync exports) + +If the content has a clear primary subject (a goal, decision, process, concept), +it does NOT go in sources/. Period. + +## Notability Gate + +Not everything deserves a brain page. Before creating a new entity page: +- **Goals:** Is this a distinct execution arc? (Not a sub-step of an existing goal) +- **Decisions:** Does this choice govern future work beyond the current goal? +- **Processes:** Is this repeatable and handoff-worthy? (Not a one-off sequence) +- **Concepts:** Reusable across goals? Stable? Non-procedural? +- **When in doubt, DON'T create.** Capture on the goal page first. Promote later. + +## Iron Law: Back-Linking (MANDATORY) + +Every mention of an entity with a brain page MUST create a back-link +FROM that entity's page TO the page mentioning it. This is bidirectional: +the new page links to the entity, AND the entity's page links back. + +Format for back-links (append to Timeline or See Also): +``` +- **YYYY-MM-DD** | Referenced in [page title](path/to/page.md) -- brief context +``` + +An unlinked mention is a broken brain. The graph is the intelligence. + +## Citation Requirements (MANDATORY) + +Every fact written to a brain page must carry an inline `[Source: ...]` citation. + +Three formats: +- **Direct attribution:** `[Source: User, {context}, YYYY-MM-DD]` +- **API/external:** `[Source: {provider}, YYYY-MM-DD]` or `[Source: {publication}, {URL}]` +- **Synthesis:** `[Source: compiled from {list of sources}]` + +Source precedence (highest to lowest): +1. User's direct statements (highest authority) +2. Compiled truth (pre-existing brain synthesis) +3. Timeline entries (raw evidence) +4. External sources (API enrichment, web search -- lowest) + +When sources conflict, note the contradiction with both citations. Don't +silently pick one. + +## Raw Source Preservation + +Every ingested item should have its raw source preserved for provenance. + +**Size routing (automatic via `gbrain files upload-raw`):** +- **< 100 MB text/PDF**: stays in the brain repo (git-tracked) in a `.raw/` + sidecar directory alongside the brain page +- **>= 100 MB OR media files** (video, audio, images): uploaded to cloud + storage with a `.redirect.yaml` pointer left in the brain repo. + +## Dream-cycle synthesize / patterns directories (v0.23) + +The `synthesize` and `patterns` phases of `gbrain dream` write to a +**fixed allow-list** of paths sourced from `_brain-filing-rules.json`'s +`dream_synthesize_paths.globs` array. Editing that JSON is the ONLY way +to add a new directory the synthesis subagent may write to. + +## Brain-to-skill promotion pipeline + +When a process proves repeatable (2-3 times with only argument changes), +it graduates from a `processes/` brain page to an actual skill file: + +- Brain stores: context, evidence, tradeoffs, project-specific constraints, debug history +- Skill files store: stable, parameterized procedures with deterministic steps +- Promotion rule: if reused successfully 2-3 times with only argument changes, graduate to a skill +- Bidirectional links: process page links to skill file path, skill references source brain pages + +## Takes attribution (v0.32+) + +When writing a `` fence, the **holder** column says +WHO BELIEVES the claim, not who it's ABOUT. + +1. **Holder ≠ subject.** The test: did this person SAY or CLEARLY IMPLY this? +2. **Atomic claims.** Split compound rows into separate rows. One claim per row. +3. **Amplification ≠ endorsement.** A retweet-only signal caps at `weight 0.55`. +4. **Self-reported ≠ verified.** Self-report → `weight=0.75`, not `holder=world/1.0`. +5. **No false precision.** Use 0.05 increments only. +6. **"So what" test.** Skip metadata-style trivia. +``` + +- [ ] **Step 2: Add `goal`, `decision`, `process` kinds to `_brain-filing-rules.json`** + +In `skills/_brain-filing-rules.json`, add three new rule objects to the `rules` array. Insert them after the existing `concept` rule (after line 36): + +```json + { + "kind": "goal", + "directory": "goals/", + "examples": ["development tasks", "/goal executions", "debug sessions"], + "description": "One /goal execution arc: what was attempted, what happened, decisions made, debug trails, what was learned. The primary authoring unit — capture here first, promote out when reusable." + }, + { + "kind": "decision", + "directory": "decisions/", + "examples": ["architecture choices", "tool selections", "tradeoff resolutions"], + "description": "A durable technical choice that governs future work beyond one goal. ADR-style: context, options considered, decision, consequences." + }, + { + "kind": "process", + "directory": "processes/", + "examples": ["deploy workflows", "setup procedures", "migration runbooks"], + "description": "A canonical reproducible procedure that is handoff-worthy. Graduates to a skill file after 2-3 successful reuses with only argument changes." + }, +``` + +- [ ] **Step 3: Add developer directories to `dream_synthesize_paths.globs`** + +In `skills/_brain-filing-rules.json`, add three new globs to the `dream_synthesize_paths.globs` array (around line 157-163): + +```json + "globs": [ + "wiki/personal/reflections/*", + "wiki/originals/*", + "wiki/personal/patterns/*", + "wiki/people/*", + "dream-cycle-summaries/*", + "goals/*", + "decisions/*", + "processes/*" + ] +``` + +- [ ] **Step 4: Run the filing-audit test to verify the new kinds are accepted** + +Run: `bun test test/filing-audit.test.ts` +Expected: All tests pass. The filing audit reads `_brain-filing-rules.json` for valid directories, so adding the new kinds makes `goals/`, `decisions/`, `processes/` valid `writes_to` targets. + +- [ ] **Step 5: Run the skills-conformance test** + +Run: `bun test test/skills-conformance.test.ts` +Expected: All tests pass. The signal-detector and brain-ops skills now declare `writes_to` directories that exist in the filing rules JSON. + +- [ ] **Step 6: Commit** + +```bash +git add skills/_brain-filing-rules.md skills/_brain-filing-rules.json +git commit -m "feat: rewrite filing rules for developer domain taxonomy" +``` + +--- + +## Task 8: Rewrite `RESOLVER.md` — routing table + +**Files:** +- Rewrite: `skills/RESOLVER.md` + +Replace VC-oriented triggers with developer-oriented triggers. Keep the table structure and all non-VC skills (thinking skills, operational, setup, identity). + +- [ ] **Step 1: Rewrite `RESOLVER.md`** + +Replace the entire contents of `skills/RESOLVER.md`. **IMPORTANT:** All quoted trigger phrases in table rows must remain unchanged — the resolver test (D5/C) fuzzy-matches quoted phrases against each skill's frontmatter triggers. Since we are NOT modifying the underlying skills (query, enrich, data-research, etc.), their trigger phrases must stay the same. Only change unquoted descriptive text and the disambiguation rules. + +```markdown +# GBrain Skill Resolver + +This is the dispatcher. Skills are the implementation. **Read the skill file before acting.** If two skills could match, read both. They are designed to chain (e.g., ingest then enrich for each entity). + +## Always-on (every message) + +| Trigger | Skill | +|---------|-------| +| Every inbound message (spawn parallel, don't block) | `skills/signal-detector/SKILL.md` | +| Any brain read/write/lookup/citation | `skills/brain-ops/SKILL.md` | + +## Brain operations + +| Trigger | Skill | +|---------|-------| +| "What do we know about", "tell me about", "search for", "who is", "background on", "notes on" | `skills/query/SKILL.md` | +| "Who knows who", "relationship between", "connections", "graph query" | `skills/query/SKILL.md` (use graph-query) | +| Creating/enriching a goal, decision, process, or concept page | `skills/enrich/SKILL.md` | +| Where does a new file go? Filing rules | `skills/repo-architecture/SKILL.md` | +| Fix broken citations in brain pages | `skills/citation-fixer/SKILL.md` | +| "citation audit", "check citations", "fix citations" | `skills/citation-fixer/SKILL.md` (focused fix). For broader brain health, chain into `skills/maintain/SKILL.md` | +| "Research", "track", "extract from email", "investor updates", "donations" | `skills/data-research/SKILL.md` | +| Share a brain page as a link | `skills/publish/SKILL.md` | +| "validate frontmatter", "check frontmatter", "fix frontmatter", "frontmatter audit", "brain lint" | `skills/frontmatter-guard/SKILL.md` | + +## Content & media ingestion + +| Trigger | Skill | +|---------|-------| +| User shares a link, article, or idea | `skills/idea-ingest/SKILL.md` | +| "watch this video", "process this YouTube link", "ingest this PDF", "save this podcast", "process this book", "summarize this book", "PDF book", "ingest it into my brain", "what's in this screenshot", "check out this repo" | `skills/media-ingest/SKILL.md` | +| Meeting transcript received | `skills/meeting-ingestion/SKILL.md` | +| Generic "ingest this" (auto-routes to above) | `skills/ingest/SKILL.md` | + +## Thinking skills (from GStack) + +| Trigger | Skill | +|---------|-------| +| "Brainstorm", "I have an idea", "office hours" | GStack: office-hours | +| "Review this plan", "CEO review", "poke holes" | GStack: ceo-review | +| "Debug", "fix", "broken", "investigate" | GStack: investigate | +| "Retro", "what shipped", "retrospective" | GStack: retro | + +> These skills come from GStack. If GStack is installed, the agent reads them directly. +> If not, brain-only mode still works (brain skills function without thinking skills). + +## Operational + +| Trigger | Skill | +|---------|-------| +| Task add/remove/complete/defer/review | `skills/daily-task-manager/SKILL.md` | +| Morning prep, meeting context, day planning | `skills/daily-task-prep/SKILL.md` | +| Daily briefing, "what's happening today" | `skills/briefing/SKILL.md` | +| Cron scheduling, quiet hours, job staggering | `skills/cron-scheduler/SKILL.md` | +| Save or load reports | `skills/reports/SKILL.md` | +| "Create a skill", "improve this skill" | `skills/skill-creator/SKILL.md` | +| "Skillify this", "is this a skill?", "make this proper" | `skills/skillify/SKILL.md` | +| "Compress my resolver", "AGENTS.md too large", "RESOLVER.md too big", "functional area dispatcher", "shrink routing table" | `skills/functional-area-resolver/SKILL.md` | +| "Is gbrain healthy?", morning health check, skillpack-check | `skills/skillpack-check/SKILL.md` | +| Post-restart health + auto-fix, smoke test | `skills/smoke-test/SKILL.md` | +| Cross-modal review, second opinion | `skills/cross-modal-review/SKILL.md` | +| "Validate skills", skill health check | `skills/testing/SKILL.md` | +| Webhook setup, external event processing | `skills/webhook-transforms/SKILL.md` | +| "Spawn agent", "background task", "parallel tasks", "steer agent", "pause/resume agent", "gbrain jobs submit", "submit a gbrain job", "submit a shell job", "shell job" | `skills/minion-orchestrator/SKILL.md` | +| "present options", "ask before proceeding", "choice gate", "user decision" | `skills/ask-user/SKILL.md` | + +## Setup & migration + +| Trigger | Skill | +|---------|-------| +| "Set up GBrain", first boot | `skills/setup/SKILL.md` | +| "Now what?", "fill my brain", "cold start", "bootstrap", "import my data", "what should I import first" | `skills/cold-start/SKILL.md` | +| "Migrate from Obsidian/Notion/Logseq" | `skills/migrate/SKILL.md` | +| Brain health check, maintenance run | `skills/maintain/SKILL.md` | +| "Extract links", "build link graph", "populate timeline" | `skills/maintain/SKILL.md` (extraction sections) | +| "Run dream", "process today's session", "synthesize my conversations", "consolidate yesterday's conversations", "what patterns did you see", "did the dream cycle run" | `skills/maintain/SKILL.md` (dream cycle section) | +| "Brain health", "what features am I missing", "brain score" | Run `gbrain features --json` | +| "Set up autopilot", "run brain maintenance", "keep brain updated" | Run `gbrain autopilot --install --repo ~/brain` | +| Agent identity, "who am I", customize agent | `skills/soul-audit/SKILL.md` | +| "Populate links", "extract links", "backfill graph" | `skills/maintain/SKILL.md` (graph population phase) | +| "Populate timeline", "extract timeline entries" | `skills/maintain/SKILL.md` (graph population phase) | + +## Identity & access (always-on) + +| Trigger | Skill | +|---------|-------| +| Non-owner sends a message | Check `ACCESS_POLICY.md` before responding | +| Agent needs to know its identity/vibe | Read `SOUL.md` | +| Agent needs user context | Read `USER.md` | +| Operational cadence (what to check and when) | Read `HEARTBEAT.md` | + +## Disambiguation rules + +When multiple skills could match: +1. Prefer the most specific skill (meeting-ingestion over ingest) +2. If the user mentions a URL, route by content type (link → idea-ingest, video → media-ingest) +3. If the user mentions a goal/decision/process/concept, check if enrich or query fits better +4. Chaining is explicit in each skill's Phases section +5. When in doubt, ask the user (see `skills/ask-user/SKILL.md` for the choice-gate pattern) + +## Conventions (cross-cutting) + +These apply to ALL brain-writing skills: +- `skills/conventions/quality.md` — citations, back-links, notability gate +- `skills/conventions/brain-first.md` — check brain before external APIs +- `skills/conventions/brain-routing.md` — which brain (DB) and which source (repo) to target; cross-brain federation is latent-space only +- `skills/conventions/subagent-routing.md` — when to use Minions vs inline work +- `skills/ask-user/SKILL.md` — choice-gate pattern for human input at decision points +- `skills/_brain-filing-rules.md` — where files go +- `skills/_output-rules.md` — output quality standards + +## Uncategorized + +| Trigger | Skill | +|---------|-------| +| "personalized version of this book", "mirror this book", "two-column book analysis", "apply this book to my life", "how does this book apply to me" | `skills/book-mirror/SKILL.md` | +| "enrich this article", "enrich brain pages", "batch enrich", "make brain pages useful" | `skills/article-enrichment/SKILL.md` | +| "strategic reading", "read this through the lens of", "apply this to my problem", "what can I learn from this about", "extract a playbook from" | `skills/strategic-reading/SKILL.md` | +| "concept synthesis", "synthesize my concepts", "find patterns across my notes", "build my intellectual map", "trace idea evolution" | `skills/concept-synthesis/SKILL.md` | +| "perplexity research", "what's new about", "current state of", "web research", "what changed about" | `skills/perplexity-research/SKILL.md` | +| "crawl my archive", "find gold in my archive", "archive crawler", "scan my dropbox for", "mine my old files for" | `skills/archive-crawler/SKILL.md` | +| "verify this academic claim", "check this study", "academic verify", "validate citation", "is this study real" | `skills/academic-verify/SKILL.md` | +| "make pdf from brain", "brain pdf", "convert brain page to pdf", "publish this page as pdf", "export brain page" | `skills/brain-pdf/SKILL.md` | +| "voice note", "ingest this voice memo", "transcribe and file", "voice note ingest", "save this audio note" | `skills/voice-note-ingest/SKILL.md` | +``` + +- [ ] **Step 2: Run resolver test** + +Run: `bun test test/resolver.test.ts` +Expected: All tests pass. The resolver test checks that every trigger in RESOLVER.md matches a skill's frontmatter `triggers:` entry. + +- [ ] **Step 3: Commit** + +```bash +git add skills/RESOLVER.md +git commit -m "feat: rewrite RESOLVER.md routing table for developer domain" +``` + +--- + +## Task 9: Patch `brain-first.md` — retrieval conventions + +**Files:** +- Modify: `skills/conventions/brain-first.md` + +- [ ] **Step 1: Update the header (line 3)** + +Replace: +``` +**Read this before doing ANY entity/person/company/fact lookup.** +``` + +With: +``` +**Read this before doing ANY entity/goal/decision/process/concept lookup.** +``` + +- [ ] **Step 2: Replace the entity page conventions table (lines 53-67)** + +Replace the entire "Entity Page Conventions" section: + +```markdown +## Entity Page Conventions + +Standard directory structure: + +| Directory | Type | Example | +|-----------|------|---------| +| `goals/` | goal | `goals/setup-jwt-auth.md` | +| `decisions/` | decision | `decisions/chose-postgres-over-sqlite.md` | +| `processes/` | process | `processes/deploy-to-production.md` | +| `concepts/` | concept | `concepts/event-sourcing.md` | + +When creating new pages, include proper frontmatter with `type`, `title`, +and `tags` fields. See `skills/_brain-filing-rules.md` for page templates. +``` + +- [ ] **Step 3: Verify the file reads correctly** + +Run: `cat skills/conventions/brain-first.md` +Expected: Developer entity table with goals/decisions/processes/concepts rows. + +- [ ] **Step 4: Commit** + +```bash +git add skills/conventions/brain-first.md +git commit -m "feat: update brain-first.md entity conventions for developer domain" +``` + +--- + +## Task 10: Full verification pass + +**Files:** +- None modified — verification only + +- [ ] **Step 1: Run typecheck** + +Run: `bun run typecheck` +Expected: PASS + +- [ ] **Step 2: Run full unit test suite** + +Run: `bun run test > /tmp/customized_domain_tests.txt 2>&1; echo "EXIT=$?"; tail -50 /tmp/customized_domain_tests.txt` +Expected: All tests pass. Zero failures. + +- [ ] **Step 3: Run the PageType consumer audit** + +Run: `grep -rn 'PageType\|ALL_PAGE_TYPES' src/ --include='*.ts' | grep -v node_modules | grep -v 'import.*PageType'` + +Review the output for any switch statements, whitelist arrays, or filter expressions that enumerate page types. The new types (`goal`, `decision`, `process`) must not be silently excluded by any existing filter. Key files to check: +- `src/core/facts/eligibility.ts` — `ELIGIBLE_TYPES` array. This is intentionally narrow (note/meeting/slack/email/calendar-event/source/writing). Developer types are NOT eligible for facts backstop, which is correct (goals/decisions/processes are structured pages, not conversation-shaped). +- `src/commands/doctor.ts` — `graph_coverage` check uses `type IN ('entity', 'person', 'company', 'organization')`. This is a Tier 2 change (not loop-breaking). Note it but don't block on it. + +- [ ] **Step 4: Run skills conformance test** + +Run: `bun test test/skills-conformance.test.ts` +Expected: All tests pass. + +- [ ] **Step 5: Run filing-audit test** + +Run: `bun test test/filing-audit.test.ts` +Expected: All tests pass. + +- [ ] **Step 6: Run check-resolvable test** + +Run: `bun test test/check-resolvable.test.ts` +Expected: All tests pass. + +- [ ] **Step 7: Run resolver test** + +Run: `bun test test/resolver.test.ts` +Expected: All tests pass. + +- [ ] **Step 8: Spot-check the inferLinkType limitation** + +Run: `grep -n 'inferLinkType' src/core/link-extraction.ts | head -5` + +Note: `inferLinkType()` classifies developer entity relationships as `mentions` (the default fallback). This is a known v1 limitation per the spec. The function uses regex heuristics tuned for VC relationships (founded, invested_in, works_at, attended). Adding developer-specific heuristics (uses, decided_in, depends_on) is a Tier 2 follow-up. + +--- + +## Task 11 (Tier 2, optional): Update doctor.ts graph_coverage check + +**Files:** +- Modify: `src/commands/doctor.ts:1378` + +This is a Tier 2 change — nice to have but not loop-breaking. + +- [ ] **Step 1: Update the type filter in graph_coverage check** + +In `src/commands/doctor.ts` line 1378, expand the SQL `type IN (...)` clause: + +Replace: +```sql +SELECT COUNT(*)::int AS count FROM pages WHERE type IN ('entity', 'person', 'company', 'organization') +``` + +With: +```sql +SELECT COUNT(*)::int AS count FROM pages WHERE type IN ('entity', 'person', 'company', 'organization', 'goal', 'decision', 'process') +``` + +- [ ] **Step 2: Run doctor test if one exists** + +Run: `bun test test/doctor.test.ts 2>/dev/null || echo "No doctor test file"` +Expected: Either passes or no test file exists. + +- [ ] **Step 3: Commit** + +```bash +git add src/commands/doctor.ts +git commit -m "feat: include developer types in doctor graph_coverage check" +``` From 91af5a70f6829919701e2a50dc7d76f5c773d746 Mon Sep 17 00:00:00 2001 From: chapter37haptics <249148637+chapter37haptics@users.noreply.github.com> Date: Fri, 15 May 2026 03:00:47 +0000 Subject: [PATCH 02/19] feat: add goal, decision, process to PageType union Co-Authored-By: Claude Sonnet 4.6 --- src/core/types.ts | 4 ++-- test/page-type-exhaustive.test.ts | 3 +++ 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/src/core/types.ts b/src/core/types.ts index 190459b51..0964e78b4 100644 --- a/src/core/types.ts +++ b/src/core/types.ts @@ -10,7 +10,7 @@ // embedding_image vector(1024). Bytes never enter the DB; the brain repo // holds the file and `files.storage_path` references it. // `synthesis` (v0.28): think-generated provenance pages. -export type PageType = 'person' | 'company' | 'deal' | 'yc' | 'civic' | 'project' | 'concept' | 'source' | 'media' | 'writing' | 'analysis' | 'guide' | 'hardware' | 'architecture' | 'meeting' | 'note' | 'email' | 'slack' | 'calendar-event' | 'code' | 'image' | 'synthesis'; +export type PageType = 'person' | 'company' | 'deal' | 'yc' | 'civic' | 'project' | 'concept' | 'source' | 'media' | 'writing' | 'analysis' | 'guide' | 'hardware' | 'architecture' | 'meeting' | 'note' | 'email' | 'slack' | 'calendar-event' | 'code' | 'image' | 'synthesis' | 'goal' | 'decision' | 'process'; /** * Canonical list of every PageType value. Kept in sync with the union above. @@ -23,7 +23,7 @@ export const ALL_PAGE_TYPES: readonly PageType[] = [ 'person', 'company', 'deal', 'yc', 'civic', 'project', 'concept', 'source', 'media', 'writing', 'analysis', 'guide', 'hardware', 'architecture', 'meeting', 'note', 'email', 'slack', 'calendar-event', - 'code', 'image', 'synthesis', + 'code', 'image', 'synthesis', 'goal', 'decision', 'process', ] as const; /** diff --git a/test/page-type-exhaustive.test.ts b/test/page-type-exhaustive.test.ts index a62a73226..075140100 100644 --- a/test/page-type-exhaustive.test.ts +++ b/test/page-type-exhaustive.test.ts @@ -84,6 +84,9 @@ describe('PageType exhaustiveness contract', () => { case 'code': return 'code'; case 'image': return 'asset'; case 'synthesis': return 'doc'; + case 'goal': return 'work'; + case 'decision': return 'doc'; + case 'process': return 'doc'; default: return assertNever(t); } } From bd27f6002c2279b89133836a8632a5d60f960231 Mon Sep 17 00:00:00 2001 From: chapter37haptics <249148637+chapter37haptics@users.noreply.github.com> Date: Fri, 15 May 2026 03:02:43 +0000 Subject: [PATCH 03/19] feat: add goals/decisions/processes directory mappings to inferType --- src/core/markdown.ts | 3 +++ test/markdown.test.ts | 20 ++++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/src/core/markdown.ts b/src/core/markdown.ts index 1d697f26d..9f3833ccd 100644 --- a/src/core/markdown.ts +++ b/src/core/markdown.ts @@ -361,6 +361,9 @@ function inferType(filePath?: string): PageType { if (lower.includes('/deals/') || lower.includes('/deal/')) return 'deal'; if (lower.includes('/yc/')) return 'yc'; if (lower.includes('/civic/')) return 'civic'; + if (lower.includes('/goals/') || lower.includes('/goal/')) return 'goal'; + if (lower.includes('/decisions/') || lower.includes('/decision/')) return 'decision'; + if (lower.includes('/processes/') || lower.includes('/process/')) return 'process'; if (lower.includes('/projects/') || lower.includes('/project/')) return 'project'; if (lower.includes('/sources/') || lower.includes('/source/')) return 'source'; if (lower.includes('/media/')) return 'media'; diff --git a/test/markdown.test.ts b/test/markdown.test.ts index 2d6f165ba..75bfcf11c 100644 --- a/test/markdown.test.ts +++ b/test/markdown.test.ts @@ -301,4 +301,24 @@ Some content.`; expect(parseMarkdown('', 'writing/post.md').type).toBe('writing'); expect(parseMarkdown('', 'projects/blog/writing/essay.md').type).toBe('writing'); }); + + test('inferType: goals/ → goal', () => { + const result = parseMarkdown('---\ntitle: Test\n---\nBody', 'goals/setup-jwt-auth.md'); + expect(result.type).toBe('goal'); + }); + + test('inferType: decisions/ → decision', () => { + const result = parseMarkdown('---\ntitle: Test\n---\nBody', 'decisions/chose-postgres.md'); + expect(result.type).toBe('decision'); + }); + + test('inferType: processes/ → process', () => { + const result = parseMarkdown('---\ntitle: Test\n---\nBody', 'processes/deploy-to-prod.md'); + expect(result.type).toBe('process'); + }); + + test('inferType: decisions/ under projects/ → decision (longest prefix)', () => { + const result = parseMarkdown('---\ntitle: Test\n---\nBody', 'projects/my-app/decisions/use-redis.md'); + expect(result.type).toBe('decision'); + }); }); From bc11839b2fb0d3c926161c5a828bcca3a7bd5d71 Mon Sep 17 00:00:00 2001 From: chapter37haptics <249148637+chapter37haptics@users.noreply.github.com> Date: Fri, 15 May 2026 03:04:29 +0000 Subject: [PATCH 04/19] feat: add goals/decisions/processes to DIR_PATTERN auto-link Extends the entity reference regex so markdown links like [text](goals/slug) and wikilinks like [[goals/slug|text]] are recognized as graph edges. Co-Authored-By: Claude Sonnet 4.6 --- src/core/link-extraction.ts | 2 +- test/link-extraction.test.ts | 20 ++++++++++++++++++++ 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/src/core/link-extraction.ts b/src/core/link-extraction.ts index bcd9d4430..dc9483e35 100644 --- a/src/core/link-extraction.ts +++ b/src/core/link-extraction.ts @@ -43,7 +43,7 @@ export type LinkResolutionType = 'qualified' | 'unqualified'; * - Our domain extensions: tech, finance, personal, openclaw (domain-organized wikis) * - Our entity prefix: entities (we kept some legacy entities/projects/ pages) */ -const DIR_PATTERN = '(?:people|companies|meetings|concepts|deal|civic|project|projects|source|media|yc|tech|finance|personal|openclaw|entities)'; +const DIR_PATTERN = '(?:goals|decisions|processes|people|companies|meetings|concepts|deal|civic|project|projects|source|media|yc|tech|finance|personal|openclaw|entities)'; /** * Match `[Name](path)` markdown links pointing to entity directories. diff --git a/test/link-extraction.test.ts b/test/link-extraction.test.ts index 6829ffeca..76ac47148 100644 --- a/test/link-extraction.test.ts +++ b/test/link-extraction.test.ts @@ -109,6 +109,26 @@ describe('extractEntityRefs', () => { expect(refs.length).toBe(1); expect(refs[0].dir).toBe('meetings'); }); + + test('extractEntityRefs: goals/ directory link', () => { + const refs = extractEntityRefs('[Setup JWT](goals/setup-jwt-auth)'); + expect(refs).toEqual([{ name: 'Setup JWT', slug: 'goals/setup-jwt-auth', dir: 'goals' }]); + }); + + test('extractEntityRefs: decisions/ directory link', () => { + const refs = extractEntityRefs('[Chose Postgres](decisions/chose-postgres)'); + expect(refs).toEqual([{ name: 'Chose Postgres', slug: 'decisions/chose-postgres', dir: 'decisions' }]); + }); + + test('extractEntityRefs: processes/ directory link', () => { + const refs = extractEntityRefs('[Deploy Flow](processes/deploy-to-prod)'); + expect(refs).toEqual([{ name: 'Deploy Flow', slug: 'processes/deploy-to-prod', dir: 'processes' }]); + }); + + test('extractEntityRefs: goals/ wikilink', () => { + const refs = extractEntityRefs('[[goals/setup-jwt-auth|Setup JWT]]'); + expect(refs).toEqual([{ name: 'Setup JWT', slug: 'goals/setup-jwt-auth', dir: 'goals' }]); + }); }); // ─── extractPageLinks ────────────────────────────────────────── From ba00bdf949849f53849c6112f7e779ff3439dc5d Mon Sep 17 00:00:00 2001 From: chapter37haptics <249148637+chapter37haptics@users.noreply.github.com> Date: Fri, 15 May 2026 03:05:14 +0000 Subject: [PATCH 05/19] feat: generalize quality.md Iron Law and notability gate for developer domain Co-Authored-By: Claude Opus 4.6 (1M context) --- skills/conventions/quality.md | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/skills/conventions/quality.md b/skills/conventions/quality.md index 5845cb4c9..3cfaa7862 100644 --- a/skills/conventions/quality.md +++ b/skills/conventions/quality.md @@ -22,8 +22,11 @@ Every fact written to a brain page must carry an inline `[Source: ...]` citation ## Back-Linking (MANDATORY) -Every mention of a person or company WITH a brain page MUST create a back-link -FROM that entity's page TO the page mentioning them. +Every mention of an entity WITH a brain page MUST create a back-link +FROM that entity's page TO the page mentioning it. + +Entities: goals, decisions, processes, concepts — any page in a recognized +entity directory. Format: `- **YYYY-MM-DD** | Referenced in [page title](path) -- context` @@ -33,8 +36,11 @@ An unlinked mention is a broken brain. Before creating a new brain page, check notability: -- **People:** Will you interact again? Relevant to work/interests? -- **Companies:** Relevant to work/investments/interests? -- **Concepts:** Reusable mental model? Worth referencing again? +- **Goals:** Is this a distinct execution arc worth documenting? (Not a sub-step of an existing goal) +- **Decisions:** Does this choice govern future work beyond the current goal? +- **Processes:** Is this repeatable and handoff-worthy? (Not a one-off sequence) +- **Concepts:** Reusable across goals? Stable? Non-procedural? (If it's steps, it's a process) -When in doubt, DON'T create. A 400-follower person who tweeted once is not notable. +When in doubt, capture in the current goal page first. Promote to its own page +only when reuse is clear. A missing page can be created later. A junk page +wastes attention and degrades search quality. From 661ed799fd94e5e7e2d8808466464890c8a397a7 Mon Sep 17 00:00:00 2001 From: chapter37haptics <249148637+chapter37haptics@users.noreply.github.com> Date: Fri, 15 May 2026 03:05:59 +0000 Subject: [PATCH 06/19] feat: patch brain-ops 8 hard-gate sites for developer domain Co-Authored-By: Claude Opus 4.6 (1M context) --- skills/brain-ops/SKILL.md | 27 +++++++++++++-------------- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/skills/brain-ops/SKILL.md b/skills/brain-ops/SKILL.md index 1cc478f77..2194ddae9 100644 --- a/skills/brain-ops/SKILL.md +++ b/skills/brain-ops/SKILL.md @@ -19,11 +19,10 @@ tools: mutating: true writes_pages: true writes_to: - - people/ - - companies/ - - deals/ + - goals/ + - decisions/ + - processes/ - concepts/ - - meetings/ --- # Brain Operations — The Ambient Context Layer @@ -46,7 +45,7 @@ This skill guarantees: ## Iron Law: Back-Linking (MANDATORY) -Every mention of a person or company with a brain page MUST create a back-link +Every mention of an entity with a brain page MUST create a back-link FROM that entity's page TO the page mentioning them. An unlinked mention is a broken brain. See `skills/conventions/quality.md` for format. @@ -54,7 +53,7 @@ broken brain. See `skills/conventions/quality.md` for format. ### Phase 1: Brain-First Lookup (MANDATORY) -Before using ANY external API to research a person, company, or topic: +Before using ANY external API to research a goal, decision, process, or concept: 1. `gbrain search "name"` — keyword search for existing pages 2. `gbrain query "natural question about name"` — hybrid search for context @@ -66,9 +65,9 @@ The brain almost always has something. External APIs fill gaps, not start from s ### Phase 2: On Every Inbound Signal (READ → ENRICH → WRITE) -Every message, meeting, email, or conversation that references a person or company: +Every message or conversation that references a goal, decision, process, or concept: -1. **Detect entities** — people, companies, deals mentioned +1. **Detect entities** — goals, decisions, processes, concepts mentioned 2. **Load brain pages** — read existing pages for context before responding 3. **Identify new information** — what does this signal tell us that the page doesn't know? 4. **Write it back** — update the brain page with new info + timeline entry + source citation @@ -85,8 +84,8 @@ to the graph (`links` table) with inferred relationship types. Stale links "auto-link" reconciliation. - No manual `add_link` calls needed for ordinary page writes. -- Inferred link types: `attended` (meeting -> person), `works_at`, `invested_in`, - `founded`, `advises`, `source` (frontmatter), `mentions` (default). +- Inferred link types: `uses` (goal -> concept), `decided_in` (decision -> goal), + `depends_on` (process -> concept), `source` (frontmatter), `mentions` (default). - The `put_page` MCP response includes `auto_links: { created, removed, errors }` so the agent can verify outcomes. - To disable: `gbrain config set auto_link false`. Default is on. @@ -95,7 +94,7 @@ to the graph (`links` table) with inferred relationship types. Stale links ### Phase 3: On Every Outbound Response (READ → PULL → RESPOND) -Before answering any question about a person, company, or topic: +Before answering any question about a goal, decision, process, or concept: 1. **Check the brain** — read relevant pages 2. **Pull context** — use compiled truth + recent timeline @@ -108,8 +107,8 @@ Don't answer from general knowledge when a brain page exists. This is not a special mode. This is the default. Everything the user says is an ingest event. -- Person mentioned → check brain, create/enrich if needed (spawn background) -- Company mentioned → same +- Goal mentioned → check brain, create/update if needed (spawn background) +- Decision/process/concept mentioned → same - Link shared → ingest it (delegate to idea-ingest) - Data shared → delegate to appropriate skill @@ -144,7 +143,7 @@ the citation is `[gstack:plans/foo]`. That's the whole rule. ## Anti-Patterns -- Answering questions about people/companies without checking the brain first +- Answering questions about goals/decisions/processes/concepts without checking the brain first - Using external APIs before checking the brain - Writing facts without inline `[Source: ...]` citations - Blocking the response to do enrichment From cc163fff63dae68d7657f72da27cabeb665b27d5 Mon Sep 17 00:00:00 2001 From: chapter37haptics <249148637+chapter37haptics@users.noreply.github.com> Date: Fri, 15 May 2026 03:07:05 +0000 Subject: [PATCH 07/19] feat: rewrite signal-detector for developer domain knowledge capture Co-Authored-By: Claude Opus 4.6 (1M context) --- skills/signal-detector/SKILL.md | 91 ++++++++++++++++++++++----------- 1 file changed, 60 insertions(+), 31 deletions(-) diff --git a/skills/signal-detector/SKILL.md b/skills/signal-detector/SKILL.md index 5752324c2..8a47a3974 100644 --- a/skills/signal-detector/SKILL.md +++ b/skills/signal-detector/SKILL.md @@ -1,10 +1,11 @@ --- name: signal-detector -version: 1.0.0 +version: 2.0.0 description: | - Always-on ambient signal capture. Fires on every inbound message to detect - original thinking and entity mentions. Spawn as a cheap sub-agent in parallel, - never block the main response. + Always-on ambient signal capture for developer knowledge. Fires on every + inbound message to detect goals, decisions, processes, concepts, and + original thinking. Spawn as a cheap sub-agent in parallel, never block + the main response. triggers: - every inbound message (always-on) tools: @@ -17,18 +18,19 @@ tools: mutating: true writes_pages: true writes_to: - - people/ - - companies/ + - goals/ + - decisions/ + - processes/ - concepts/ --- -# Signal Detector — Ambient Brain Capture +# Signal Detector — Developer Knowledge Capture Lightweight sub-agent that fires on every inbound message to capture TWO things with EQUAL priority: -1. **Original thinking** — the user's ideas, observations, theses, frameworks -2. **Entity mentions** — people, companies, media references +1. **Original thinking** — the user's ideas, observations, frameworks +2. **Developer knowledge signals** — goals, decisions, processes, concepts Original thinking is AT LEAST as valuable as entity extraction. Ideas are the intellectual capital. Entities are bookkeeping. Both compound over time. @@ -39,15 +41,15 @@ This skill guarantees: - Fires on every message (no exceptions unless purely operational) - Runs in parallel (spawned, never blocks main response) - Captures ideas with the user's EXACT phrasing (no paraphrasing) -- Detects entity mentions and creates/enriches brain pages +- Detects developer knowledge signals and creates/enriches brain pages - Logs a one-line summary of what was captured - Back-links all entity mentions (Iron Law) - Citations on every fact written > **Convention:** See `skills/conventions/quality.md` for Iron Law back-linking. -Every time this skill creates or updates a brain page that mentions a person or company: -1. Check if that person/company has a brain page +Every time this skill creates or updates a brain page that mentions another entity: +1. Check if that entity has a brain page 2. If yes → add a back-link FROM their page TO the page you just created/updated 3. Format: `- **YYYY-MM-DD** | Referenced in [page title](path) — brief context` 4. An unlinked mention is a broken brain. @@ -57,35 +59,60 @@ Every time this skill creates or updates a brain page that mentions a person or ### Phase 1: Idea/Observation Detection (PRIMARY) When the user expresses a novel thought, observation, thesis, or framework: -- If it's the user's **original thinking** (they generated it) → create/update `originals/{slug}` -- If it's a **world concept** they're referencing → create/update `concepts/{slug}` -- If it's a **product or business idea** → create/update `ideas/{slug}` +- If it's the user's **original thinking** (they generated it) → create/update `concepts/{slug}` +- If it's a **reusable pattern or mental model** → create/update `concepts/{slug}` **Capture exact phrasing.** The user's language IS the insight. Don't paraphrase. -**Cross-linking (MANDATORY):** Every original MUST link to related people, companies, -meetings, and concepts. An original without cross-links is a dead original. +**Cross-linking (MANDATORY):** Every concept MUST link to related goals, decisions, +and processes. A concept without cross-links is a dead concept. -### Phase 2: Entity Detection (SECONDARY) +### Phase 2: Developer Knowledge Detection (SECONDARY) -1. Extract entity mentions (people, companies, media titles) -2. For each entity: - - `gbrain search "name"` — does a page exist? - - If NO page → check notability. If notable, create page with enrichment. - - If page exists but THIN → trigger enrich - - If page exists and RICH → no action -3. For new FACTS with specific dates → call `gbrain timeline-add ""` +Scan every message for these signals: -**Auto-link (v0.10.1):** When you write/update an originals or ideas page that -references a person or company, the auto-link post-hook on `put_page` -automatically creates the link from the new page to that entity. You don't -need to call `gbrain link` manually. Timeline entries still need explicit calls. +1. **Goal signals** — "set up JWT auth", "migrate to Postgres", "fix the deploy", + any /goal invocation or development task being worked on + - Check brain: `gbrain search "goal name"` + - If no page → create `goals/{slug}` with approach, environment, initial state + - If page exists → update with new progress, debug trails, decisions made + +2. **Decision signals** — "we chose X because Y", "decided to", "tradeoff", + "going with", "ruling out" + - If the decision governs future work beyond this goal → create `decisions/{slug}` + - If the decision is local to the current goal → log on the goal page + - Always record: what was decided, why, what alternatives were considered + +3. **Process signals** — "to deploy, you need to", "the workflow is", "steps to", + "how to set up", repeatable sequences + - Create `processes/{slug}` with preconditions, steps, verification + - Only if the process is reproducible and handoff-worthy + +4. **Concept signals** — "event sourcing works by", "the repository pattern", + "Docker needs this flag because", tool knowledge, pattern explanations + - Create/update `concepts/{slug}` with context-free reusable understanding + - Must be: reusable, cross-goal, stable, non-procedural + +5. **Debug signals** — "the bug was caused by", "root cause was", "fixed by" + - Add structured timeline entry to the active goal page (NOT a separate page) + - Format: `- **YYYY-MM-DD** | Debug — **Symptom:** X. **Root cause:** Y. **Fix:** Z.` + +For each entity: +- `gbrain search "name"` — does a page exist? +- If NO page → check notability (see quality.md). If notable, create with enrichment. +- If page exists but THIN → enrich with new information +- If page exists and RICH → add timeline entry if there's new dated information + +**Auto-link (v0.10.1):** When you write/update a page that references another +entity, the auto-link post-hook on `put_page` automatically creates the graph +edge. You don't need to call `gbrain link` manually. Timeline entries still +need explicit calls. ### Phase 3: Signal Logging Always log a one-line summary: - `Signals: 0 ideas, 0 entities, 0 facts (skipped: operational)` -- `Signals: 1 idea (captured → originals/x), 2 entities (enriched → people/y, companies/z)` +- `Signals: 1 concept (captured → concepts/x), 1 goal (updated → goals/y), 1 decision (created → decisions/z)` This makes the ambient capture loop debuggable. @@ -98,9 +125,11 @@ The output is brain pages created/updated and the signal log line. - Blocking the main response to wait for signal detection to complete - Paraphrasing the user's original thinking instead of capturing exact phrasing -- Creating pages for non-notable entities (one-off mentions) +- Creating pages for non-notable entities (one-off mentions, sub-steps) - Skipping back-links after creating/updating pages - Running on purely operational messages ("ok", "thanks", "do it") +- Creating a separate page for debug trails (they go on the goal page) +- Filing a concept that's really a process (if it has steps, it's a process) ## Tools Used From 3b7fb1c8e8c6c99967ad601c3c64ba9953f07752 Mon Sep 17 00:00:00 2001 From: chapter37haptics <249148637+chapter37haptics@users.noreply.github.com> Date: Fri, 15 May 2026 03:08:28 +0000 Subject: [PATCH 08/19] feat: rewrite filing rules for developer domain taxonomy Co-Authored-By: Claude Opus 4.6 (1M context) --- skills/_brain-filing-rules.json | 23 ++++- skills/_brain-filing-rules.md | 155 +++++++++++--------------------- 2 files changed, 75 insertions(+), 103 deletions(-) diff --git a/skills/_brain-filing-rules.json b/skills/_brain-filing-rules.json index f266d421c..989c26468 100644 --- a/skills/_brain-filing-rules.json +++ b/skills/_brain-filing-rules.json @@ -33,6 +33,24 @@ "examples": ["mental models", "theses", "frameworks"], "description": "A reusable idea, framework, or mental model not tied to a specific person/company." }, + { + "kind": "goal", + "directory": "goals/", + "examples": ["development tasks", "/goal executions", "debug sessions"], + "description": "One /goal execution arc: what was attempted, what happened, decisions made, debug trails, what was learned. The primary authoring unit — capture here first, promote out when reusable." + }, + { + "kind": "decision", + "directory": "decisions/", + "examples": ["architecture choices", "tool selections", "tradeoff resolutions"], + "description": "A durable technical choice that governs future work beyond one goal. ADR-style: context, options considered, decision, consequences." + }, + { + "kind": "process", + "directory": "processes/", + "examples": ["deploy workflows", "setup procedures", "migration runbooks"], + "description": "A canonical reproducible procedure that is handoff-worthy. Graduates to a skill file after 2-3 successful reuses with only argument changes." + }, { "kind": "project", "directory": "projects/", @@ -159,7 +177,10 @@ "wiki/originals/*", "wiki/personal/patterns/*", "wiki/people/*", - "dream-cycle-summaries/*" + "dream-cycle-summaries/*", + "goals/*", + "decisions/*", + "processes/*" ] } } diff --git a/skills/_brain-filing-rules.md b/skills/_brain-filing-rules.md index bebee8d14..694950223 100644 --- a/skills/_brain-filing-rules.md +++ b/skills/_brain-filing-rules.md @@ -7,29 +7,45 @@ not the source, not the skill that's running. ## Decision Protocol -1. Identify the primary subject (a person? company? concept? policy issue?) +1. Identify the primary subject (a goal? decision? process? concept?) 2. File in the directory that matches the subject 3. Cross-link from related directories 4. When in doubt: what would you search for to find this page again? +## Operational Rule + +Capture everything in `goals/` first. Promote out only when reusable: +- `decision` — if the choice should constrain other goals +- `process` — if it's reproducible and handoff-worthy +- `concept` — if it generalizes beyond the specific case + ## Common Misfiling Patterns -- DO NOT DO THESE | Wrong | Right | Why | |-------|-------|-----| -| Analysis of a topic -> `sources/` | -> appropriate subject directory | sources/ is for raw data only | -| Article about a person -> `sources/` | -> `people/` | Primary subject is a person | -| Meeting-derived company info -> `meetings/` only | -> ALSO update `companies/` | Entity propagation is mandatory | -| Research about a company -> `sources/` | -> `companies/` | Primary subject is a company | -| Reusable framework/thesis -> `sources/` | -> `concepts/` | It's a mental model | -| Tweet thread about policy -> `media/` | -> `civic/` or `concepts/` | media/ is for content ops | +| Local decision on goal page → `decisions/` | Keep on `goals/` page | Only durable cross-goal choices go to decisions/ | +| One-off command sequence → `processes/` | Keep on `goals/` page | processes/ is for repeatable, handoff-worthy workflows | +| Project-specific config note → `concepts/` | Keep on `goals/` page | concepts/ is for context-free reusable knowledge | +| Reusable pattern buried in goal page | → `concepts/` | If it applies to more than one goal, promote it | +| Debug trail → separate page | → timeline entry on `goals/` page | Debug trails are structured timeline entries, not pages | +| A series of steps → `concepts/` | → `processes/` | If it has steps, it's a process | + +## MECE Boundaries (hard rules) + +| Pair | Boundary | +|------|----------| +| goals/ vs decisions/ | goals: what happened in one execution run. decisions: durable choice meant to govern future goals | +| goals/ vs processes/ | goals: narrative + debug trail. processes: canonical reproducible procedure (no session story) | +| goals/ vs concepts/ | goals: applied, context-bound. concepts: context-free reusable understanding | +| decisions/ vs processes/ | decisions: what/why we chose. processes: how to execute | +| decisions/ vs concepts/ | decisions: committed policy for a scope. concepts: explanatory model, no commitment | +| processes/ vs concepts/ | processes: stepwise action. concepts: theory/pattern vocabulary | ## Sanctioned exception: synthesis output is sui generis The "file by primary subject" rule is for raw ingest. Synthesized output that -is one-of-one to a single source AND a specific reader (a personalized book -mirror, a strategic-reading playbook tied to one problem) does not fit any -subject directory cleanly: filing by topic loses the "this is the book" -dimension; filing by author muddles authorship pages with synthesis pages. +is one-of-one to a single source AND a specific reader does not fit any +subject directory cleanly. Format-prefixed paths under `media//` are the sanctioned exception: @@ -37,33 +53,29 @@ exception: - `media/books/-personalized.md` (book-mirror output) - `media/articles/-personalized.md` (long-form article personalization) -If you find yourself wanting `media//` for raw ingest, that is still -the anti-pattern in the table above. The exception is narrow: synthesized, -one-of-one, sui generis to a single source. - ## What `sources/` Is Actually For `sources/` is ONLY for: - Bulk data imports (API dumps, CSV exports, snapshots) -- Raw data that feeds multiple brain pages (e.g., a guest export, contact sync) +- Raw data that feeds multiple brain pages - Periodic captures (quarterly snapshots, sync exports) -If the content has a clear primary subject (a person, company, concept, policy -issue), it does NOT go in sources/. Period. +If the content has a clear primary subject (a goal, decision, process, concept), +it does NOT go in sources/. Period. ## Notability Gate Not everything deserves a brain page. Before creating a new entity page: -- **People:** Will you interact with them again? Are they relevant to your work? -- **Companies:** Are they relevant to your work or interests? -- **Concepts:** Is this a reusable mental model worth referencing later? -- **When in doubt, DON'T create.** A missing page can be created later. - A junk page wastes attention and degrades search quality. +- **Goals:** Is this a distinct execution arc? (Not a sub-step of an existing goal) +- **Decisions:** Does this choice govern future work beyond the current goal? +- **Processes:** Is this repeatable and handoff-worthy? (Not a one-off sequence) +- **Concepts:** Reusable across goals? Stable? Non-procedural? +- **When in doubt, DON'T create.** Capture on the goal page first. Promote later. ## Iron Law: Back-Linking (MANDATORY) -Every mention of a person or company with a brain page MUST create a back-link -FROM that entity's page TO the page mentioning them. This is bidirectional: +Every mention of an entity with a brain page MUST create a back-link +FROM that entity's page TO the page mentioning it. This is bidirectional: the new page links to the entity, AND the entity's page links back. Format for back-links (append to Timeline or See Also): @@ -99,94 +111,33 @@ Every ingested item should have its raw source preserved for provenance. - **< 100 MB text/PDF**: stays in the brain repo (git-tracked) in a `.raw/` sidecar directory alongside the brain page - **>= 100 MB OR media files** (video, audio, images): uploaded to cloud - storage (Supabase Storage, S3, etc.) with a `.redirect.yaml` pointer left - in the brain repo. Files >= 100 MB use TUS resumable upload (6 MB chunks - with retry) for reliability. - -**Upload command:** -```bash -gbrain files upload-raw --page --type -``` -Returns JSON: `{storage: "git"}` for small files, `{storage: "supabase", storagePath, reference}` for cloud. - -**The `.redirect.yaml` pointer format:** -```yaml -target: supabase://brain-files/page-slug/filename.mp4 -bucket: brain-files -storage_path: page-slug/filename.mp4 -size: 524288000 -size_human: 500 MB -hash: sha256:abc123... -mime: video/mp4 -uploaded: 2026-04-11T... -type: transcript -``` - -**Accessing stored files:** -```bash -gbrain files signed-url # Generate 1-hour signed URL -gbrain files restore # Download back to local -``` - -This ensures any derived brain page can be traced back to its original source, -and large files don't bloat the git repo. + storage with a `.redirect.yaml` pointer left in the brain repo. ## Dream-cycle synthesize / patterns directories (v0.23) The `synthesize` and `patterns` phases of `gbrain dream` write to a **fixed allow-list** of paths sourced from `_brain-filing-rules.json`'s `dream_synthesize_paths.globs` array. Editing that JSON is the ONLY way -to add a new directory the synthesis subagent may write to: - -| Output type | Slug pattern | What goes here | -|-------------|--------------|----------------| -| Reflection | `wiki/personal/reflections/YYYY-MM-DD--` | Self-knowledge, emotional processing, pattern recognition. Verbatim quotes from the user, with analysis. | -| Original idea | `wiki/originals/ideas/YYYY-MM-DD--` | New frames, theses, mental models, "conceptive ideologist" outputs. Capture the user's exact phrasing — that's the artifact. | -| People enrichment | `wiki/people/` | Timeline entries appended to existing people pages from session mentions. Stub pages for new substantive people. | -| Pattern | `wiki/personal/patterns/` | Cross-session theme detected across ≥3 reflections. Highest-leverage output: a pattern can span 25 years if reflections reference dated content. | -| Cycle summary | `dream-cycle-summaries/YYYY-MM-DD` | Index of every page produced by one dream cycle. Auto-written deterministically by the orchestrator. | - -**Iron Law for synthesize output:** -1. Quote the user verbatim. Do not paraphrase memorable phrasings. -2. Cross-reference compulsively: every new page MUST link to existing brain content. -3. Slug discipline: lowercase alphanumeric and hyphens only, slash-separated. NO underscores, NO file extensions. -4. Edited transcripts produce NEW slugs (content-hash suffix changes) — never silently overwrite a prior reflection. +to add a new directory the synthesis subagent may write to. + +## Brain-to-skill promotion pipeline + +When a process proves repeatable (2-3 times with only argument changes), +it graduates from a `processes/` brain page to an actual skill file: + +- Brain stores: context, evidence, tradeoffs, project-specific constraints, debug history +- Skill files store: stable, parameterized procedures with deterministic steps +- Promotion rule: if reused successfully 2-3 times with only argument changes, graduate to a skill +- Bidirectional links: process page links to skill file path, skill references source brain pages ## Takes attribution (v0.32+) When writing a `` fence, the **holder** column says -WHO BELIEVES the claim, not who it's ABOUT. Cross-modal eval over 100K -production takes scored attribution at 6.5/10 — holder/subject confusion was -the #1 error. These six rules are the contract. Long form with worked -examples lives in `docs/takes-vs-facts.md`. +WHO BELIEVES the claim, not who it's ABOUT. 1. **Holder ≠ subject.** The test: did this person SAY or CLEARLY IMPLY this? - - YES → `holder = people/` - - NO, it's your analysis OF them → `holder = brain` - - Example: "Garry has a hero/rescuer pattern" → `holder=brain` (analysis ABOUT Garry, not stated BY Garry) 2. **Atomic claims.** Split compound rows into separate rows. One claim per row. 3. **Amplification ≠ endorsement.** A retweet-only signal caps at `weight 0.55`. - The user shared something; they didn't necessarily endorse every clause. -4. **Self-reported ≠ verified.** "Saif reports 7 figures" → `holder=people/saif`, - `weight=0.75`, NOT `holder=world/1.0`. Self-report is a strong individual - signal, not consensus fact. -5. **No false precision.** Use 0.05 increments only (`0.35`, `0.55`, `0.75`). - `0.74` and `0.82` imply calibration accuracy that doesn't exist. The engine - layer rounds on insert — match the grid in your fence and avoid the warning. -6. **"So what" test.** Skip metadata-style trivia (Twitter handles, follower - counts, obvious bio fields). A take has to be load-bearing for some future - query. - -**Holder format (enforced as a parser warning in v0.32, error in v0.33+):** -- `world` (consensus fact, no individual claimant) -- `brain` (AI-inferred, holder genuinely ambiguous) -- `people/` (individual's stated belief) -- `companies/` (institutional fact, no individual claimant) - -Slugs use the standard grammar (`[a-z0-9._-]+`). `Garry`, `people/Garry-Tan`, -and `world/garry-tan` all fail validation. - -**Founder-describing-own-company rule.** When a founder describes their own -company, the holder is the FOUNDER, not the company. "We can hit $10M ARR" -said by Bo Lu → `holder=people/bo-lu`, NOT `holder=companies/clipboard-health`. -Companies don't speak; their employees do. +4. **Self-reported ≠ verified.** Self-report → `weight=0.75`, not `holder=world/1.0`. +5. **No false precision.** Use 0.05 increments only. +6. **"So what" test.** Skip metadata-style trivia. From 3fff8733897725b16e11412defca2c034a5ebbd5 Mon Sep 17 00:00:00 2001 From: chapter37haptics <249148637+chapter37haptics@users.noreply.github.com> Date: Fri, 15 May 2026 03:09:37 +0000 Subject: [PATCH 09/19] feat: update RESOLVER.md disambiguation for developer domain Co-Authored-By: Claude Opus 4.6 (1M context) --- skills/RESOLVER.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/skills/RESOLVER.md b/skills/RESOLVER.md index ffeea3d70..def01b6aa 100644 --- a/skills/RESOLVER.md +++ b/skills/RESOLVER.md @@ -15,7 +15,7 @@ This is the dispatcher. Skills are the implementation. **Read the skill file bef |---------|-------| | "What do we know about", "tell me about", "search for", "who is", "background on", "notes on" | `skills/query/SKILL.md` | | "Who knows who", "relationship between", "connections", "graph query" | `skills/query/SKILL.md` (use graph-query) | -| Creating/enriching a person or company page | `skills/enrich/SKILL.md` | +| Creating/enriching a goal, decision, process, or concept page | `skills/enrich/SKILL.md` | | Where does a new file go? Filing rules | `skills/repo-architecture/SKILL.md` | | Fix broken citations in brain pages | `skills/citation-fixer/SKILL.md` | | "citation audit", "check citations", "fix citations" | `skills/citation-fixer/SKILL.md` (focused fix). For broader brain health, chain into `skills/maintain/SKILL.md` | @@ -27,7 +27,7 @@ This is the dispatcher. Skills are the implementation. **Read the skill file bef | Trigger | Skill | |---------|-------| -| User shares a link, article, tweet, or idea | `skills/idea-ingest/SKILL.md` | +| User shares a link, article, or idea | `skills/idea-ingest/SKILL.md` | | "watch this video", "process this YouTube link", "ingest this PDF", "save this podcast", "process this book", "summarize this book", "PDF book", "ingest it into my brain", "what's in this screenshot", "check out this repo" | `skills/media-ingest/SKILL.md` | | Meeting transcript received | `skills/meeting-ingestion/SKILL.md` | | Generic "ingest this" (auto-routes to above) | `skills/ingest/SKILL.md` | @@ -57,7 +57,7 @@ This is the dispatcher. Skills are the implementation. **Read the skill file bef | "Skillify this", "is this a skill?", "make this proper" | `skills/skillify/SKILL.md` | | "Compress my resolver", "AGENTS.md too large", "RESOLVER.md too big", "functional area dispatcher", "shrink routing table" | `skills/functional-area-resolver/SKILL.md` | | "Is gbrain healthy?", morning health check, skillpack-check | `skills/skillpack-check/SKILL.md` | -| Post-restart health + auto-fix, "did the container restart break anything", smoke test | `skills/smoke-test/SKILL.md` | +| Post-restart health + auto-fix, smoke test | `skills/smoke-test/SKILL.md` | | Cross-modal review, second opinion | `skills/cross-modal-review/SKILL.md` | | "Validate skills", skill health check | `skills/testing/SKILL.md` | | Webhook setup, external event processing | `skills/webhook-transforms/SKILL.md` | @@ -94,7 +94,7 @@ This is the dispatcher. Skills are the implementation. **Read the skill file bef When multiple skills could match: 1. Prefer the most specific skill (meeting-ingestion over ingest) 2. If the user mentions a URL, route by content type (link → idea-ingest, video → media-ingest) -3. If the user mentions a person/company, check if enrich or query fits better +3. If the user mentions a goal/decision/process/concept, check if enrich or query fits better 4. Chaining is explicit in each skill's Phases section 5. When in doubt, ask the user (see `skills/ask-user/SKILL.md` for the choice-gate pattern) @@ -122,4 +122,3 @@ These apply to ALL brain-writing skills: | "verify this academic claim", "check this study", "academic verify", "validate citation", "is this study real" | `skills/academic-verify/SKILL.md` | | "make pdf from brain", "brain pdf", "convert brain page to pdf", "publish this page as pdf", "export brain page" | `skills/brain-pdf/SKILL.md` | | "voice note", "ingest this voice memo", "transcribe and file", "voice note ingest", "save this audio note" | `skills/voice-note-ingest/SKILL.md` | - From ba754660e74b4a854fa8564e3c9a47b69407dcd9 Mon Sep 17 00:00:00 2001 From: chapter37haptics <249148637+chapter37haptics@users.noreply.github.com> Date: Fri, 15 May 2026 03:09:59 +0000 Subject: [PATCH 10/19] feat: update brain-first.md entity conventions for developer domain Co-Authored-By: Claude Opus 4.6 (1M context) --- skills/conventions/brain-first.md | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/skills/conventions/brain-first.md b/skills/conventions/brain-first.md index a196cc4b0..90e4cb02f 100644 --- a/skills/conventions/brain-first.md +++ b/skills/conventions/brain-first.md @@ -1,6 +1,6 @@ # Brain-First Lookup Convention -**Read this before doing ANY entity/person/company/fact lookup.** +**Read this before doing ANY entity/goal/decision/process/concept lookup.** Sub-agents and fresh sessions inherit gbrain tools but not the knowledge of when and how to use them. This file is that knowledge. @@ -56,15 +56,13 @@ Standard directory structure: | Directory | Type | Example | |-----------|------|---------| -| `people/` | person | `people/paul-graham.md` | -| `companies/` | company | `companies/stripe.md` | -| `deals/` | deal | `deals/stripe-series-c.md` | -| `meetings/` | meeting | `meetings/2026-04-23-weekly-sync.md` | -| `projects/` | project | `projects/gbrain.md` | -| `yc/` | yc | `yc/batch-w26.md` | +| `goals/` | goal | `goals/setup-jwt-auth.md` | +| `decisions/` | decision | `decisions/chose-postgres-over-sqlite.md` | +| `processes/` | process | `processes/deploy-to-production.md` | +| `concepts/` | concept | `concepts/event-sourcing.md` | When creating new pages, include proper frontmatter with `type`, `title`, -and `tags` fields. +and `tags` fields. See `skills/_brain-filing-rules.md` for page templates. ## When Spawning Further Sub-agents From 8a69e923265216cd21db5a69364d777ec059f342 Mon Sep 17 00:00:00 2001 From: chapter37haptics <249148637+chapter37haptics@users.noreply.github.com> Date: Fri, 15 May 2026 03:18:16 +0000 Subject: [PATCH 11/19] chore: rebuild llms.txt after domain customization changes Co-Authored-By: Claude Opus 4.6 (1M context) --- llms-full.txt | 1379 +++---------------------------------------------- 1 file changed, 63 insertions(+), 1316 deletions(-) diff --git a/llms-full.txt b/llms-full.txt index 36979515d..c76ce072d 100644 --- a/llms-full.txt +++ b/llms-full.txt @@ -140,393 +140,37 @@ strict behavior when unset. ## Key files -- `src/core/operations.ts` — Contract-first operation definitions (the foundation). Also exports upload validators: `validateUploadPath`, `validatePageSlug`, `validateFilename`, plus `matchesSlugAllowList(slug, prefixes)` (v0.23 glob matcher: `/*` matches recursive children; bare `` matches exact only). `OperationContext.remote` flags untrusted callers; `OperationContext.allowedSlugPrefixes` (v0.23) is the trusted-workspace allow-list set by the dream cycle. `put_page` enforces: when `viaSubagent` and `allowedSlugPrefixes` is set, slug must match the allow-list; else the legacy `wiki/agents//...` namespace check applies. Auto-link enabled for trusted-workspace writes (skipped only when `remote=true && !trustedWorkspace`). As of v0.26.0, every `Operation` also carries `scope?: 'read' | 'write' | 'admin'` + `localOnly?: boolean`. All ops are annotated; `sync_brain`, `file_upload`, `file_list`, and `file_url` are `admin + localOnly` (rejected over HTTP). `OperationContext.auth?: AuthInfo` is threaded through HTTP dispatch for scope enforcement in `serve-http.ts` before the op runs. **v0.26.9 (D12 + F7b):** `OperationContext.remote` is now a REQUIRED field in the TypeScript type — the compiler is the first defense against transports that forget to set it. Four trust-boundary call sites (`put_page` allowlist, file_upload trust-narrowing, submit_job protected-name guard, auto-link skip) flipped from falsy-default (`!ctx.remote`) to fail-closed semantics (`ctx.remote === false` for "trusted-only" sites and `ctx.remote !== false` for "untrust unless explicit-false"). Anything that isn't strictly `false` is now treated as remote. Closed an HTTP MCP shell-job RCE: a `read+write`-scoped OAuth token could submit `shell` jobs because the HTTP request handler's literal context skipped `remote: true` and `submit_job`'s protected-name guard saw a falsy undefined. Stdio MCP set the field correctly via dispatch.ts; HTTP inlined a parallel context-builder for several releases and lost it. -- `src/core/engine.ts` — Pluggable engine interface (BrainEngine). `clampSearchLimit(limit, default, cap)` takes an explicit cap so per-operation caps can be tighter than `MAX_SEARCH_LIMIT`. Exports `LinkBatchInput` / `TimelineBatchInput` for the v0.12.1 bulk-insert API (`addLinksBatch` / `addTimelineEntriesBatch`). As of v0.13.1, `BrainEngine` has a `readonly kind: 'postgres' | 'pglite'` discriminator so migrations (`src/core/migrate.ts`) and other consumers can branch on engine without `instanceof` + dynamic imports. **v0.29:** four new methods — `batchLoadEmotionalInputs(slugs?)` (CTE-shaped read with per-table aggregates so a page × N tags × M takes never produces N×M rows), `setEmotionalWeightBatch(rows)` (`UPDATE FROM unnest($1::text[], $2::text[], $3::real[])` composite-keyed on `(slug, source_id)` for multi-source safety), `getRecentSalience(opts)`, `findAnomalies(opts)`. `PageFilters` extended with `sort?: 'updated_desc' | 'updated_asc' | 'created_desc' | 'slug'` + `PAGE_SORT_SQL` whitelist consumed by both engines (was hardcoded `ORDER BY updated_at DESC`). **v0.32.8 (PR #860):** new `listAllPageRefs(): Promise>` ordered by `(source_id, slug)`. Cheap cross-source enumeration for hot loops on large brains — replaces the `getAllSlugs()→getPage(slug)` N+1 pattern in extract-takes, extract, integrity, which silently defaulted to `source_id='default'` for non-default-source pages. Implementation parity across postgres-engine.ts + pglite-engine.ts. Pinned by `test/e2e/multi-source-bug-class.test.ts`. +> Only files relevant to the customized-domain spec are listed here. +> For the full key-files catalog, see the main CLAUDE.md. + +- `src/core/operations.ts` — Contract-first operation definitions (the foundation). `put_page` enforces slug allow-lists when `viaSubagent` and `allowedSlugPrefixes` is set. Auto-link enabled for trusted-workspace writes (skipped only when `remote=true && !trustedWorkspace`). +- `src/core/engine.ts` — Pluggable engine interface (BrainEngine). `clampSearchLimit(limit, default, cap)` takes an explicit cap so per-operation caps can be tighter than `MAX_SEARCH_LIMIT`. Exports `LinkBatchInput` / `TimelineBatchInput` for the bulk-insert API (`addLinksBatch` / `addTimelineEntriesBatch`). `BrainEngine` has a `readonly kind: 'postgres' | 'pglite'` discriminator. - `src/core/engine-factory.ts` — Engine factory with dynamic imports (`'pglite'` | `'postgres'`) -- `src/core/pglite-engine.ts` — PGLite (embedded Postgres 17.5 via WASM) implementation, all 40 BrainEngine methods. `addLinksBatch` / `addTimelineEntriesBatch` use multi-row `unnest()` with manual `$N` placeholders. As of v0.13.1, `connect()` wraps `PGlite.create()` in a try/catch that emits an actionable error naming the macOS 26.3 WASM bug (#223) and pointing at `gbrain doctor`; the lock is released on failure so the next process can retry cleanly. v0.22.0: `searchKeyword` and `searchKeywordChunks` multiply `ts_rank` by the source-factor CASE expression at the chunk-grain level; `searchVector` becomes a two-stage CTE — inner CTE keeps `ORDER BY cc.embedding <=> vec` so HNSW stays usable, outer SELECT re-ranks by `raw_score * source_factor`. Inner LIMIT scales with offset to preserve pagination contract. As of v0.22.6.1, `initSchema()` calls `applyForwardReferenceBootstrap()` BEFORE replaying SCHEMA_SQL — probes for the specific forward-referenced state the embedded schema blob needs (`pages.source_id`, `links.link_source`, `links.origin_page_id`, `content_chunks.symbol_name`, `content_chunks.language`, `sources` FK target table) and adds only what's missing. Closes the upgrade-wedge bug class that bit users 10+ times across 6 schema versions over 2 years (#239/#243/#266/#357/#366/#374/#375/#378/#395/#396). No-op on fresh installs and modern brains. -- `src/core/pglite-schema.ts` — PGLite-specific DDL (pgvector, pg_trgm, triggers) -- `src/core/postgres-engine.ts` — Postgres + pgvector implementation (Supabase / self-hosted). `addLinksBatch` / `addTimelineEntriesBatch` use `INSERT ... SELECT FROM unnest($1::text[], ...) JOIN pages ON CONFLICT DO NOTHING RETURNING 1` — 4-5 array params regardless of batch size, sidesteps the 65535-parameter cap. As of v0.12.3, `searchKeyword` / `searchVector` scope `statement_timeout` via `sql.begin` + `SET LOCAL` so the GUC dies with the transaction instead of leaking across the pooled postgres.js connection (contributed by @garagon). `getEmbeddingsByChunkIds` uses `tryParseEmbedding` so one corrupt row skips+warns instead of killing the query. v0.22.0: `searchKeyword`, `searchKeywordChunks`, and `searchVector` apply source-aware ranking by inlining the source-factor CASE and `NOT (col LIKE …)` hard-exclude clause from `src/core/search/sql-ranking.ts`. `searchVector` switches to a two-stage CTE (HNSW-safe inner ORDER BY, source-boost re-rank in the outer SELECT) and carries `p.source_id` through inner→outer for v0.18 multi-source callers. v0.22.1 (#406): `_savedConfig` retains the connect config; `reconnect()` tears down + recreates the pool from saved config (called by supervisor watchdog after 3 consecutive health-check failures). `executeRaw` is a single-statement passthrough — no per-call retry (D3 dropped that as unsound for non-idempotent statements; recovery is supervisor-driven). v0.22.1 (#363, contributed by @orendi84): `connect()` applies `resolveSessionTimeouts()` from `db.ts` as connection-time startup parameters (`statement_timeout`, `idle_in_transaction_session_timeout`) so orphan pgbouncer backends can't hold locks for hours. v0.22.1 (#409, contributed by @atrevino47): `countStaleChunks()` + `listStaleChunks()` server-side-filter on `embedding IS NULL` for `embed --stale`, eliminating ~76 MB/call client-side pull on a fully-embedded brain; `upsertChunks()` resets both `embedding` AND `embedded_at` to NULL when chunk_text changes without a new embedding (consistency). As of v0.22.6.1, `initSchema()` calls `applyForwardReferenceBootstrap()` BEFORE replaying SCHEMA_SQL on the same forward-reference probe set as the PGLite engine, so old Postgres brains pinned at v0.13/v0.18/v0.19 walk forward cleanly instead of wedging on `column "..." does not exist`. **v0.28.1:** `disconnect()` is now idempotent. New `_connectionStyle` instance field tracks whether the engine owns its pool (worker engines) or shares the module-level singleton; second call on an instance-pool engine is a no-op rather than falling through to `db.disconnect()` and clobbering the singleton. Pinned by `test/e2e/postgres-engine-disconnect-idempotency.test.ts` (2 cases). Closes the bug class where any test sharing an engine across multiple `worker.start()` / `worker.stop()` cycles silently broke its own DB connectivity. -- `src/core/cjk.ts` (v0.32.7 CJK wave) — Single source of truth for CJK detection across the codebase. Exports `CJK_RANGES_REGEX`, `CJK_SLUG_CHARS` (character-class fragment for embedding inside other regexes), `CJK_SENTENCE_DELIMITERS` (`。!?`), `CJK_CLAUSE_DELIMITERS` (`;:,、`), `CJK_DENSITY_THRESHOLD = 0.30`, `hasCJK(s)`, `countCJKAwareWords(s)` (30% density threshold — English docs with one Japanese term stay whitespace-tokenized; Chinese-dominant docs get char-counted), and `escapeLikePattern(s)` (escapes `%`, `_`, `\\` for `ILIKE ... ESCAPE '\\'`). Replaces the inline hasCJK regex previously duplicated at `expansion.ts:58`. BMP-only ranges (Han / Hiragana / Katakana / Hangul Syllables); widening to Unicode property escapes is a v0.33+ TODO. Consumers: `expansion.ts`, `sync.ts:slugifySegment`, `operations.ts:validatePageSlug + validateFilename`, `chunkers/recursive.ts:countWords + DELIMITERS`, `pglite-engine.ts:searchKeyword + searchKeywordChunks`. -- `src/core/audit-slug-fallback.ts` (v0.32.7 CJK wave) — Weekly ISO-week-rotated audit JSONL at `~/.gbrain/audit/slug-fallback-YYYY-Www.jsonl`. `logSlugFallback(slug, sourcePath)` fires when `importFromFile` falls back to a frontmatter slug because `slugifyPath` returned empty (emoji / Thai / Arabic / non-CJK exotic-script filenames). `readRecentSlugFallbacks(days)` reads the last N days for `gbrain doctor`'s `slug_fallback_audit` check. Honors `GBRAIN_AUDIT_DIR` via the shared `resolveAuditDir()` from shell-audit.ts. Separate surface from `sync-failures.jsonl` per codex outside-voice review — that file carries bookmark-gating semantics that info events shouldn't trigger. -- `src/core/embedding-pricing.ts` (v0.32.7 CJK wave) — `EMBEDDING_PRICING` map keyed `provider:model` for the post-upgrade reindex cost estimate. Sibling to `anthropic-pricing.ts`. Entries: OpenAI text-embedding-3-large ($0.13/1M), 3-small ($0.02/1M), ada-002 ($0.10/1M), Voyage 3-large ($0.18/1M), 3 ($0.06/1M). `lookupEmbeddingPrice(modelString)` returns a tagged union (`known` with price + `unknown` with provider name); `estimateCostFromChars(charCount, pricePerMTok)` uses 3.5 chars/token approximation. Unknown providers degrade gracefully to "estimate unavailable" instead of fabricating numbers. -- `src/core/post-upgrade-reembed.ts` (v0.32.7 CJK wave) — Pure functions backing the `gbrain upgrade` chunker-bump cost prompt. `computeReembedEstimate(engine, model)` queries real SQL (`COUNT(*)` + `COALESCE(SUM(LENGTH(compiled_truth)) + SUM(LENGTH(timeline)), 0)`) on `pages WHERE chunker_version < MARKDOWN_CHUNKER_VERSION`. `formatReembedPrompt(est, graceSeconds)` is the stderr-line formatter. `runPostUpgradeReembedPrompt(engine, model, opts)` orchestrates the 10-second Ctrl-C window; TTY-only wait (non-TTY auto-proceeds for CI / cron); `GBRAIN_NO_REEMBED=1` bails out with a doctor-warning marker; `GBRAIN_REEMBED_GRACE_SECONDS=0` skips the wait. -- `src/commands/reindex.ts` (v0.32.7 CJK wave) — `gbrain reindex --markdown [--limit N] [--dry-run] [--json] [--no-embed] [--repo PATH]`. Walks `pages WHERE page_kind = 'markdown' AND chunker_version < MARKDOWN_CHUNKER_VERSION` in 100-row batches, ordered by id. Rows with non-null `source_path` re-import via `importFromFile`; rows without fall back to `importFromContent` against the stored `compiled_truth`. **Both paths pass `forceRechunk: true`** to bypass `importFromContent`'s `content_hash` short-circuit — without that flag (codex post-merge F1), the chunker version bump never reaches pages whose source content hasn't changed since last sync, AND master's v0.32.2 stripFactsFence privacy strip never applies to pre-strip chunks. Idempotent — partial-completion re-runs pick up where they left off via id-ordered batches. Wired into `src/commands/upgrade.ts:runPostUpgrade` after `apply-migrations`. -- `src/commands/sync.ts:resolveSlugByPathOrSourcePath` (v0.32.7 CJK wave, codex post-merge F4) — Resolves a slug by `pages.source_path` first (returns the stored slug for frontmatter-fallback pages whose path doesn't derive a slug), then falls back to `resolveSlugForPath(path)`. Threaded into all 4 delete/rename call sites (`performSync`'s un-syncable cleanup at ~:531, deletes at ~:603, rename oldSlug at ~:622). Without this, emoji-only / Thai / Arabic filenames whose slug came from frontmatter would orphan on delete/rename (the delete path would compute the wrong path-derived slug). Best-effort query — pre-migration brains fall through to the legacy path. -- `src/core/utils.ts` — Shared SQL utilities extracted from postgres-engine.ts. Exports `parseEmbedding(value)` (throws on unknown input, used by migration + ingest paths where data integrity matters) and as of v0.12.3 `tryParseEmbedding(value)` (returns `null` + warns once per process, used by search/rescore paths where availability matters more than strictness). **v0.26.9 (D14):** adds `isUndefinedColumnError(err)` predicate — pattern-matches Postgres SQLSTATE 42703 / "column ... does not exist" with engine-driver shape variation tolerated. Replaces bare `catch {}` blocks in `oauth-provider.ts` so genuine errors (lock timeout, network blip, permission denied) propagate while column-missing falls through to the legacy fallback path. Reusable from any future code that needs the same column-existence probe semantics. **v0.32.8 (PR #860):** adds `validateSourceId(id)` that throws on anything outside `^[a-z0-9_-]+$`. Used by the per-source disk-layout fix in patterns.ts/synthesize.ts before any `join(brainDir, '.sources', source_id, slug+'.md')` call so source_id can't traverse out of brainDir. `rowToPage` updated to populate the now-required `Page.source_id` field from the SELECT projection (`scripts/check-source-id-projection.sh` enforces that every projection feeding `rowToPage` includes the column). -- `src/core/db.ts` — Connection management, schema initialization. v0.22.1 (#363, contributed by @orendi84): `resolveSessionTimeouts()` returns `statement_timeout` + `idle_in_transaction_session_timeout` (defaults: 5min each, env-overridable via `GBRAIN_STATEMENT_TIMEOUT` / `GBRAIN_IDLE_TX_TIMEOUT` / `GBRAIN_CLIENT_CHECK_INTERVAL`). Both `connect()` (module singleton) and `PostgresEngine.connect()` (worker pool) consume the result via postgres.js's `connection` option, sending GUCs as startup parameters that survive PgBouncer transaction mode (unlike the prior `setSessionDefaults` post-pool SET, kept as a back-compat no-op shim). -- `src/commands/migrate-engine.ts` — Bidirectional engine migration (`gbrain migrate --to supabase/pglite`) - `src/core/import-file.ts` — importFromFile + importFromContent (chunk + embed + tags) -- `src/core/sync.ts` — Pure sync functions (manifest parsing, filtering, slug conversion). v0.22.12 (#500, foundation by @wintermute via #501): `classifyErrorCode(errorMsg)` regex-based classifier with 12 codes (`SLUG_MISMATCH`, `YAML_PARSE`, `YAML_DUPLICATE_KEY`, `MISSING_OPEN`, `MISSING_CLOSE`, `NESTED_QUOTES`, `EMPTY_FRONTMATTER`, `NULL_BYTES`, `INVALID_UTF8`, `STATEMENT_TIMEOUT`, `FILE_TOO_LARGE`, `SYMLINK_NOT_ALLOWED`) plus `UNKNOWN` fallback. `summarizeFailuresByCode(failures)` returns sorted `[{code, count}]`. `code?` optional field on `SyncFailure`; backfilled at ack time on pre-v0.22.12 entries. `acknowledgeSyncFailures()` returns `AcknowledgeResult { count, summary }`. Three regexes (`MISSING_OPEN`, `MISSING_CLOSE`, `EMPTY_FRONTMATTER`) broadened to match actual `markdown.ts:159-244` validator message strings, not just the literal code-name prefix. `FILE_TOO_LARGE` covers all three production size sites in `import-file.ts:199, 352, 401`; `SYMLINK_NOT_ALLOWED` covers the rejection at `:347`. Closes the silent-skip pattern that motivated #500. -- `src/core/storage.ts` — Pluggable storage interface (S3, Supabase Storage, local) -- `src/core/storage-config.ts` (v0.22.11) — Storage tiering: `loadStorageConfig` reads `gbrain.yml`, normalizes deprecated keys (`git_tracked` / `supabase_only`) to canonical (`db_tracked` / `db_only`) with once-per-process deprecation warning, and runs `normalizeAndValidateStorageConfig` (auto-fixes missing trailing `/`, throws `StorageConfigError` on tier overlap). Path-segment matcher: `media/x/` does NOT match `media/xerox/foo`. Replaces gray-matter (broken on delimiter-less YAML) with a dedicated parser for the `gbrain.yml` shape. -- `src/core/disk-walk.ts` (v0.22.11) — `walkBrainRepo(repoPath)` returns `Map` from one recursive `readdirSync`. Skips dot-dirs, `node_modules`, non-`.md` files. Used by `gbrain storage status` to replace per-page `existsSync + statSync` (~400K syscalls on 200K-page brains → tens). -- `src/commands/storage.ts` (v0.22.11) — `gbrain storage status [--repo P] [--json]`. Split into pure data (`getStorageStatus`) + JSON formatter + human formatter (ASCII-only per D10) matching the `orphans.ts` pattern. `PageCountsByTier` and `DiskUsageByTier` are distinct nominal types so swaps fail at compile time. -- `gbrain.yml` (brain repo root, v0.22.11) — Optional storage tiering config. Top-level `storage:` section with `db_tracked:` and `db_only:` array-valued keys. `gbrain sync` auto-manages `.gitignore` for `db_only` paths on successful sync (skips on dry-run, blocked-by-failures, submodule context, or `GBRAIN_NO_GITIGNORE=1`). `gbrain export --restore-only [--repo P] [--type T] [--slug-prefix S]` repopulates missing `db_only` files from the database. -- `src/core/supabase-admin.ts` — Supabase admin API (project discovery, pgvector check) -- `src/core/file-resolver.ts` — File resolution with fallback chain (local -> .redirect.yaml -> .redirect -> .supabase) -- `src/core/chunkers/` — 3-tier chunking (recursive, semantic, LLM-guided). v0.19.0 adds `code.ts` — tree-sitter-based semantic chunker for 29 languages with embedded-asset WASMs (`src/assets/wasm/`), `@dqbd/tiktoken` cl100k_base tokenizer, small-sibling merging. `CHUNKER_VERSION` constant folded into `importCodeFile`'s `content_hash` so chunker shape changes force clean re-chunks across releases. -- `src/core/errors.ts` (v0.19.0) — `StructuredAgentError` + `buildError` + `serializeError`. Every new v0.19.0 agent-facing surface (code-def, code-refs, usage errors) uses this envelope; matches v0.17.0 `CycleReport.PhaseResult.error` shape. -- `src/assets/wasm/` (v0.19.0) — 36 tree-sitter grammar WASMs + tree-sitter runtime. Committed to the repo so `bun --compile` embeds them deterministically via `import path from ... with { type: 'file' }`. The CI guard `scripts/check-wasm-embedded.sh` fails the build if the compiled binary ever silently falls through to recursive chunks. -- `src/commands/code-def.ts` + `src/commands/code-refs.ts` (v0.19.0) — symbol definition + references lookup. Query `content_chunks.symbol_name` or chunk_text ILIKE with `page_kind='code'` filter. Auto-JSON when stdout is not a TTY (gh-CLI convention). Bypass the standard `searchKeyword` `DISTINCT ON (slug)` collapse so multiple call-sites from the same file surface. -- `src/core/search/` — Hybrid search: vector + keyword + RRF + multi-query expansion + dedup. As of v0.22.0, `searchKeyword` / `searchKeywordChunks` / `searchVector` apply source-aware ranking at the SQL layer (curated content like `originals/`, `concepts/`, `writing/` outranks bulk content like `wintermute/chat/`, `daily/`, `media/x/`). `searchVector` uses a two-stage CTE so source-boost re-ranking doesn't kill the HNSW index. Hard-exclude prefixes (`test/`, `archive/`, `attachments/`, `.raw/` by default) filter at retrieval, not post-rank. Both gates honor `detail !== 'high'` so temporal queries surface chat pages normally. -- `src/core/search/intent.ts` — Query intent classifier (entity/temporal/event/general → auto-selects detail level) -- `src/core/search/eval.ts` — Retrieval eval harness: P@k, R@k, MRR, nDCG@k metrics + runEval() orchestrator -- `src/core/search/source-boost.ts` (v0.22.0) — Source-type boost map keyed by slug prefix. `DEFAULT_SOURCE_BOOSTS` (originals/ 1.5, concepts/ 1.3, writing/ 1.4, people/companies/deals/ 1.2, daily/ 0.8, media/x/ 0.7, wintermute/chat/ 0.5) and `DEFAULT_HARD_EXCLUDES` (test/, archive/, attachments/, .raw/). `parseSourceBoostEnv` / `parseHardExcludesEnv` parse comma-separated `prefix:factor` pairs from `GBRAIN_SOURCE_BOOST` / `GBRAIN_SEARCH_EXCLUDE` env vars. `resolveBoostMap` and `resolveHardExcludes` merge defaults + env + caller `SearchOpts.exclude_slug_prefixes`/`include_slug_prefixes`. -- `src/core/search/sql-ranking.ts` (v0.22.0) — Pure SQL string builders. `buildSourceFactorCase(slugColumn, boostMap, detail)` emits a CASE expression with longest-prefix-match wins (returns literal `'1.0'` when `detail === 'high'` for temporal-bypass parity with COMPILED_TRUTH_BOOST). `buildHardExcludeClause(slugColumn, prefixes)` emits `NOT (col LIKE 'p1%' OR col LIKE 'p2%')` — OR-chain wrapped in NOT, NOT `NOT LIKE ALL/ANY` (those quantifiers don't express set-exclusion). LIKE meta-character escape covers all three of `%`, `_`, AND `\` (backslash matters because it's Postgres LIKE's default escape char). Single-quote doubling on SQL string literals so injection-style inputs are inert text. -- `src/commands/eval.ts` — `gbrain eval` command: single-run table + A/B config comparison. v0.25.0 adds sub-subcommand dispatch on `args[0]` so `gbrain eval export` + `gbrain eval prune` + `gbrain eval replay` route into session-capture handlers; bare `gbrain eval --qrels …` fall-through preserves the legacy IR-metrics flow. v0.27.x adds `gbrain eval cross-modal` to the dispatch (the user-facing path is the cli.ts no-DB branch — `src/commands/eval.ts:cross-modal` only fires when callers re-enter with an existing engine). -- `src/commands/eval-cross-modal.ts` (v0.27.x) — multi-model quality gate. Three different-provider frontier models score the OUTPUT against the TASK on a 5-dim list. Verdict `pass` (exit 0) / `fail` (exit 1) / `inconclusive` (exit 2; <2/3 model successes per Q3=A in plans/radiant-napping-lerdorf.md). Reuses `src/core/ai/gateway.ts:chat()` so config/auth/aliasing comes from the gateway recipe registry — no parallel provider stack. Self-configures the gateway (`configureGateway(loadConfig() + process.env)`) since the cli.ts dispatch bypasses `connectEngine()`. Default cycles 3 in TTY, 1 in non-TTY (T11=B partial cost guardrail). Receipts land at `gbrainPath('eval-receipts')/-.json`. The full `--budget-usd` cap is a v0.27.x follow-up TODO. -- `src/core/cross-modal-eval/json-repair.ts` (v0.27.x) — `parseModelJSON(raw)` named export with a 4-strategy fallback chain (direct parse → fence-strip → trailing-comma + single-quote + embedded-newline repair → regex nuclear option). Adversarial input throws rather than fabricating scores — the aggregator treats a throw as "this model contributed nothing this cycle" so the gate stays correct at >=2/3 successes. -- `src/core/cross-modal-eval/aggregate.ts` (v0.27.x) — pure verdict logic. Pass criterion: `(successes >= 2) AND (every dim mean >= 7) AND (every dim min across models >= 5)` (Q2=A floor). Inconclusive when <2/3 models returned parseable scores (Q3=A regression guard for the v1 .mjs `Object.values({}).every(...) === true` empty-array PASS bug). -- `src/core/cross-modal-eval/runner.ts` (v0.27.x) — orchestrator. Each cycle runs `Promise.allSettled([gwChat(slotA), gwChat(slotB), gwChat(slotC)])` (T4=A — bare allSettled, no rate-leases for the CLI path; minion-integration TODO recovers cross-process concurrency). Stops early on PASS or INCONCLUSIVE; runs up to 3 cycles. Default slots: `openai:gpt-4o` / `anthropic:claude-opus-4-7` / `google:gemini-1.5-pro`. `estimateCost()` exports a small per-model pricing table (drifts; refresh alongside model-family bumps). -- `src/core/cross-modal-eval/receipt-name.ts` (v0.27.x) — receipt filename binds (slug, SKILL.md sha-8). `findReceiptForSkill(skillPath, receiptDir)` returns `'found' | 'stale' | 'missing'` (T10=A). Skillify-check item 11 surfaces the status as informational (T7=C); the audit does NOT fail on missing/stale receipts. -- `src/core/cross-modal-eval/receipt-write.ts` (v0.27.x) — wraps `fs.writeFileSync` with `mkdirSync({recursive:true})` ahead of every write (T5 correction; `gbrainPath()` does NOT auto-mkdir). -- `src/commands/eval-export.ts` (v0.25.0) — streams `eval_candidates` rows as NDJSON to stdout with `schema_version: 1` prefix on every line. EPIPE-safe, progress heartbeats on stderr, stable id-desc tiebreaker so `--since` windows never dupe/miss rows. -- `src/commands/eval-prune.ts` (v0.25.0) — explicit retention cleanup. Requires `--older-than DUR`. `--dry-run` reports would-delete count. -- `src/commands/eval-replay.ts` (v0.25.0) — contributor-facing replay tool. Reads NDJSON from `gbrain eval export`, re-runs each captured `query` / `search` op against the current brain, computes set-Jaccard@k between captured + current `retrieved_slugs`, top-1 stability rate, and latency Δ. Stable JSON shape (`schema_version: 1`) for CI gating; human mode prints a regression table. Pure Bun, zero new deps. The dev-loop half of BrainBench-Real that closes the gap between "data captured" and "data used to gate a PR." See `docs/eval-bench.md` for the workflow. -- `src/commands/eval-suspected-contradictions.ts` + `src/core/eval-contradictions/{judge,runner,types,date-filter,cost-tracker,cache,severity-classify,cross-source,trends,calibration,judge-errors,auto-supersession,fixture-redact}.ts` (v0.32.6) — `gbrain eval suspected-contradictions [run|trend|review]`. Probe samples top-K retrieval pairs per query (cross-slug + intra-page chunk-vs-take), date pre-filters (3-rule layered — same-paragraph-dual-date overrides separation rule), LLM judge (query-conditioned per Codex; UTF-8-safe truncation; C1 confidence-floor double-enforcement; resolution_kind output drives M7 paste-ready commands), persistent cache keyed on `(chunk_a_hash, chunk_b_hash, model_id, prompt_version, truncation_policy)` (Codex outside-voice fix — prompt edits cleanly invalidate prior verdicts), Wilson 95% CI calibration on the headline percentage with `small_sample_note` when n<30, judge_errors as first-class typed counters (parse_fail/refusal/timeout/http_5xx/unknown — Codex fix to bias from silent skip), M5 trend writes to `eval_contradictions_runs`, M6 source-tier breakdown reuses `DEFAULT_SOURCE_BOOSTS` prefix logic, deterministic sampling (combined_score DESC + lex tiebreaker — stable cache hit-rate across re-runs). Hermetic via `judgeFn` + `searchFn` DI in the runner; never touches the real gateway in tests. Engine surface: `BrainEngine.listActiveTakesForPages` (P1 batched), `writeContradictionsRun` + `loadContradictionsTrend` (M5), `getContradictionCacheEntry` + `putContradictionCacheEntry` + `sweepContradictionCache` (P2). Schema migrations v51 + v52. MCP op `find_contradictions` (read scope, NOT localOnly, NOT in subagent allowlist — user-initiated only). M1 doctor check surfaces high-severity findings with paste-ready resolution commands. M2 synthesize phase pre-fetches latest probe's top-5-by-severity findings and threads them into `buildSynthesisPrompt` as an informational block. 226 hermetic unit tests + 12 real-Postgres E2E. Plan: `~/.claude/plans/system-instruction-you-are-working-hashed-dewdrop.md`. Architecture doc: `docs/contradictions.md`. -- `src/commands/eval-longmemeval.ts` + `src/eval/longmemeval/{harness,adapter,sanitize}.ts` (v0.28.1) — `gbrain eval longmemeval ` runs the public [LongMemEval](https://huggingface.co/datasets/xiaowu0162/longmemeval) benchmark against gbrain's hybrid retrieval. Architecture: one in-memory PGLite per benchmark run created via `createBenchmarkBrain` + `withBenchmarkBrain` (NO `EphemeralBrain` class). Between questions, `TRUNCATE` over runtime-enumerated `pg_tables` so future schema migrations don't silently leak data across questions; infrastructure tables (`sources`, `config`, `gbrain_cycle_locks`, `subagent_rate_leases`) are preserved. `cli.ts` has a pre-dispatch bypass so `eval longmemeval` skips `connectEngine()` — the user's `~/.gbrain` brain is never opened. `--expansion` defaults to OFF (deterministic, no per-query Haiku call); pass `--expansion` to opt in. Default model resolves through `resolveModel()` 6-tier chain with `models.eval.longmemeval` as the new config key. Sanitization parity: `harness.ts` re-uses `INJECTION_PATTERNS` from `src/core/think/sanitize.ts` (now exported, line 22) so adding a pattern automatically covers takes AND benchmarks. Retrieved chat content is wrapped in `` framing; the answer-gen system prompt declares the content UNTRUSTED. LLM injection seam: `runEvalLongMemEval(args, {client?: ThinkLLMClient})` lets tests stub the client so the full pipeline runs without an Anthropic API key. p50 25.9ms / p99 30.3ms warm reset+import+search on Apple Silicon (per `test/eval-longmemeval.test.ts` perf gate). Hand the JSONL output to LongMemEval's `evaluate_qa.py` to score (their published evaluator, not bundled — needs OpenAI gpt-4o per their spec). -- `docs/eval-bench.md` (v0.25.0) — contributor guide for using captured data to benchmark retrieval changes before merging. Linked from CONTRIBUTING.md under "Running real-world eval benchmarks (touching retrieval code)". -- `src/core/eval-capture.ts` (v0.25.0) — op-layer capture wrapper called from `src/core/operations.ts` `query` + `search` handlers. Catches MCP + CLI + subagent tool-bridge from one site. Fire-and-forget; failures route to `engine.logEvalCaptureFailure` so `gbrain doctor` sees drops cross-process. **Capture is off by default** — `isEvalCaptureEnabled` resolution: explicit `config.eval.capture` (true/false) wins, else `process.env.GBRAIN_CONTRIBUTOR_MODE === '1'`, else off. Production users get a quiet brain; contributors set `export GBRAIN_CONTRIBUTOR_MODE=1` in `.zshrc` to enable the dev loop. PII scrubber gate is independent and defaults to true regardless of CONTRIBUTOR_MODE. -- `src/core/eval-capture-scrub.ts` (v0.25.0) — zero-deps PII scrubber: emails, phones, SSN, Luhn-verified credit cards, JWT-shaped tokens, bearer tokens. -- `src/core/search/hybrid.ts` — Cathedral II `Promise` return shape unchanged in v0.25.0. Adds `onMeta?: (m: HybridSearchMeta) => void` callback so op-layer capture can record what hybridSearch actually did. Existing callers leave it undefined. -- `docs/eval-capture.md` (v0.25.0) — stable NDJSON schema reference for gbrain-evals consumers. -- `test/public-exports.test.ts` (v0.25.0 / R2) — runtime contract test. Imports each of the 17 public subpaths via package name and pins a canary symbol per module. Paired with `scripts/check-exports-count.sh`. -- `src/core/embedding.ts` — OpenAI text-embedding-3-large, batch, retry, backoff. **v0.28.7:** `BATCH_SIZE` reverted 50→100 — the original Voyage safety guard halved OpenAI throughput on every page. Per-recipe pre-split + recursive halving + adaptive shrink-on-miss now live in the gateway, so the outer paginator goes back to its original purpose: progress-callback granularity, not batch protection. -- `src/core/ai/types.ts` — provider/recipe types. **v0.28.7 (#680):** `EmbeddingTouchpoint` extended with optional `chars_per_token` (default 4 chars/token, matching OpenAI tiktoken on English) and `safety_factor` (default 0.8, budget-utilization ceiling). Both consulted only when `max_batch_tokens` is also set. Voyage declares `chars_per_token=1` + `safety_factor=0.5` to handle dense payloads (CJK/JSON/base64) that overshoot tiktoken. The pre-split budget is `max_batch_tokens × safety_factor / chars_per_token`. **v0.28.11 (#719):** `EmbeddingTouchpoint.multimodal_models?: string[]` model-level allow-list for recipes that mix text-only + multimodal models under one touchpoint (Voyage's 12 models share `supports_multimodal: true` but only `voyage-multimodal-3` accepts `/multimodalembeddings`). When omitted, recipe-level `supports_multimodal` is sufficient. `AIGatewayConfig.embedding_multimodal_model?: string` lets `embedMultimodal()` route to a different model than `embedding_model` — brains using OpenAI for text can use Voyage for images without flipping the primary embedding pipeline. -- `src/core/ai/gateway.ts` — unified seam for every AI call. **v0.28.7 (#680):** module-scoped `_embedTransport` defaulting to AI SDK `embedMany`, with `__setEmbedTransportForTests(fn)` test seam so tests drive the public `embed()` function with a stubbed transport instead of probing private helpers. `splitByTokenBudget` and `isTokenLimitError` are now exported `@internal` — pure functions reused directly by the test file. Module-level `_shrinkState: Map` halves the recipe's effective `safety_factor` on token-limit miss (floor 0.05) and heals back ×1.5 toward the ceiling after `SHRINK_HEAL_AFTER=10` consecutive successes. `configureGateway()` walks every registered recipe at construction time and emits a once-per-process stderr warning for any embedding touchpoint missing `max_batch_tokens` (excluding the canonical OpenAI fast-path recipe). `resetGateway()` clears `_shrinkState`, the warned-set, and restores the real transport. ASCII flow diagram embedded in the `embed()` JSDoc covers the routing decision, recursion + halving, and shrinkState lifecycle. **v0.28.11 (#719):** `embedMultimodal()` reads `cfg.embedding_multimodal_model` first (falls back to `cfg.embedding_model` for single-model setups). After the existing recipe-level `supports_multimodal` fast-fail, validates the resolved model against `touchpoint.multimodal_models` when declared — closes the Voyage-text-only-model-into-multimodal-endpoint footgun before any HTTP call (Codex F1 from PR review). New `getMultimodalModel()` accessor mirrors `getEmbeddingModel` / `getChatModel` so doctor and integration tests can read the gateway state. -- `src/core/ai/recipes/voyage.ts` — Voyage AI openai-compatible recipe. **v0.28.7 (#680):** declares `chars_per_token=1` + `safety_factor=0.5` so the gateway pre-splits Voyage batches at a 60K-character budget (50% of 120K-token cap with the dense-tokenizer ratio). Closes the v0.27 backfill loop where ~26% of the corpus stayed un-embedded because tiktoken-grounded budgeting silently undercounted Voyage's actual token usage. **v0.28.11 (#719):** declares `multimodal_models: ['voyage-multimodal-3']` so the gateway rejects text-only Voyage models pointed at the multimodal endpoint with a clear `AIConfigError` instead of waiting for Voyage's HTTP 400. -- `src/core/ai/recipes/anthropic.ts` — Anthropic recipe (chat + expansion touchpoints). **v0.31.12:** chat and expansion `models:` lists drop the v0.31.6 phantom `claude-sonnet-4-6-20250929` date suffix — canonical id is `claude-sonnet-4-6`. The wrong-direction alias `claude-sonnet-4-6 → claude-sonnet-4-6-20250929` is removed; a reverse alias `claude-sonnet-4-6-20250929 → claude-sonnet-4-6` keeps stale user configs working (rescues `facts.extraction_model` and `models.dream.synthesize` set by v0.31.6 installs). Recipe-shape regression pinned by `test/anthropic-model-ids.test.ts` (6 cases, verbatim cherry-pick of PR #830 plus the reverse-alias rescue case). -- `src/core/anthropic-pricing.ts` — Single source of truth for Anthropic model pricing (per-MTok input/output). **v0.31.12:** Opus 4.7 corrected from `$15/$75` to `$5/$25` (the old number was from Opus 4 generation, never refreshed when 4.7 shipped); Opus 4.6 also corrected. Consumed by `src/core/budget-meter.ts` and `src/core/cross-modal-eval/runner.ts` — the cross-modal estimator now reads `ANTHROPIC_PRICING` for Anthropic models instead of duplicating the table, killing the v0.31.6 drift bug class. -- `src/core/model-config.ts` — Model-string resolution (the seam every internal LLM call walks through). **v0.31.12:** four-tier system (`ModelTier = 'utility' | 'reasoning' | 'deep' | 'subagent'`) with `TIER_DEFAULTS` (utility→haiku-4-5, reasoning→sonnet-4-6, deep→opus-4-7, subagent→sonnet-4-6) and `tier?: ModelTier` on `ResolveModelOpts`. Resolution chain is now 8 steps: cliFlag → deprecated key → config key → `models.default` → `models.tier.` → env var → `TIER_DEFAULTS[tier]` → caller fallback. Two new exports — `isAnthropicProvider(modelString)` checks `provider:model` prefix OR `claude-` bare-id pattern, and `enforceSubagentAnthropic()` is the layer-2 runtime guard: when `tier === 'subagent'` resolves to a non-Anthropic provider, it emits a once-per-`(source, model)` stderr warn AND falls back to `TIER_DEFAULTS.subagent` instead of letting the Anthropic Messages API tool-loop attempt to run on OpenAI/Gemini. `_resetDeprecationWarningsForTest()` now also clears `_subagentTierWarningsEmitted` so tests re-emit. -- `src/core/ai/model-resolver.ts` — Recipe-touchpoint validator. **v0.31.12:** `assertTouchpoint(recipe, touchpoint, modelId, extendedModels?)` gains an optional 4th `extendedModels: ReadonlySet` argument. When the modelId is in that set, the native-recipe allowlist throw is bypassed — the user explicitly opted into this model via config so we let provider rejection surface as `model_not_found` at HTTP call time (and `gbrain models doctor` catches it earlier). Default code paths with hardcoded model strings MUST NOT pass `extendedModels` — typos in source code still fail fast. Replaces the earlier plan to soften the validator wholesale (Codex F4/F5 in plan review flagged that as too broad — it would have removed the fail-fast contract for chat + expand + embed all three). -- `src/core/ai/gateway.ts` extension (v0.31.12) — new module-scoped `_extendedModels: Map>` registry feeds `assertTouchpoint`'s 4th-arg path. New `reconfigureGatewayWithEngine(engine)` async function is called from `cli.ts` after `engine.connect()` (and before every command except `CLI_ONLY` no-DB commands) — re-resolves expansion + chat defaults through `resolveModel()` so `models.tier.*` and `models.default` overrides apply to expansion + chat both. `DEFAULT_CHAT_MODEL` corrected to `anthropic:claude-sonnet-4-6` (was the v0.31.6 phantom `-20250929`). New `__setChatTransportForTests` seam mirrors `__setEmbedTransportForTests` so tests drive `chat()` with a stubbed transport. -- `src/core/minions/queue.ts` extension (v0.31.12) — `MinionQueue.add()` now rejects `subagent` jobs whose `data.model` resolves through `isAnthropicProvider()` to a non-Anthropic provider. Lazy-imports `model-config.ts` to avoid pulling engine types into queue's eager-load surface. Layer 1 of the three-layer subagent provider enforcement (Codex F1+F2 in plan review). Layers 2 + 3 live in `src/core/model-config.ts` (`enforceSubagentAnthropic` runtime fallback) and `src/commands/doctor.ts` (`subagent_provider` check). Pinned by 3 cases in `test/agent-cli.test.ts`. -- `src/commands/models.ts` (v0.31.12) — `gbrain models [--json]` read-only routing dashboard: prints tier defaults (`utility`/`reasoning`/`deep`/`subagent`), the resolved value for each (re-walking the resolution chain to attribute properly), every per-task override (11 `PER_TASK_KEYS` entries — `models.dream.synthesize`, `models.dream.patterns`, `models.drift`, `models.auto_think`, `models.think`, `models.subagent`, `facts.extraction_model`, `models.eval.longmemeval`, `models.expansion`, `models.chat`, `models.dream.synthesize_verdict`), the alias map (defaults + user overrides), and a source-of-truth column showing `default` / `config: ` / `env: `. `gbrain models doctor [--skip=] [--json]` fires a 1-token `gateway.chat()` probe against each configured chat + expansion model and classifies failures into `{model_not_found, auth, rate_limit, network, unknown}` — the structural fix for the v0.31.6 silent-no-op bug class. Wired into `cli.ts` dispatch table + `CLI_ONLY` set. -- `src/commands/doctor.ts` extension (v0.31.12) — new `subagent_provider` check (layer 3 of 3 — Codex F13). Warns when `models.tier.subagent` is explicitly set to a non-Anthropic provider (fail-loud since the user clearly meant it — message names the bad value and prints the paste-ready fix command `gbrain config set models.tier.subagent anthropic:claude-sonnet-4-6`); also warns when `models.default` would sneak `subagent` into a non-Anthropic provider via tier inheritance. OK status when subagent tier resolves to Anthropic. Tests cover all three paths in `test/doctor.test.ts`. +- `src/core/sync.ts` — Pure sync functions (manifest parsing, filtering, slug conversion). +- `src/core/markdown.ts` — Frontmatter parsing + body splitter. `splitBody` requires an explicit timeline sentinel (``, `--- timeline ---`, or `---` immediately before `## Timeline`/`## History`). Plain `---` in body text is a markdown horizontal rule, not a separator. `inferType` auto-types `/wiki/analysis/` → analysis, `/wiki/guides/` → guide, `/wiki/hardware/` → hardware, `/wiki/architecture/` → architecture, `/writing/` → writing (plus the existing people/companies/deals/etc heuristics). +- `src/core/link-extraction.ts` — shared library for the v0.12.0 graph layer. extractEntityRefs (canonical, replaces backlinks.ts duplicate) matches both `[Name](people/slug)` markdown links and Obsidian `[[people/slug|Name]]` wikilinks as of v0.12.3. extractPageLinks, inferLinkType heuristics (attended/works_at/invested_in/founded/advises/source/mentions), parseTimelineEntries, isAutoLinkEnabled config helper. `DIR_PATTERN` covers `people`, `companies`, `deals`, `topics`, `concepts`, `projects`, `entities`, `tech`, `finance`, `personal`, `openclaw`. Used by extract.ts, operations.ts auto-link post-hook, and backlinks.ts. - `src/core/check-resolvable.ts` — Resolver validation: reachability, MECE overlap, DRY checks, structured fix objects. v0.14.1: `CROSS_CUTTING_PATTERNS.conventions` is an array (notability gate accepts both `conventions/quality.md` and `_brain-filing-rules.md`). New `extractDelegationTargets()` parses `> **Convention:**`, `> **Filing rule:**`, and inline backtick references. DRY suppression is proximity-based via `DRY_PROXIMITY_LINES = 40`. -- `src/core/repo-root.ts` — Shared `findRepoRoot(startDir?)` (v0.16.4): walks up from `startDir` (default `process.cwd()`) looking for `skills/RESOLVER.md`. Zero-dependency module imported by both `doctor.ts` and `check-resolvable.ts`. Parameterized `startDir` makes tests hermetic. **v0.31.7:** read-path / write-path split. `autoDetectSkillsDir` (shared, read+write-safe) gains tier-0 `$GBRAIN_SKILLS_DIR` explicit operator override (Docker mounts, CI, monorepo subdirs) ahead of the existing 4-tier chain. New `autoDetectSkillsDirReadOnly` wraps it with a tier-5 install-path fallback that walks up from `fileURLToPath(import.meta.url)` and gates on `isGbrainRepoRoot` so unrelated repos can't false-positive. Read-path callers (`doctor`, `check-resolvable`, `routing-eval`) use the read-only variant; write-path callers (`skillpack install`, `skillify scaffold`, `post-install-advisory`) deliberately stay on the shared function so `gbrain skillpack install` from `~` cannot silently retarget the bundled gbrain repo's `skills/` instead of the user's actual workspace. Two new `SkillsDirSource` variants: `'env_explicit'`, `'install_path'`. New `AUTO_DETECT_HINT_READ_ONLY` documents the extra tier. The D6 `--fix` safety gate in `doctor.ts` + `check-resolvable.ts` refuses auto-repair when `detected.source === 'install_path'` so `gbrain doctor --fix` from `~` cannot silently rewrite the bundled install tree. -- `src/commands/check-resolvable.ts` — Standalone CLI wrapper (v0.16.4) over `checkResolvable()`. Exports `parseFlags`, `resolveSkillsDir`, `DEFERRED`, `runCheckResolvable`. Exit rule: **1 on any issue (warnings OR errors)**, stricter than doctor's `ok` flag — honors README:259. Stable JSON envelope `{ok, skillsDir, report, autoFix, deferred, error, message}` — same shape on success and error paths. `--fix` path runs `autoFixDryViolations` BEFORE `checkResolvable` (same ordering as doctor). `scripts/skillify-check.ts` subprocess-calls `gbrain check-resolvable --json` (cached per process) and fails loud on binary-missing — no silent false-pass. **v0.19:** AGENTS.md workspaces now resolve natively (see `src/core/resolver-filenames.ts`) — gbrain inspects the 107-skill OpenClaw deployment whether the routing file is `RESOLVER.md` or `AGENTS.md`. `DEFERRED[]` is empty — Checks 5 + 6 shipped as real code, not issue URLs. **v0.31.7:** the resolver lookup switched from first-match-wins to the multi-file merge in `src/core/check-resolvable.ts` — entries collected from every `RESOLVER.md` / `AGENTS.md` across the skills dir AND its parent, deduped by `skillPath` (first occurrence wins). Lifted reachable skills on the reference OpenClaw layout from 37/224 to 200/224 — the deployment ships a thin `skills/RESOLVER.md` (~40 entries from skillpack) plus a fat `../AGENTS.md` (200+ entries, the real dispatcher), and the previous code only saw the first one. The CLI also switched to `autoDetectSkillsDirReadOnly` so `cd ~ && gbrain check-resolvable` finds the bundled skills via the install-path fallback. `--fix` carries the same D6 safety gate as `gbrain doctor --fix`: refuses to write when `detected.source === 'install_path'`. -- `src/core/resolver-filenames.ts` (v0.19) — central list of accepted routing filenames (`RESOLVER.md`, `AGENTS.md`). Shared by `findRepoRoot`, `check-resolvable`, and skillpack install so every code path walks the same fallback chain. -- `src/commands/skillify.ts` + `src/core/skillify/{generator,templates}.ts` (v0.19) — `gbrain skillify scaffold ` creates all stubs for a new skill in one command: SKILL.md, script, tests, routing-eval.jsonl, resolver entry, filing-rules pointer. `gbrain skillify check