fix: code-source first-sync (#744) + class-method code-def#749
Open
lanceretter wants to merge 2 commits intogarrytan:masterfrom
Open
fix: code-source first-sync (#744) + class-method code-def#749lanceretter wants to merge 2 commits intogarrytan:masterfrom
lanceretter wants to merge 2 commits intogarrytan:masterfrom
Conversation
…#744) `gbrain sync --strategy code --source <id>` reports "X pages imported" on first sync but persists ZERO code files. The full-sync path (`performFullSync` at commands/sync.ts:847) called `runImport(repoPath)` without the strategy flag, and `runImport` walked markdown only via `collectMarkdownFiles`. Result: code files never reached the importer on first sync; `last_commit` got bumped as if synced; subsequent incremental runs found no diff and did nothing. Reproduced across three different repos on the same brain. Three small changes: 1. commands/import.ts — extract a shared `collectFiles(dir, strategy)` walker that branches on strategy. Add `collectCodeFiles` and `collectFilesAuto` wrappers. `collectMarkdownFiles` keeps its current behavior + signature for backward compat. 2. commands/import.ts:runImport — accept `strategy` in opts and pick the right walker. Default 'markdown' preserves pre-strategy callers. 3. commands/sync.ts — thread `opts.strategy` through to `runImport` on the full-sync path; also fix the dry-run filter at commands/sync.ts:861 which dropped strategy too. Verified locally on a real PlanetScale Postgres brain across three repos: conquest-lpr (858 code files imported), rv-helper (360), trashtastic-helix (partial — 2 legitimate failures: a 5.6MB minified JS in dist/ and a Supabase types.ts with NUL bytes). Code files now populate `pages WHERE page_kind='code'`; `gbrain code-def` returns real hits (modulo issue tracked separately re: class-method symbol types — see follow-up commit).
`gbrain code-def <symbol>` returned 0 results for any class method
even when chunks existed with the right `symbol_name`, because the
WHERE filter (`commands/code-def.ts:56`) only allowed
`['function', 'class', 'interface', 'type', 'enum', 'struct',
'trait', 'module', 'contract', 'export statement']`.
The tree-sitter chunker emits more granular symbol_type values than
that list covers. On a real Postgres brain after a full code sync:
function | 10897
declaration | 5926
method definition | 3201 ← previously filtered out
export statement | 818
import | 375
class | 343
interface | 215
field definition | 124 ← previously filtered out
variable assignment | 108
struct specifier | 60
public field definition | 57 ← previously filtered out
type | 40
method signature | 14 ← previously filtered out
Concretely: `Database.calculateDeviceStatuses` (a class method)
returned `{count: 0}` even though the chunk existed at
`apps/worker/src/db.ts` with symbol_name='calculateDeviceStatuses'
and symbol_type='method definition'.
Fix: extend DEF_TYPES with the four multi-word symbol_types that
represent class members:
- method definition (TS/JS class methods, Python methods)
- method signature (TS interface methods)
- field definition (class fields)
- public field definition (TS public fields)
This is additive only. Existing callers continue to work; class
members now resolve. `declaration` stays out (too generic — covers
imports, type aliases, and bindings).
Verified locally:
$ gbrain code-def calculateDeviceStatuses
→ 3 hits (real source `apps-worker-src-db-ts` + 2 build artifacts)
$ gbrain code-def runFifteenMinuteTick
→ still works (export statement, was already covered)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two atomic commits that together make
gbrain sync --strategy codeactually work end-to-end on first sync, andgbrain code-defresolve class methods that the existing tree-sitter chunker already extracts.Discovered while upgrading a real PlanetScale Postgres brain from
v0.26.6→v0.30.0(companion to #740, #741, #744).1.
commands/sync.ts+commands/import.ts— strategy threading (closes #744)Reproduces issue #744 directly.
performFullSynccalledrunImport(repoPath)without the strategy flag;runImportthen walked markdown only viacollectMarkdownFiles. First-sync of a freshgstack-code-*source reported "X pages imported" but persisted zero code files. Three different repos on the same brain showed identical broken state.Three small changes, all additive:
collectFiles(dir, strategy)walker incommands/import.tsthat branches on strategy. AddcollectCodeFilesandcollectFilesAutowrappers.collectMarkdownFileskeeps its current behavior + signature for backward compat (no caller changes needed).runImportacceptsstrategyin opts and picks the right walker. Default'markdown'preserves pre-strategy callers.performFullSyncthreadsopts.strategythrough torunImport(and into the dry-run filter at line 861, which dropped strategy too).Verified end-to-end across three repos:
dist/JS; Supabase types.ts with NUL bytes — both correctly refused, not bugs)2.
commands/code-def.ts— extendDEF_TYPESfor class membersgbrain code-def <symbol>returned 0 results for any class method even when chunks existed with the correctsymbol_name, because the WHERE filter only allowed:The tree-sitter chunker emits more granular
symbol_typevalues than that. Real distribution on my brain after a full sync:Concretely:
Database.calculateDeviceStatuses(a TypeScript class method) returned{count: 0}fromcode-defeven though the chunk existed atapps/worker/src/db.tswith the rightsymbol_nameandsymbol_type='method definition'.Fix: add four multi-word symbol_types representing class members. Additive only:
const DEF_TYPES = [ 'function', 'class', 'interface', 'type', 'enum', 'struct', 'trait', 'module', 'contract', + 'method definition', 'method signature', 'field definition', 'public field definition', ];'declaration'is left out (too generic — covers imports, type aliases, plain bindings).After:
gbrain code-def calculateDeviceStatusesreturns 3 hits (the real source plus a couple of compiled-JS build artifacts that are themselves a separate walker concern).Test plan
bun install && bun run build— passes.bun run typecheck— passes.bun test test/schema-bootstrap-coverage.test.ts— 5/5 pass (unchanged).gstack-code-*source on Postgres →gbrain sync --strategy code --source <id>actually imports code files;gbrain code-def <classMethod>returns hits.Notes on follow-up bugs surfaced during this dig
Not in this PR; flagging for future:
.gitignore—tmp/,dist/, build artifacts get indexed alongside source. Worth a.gitignore-aware skip-list or an explicit--ignoreglob.--source <id>is recorded onsources.last_commit/last_sync_atbut pages still land undersource_id='default'rather than the named source. That's whygbrain sources listshowspage_count=0on the new sources even after a successful sync. Reasonable next dig — happy to PR if you want.🤖 Generated with Claude Code
Need help on this PR? Tag
@codesmithwith what you need.