Skip to content

fix(pipeline): handle ALTER migrations and auto-init fingerprint registry#156

Open
tipich wants to merge 1 commit into
Lum1104:mainfrom
tipich:fix/alter-migrations-and-fingerprint-autoinit
Open

fix(pipeline): handle ALTER migrations and auto-init fingerprint registry#156
tipich wants to merge 1 commit into
Lum1104:mainfrom
tipich:fix/alter-migrations-and-fingerprint-autoinit

Conversation

@tipich
Copy link
Copy Markdown
Contributor

@tipich tipich commented May 17, 2026

Summary

Two small, unrelated gaps in /understand that both surface on Laravel projects. Bundled into one PR because the diff is tiny and the two fixes are independent — easy to back out either side.

Gap 1 — ALTER migrations produce orphan table: nodes

agents/file-analyzer.md only described how to handle CREATE-shape SQL migrations. A Laravel Schema::table('users', ...) migration (and the equivalent Rails change_table / Django AddField) emitted a bare file: node with no migrates edge to the existing table: node it was modifying. Each ALTER added an orphan sibling table node to the graph.

Measured impact on one Aeropax (Laravel 13) regen: 8 orphan ALTER-migration table: nodes after a full incremental pipeline pass.

Fix: added an explicit ORM migration files section to the Non-code edge creation guidance, distinguishing CREATE shape (Schema::create, create_table, CreateModel) from ALTER shape (Schema::table, change_table, AddField/RemoveField/AlterField). The ALTER case emits only the migrates edge to the existing table node — no duplicate table: node. Also added two rows to the Edge Signal Quick Reference table for fast pattern matching.

Gap 2 — buildFingerprintStore required hidden setup the skill doc didn't mention

skills/understand/SKILL.md Phase 7 step 2.5 documents:

import { buildFingerprintStore, saveFingerprints } from '@understand-anything/core';

const store = await buildFingerprintStore('<PROJECT_ROOT>', sourceFilePaths);
saveFingerprints('<PROJECT_ROOT>', store);

…but the actual signature was (projectDir, filePaths, registry, gitCommitHash) — sync, with both extra args required. Following the doc gave:

TypeError: Cannot read properties of undefined (reading 'analyzeFile')
    at packages/core/dist/fingerprint.js:168:35

…because registry was undefined. Result: fingerprints.json silently failed to regen on Phase 7, leaving the baseline stale.

Fix: option (A) from the proposal — auto-init inside buildFingerprintStore.

  • registry and gitCommitHash are now optional.
  • When registry is missing OR empty, the function constructs one and registers all built-in non-code parsers (registerAllParsers) plus a TreeSitterPlugin with builtinLanguageConfigs (the 10 supported code languages, WASM grammars awaited).
  • When gitCommitHash is missing, the function shells out to git rev-parse HEAD in projectDir; falls back to \"unknown\" if not a git repo or git is unavailable.
  • The function is now async to match the documented await and to allow the tree-sitter WASM init. Callers that already pass a populated registry are untouched — auto-registration only fires on an empty registry.

Updated SKILL.md to call out that the 2-argument form is the supported standalone path and that 3rd/4th args remain available for explicit-control / CI scenarios.

Test plan

  • pnpm --filter @understand-anything/core build clean (no TS errors).
  • pnpm --filter @understand-anything/core test673 tests, all green (previously 670; added 3 new in fingerprint-autoinit.test.ts).
  • New test file uses the real filesystem (the existing fingerprint.test.ts mocks node:fs, so the smoke test had to live separately). It covers:
    • Zero-arg call returns a populated store ({ version: \"1.0.0\", files: { ... }, gitCommitHash: \"unknown\" } outside a git repo).
    • Tree-sitter extracts function/class signatures for both TypeScript and Python sample files (this is the direct regression test for the analyzeFile undefined crash).
    • Caller-provided non-empty registry stays untouched — auto-registration is skipped, sentinel plugin output wins, explicit commit hash flows through.
  • Aeropax regen (real-project smoke) — after this lands, re-running /understand against C:\dev\aeropax should drop orphan ALTER-migration count from 8 → 0 and produce a fresh fingerprints.json.

Pre-existing failure

pnpm --filter @understand-anything/skill test has one pre-existing failure in extract-structure.test.mjs (1 fail / 43 pass on main; 1 fail / 44 pass on this branch). Verified on main — unrelated to these changes.

Out of scope

  • Other orphan classes from the same Aeropax regen (36 config() runtime-string orphans, 24 markdown→code document orphans, 268 legacy file orphans) — bigger investigations, separate tasks.
  • Adding a Laravel framework prompt to skills/understand/frameworks/ — the new guidance in file-analyzer.md is framework-agnostic and covers Rails / Django / Laravel uniformly. A dedicated laravel.md can come later if it justifies its own prompt.

…stry

Two unrelated but small gaps in /understand that show up on Laravel
projects, bundled to keep the noise down:

Gap 1 — ALTER migrations produce orphan nodes
  file-analyzer.md only knew about CREATE-shape SQL migrations. A
  Laravel `Schema::table('users', ...)` (and the equivalent Rails
  `change_table` / Django `AddField`) emitted a bare file: node with
  no migrates edge to the existing table: node it was modifying, so
  the table picked up an orphan sibling on every ALTER. Added an
  explicit ORM migrations section with CREATE vs ALTER guidance plus
  two new rows in the Edge Signal Quick Reference table.

Gap 2 — buildFingerprintStore required hidden setup
  SKILL.md Phase 7 step 2.5 documents a two-arg call:
      const store = await buildFingerprintStore(root, paths);
  …but the actual signature required a fully-initialized PluginRegistry
  and a git commit hash, so callers crashed with
      TypeError: Cannot read properties of undefined (reading 'analyzeFile')
  at fingerprint.js:168. Made `registry` and `gitCommitHash` optional
  and lazy: an empty/missing registry gets all built-in parsers plus
  the tree-sitter plugin auto-registered (10 code languages, WASM
  init awaited), and a missing commit hash is resolved via
  `git rev-parse HEAD` in the project root (falls back to "unknown"
  if not a git repo). Existing 4-arg callers are unaffected; the
  function is now async to match the documented `await`.

Tests
  - 3 new tests in fingerprint-autoinit.test.ts use the real
    filesystem (no fs mock) to cover the zero-config contract, real
    TS/Python structural extraction, and the "caller-provided registry
    wins" path.
  - All 673 core tests green. Pre-existing extract-structure.test.mjs
    skill failure is on main and unrelated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant