Skip to content

fix(bootstrap): cover files + oauth_clients forward-refs (closes #974, #1018)#1045

Closed
joshwilks111-max wants to merge 1 commit into
garrytan:masterfrom
joshwilks111-max:joshwilks111-max/bootstrap-forward-refs-wave
Closed

fix(bootstrap): cover files + oauth_clients forward-refs (closes #974, #1018)#1045
joshwilks111-max wants to merge 1 commit into
garrytan:masterfrom
joshwilks111-max:joshwilks111-max/bootstrap-forward-refs-wave

Conversation

@joshwilks111-max
Copy link
Copy Markdown

Summary

Fix-wave that closes a class of upgrade-path bootstrap-gap bugs. Adds 4 missing forward-reference probes to applyForwardReferenceBootstrap (both engines), updates REQUIRED_BOOTSTRAP_COVERAGE, adds rewound-brain unit + E2E coverage, and fixes the misleading doctor hint #1018 called out.

Closes:

What's broken

Brains at schema version pre-v0.18 (or pre-v60 for the oauth_clients case) wedge on gbrain init --migrate-only and gbrain apply-migrations --yes with a one-line column "<name>" does not exist error. No stack trace. The wedge fires inside SCHEMA_SQL apply when the embedded schema blob's CREATE INDEX references a column the brain's table predates.

applyForwardReferenceBootstrap is supposed to add these forward-referenced columns BEFORE SCHEMA_SQL replay so the indexes can build. It probes 18 columns but missed:

Column Migration Issue SCHEMA_SQL forward ref
files.source_id v0.18 Step 7 #974 CREATE INDEX idx_files_source_id ON files(source_id) (schema.sql:499)
files.page_id v0.18 Step 7 #974 (+ column half of #820) CREATE INDEX idx_files_page_id ON files(page_id) (schema.sql:498)
oauth_clients.source_id v60 (#861) #1018 partial CREATE INDEX idx_oauth_clients_source_id ON oauth_clients(source_id) WHERE source_id IS NOT NULL (schema.sql:434-435)
oauth_clients.federated_read v61 (#876) #1018 GIN CREATE INDEX idx_oauth_clients_federated_read ON oauth_clients USING GIN (federated_read) (schema.sql:436-437)

Changes

src/core/postgres-engine.ts + src/core/pglite-engine.ts

  • 4 new EXISTS probes added to the round-trip detection SELECT (mirrors existing 18-probe pattern)
  • 2 new needsXBootstrap flags + corresponding ALTER blocks (needsFilesBootstrap + needsOauthClientsBootstrap)
  • Branch ordering preserved: files and oauth_clients ALTERs run AFTER the needsPagesBootstrap block (which creates sources) so the FK targets exist
  • oauth_clients.source_id uses ON DELETE RESTRICT to match src/schema.sql:427 and v64's final intent (avoids redundant subsequent ALTER; per the TODOS.md "v60 idempotency guard" note)
  • All ALTER statements use IF NOT EXISTS for idempotency (matches existing pattern)

src/commands/doctor.ts (per #1018 secondary suggestion)

  • schema_version check now points users at gbrain init --migrate-only first, then gbrain apply-migrations --yes. When bootstrap is wedged, apply-migrations itself can't proceed (its Phase A calls init --migrate-only), so the original hint was a dead-end. Two-line message change in two sites.

test/schema-bootstrap-coverage.test.ts

  • 4 new entries in REQUIRED_BOOTSTRAP_COVERAGE — exercised automatically by the existing assertion loop
  • 4 corresponding DROP statements in both DROP setup blocks (the per-target asserts test + the SCHEMA_SQL replay test)

test/bootstrap.test.ts

4 new unit cases:

  • pre-v0.18 files shape: bootstrap adds source_id + page_id (closes #974) — also asserts SCHEMA_SQL replay succeeds afterward
  • pre-v60 oauth_clients shape: bootstrap adds source_id + federated_read (closes #1018) — also asserts federated_read is TEXT[] per schema, and that SCHEMA_SQL replay succeeds afterward (the actual PGLite upgrade wedge: applyForwardReferenceBootstrap missing v60 oauth_clients forward refs #1018 bug surface)
  • absent oauth_clients table (very old brain): bootstrap no-ops the oauth branch — defense against pre-v0.26 brains that have no oauth_clients table at all
  • absent files table (very old brain): bootstrap no-ops the files branch — same defense pattern

test/e2e/postgres-bootstrap.test.ts

New rewound-brain E2E that strips ALL 4 newly-probed columns + pages.source_id together, then runs PostgresEngine.initSchema() and asserts:

  • config.version reaches LATEST_VERSION
  • Each newly-probed column exists post-bootstrap
  • Every newly-built index (idx_files_page_id, idx_files_source_id, idx_oauth_clients_source_id, idx_oauth_clients_federated_read) is present

This is the contract test for branch ordering. If files or oauth_clients ALTER fires before the sources table is created (FK target missing), the test fails loud.

Mirrors commit 336597c's pattern (rewound-brain E2E for v39-v41 forward-reference bootstrap).

What's NOT in scope

  • Structural upgrade-path test — the class of bug that keeps recurring needs a structural fix, not just another column added to the hand-maintained list. Filed as Add structural upgrade-path test to catch the bootstrap-gap class at PR time #1044 with two design sketches (snapshot-based vs replay-based) for a separate PR.
  • VERSION + CHANGELOG bump — leaving for maintainer's fix-wave packaging (matches the 7d39527 pattern for prior bootstrap fixes).
  • v60 idempotency guard against --force-retry race with v64 (per TODOS.md) — directly relevant to this PR's ON DELETE RESTRICT decision but a separate code change in migrate.ts. Documented here for context.
  • content_chunks.source_id — initially included in the PR draft but trimmed after verification: the column is in src/schema.sql:218 but NOT in src/core/schema-embedded.ts (the runtime SQL Postgres replays), so it isn't a SCHEMA_SQL forward-reference. Adding it via bootstrap created real schema drift vs PGLite that the schema-drift.test.ts gate caught. The schema.sqlschema-embedded.ts discrepancy is a separate concern for a different PR.

Test plan

  • bun run typecheck — clean
  • bun test test/bootstrap.test.ts test/schema-bootstrap-coverage.test.ts15/15 pass (4 new cases + the array test exercising 4 new entries)
  • bun test test/e2e/schema-drift.test.ts test/e2e/postgres-bootstrap.test.ts against fresh Postgres — 9/9 pass including the new rewound-brain E2E and the schema-drift parity gate
  • Targeted unit suite (9 files, 297 tests): 294 pass / 3 fail. The 3 failures are pre-existing Windows Git-Bash environment issues (Bun.spawn(['bun', ...]) ENOENT — the spawned process can't find bun via bare name). Unrelated to this PR.

Diagnosis context

I hit oauth_clients.source_id first on my own brain (the federation-attribution bug PR #776 fixed already, then needed schema migration to unlock --strategy code indexing for Cathedral II). Searching the issue tracker surfaced the recurring pattern across #974, #1018, #820 — same shape every time. Bundling all four columns into one PR closes 3 issues at once and signals the pattern recognition the bootstrap docstring asks for ("11 wedge incidents and counting").

…ytan#974, garrytan#1018)

applyForwardReferenceBootstrap probes 18 columns but missed four that
SCHEMA_SQL declares unguarded indexes against:

  - files.source_id            (v0.18 Step 7, schema.sql:499 idx)
  - files.page_id              (v0.18 Step 7, schema.sql:498 idx)
  - oauth_clients.source_id    (v60, schema.sql:434-435 partial idx)
  - oauth_clients.federated_read (v61, schema.sql:436-437 GIN idx)

Pre-v0.18 brains wedge on `gbrain init --migrate-only` /
`gbrain apply-migrations --yes` with a one-line `column "<name>" does
not exist` and no stack trace, identical shape to the 11 prior wedge
incidents tracked in REQUIRED_BOOTSTRAP_COVERAGE's docstring.

Same fix structure as the existing 18 probe blocks. Both engines
(postgres-engine.ts + pglite-engine.ts) get the same 4 EXISTS probes,
needs flags, and ALTER ADD blocks. Branch ordering preserves
`needsPagesBootstrap` first so the sources(id) FK target exists when
files/oauth_clients ALTERs run. oauth_clients.source_id uses ON DELETE
RESTRICT to match schema.sql:427 + v64's final intent (skips a
redundant subsequent ALTER per the v60-idempotency-guard TODO).

Doctor's schema_version check (per garrytan#1018 secondary suggestion) now
points users at `gbrain init --migrate-only` first then
`apply-migrations --yes` — when bootstrap is wedged, apply-migrations
itself can't proceed (its Phase A calls init --migrate-only).

Test coverage:
  - REQUIRED_BOOTSTRAP_COVERAGE gains 4 entries + DROP statements in
    both setup blocks, exercised by the existing assertion loop
  - test/bootstrap.test.ts: 4 new unit cases (files, oauth_clients +
    SCHEMA_SQL replay-succeeds-after-bootstrap, plus defense-in-depth
    cases for very-old brains lacking the files/oauth_clients tables
    entirely)
  - test/e2e/postgres-bootstrap.test.ts: rewound-brain E2E that strips
    all 4 columns + pages.source_id together against real Postgres,
    asserts initSchema() reaches LATEST_VERSION with each column +
    each previously-wedged index built. Mirrors commit 336597c's
    pattern for the v39-v41 wave.

Verified: typecheck clean; bootstrap-focused unit tests 15/15 pass;
fresh-container E2E (postgres-bootstrap + schema-drift) 9/9 pass.

Diagnosed in a Claude Code session against my own brain wedge.
@garrytan
Copy link
Copy Markdown
Owner

Thanks @joshwilks111-max — already shipped in master as of v0.35.5.0. applyForwardReferenceBootstrap in both src/core/pglite-engine.ts and src/core/postgres-engine.ts now probes files.source_id, files.page_id, oauth_clients.source_id, oauth_clients.federated_read, sources.archived, sources.archived_at, and sources.archive_expires_at — before SCHEMA_SQL replays. Closes the pre-v18/v34/v60 forward-reference wedges. test/bootstrap.test.ts + test/schema-bootstrap-coverage.test.ts enforce.

If your install is still wedged on gbrain upgrade, please reopen with the output of gbrain doctor --json and your starting schema_version.

Closing as already-shipped. Real appreciation for chasing this — the structural fix-set with the schema-bootstrap-coverage CI guard now prevents the entire bug class.

@garrytan garrytan closed this May 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants