fix(bootstrap): cover files + oauth_clients forward-refs (closes #974, #1018)#1045
Conversation
…ytan#974, garrytan#1018) applyForwardReferenceBootstrap probes 18 columns but missed four that SCHEMA_SQL declares unguarded indexes against: - files.source_id (v0.18 Step 7, schema.sql:499 idx) - files.page_id (v0.18 Step 7, schema.sql:498 idx) - oauth_clients.source_id (v60, schema.sql:434-435 partial idx) - oauth_clients.federated_read (v61, schema.sql:436-437 GIN idx) Pre-v0.18 brains wedge on `gbrain init --migrate-only` / `gbrain apply-migrations --yes` with a one-line `column "<name>" does not exist` and no stack trace, identical shape to the 11 prior wedge incidents tracked in REQUIRED_BOOTSTRAP_COVERAGE's docstring. Same fix structure as the existing 18 probe blocks. Both engines (postgres-engine.ts + pglite-engine.ts) get the same 4 EXISTS probes, needs flags, and ALTER ADD blocks. Branch ordering preserves `needsPagesBootstrap` first so the sources(id) FK target exists when files/oauth_clients ALTERs run. oauth_clients.source_id uses ON DELETE RESTRICT to match schema.sql:427 + v64's final intent (skips a redundant subsequent ALTER per the v60-idempotency-guard TODO). Doctor's schema_version check (per garrytan#1018 secondary suggestion) now points users at `gbrain init --migrate-only` first then `apply-migrations --yes` — when bootstrap is wedged, apply-migrations itself can't proceed (its Phase A calls init --migrate-only). Test coverage: - REQUIRED_BOOTSTRAP_COVERAGE gains 4 entries + DROP statements in both setup blocks, exercised by the existing assertion loop - test/bootstrap.test.ts: 4 new unit cases (files, oauth_clients + SCHEMA_SQL replay-succeeds-after-bootstrap, plus defense-in-depth cases for very-old brains lacking the files/oauth_clients tables entirely) - test/e2e/postgres-bootstrap.test.ts: rewound-brain E2E that strips all 4 columns + pages.source_id together against real Postgres, asserts initSchema() reaches LATEST_VERSION with each column + each previously-wedged index built. Mirrors commit 336597c's pattern for the v39-v41 wave. Verified: typecheck clean; bootstrap-focused unit tests 15/15 pass; fresh-container E2E (postgres-bootstrap + schema-drift) 9/9 pass. Diagnosed in a Claude Code session against my own brain wedge.
|
Thanks @joshwilks111-max — already shipped in master as of v0.35.5.0. If your install is still wedged on Closing as already-shipped. Real appreciation for chasing this — the structural fix-set with the schema-bootstrap-coverage CI guard now prevents the entire bug class. |
Summary
Fix-wave that closes a class of upgrade-path bootstrap-gap bugs. Adds 4 missing forward-reference probes to
applyForwardReferenceBootstrap(both engines), updatesREQUIRED_BOOTSTRAP_COVERAGE, adds rewound-brain unit + E2E coverage, and fixes the misleading doctor hint #1018 called out.Closes:
files.source_id+files.page_idoauth_clients.source_id+oauth_clients.federated_read+ the doctor hint suggestionfiles.page_idreferenceWhat's broken
Brains at schema version pre-v0.18 (or pre-v60 for the oauth_clients case) wedge on
gbrain init --migrate-onlyandgbrain apply-migrations --yeswith a one-linecolumn "<name>" does not existerror. No stack trace. The wedge fires inside SCHEMA_SQL apply when the embedded schema blob'sCREATE INDEXreferences a column the brain's table predates.applyForwardReferenceBootstrapis supposed to add these forward-referenced columns BEFORE SCHEMA_SQL replay so the indexes can build. It probes 18 columns but missed:files.source_idCREATE INDEX idx_files_source_id ON files(source_id)(schema.sql:499)files.page_idCREATE INDEX idx_files_page_id ON files(page_id)(schema.sql:498)oauth_clients.source_idCREATE INDEX idx_oauth_clients_source_id ON oauth_clients(source_id) WHERE source_id IS NOT NULL(schema.sql:434-435)oauth_clients.federated_readCREATE INDEX idx_oauth_clients_federated_read ON oauth_clients USING GIN (federated_read)(schema.sql:436-437)Changes
src/core/postgres-engine.ts+src/core/pglite-engine.tsneedsXBootstrapflags + corresponding ALTER blocks (needsFilesBootstrap+needsOauthClientsBootstrap)filesandoauth_clientsALTERs run AFTER theneedsPagesBootstrapblock (which createssources) so the FK targets existoauth_clients.source_idusesON DELETE RESTRICTto matchsrc/schema.sql:427and v64's final intent (avoids redundant subsequent ALTER; per theTODOS.md"v60 idempotency guard" note)IF NOT EXISTSfor idempotency (matches existing pattern)src/commands/doctor.ts(per #1018 secondary suggestion)schema_versioncheck now points users atgbrain init --migrate-onlyfirst, thengbrain apply-migrations --yes. When bootstrap is wedged,apply-migrationsitself can't proceed (its Phase A callsinit --migrate-only), so the original hint was a dead-end. Two-line message change in two sites.test/schema-bootstrap-coverage.test.tsREQUIRED_BOOTSTRAP_COVERAGE— exercised automatically by the existing assertion looptest/bootstrap.test.ts4 new unit cases:
pre-v0.18 files shape: bootstrap adds source_id + page_id (closes #974)— also asserts SCHEMA_SQL replay succeeds afterwardpre-v60 oauth_clients shape: bootstrap adds source_id + federated_read (closes #1018)— also assertsfederated_readisTEXT[]per schema, and that SCHEMA_SQL replay succeeds afterward (the actual PGLite upgrade wedge: applyForwardReferenceBootstrap missing v60 oauth_clients forward refs #1018 bug surface)absent oauth_clients table (very old brain): bootstrap no-ops the oauth branch— defense against pre-v0.26 brains that have no oauth_clients table at allabsent files table (very old brain): bootstrap no-ops the files branch— same defense patterntest/e2e/postgres-bootstrap.test.tsNew rewound-brain E2E that strips ALL 4 newly-probed columns +
pages.source_idtogether, then runsPostgresEngine.initSchema()and asserts:config.versionreachesLATEST_VERSIONidx_files_page_id,idx_files_source_id,idx_oauth_clients_source_id,idx_oauth_clients_federated_read) is presentThis is the contract test for branch ordering. If
filesoroauth_clientsALTER fires before thesourcestable is created (FK target missing), the test fails loud.Mirrors commit
336597c's pattern (rewound-brain E2E for v39-v41 forward-reference bootstrap).What's NOT in scope
7d39527pattern for prior bootstrap fixes).v60 idempotency guard against --force-retry race with v64(perTODOS.md) — directly relevant to this PR'sON DELETE RESTRICTdecision but a separate code change inmigrate.ts. Documented here for context.content_chunks.source_id— initially included in the PR draft but trimmed after verification: the column is insrc/schema.sql:218but NOT insrc/core/schema-embedded.ts(the runtime SQL Postgres replays), so it isn't a SCHEMA_SQL forward-reference. Adding it via bootstrap created real schema drift vs PGLite that theschema-drift.test.tsgate caught. Theschema.sql↔schema-embedded.tsdiscrepancy is a separate concern for a different PR.Test plan
bun run typecheck— cleanbun test test/bootstrap.test.ts test/schema-bootstrap-coverage.test.ts— 15/15 pass (4 new cases + the array test exercising 4 new entries)bun test test/e2e/schema-drift.test.ts test/e2e/postgres-bootstrap.test.tsagainst fresh Postgres — 9/9 pass including the new rewound-brain E2E and the schema-drift parity gateBun.spawn(['bun', ...])ENOENT — the spawned process can't findbunvia bare name). Unrelated to this PR.Diagnosis context
I hit
oauth_clients.source_idfirst on my own brain (the federation-attribution bug PR #776 fixed already, then needed schema migration to unlock--strategy codeindexing for Cathedral II). Searching the issue tracker surfaced the recurring pattern across #974, #1018, #820 — same shape every time. Bundling all four columns into one PR closes 3 issues at once and signals the pattern recognition the bootstrap docstring asks for ("11 wedge incidents and counting").