Workspace lifecycle refactor: provision at identity, strict resolver, tighten placement API#67
Merged
mgoldsborough merged 6 commits intomainfrom Apr 24, 2026
Merged
Conversation
Move workspace provisioning from the data path (resolveWorkspace middleware, fired on the first tool call) to the identity path (provisionUser in each adapter, fired once on first login). Establishes the invariant "authenticated user has ≥1 workspace" at the identity boundary so request resolvers can treat it as a hard requirement. Fixes a timing bug where GET /v1/bootstrap returns before any workspace exists for a first-login user, leading to a blank workspace selector and unfiltered placements leaking apps from other workspaces. New src/workspace/provisioning.ts::ensureUserWorkspace is idempotent and concurrent-safe: a deterministic slug + WorkspaceConflictError / MemberConflictError reconciliation guarantees exactly one workspace per user even under racing first-login requests. Replaces the old timestamp-suffix fallback in resolveWorkspace which could create duplicate workspaces under StrictMode or retries. Wired into WorkOS (exchangeCode -> provisionUser), OIDC (first valid JWT -> verifyRequest), and Dev (ensureDefaults). resolveWorkspace's auto-provision branch stays for now as a safety net; removed in Phase 2.
resolveWorkspace (src/api/auth-middleware.ts) no longer picks defaults or auto-provisions. Both behaviors were footguns: - Silent single-workspace resolution: a client that "just worked" with one workspace would 400 the next day when the user was added to a second. Honest contract: the caller names the workspace. - Auto-provisioning on zero: side effect on the data path. Phase 1 moved this to the identity boundary; now the fallback is removed. The resolver now requires either an X-Workspace-Id header or a conversation-derived wsId (chat paths, tied to stored state). Missing → 400 workspace_required, with a message pointing at the bootstrap endpoint or Settings → Profile → MCP Connection. handleBootstrap (GET /v1/bootstrap) does NOT mount requireWorkspace middleware — it's the one endpoint where a client legitimately cannot yet know a wsId. The handler does local permissive resolution: honor X-Preferred-Workspace hint when it matches a membership, else pick the user's first workspace. Placement filter is now unconditional — if userWorkspaces is empty at bootstrap time, the Phase 1 invariant has broken and we 500 loudly instead of leaking placements from every workspace on the tenant (which was the original Mario bug). Test changes: integration tests that previously relied on silent single-workspace resolution now send the X-Workspace-Id header explicitly. Three blocks in api-integration.test.ts were using the default work dir (shared state across parallel suites, masked by the silent resolver) — added isolated work dirs. No behavior regressions against the pre-existing lifecycle test failures on main.
Reduce the registry's read surface to a single method, forWorkspace(wsId), that returns ambient (no wsId) plus scoped-to-this-workspace entries merged and sorted. Remove all() and forSlot(). The removed methods were footguns: in a multi-tenant system no legitimate caller wants placements unrelated to a workspace. all() was the exact API that let the original Mario/HQ-tenant bug compile — handlers could fetch every tenant's nav entries with one call. With forWorkspace as the only accessor, the leak becomes unwritable. The concept of "global" vs "workspace-scoped" placements was also wrong in spirit. Home/Conversations/Files/Settings aren't global — they operate inside a workspace; they just happen to be provided by the platform rather than by an installed bundle. Renaming to "ambient" (always present within a workspace) makes the semantics honest. Collateral deletions: - src/runtime/workspace-access.ts — filterPlacementsForWorkspace folded into PlacementRegistry.forWorkspace. Deleted. - test/unit/runtime/workspace-access.test.ts — contract now covered by placement-registry.test.ts. Deleted. handleBootstrap and handleShell become one-line readers. handleShell is also tightened to require workspaceId (always set by the requireWorkspace middleware that already gates the route); the silent "no workspaceId → return everything" fallback is gone. Zero new test failures against the pre-existing 37 lifecycle-test failures on main.
Addresses PR review feedback. Phase 1 took workspaceStore as optional (workspaceStore?: WorkspaceStore) to minimize test-site churn, but that's a silent footgun: a future provider author could forget to pass it, provisioning would no-op, and the user would hit a confusing 400 on their first data request rather than a clear error at wiring time. The API principle is the same one Phase 3 applied to PlacementRegistry (remove all() and forSlot()): make wrong code unwritable. Required parameter is the strongest form of the guarantee. Production wiring (createIdentityProvider factory, server.ts dev-mode construction) already passes workspaceStore — only tests were relying on the optional form. Also removes the now-dead `if (this.workspaceStore)` guards inside each provider's provisioning call site, since the type now excludes null. userStore stays optional on the WorkOS provider — documented in-line why: WorkOS itself is the source of truth for users (managedUsers: true), and the local profile is a preferences cache. workspaceStore does not have that asymmetry.
Addresses three warnings and two suggestions from code review.
W1+W2: self-healing workspace provisioning on every verifyRequest.
Move ensureUserWorkspace out of first-login-only gates. OIDC, WorkOS
(both AuthKit and regular paths), and dev now call it on every
successful verifyRequest. Happy path is one filesystem read (same
tier as existing per-request work) and no writes.
Fixes:
- Pre-existing users whose workspace was lost (admin deletion,
partial failure, migration drop) would permanently 500 at
/v1/bootstrap with no client-side recovery path. They now recover
on next login.
- The AuthKit/MCP-OAuth path never routes through exchangeCode, so
users authenticated via that path had no workspace provisioned.
They do now.
Remove the redundant ensureUserWorkspace from WorkOS provisionUser
(verifyRequest handles it). Rename dev's ensureDefaults →
ensureUserProfile to reflect that it only seeds the profile.
W3: drop deriveSlug truncation.
Old code sliced user IDs to 16 chars. For OIDC IDs (usr_oidc_<12
hex>) that left ~7 hex of entropy and birthday-collided at ~16K
users. On collision, reconcileConflict silently added both users as
admins of the same workspace — a cross-tenant leak. WORKSPACE_ID_RE
allows 64 chars; no reason to truncate. Drop it. Remove the unit
test that codified the collision behavior as correct. Add a
regression test asserting different OIDC IDs with shared hex
prefixes now yield distinct workspaces.
S4+S5: sharpen comments.
reconcileConflict's !existing branch comment narrowed to the real
trigger (concurrent delete between create() throw and our re-read).
ensureUserWorkspace doc tightened to accurately describe what
WorkspaceStore.create atomically detects vs what it doesn't.
Header unification: delete X-Preferred-Workspace.
Pre-existing dual-header design ("X-Preferred-Workspace" for
bootstrap, "X-Workspace-Id" for data endpoints) violated the same
"API shapes should not require context-specific rules" principle
Phase 3 applied to PlacementRegistry. Unify on X-Workspace-Id:
bootstrap reads it as a soft hint (only place the server defaults,
per Phase 2); data endpoints keep authoritative semantics. Remove
from CORS allowlist, web client, handler. One header, one name.
AuthKit: new test asserting verifyRequest provisions a workspace on
successful MCP-OAuth auth. Covers the path that has no exchangeCode.
OIDC integration: new self-heal test that deletes a user's workspace
and asserts the next verifyRequest re-establishes the invariant
instead of leaving the user stuck.
Zero unit-test regressions. Integration suite is stable against the
pre-existing 37 lifecycle failures on main. (One intermittent flake —
"10 concurrent authenticated chat requests" — passes in isolation but
can time out under parallel load; exposed by Phase 2 actually reaching
the stream path. Tracked separately, not a behavior regression.)
resolveWorkspace took a `conversationWorkspaceId?: string` second-fallback
parameter intended for chat paths — the idea being that a request with a
known conversationId could derive workspace from the conversation's
stored wsId instead of requiring X-Workspace-Id.
No caller passed it. requireWorkspace middleware and the /mcp route both
call the three-arg form. The fallback branch was unreachable for the
entire time it existed.
Two unit tests in workspace-context.test.ts exercised the dead parameter
directly ("resolves from conversation's workspaceId when no header",
"header takes precedence over conversation workspaceId"). They verified
behavior no production code ever relied on. Deleted.
Now that X-Workspace-Id is the one-and-only way to name a workspace on
data endpoints (bootstrap handles its own defaulting), this parameter
was just a footgun waiting to grow. Delete it — matches the same
"API shapes make wrong code unwritable" principle applied to
PlacementRegistry.all() and the header unification.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the HQ-tenant bug Mario hit on first login (blank workspace selector + duplicate apps in sidebar) and removes the bug class that allowed it.
ensureUserWorkspacecalled from each identity provider's first-login hook (WorkOSprovisionUser, OIDCverifyRequest, DevensureDefaults). Idempotent and concurrent-safe. Invariant "authenticated user has ≥1 workspace" holds at the identity boundary — no more half-provisioned user state between auth and first data request.resolveWorkspacebecomes pure selection: header or conversation-derived wsId, else 400. Both the silent single-workspace default and the data-path auto-provision are gone./v1/bootstrapis exempt fromrequireWorkspacemiddleware — it's the discovery endpoint and does its own local resolution (honorsX-Preferred-Workspace, defaults to first membership).all()andforSlot(). OnlyforWorkspace(wsId)remains, returning ambient + scoped merged. The original placement-leak bug becomes unwritable, not merely caught.workspace-access.tsdeleted (folded into registry).Three phases, three commits for clean review/bisect. Net
+714/−428across 34 files; production-code surface shrinks by ~100 LOC after Phase 3's deletions.Test plan
X-Workspace-Idheader continue working