diff --git a/docs/02_architecture/C4_SYSTEM_DIAGRAM.md b/docs/02_architecture/C4_SYSTEM_DIAGRAM.md
new file mode 100644
index 0000000..4ee47e8
--- /dev/null
+++ b/docs/02_architecture/C4_SYSTEM_DIAGRAM.md
@@ -0,0 +1,352 @@
+# C4 System Diagrams — Code Kit Ultra
+
+**Status:** Authoritative
+**Version:** 1.2.0
+**Last reviewed:** 2026-04-04
+**See also:** `docs/02_architecture/SYSTEM_ARCHITECTURE.md`, `docs/02_architecture/AUTH_ARCHITECTURE.md`
+
+---
+
+## Overview
+
+This document presents the Code Kit Ultra architecture at three levels of abstraction following the [C4 model](https://c4model.com/):
+
+- **Level 1 — System Context:** Code Kit Ultra in relation to users and external systems.
+- **Level 2 — Container Diagram:** The deployable units (apps, packages) and their responsibilities.
+- **Level 3 — Component Diagrams:** Internal component breakdown for the Orchestrator and Auth containers.
+
+All diagrams use [Mermaid](https://mermaid.js.org/) syntax and are renderable in GitHub, GitLab, and most modern documentation tooling.
+
+---
+
+## Level 1 — System Context
+
+```mermaid
+C4Context
+  title System Context — Code Kit Ultra
+
+  Person(developer, "Developer / Operator", "Human user who submits ideas, reviews gates, and approves actions via CLI or Web UI.")
+  Person(operator, "Operator (Automated)", "CI/CD system or script that drives runs via the Control Service API.")
+  Person(svcAccount, "Service Account", "Machine identity issued by Code Kit Ultra for non-interactive automation flows.")
+
+  System(cku, "Code Kit Ultra", "Orchestration, governance, execution, and learning plane for AI-assisted software engineering. Runs are submitted, planned, gated, executed, healed, and recorded here.")
+
+  System_Ext(insforge, "InsForge Identity Platform", "Issues RS256-signed session JWTs via Supabase Auth. Provides JWKS endpoint, Supabase PostgreSQL, object storage, and Realtime SSE infrastructure.")
+  System_Ext(aiProviders, "AI Providers", "LLM inference endpoints: Anthropic Claude, Google Gemini, OpenAI GPT-4o, Cursor, Windsurf, AntiGravity. Receive adapter-routed prompts and return structured completions.")
+  System_Ext(github, "GitHub", "Source code host. The GitHub provider adapter reads repositories, creates branches, commits files, and opens pull requests.")
+  System_Ext(redis, "Redis", "JWT jti revocation blacklist and JWKS cache TTL store. Used for sub-second token invalidation without DB round-trips.")
+
+  Rel(developer, cku, "Submits ideas, approves gates, views audit trail", "HTTPS / CLI stdin")
+  Rel(operator, cku, "Drives runs programmatically", "HTTPS REST API")
+  Rel(svcAccount, cku, "Executes automated runs", "HTTPS + HS256 JWT")
+
+  Rel(cku, insforge, "Verifies session JWTs via JWKS; reads/writes run data to Supabase DB; streams events via Supabase Realtime", "HTTPS / WebSocket")
+  Rel(cku, aiProviders, "Routes inference requests through adapter layer", "HTTPS / vendor SDK")
+  Rel(cku, github, "Reads repos, creates branches, commits files, opens PRs", "HTTPS / GitHub REST API")
+  Rel(cku, redis, "Checks/writes jti blacklist; caches JWKS public keys", "TCP / Redis protocol")
+
+  UpdateLayoutConfig($c4ShapeInRow="3", $c4BoundaryInRow="2")
+```
+
+### System Context Notes
+
+| Actor / System | Role |
+|---|---|
+| Developer / Operator | Primary human interface. Uses `apps/cli` or `apps/web-control-plane`. |
+| Service Account | Machine-to-machine identity. JWT issued by `packages/auth/src/service-account.ts`. |
+| InsForge | Identity plane. Code Kit Ultra does **not** own human identity. |
+| AI Providers | Stateless inference backends. All routing is done by `packages/adapters`. |
+| GitHub | Target environment for code-producing runs. |
+| Redis | Fast revocation store. System falls back to in-memory cache if Redis is unavailable. |
+
+---
+
+## Level 2 — Container Diagram
+
+```mermaid
+C4Container
+  title Container Diagram — Code Kit Ultra
+
+  Person(developer, "Developer / Operator")
+
+  System_Boundary(cku, "Code Kit Ultra") {
+
+    Container(cli, "CLI", "Node.js / TypeScript", "Interactive terminal interface. Issues commands (run, approve, rollback, validate). Located at apps/cli/.")
+    Container(webUI, "Web Control Plane", "React + Vite / TypeScript", "Browser-based dashboard. Displays run status, gate decisions, audit trail, and live SSE stream. Located at apps/web-control-plane/.")
+    Container(controlService, "Control Service", "Node.js / Express / TypeScript", "Single HTTP API server. Handles auth middleware, command routing, SSE event stream, and realtime bridging. Located at apps/control-service/.")
+
+    Container(orchestrator, "Orchestrator", "TypeScript (internal package)", "Drives the full run lifecycle. Contains: phase-engine, execution-engine, intake, planner, gate-manager, action-runner, mode-controller, batch-queue, outcome-engine, healing-integration, resume-run, rollback-engine.")
+    Container(governance, "Governance", "TypeScript (internal package)", "Evaluates the 9 governance gates. Contains: gate-controller, governed-pipeline, confidence-engine, consensus-engine, validation-engine, constraint-engine, intent-engine, adaptive-consensus, kill-switch.")
+    Container(adapters, "Adapters", "TypeScript (internal package)", "AI routing adapters (claude, gemini, openai, cursor, windsurf, antigravity) and provider adapters (FileSystem, Terminal, GitHub). Translates internal action contracts to vendor APIs.")
+    Container(auth, "Auth", "TypeScript (internal package)", "Verifies InsForge session JWTs, resolves session context, issues per-run execution tokens, and manages service accounts.")
+    Container(healing, "Healing", "TypeScript (internal package)", "Post-failure recovery. Classifies failures, selects healing strategies, executes recovery actions, and revalidates outcomes.")
+    Container(learning, "Learning", "TypeScript (internal package)", "Post-run intelligence. Persists outcome records, updates reliability scores, tunes execution policies, and surfaces execution optimisations.")
+    Container(audit, "Audit", "TypeScript (internal package)", "Writes immutable AuditEvents to the DB with SHA256 hash chain for tamper detection.")
+    Container(events, "Events", "TypeScript (internal package)", "Publishes CanonicalEvents to the SSE stream and Supabase Realtime channel using domain.noun.verb naming convention.")
+    Container(observability, "Observability", "TypeScript (internal package)", "Structured trace engine, timeline builder, logger, report renderer, and score explainer for run introspection.")
+    Container(security, "Security", "TypeScript (internal package)", "Action policy enforcement, batch signing, and batch provenance tracking.")
+    Container(policy, "Policy", "TypeScript (internal package)", "RBAC permission resolution, role mapping, and permission constants.")
+    Container(skillEngine, "Skill Engine", "TypeScript (internal package)", "Selects skills for a run plan, resolves manifests, and validates skill schemas.")
+    Container(commandEngine, "Command Engine", "TypeScript (internal package)", "17 command handlers (execute, approve-batch, rollback, validate, etc.) that translate API routes into orchestrator calls.")
+    Container(memory, "Memory", "TypeScript (internal package)", "Run state persistence (run-store.ts) used as the authoritative in-process run record.")
+    Container(shared, "Shared", "TypeScript (internal package)", "Cross-package type definitions: types.ts, contracts.ts, governance-types.ts, observability-types.ts.")
+  }
+
+  System_Ext(insforge, "InsForge Platform", "Supabase Auth JWKS + PostgreSQL + Realtime")
+  System_Ext(aiProviders, "AI Providers", "Claude / Gemini / OpenAI / Cursor / Windsurf / AntiGravity")
+  System_Ext(github, "GitHub")
+  System_Ext(redis, "Redis")
+
+  Rel(developer, cli, "Runs commands", "stdin / stdout")
+  Rel(developer, webUI, "Views dashboard, approves gates", "HTTPS browser")
+  Rel(cli, controlService, "Issues API requests", "HTTPS REST + SSE")
+  Rel(webUI, controlService, "Issues API requests, consumes SSE", "HTTPS REST + SSE")
+
+  Rel(controlService, auth, "Resolves session on every request", "in-process import")
+  Rel(controlService, commandEngine, "Dispatches to command handlers", "in-process import")
+  Rel(commandEngine, orchestrator, "Starts / resumes / rolls back runs", "in-process import")
+  Rel(orchestrator, governance, "Evaluates gates during gating phase", "in-process import")
+  Rel(orchestrator, adapters, "Executes adapter actions during building phase", "in-process import")
+  Rel(orchestrator, healing, "Triggers healing on step failure", "in-process import")
+  Rel(orchestrator, skillEngine, "Selects skills during skills phase", "in-process import")
+  Rel(orchestrator, audit, "Emits AuditEvents at each lifecycle boundary", "in-process import")
+  Rel(orchestrator, events, "Publishes CanonicalEvents for SSE", "in-process import")
+  Rel(orchestrator, learning, "Records outcomes post-run", "in-process import")
+  Rel(orchestrator, security, "Validates action policy and signs batches", "in-process import")
+  Rel(orchestrator, memory, "Reads/writes run state", "in-process import")
+  Rel(orchestrator, observability, "Traces phases and steps", "in-process import")
+  Rel(adapters, aiProviders, "Routes inference requests", "HTTPS / vendor SDK")
+  Rel(adapters, github, "Executes GitHub actions", "HTTPS / GitHub REST API")
+  Rel(auth, insforge, "Fetches JWKS, validates JWTs", "HTTPS")
+  Rel(auth, redis, "Checks jti revocation blacklist", "TCP")
+  Rel(events, insforge, "Publishes to Supabase Realtime channel", "WebSocket")
+
+  UpdateLayoutConfig($c4ShapeInRow="4", $c4BoundaryInRow="2")
+```
+
+### Container Technology Summary
+
+| Container | Runtime | Key Technology | Notes |
+|---|---|---|---|
+| CLI (`apps/cli`) | Node.js | TypeScript, commander or similar | No business logic — translates commands to API calls |
+| Web Control Plane (`apps/web-control-plane`) | Browser | React, Vite, TypeScript | Consumes SSE for live updates |
+| Control Service (`apps/control-service`) | Node.js | Express, TypeScript | Sole HTTP ingress point |
+| Orchestrator | Node.js (in-process) | TypeScript | Stateful phase/step runner |
+| Governance | Node.js (in-process) | TypeScript | 9-gate evaluation pipeline |
+| Adapters | Node.js (in-process) | TypeScript, vendor SDKs | 6 AI + 3 provider adapters |
+| Auth | Node.js (in-process) | TypeScript, jose (JWKS/JWT) | Three-strategy auth chain |
+| Healing | Node.js (in-process) | TypeScript | Strategy-registry pattern |
+| Learning | Node.js (in-process) | TypeScript | Post-run outcome processing |
+| Audit | Node.js (in-process) | TypeScript, crypto (SHA256) | Append-only hash chain |
+| Events | Node.js (in-process) | TypeScript, SSE | domain.noun.verb naming |
+
+### Container Boundary Rules
+
+The following call directions are **permitted**:
+
+```
+Control Service  → Auth, Command Engine
+Command Engine   → Orchestrator
+Orchestrator     → Governance, Adapters, Healing, Skill Engine, Audit,
+                   Events, Learning, Security, Memory, Observability
+Adapters         → AI Providers (external), GitHub (external)
+Auth             → InsForge JWKS (external), Redis (external)
+Events           → InsForge Realtime (external)
+```
+
+The following calls are **prohibited** to maintain layering integrity:
+
+- `Adapters → Orchestrator` (adapters are leaves)
+- `Governance → Orchestrator` (governance is a pure evaluator)
+- `Audit → Orchestrator` (audit is append-only)
+- `CLI / Web UI → Orchestrator` (must route through Control Service)
+
+---
+
+## Level 3 — Orchestrator Components
+
+```mermaid
+C4Component
+  title Component Diagram — Orchestrator Package
+
+  Container_Boundary(orch, "packages/orchestrator/src") {
+
+    Component(phaseEngine, "phase-engine.ts", "TypeScript module", "Top-level phase sequencer. Iterates the 8 phases (intake → deployment). Calls sub-engines per phase. Emits phase-level AuditEvents and CanonicalEvents.")
+    Component(executionEngine, "execution-engine.ts", "TypeScript module", "10-step pipeline runner for the building phase. Executes audit-start, policy-eval, adapter-lookup, simulation, approval-gate, validation, execution-with-retry, outcome-verify, healing-integration (step 10.5), and rollback.")
+    Component(intake, "intake.ts", "TypeScript module", "Phase handler for the intake phase. Calls normalizeIdeaText, inferSolutionCategory, and generateClarifyingQuestions.")
+    Component(planner, "planner.ts", "TypeScript module", "Phase handler for the planning phase. Builds a structured PlanTask[] from clarification answers using the active AI adapter.")
+    Component(gateManager, "gate-manager.ts", "TypeScript module", "Coordinates the 9 governance gates. Delegates to packages/governance. Pauses the run if any gate returns NEEDS_REVIEW.")
+    Component(actionRunner, "action-runner.ts", "TypeScript module", "Executes individual adapter actions with retry logic. Reports success/failure to the execution engine.")
+    Component(modeController, "mode-controller.ts", "TypeScript module", "Resolves the active execution Mode (turbo | builder | pro | expert | safe | balanced | god) and sets mode-specific constraints on gate thresholds and retry limits.")
+    Component(batchQueue, "batch-queue.ts", "TypeScript module", "Manages ordered execution batches. Handles sequential/parallel step dispatch to the action runner.")
+    Component(outcomeEngine, "outcome-engine.ts", "TypeScript module", "Post-run outcome aggregation. Computes quality score, records failures, writes OutcomeRecord, forwards to learning engine.")
+    Component(healingIntegration, "healing-integration.ts", "TypeScript module", "Bridge between execution-engine (step 10.5) and packages/healing. Invokes the failure classifier and healing strategy pipeline.")
+    Component(resumeRun, "resume-run.ts", "TypeScript module", "Resumes a paused run after a gate approval event. Re-enters the phase engine at the paused checkpoint.")
+    Component(rollbackEngine, "rollback-engine.ts", "TypeScript module", "Executes compensating actions when healing is exhausted or a rollback command is issued. Records rollback_actions rows.")
+  }
+
+  Container(governance, "packages/governance", "", "Gate evaluation layer")
+  Container(adapters, "packages/adapters", "", "AI and provider adapters")
+  Container(healing, "packages/healing", "", "Healing strategy pipeline")
+  Container(learning, "packages/learning", "", "Outcome and learning recording")
+  Container(audit, "packages/audit", "", "Immutable audit event writer")
+  Container(events, "packages/events", "", "SSE CanonicalEvent publisher")
+
+  Rel(phaseEngine, intake, "Calls for intake phase")
+  Rel(phaseEngine, planner, "Calls for planning phase")
+  Rel(phaseEngine, gateManager, "Calls for gating phase")
+  Rel(phaseEngine, executionEngine, "Calls executeRunBundle for building phase")
+  Rel(phaseEngine, outcomeEngine, "Calls post-run")
+  Rel(phaseEngine, audit, "Emits run.started, phase.completed events")
+  Rel(phaseEngine, events, "Publishes run.phase.changed CanonicalEvents")
+
+  Rel(executionEngine, modeController, "Reads mode constraints")
+  Rel(executionEngine, batchQueue, "Dispatches action batches")
+  Rel(executionEngine, actionRunner, "Executes individual actions")
+  Rel(executionEngine, healingIntegration, "Invokes on step failure (step 10.5)")
+  Rel(executionEngine, rollbackEngine, "Invokes on healing exhaustion")
+  Rel(executionEngine, audit, "Emits action.executed, action.failed events")
+
+  Rel(gateManager, governance, "Evaluates 9 governance gates")
+  Rel(gateManager, resumeRun, "Calls after approval received")
+
+  Rel(actionRunner, adapters, "Routes to AI and provider adapters")
+  Rel(healingIntegration, healing, "Delegates to healing engine pipeline")
+  Rel(outcomeEngine, learning, "Sends OutcomeRecord for learning")
+
+  UpdateLayoutConfig($c4ShapeInRow="4", $c4BoundaryInRow="1")
+```
+
+### Orchestrator Phase-to-Component Mapping
+
+| Phase | Primary Component | Secondary Components |
+|---|---|---|
+| `intake` | `intake.ts` | `phaseEngine`, `audit`, `events` |
+| `planning` | `planner.ts` | `phaseEngine`, AI adapter |
+| `skills` | `skill-engine selector` | `phaseEngine` |
+| `gating` | `gate-manager.ts` | `governance` (all 9 gates) |
+| `building` | `execution-engine.ts` | `batchQueue`, `actionRunner`, `adapters` |
+| `testing` | `phaseEngine` (simulated) | `audit`, `events` |
+| `reviewing` | `phaseEngine` (simulated) | `audit`, `events` |
+| `deployment` | `phaseEngine` (simulated) | `audit`, `events` |
+| Recovery | `healing-integration.ts` | `healing`, `rollback-engine.ts` |
+| Post-run | `outcome-engine.ts` | `learning` |
+
+---
+
+## Level 3 — Auth Components
+
+```mermaid
+C4Component
+  title Component Diagram — Auth Package
+
+  Container_Boundary(authPkg, "packages/auth/src") {
+
+    Component(resolveSession, "resolve-session.ts", "TypeScript module", "Entry point for all auth resolution. Determines which strategy applies (session JWT, service account JWT, legacy API key) and delegates accordingly. Returns a unified ResolvedSession object.")
+    Component(verifyInsforgeToken, "verify-insforge-token.ts", "TypeScript module", "Verifies RS256-signed InsForge session JWTs. Fetches and caches the JWKS from INSFORGE_JWKS_URI (10-min TTL). Validates iss, exp, aud, and performs jti Redis lookup.")
+    Component(issueExecutionToken, "issue-execution-token.ts", "TypeScript module", "Issues short-lived (10-min) HS256 execution tokens scoped to a specific runId and orgId. Used by adapters to authenticate outgoing calls without exposing the primary session JWT.")
+    Component(serviceAccount, "service-account.ts", "TypeScript module", "Verifies HS256-signed service account JWTs issued by Code Kit Ultra itself. Resolves scopes, orgId, workspaceId, and projectId from claims. Also provides issueServiceAccountToken() for enrollment flows.")
+  }
+
+  System_Ext(insforgeJwks, "InsForge JWKS Endpoint", "RS256 public key set")
+  System_Ext(redis, "Redis", "jti revocation blacklist")
+  Container(controlService, "Control Service", "", "authenticate.ts middleware calls resolve-session")
+  Container(orchestrator, "Orchestrator", "", "Receives resolved session; calls issue-execution-token per run")
+  Container(policy, "packages/policy", "", "Receives resolved session to run permission checks")
+
+  Rel(controlService, resolveSession, "Calls on every authenticated request")
+  Rel(resolveSession, verifyInsforgeToken, "Delegates when Bearer token matches InsForge format")
+  Rel(resolveSession, serviceAccount, "Delegates when Bearer token is a service-account JWT")
+  Rel(resolveSession, resolveSession, "Falls through to legacy API key check if both fail")
+
+  Rel(verifyInsforgeToken, insforgeJwks, "Fetches JWKS (cached 10 min)", "HTTPS")
+  Rel(verifyInsforgeToken, redis, "Checks jti blacklist", "TCP")
+
+  Rel(serviceAccount, redis, "Checks jti blacklist for service account tokens", "TCP")
+
+  Rel(orchestrator, issueExecutionToken, "Issues per-run scoped execution token")
+  Rel(resolveSession, policy, "Resolved session passed to permission resolver")
+
+  UpdateLayoutConfig($c4ShapeInRow="3", $c4BoundaryInRow="1")
+```
+
+### Auth Strategy Resolution Order
+
+```
+Request arrives at Control Service
+       │
+       ▼
+  Extract Bearer token from Authorization header
+       │
+       ├── token.iss === INSFORGE_ISSUER?
+       │         └── YES → verify-insforge-token.ts
+       │                    ├── Fetch/cache JWKS
+       │                    ├── Verify RS256 signature
+       │                    ├── Validate iss / exp / aud
+       │                    ├── Redis jti revocation check
+       │                    └── Build ResolvedSession { authMode: 'session' }
+       │
+       ├── token has `svc:` prefix in sub or known service-account issuer?
+       │         └── YES → service-account.ts
+       │                    ├── Verify HS256 with SERVICE_ACCOUNT_JWT_SECRET
+       │                    ├── Validate exp, scopes
+       │                    ├── Redis jti revocation check
+       │                    └── Build ResolvedSession { authMode: 'service-account' }
+       │
+       └── Legacy API key header (X-Api-Key)?
+                 └── YES → legacy key lookup in DB
+                            └── Build ResolvedSession { authMode: 'legacy-api-key' }
+                                ⚠ DEPRECATED — planned for removal
+```
+
+### Execution Token Lifecycle
+
+```
+Orchestrator starts a new run
+       │
+       ▼
+issue-execution-token.ts
+  sign({ sub: actorId, runId, orgId, scope: 'run:execute' }, HS256, exp: +10min)
+       │
+       ▼
+Token stored in run context (not persisted to DB)
+       │
+       ▼
+Adapters use token for outgoing calls to AI providers
+       │
+       ▼
+Token expires automatically after 10 minutes
+(No explicit revocation path — expiry is the revocation mechanism)
+```
+
+---
+
+## Cross-Cutting Architecture Notes
+
+### Deployment Topology
+
+```
+┌─────────────────────────────────────────────────────┐
+│                  Single Node.js Process              │
+│  apps/control-service                                │
+│    ├── Express HTTP server (port configurable)       │
+│    ├── SSE endpoint: GET /v1/events                  │
+│    ├── All packages imported in-process              │
+│    └── No inter-service network calls (monolith)     │
+└──────────────────────┬──────────────────────────────┘
+                       │ external calls only
+            ┌──────────┼──────────────┐
+            ▼          ▼              ▼
+         InsForge    Redis       AI Providers
+       (Supabase)               (Claude, etc.)
+```
+
+All packages (`packages/*`) are compiled TypeScript imported directly into the control service process. There are no separate microservices. This is an intentional **modular monolith** design that minimises operational complexity while keeping internal boundaries enforced through module imports rather than network contracts.
+
+### Key Design Invariants
+
+1. **Identity plane separation:** Code Kit Ultra never issues or stores human passwords or primary identity. All human auth is delegated to InsForge.
+2. **Governance immutability:** AuditEvents are never updated or deleted. Gate decisions are permanent records.
+3. **Adapter isolation:** AI providers are never called directly from orchestrator, governance, or auth. All calls route through `packages/adapters`.
+4. **CLI/UI are surfaces only:** `apps/cli` and `apps/web-control-plane` contain no business logic. All logic lives in packages imported by `apps/control-service`.
+5. **Execution tokens are ephemeral:** Short-lived HS256 tokens (10 min) prevent long-lived credential leakage to adapters.
diff --git a/docs/02_architecture/ERD.md b/docs/02_architecture/ERD.md
new file mode 100644
index 0000000..5c0c76f
--- /dev/null
+++ b/docs/02_architecture/ERD.md
@@ -0,0 +1,470 @@
+# Entity-Relationship Diagram — Code Kit Ultra
+
+**Status:** Authoritative
+**Version:** 1.2.0
+**Last reviewed:** 2026-04-04
+**See also:** `docs/02_architecture/DATA_MODEL.md`, `/db/schema.sql`, `packages/shared/src/types.ts`
+
+---
+
+## Overview
+
+The Code Kit Ultra database schema is hosted in **Supabase (PostgreSQL)** as part of the InsForge plane. The central entity is a **Run** — every other table exists to scope, govern, observe, or recover runs.
+
+The schema reflects three architectural concerns:
+
+1. **Multi-tenancy:** `organizations → workspaces → projects → runs` form a strict ownership hierarchy. Every row is scoped to at least an `organization_id`.
+2. **Governance traceability:** `run_gates`, `run_events`, and `audit_logs` record every decision, event, and action taken during a run's lifecycle. These are append-only.
+3. **Operational recovery:** `healing_actions` and `rollback_actions` provide full forensic traceability for automated recovery operations.
+
+> **Table naming note:** The canonical spec names used in this document map to the following repo table names:
+> `gate_decisions` → `run_gates` (run_approvals in older migrations),
+> `audit_events` → `audit_logs`,
+> `canonical_events` → `run_events`.
+> See `DATA_MODEL.md §Schema Alignment` for the rename migration reference.
+
+---
+
+## Entity-Relationship Diagram
+
+```mermaid
+erDiagram
+
+    %% ─────────────────────────────────────────────
+    %% TENANT HIERARCHY
+    %% ─────────────────────────────────────────────
+
+    organizations {
+        uuid   id             PK
+        text   name           "NOT NULL"
+        timestamptz created_at "NOT NULL DEFAULT now()"
+    }
+
+    workspaces {
+        uuid   id              PK
+        uuid   organization_id FK
+        text   name            "NOT NULL"
+        timestamptz created_at "NOT NULL DEFAULT now()"
+    }
+
+    projects {
+        uuid   id            PK
+        uuid   workspace_id  FK
+        text   name          "NOT NULL"
+        text   slug          "NOT NULL, UNIQUE per workspace"
+        timestamptz created_at "NOT NULL DEFAULT now()"
+    }
+
+    %% ─────────────────────────────────────────────
+    %% IDENTITY & MEMBERSHIP
+    %% ─────────────────────────────────────────────
+
+    users {
+        text   id            PK  "actorId — sourced from InsForge sub claim"
+        text   email         "NOT NULL"
+        text   display_name
+        timestamptz created_at "NOT NULL DEFAULT now()"
+    }
+
+    organization_memberships {
+        uuid   id              PK
+        uuid   organization_id FK
+        text   user_id         FK
+        text   role            "owner | admin | member | viewer"
+        timestamptz created_at "NOT NULL DEFAULT now()"
+    }
+
+    project_memberships {
+        uuid   id         PK
+        uuid   project_id FK
+        text   user_id    FK
+        text   role       "owner | admin | member | viewer"
+        timestamptz created_at "NOT NULL DEFAULT now()"
+    }
+
+    service_accounts {
+        text   id             PK  "svc_<slug>"
+        text   name           "NOT NULL"
+        uuid   org_id         FK
+        uuid   workspace_id   FK  "nullable — org-level if null"
+        uuid   project_id     FK  "nullable — workspace-level if null"
+        jsonb  scopes         "NOT NULL DEFAULT '[]'"
+        text   created_by     "actorId of creator"
+        timestamptz created_at "NOT NULL DEFAULT now()"
+    }
+
+    %% ─────────────────────────────────────────────
+    %% RBAC
+    %% ─────────────────────────────────────────────
+
+    permissions {
+        uuid   id          PK
+        text   name        "e.g. run:create, gate:approve, rollback:trigger"
+        text   description
+    }
+
+    role_permissions {
+        uuid   id            PK
+        text   role          "FK-equivalent to role column on memberships"
+        uuid   permission_id FK
+    }
+
+    %% ─────────────────────────────────────────────
+    %% RUN CORE
+    %% ─────────────────────────────────────────────
+
+    runs {
+        uuid   id              PK  "run_YYYYMMDD_NNNN format"
+        uuid   organization_id FK
+        uuid   workspace_id    FK  "nullable"
+        uuid   project_id      FK  "nullable"
+        text   actor_id        "actorId — human or service account"
+        text   actor_type      "human | service-account | system"
+        text   auth_mode       "session | service-account | legacy-api-key"
+        text   correlation_id  "NOT NULL — ties audit trail together"
+        text   idea            "NOT NULL — raw operator intent"
+        text   mode            "turbo | builder | pro | expert | safe | balanced | god"
+        text   status          "planned | running | paused | completed | failed | cancelled"
+        text   priority        "speed | quality | cost"
+        text   deliverable     "app | api | script | report"
+        timestamptz created_at "NOT NULL DEFAULT now()"
+        timestamptz updated_at "NOT NULL DEFAULT now()"
+    }
+
+    plan_tasks {
+        uuid   id              PK
+        uuid   run_id          FK
+        text   phase           "intake | planning | skills | gating | building | testing | reviewing | deployment"
+        text   title           "NOT NULL"
+        text   description     "NOT NULL"
+        text   done_definition "NOT NULL — revalidation target"
+        text   status          "pending | running | success | failed | paused | skipped | rolled-back"
+        int    position        "NOT NULL — ordering within run"
+    }
+
+    %% ─────────────────────────────────────────────
+    %% GOVERNANCE
+    %% ─────────────────────────────────────────────
+
+    run_gates {
+        uuid   id           PK
+        uuid   run_id       FK
+        text   gate_type    "risk_threshold | policy_compliance | confidence_score | kill_switch | consensus | constraint | validation | intent_alignment | approval"
+        text   status       "pending | pass | needs-review | blocked | approved | rejected"
+        text   reason       "NOT NULL — human-readable evaluation result"
+        bool   should_pause "NOT NULL DEFAULT false"
+        text   decided_by   "actorId of approver/rejecter (nullable)"
+        timestamptz decided_at   "nullable — set on approval/rejection"
+        text   decision_note "optional operator note on decision"
+    }
+
+    %% ─────────────────────────────────────────────
+    %% EVENTS & AUDIT
+    %% ─────────────────────────────────────────────
+
+    run_events {
+        uuid   id            PK
+        uuid   run_id        FK  "nullable — org-level events have no run"
+        text   event_name    "NOT NULL — domain.noun.verb e.g. run.phase.completed"
+        jsonb  payload       "NOT NULL"
+        text   actor_id
+        text   actor_type
+        uuid   org_id        FK
+        uuid   workspace_id  FK  "nullable"
+        uuid   project_id    FK  "nullable"
+        text   auth_mode
+        text   correlation_id
+        timestamptz created_at "NOT NULL DEFAULT now()"
+    }
+
+    audit_logs {
+        uuid   id            PK
+        uuid   run_id        FK  "nullable"
+        uuid   organization_id FK
+        uuid   workspace_id  FK  "nullable"
+        uuid   project_id    FK  "nullable"
+        text   actor_id      "NOT NULL"
+        text   actor_type    "NOT NULL — human | service-account | system"
+        text   auth_mode     "NOT NULL"
+        text   correlation_id "NOT NULL"
+        text   event_type    "NOT NULL — e.g. run.created, gate.approved"
+        jsonb  payload       "NOT NULL"
+        text   previous_hash "SHA256 of prior event content + prior hash"
+        timestamptz created_at "NOT NULL DEFAULT now()"
+    }
+
+    outcome_records {
+        uuid    id             PK
+        uuid    run_id         FK
+        bool    success        "NOT NULL"
+        jsonb   failures       "NOT NULL DEFAULT '[]'"
+        int     retry_count    "NOT NULL DEFAULT 0"
+        int     duration_ms    "NOT NULL"
+        numeric quality_score  "0.0000 – 1.0000 (nullable)"
+        text    user_feedback  "nullable — operator-provided text"
+        int     operator_rating "1–5 CHECK constraint (nullable)"
+        timestamptz created_at "NOT NULL DEFAULT now()"
+    }
+
+    %% ─────────────────────────────────────────────
+    %% RECOVERY
+    %% ─────────────────────────────────────────────
+
+    healing_actions {
+        uuid   id           PK
+        uuid   run_id       FK
+        text   step_id      "plan_tasks.id reference (text for flexibility)"
+        text   strategy     "retry-same | fallback-adapter | prompt-revision | partial-replan | add-context | escalate-mode"
+        int    attempt      "NOT NULL — attempt number within this healing episode"
+        text   status       "pending | running | success | failed | exhausted"
+        jsonb  input        "Action input passed to healing engine"
+        jsonb  output       "Action output / error from healing attempt"
+        timestamptz created_at "NOT NULL DEFAULT now()"
+        timestamptz updated_at "NOT NULL DEFAULT now()"
+    }
+
+    rollback_actions {
+        uuid   id           PK
+        uuid   run_id       FK
+        text   step_id      "plan_tasks.id reference (text for flexibility)"
+        text   action_type  "compensating action type"
+        text   status       "pending | running | success | failed"
+        jsonb  payload      "Compensating action parameters"
+        text   triggered_by "actorId or 'system' for automatic rollbacks"
+        timestamptz created_at "NOT NULL DEFAULT now()"
+        timestamptz updated_at "NOT NULL DEFAULT now()"
+    }
+
+    %% ─────────────────────────────────────────────
+    %% RELATIONSHIPS
+    %% ─────────────────────────────────────────────
+
+    organizations       ||--o{ workspaces             : "contains"
+    organizations       ||--o{ organization_memberships : "has members"
+    organizations       ||--o{ service_accounts        : "owns"
+    organizations       ||--o{ runs                    : "scopes"
+    organizations       ||--o{ audit_logs              : "scopes"
+    organizations       ||--o{ run_events              : "scopes"
+
+    workspaces          ||--o{ projects                : "contains"
+    workspaces          ||--o{ service_accounts        : "scoped to (optional)"
+    workspaces          ||--o{ runs                    : "scopes (optional)"
+
+    projects            ||--o{ project_memberships     : "has members"
+    projects            ||--o{ service_accounts        : "scoped to (optional)"
+    projects            ||--o{ runs                    : "scopes (optional)"
+
+    users               ||--o{ organization_memberships : "member of"
+    users               ||--o{ project_memberships     : "member of"
+
+    role_permissions    }o--|| permissions             : "grants"
+
+    runs                ||--o{ plan_tasks              : "contains"
+    runs                ||--o{ run_gates               : "evaluated by"
+    runs                ||--o{ run_events              : "generates"
+    runs                ||--o{ audit_logs              : "recorded in"
+    runs                ||--o|  outcome_records        : "produces"
+    runs                ||--o{ healing_actions         : "attempts"
+    runs                ||--o{ rollback_actions        : "reverses with"
+
+    plan_tasks          ||--o{ healing_actions         : "healed via"
+    plan_tasks          ||--o{ rollback_actions        : "rolled back via"
+```
+
+---
+
+## Commentary
+
+### 1. Tenant Hierarchy and Run Scoping
+
+The schema enforces a strict four-level ownership hierarchy:
+
+```
+organizations
+  └── workspaces          (organization_id FK → organizations.id)
+        └── projects      (workspace_id FK → workspaces.id)
+              └── runs    (organization_id FK required; workspace_id + project_id optional)
+```
+
+Every `run` row carries `organization_id` as a **required** foreign key, making org-level tenancy the mandatory scoping unit. `workspace_id` and `project_id` are optional — a run may be scoped as narrowly as a specific project or as broadly as an entire organization.
+
+This design supports three run contexts:
+
+| Context | organization_id | workspace_id | project_id |
+|---|---|---|---|
+| Org-level run | Required | NULL | NULL |
+| Workspace-level run | Required | Required | NULL |
+| Project-level run | Required | Required | Required |
+
+The `runs` table's `correlation_id` column is set at request ingress (sourced from the InsForge JWT `jti` or generated) and is threaded through every subsequent `audit_logs` and `run_events` row for the run. This makes it possible to reconstruct the complete causal chain of a run from any event by filtering on `correlation_id`.
+
+---
+
+### 2. RBAC Through Memberships and Role Permissions
+
+Access control is a two-table lookup:
+
+```
+users → organization_memberships.role
+              │
+              └── role_permissions.role
+                        │
+                        └── permissions.name
+                                 (e.g. 'run:create', 'gate:approve', 'rollback:trigger')
+```
+
+`organization_memberships` and `project_memberships` record a `role` text column (values: `owner`, `admin`, `member`, `viewer`). The `role_permissions` table maps each role to a set of `permissions` rows. Permission resolution at runtime uses `packages/policy/src/resolve-permissions.ts`, which joins these tables and returns a `PermissionSet` object attached to `req.auth`.
+
+**Key design points:**
+
+- Project memberships override organization memberships — a user can have `viewer` at the org level but `admin` at a specific project.
+- Service accounts carry an explicit `scopes` JSONB array (e.g., `["run:create", "gate:read"]`) that bypasses the membership table entirely. Their permissions are evaluated directly from the JWT claims by `service-account.ts`.
+- The `permissions` table is the canonical enumeration of every grantable capability in the system. Changes to what a role can do require a migration that updates `role_permissions` rows.
+
+---
+
+### 3. Run Lineage — runs → run_gates → run_events
+
+The traceability chain for any run is:
+
+```
+runs (1)
+  ├── run_gates (0..*) — one per governance gate evaluated
+  ├── run_events (0..*) — one per CanonicalEvent emitted
+  ├── plan_tasks (0..*) — one per planned step
+  └── outcome_records (0..1) — exactly one post-run summary
+```
+
+**`run_gates`** is the authoritative record of every governance decision. It captures:
+- `gate_type` — which of the 9 gates was evaluated.
+- `status` — the final status after any human decisions.
+- `should_pause` — whether this evaluation caused a run pause.
+- `decided_by` / `decided_at` / `decision_note` — human approval/rejection attribution.
+
+Because `should_pause` and `status` are set at evaluation time and then updated only on approval/rejection, the full decision history (initial evaluation + subsequent human action) is captured in a single row. This differs from an event-sourced model where two rows would be written. A full event log is still available via `audit_logs` for forensic reconstruction.
+
+**`run_events`** holds all `CanonicalEvents` (the SSE stream persisted to DB). These use the `domain.noun.verb` naming convention (e.g., `run.phase.completed`, `gate.approval.required`). They are optimised for timeline rendering and are indexed on `(run_id, created_at ASC)` to support ordered replay. Unlike `audit_logs`, `run_events` rows may be queried and filtered by `event_name` without decoding JSONB.
+
+**`plan_tasks`** maps 1:N to both `healing_actions` and `rollback_actions`, allowing post-run analysis of which specific steps required recovery and what strategies were attempted.
+
+---
+
+### 4. Audit Integrity — SHA256 Hash Chain in `audit_logs`
+
+The `audit_logs` table provides governance-grade immutability through a **SHA256 hash chain**, implemented in `packages/audit/src/write-audit-event.ts`:
+
+```
+┌─────────────────────────────────────────────────────────┐
+│  audit_logs row N-1                                      │
+│  previous_hash: <hash of row N-2>                        │
+│  this_hash:     sha256(content_N-1 + previous_hash_N-1) │
+└─────────────────────────────────────────────────────────┘
+                      │
+                      │ previous_hash_N = this_hash_N-1
+                      ▼
+┌─────────────────────────────────────────────────────────┐
+│  audit_logs row N                                        │
+│  previous_hash: <hash of row N-1>                        │
+│  this_hash:     sha256(content_N + previous_hash_N)     │
+└─────────────────────────────────────────────────────────┘
+```
+
+The genesis event uses `previous_hash = '0'.repeat(64)`.
+
+Any tampering with a historical row will invalidate all subsequent `previous_hash` values in the chain, making tampering detectable by a chain verification scan. The hash covers the full event content (`id`, `event_type`, `payload`, `actor_id`, `created_at`) plus the prior hash.
+
+**Known limitation:** The `lastHash` state is held in module-level memory in the current implementation. On process restart, the last hash must be loaded from the DB before writing new events, and multi-replica deployments require a DB-level advisory lock or sequence to prevent chain forks. This is tracked as risk R-09 in `docs/04_tracking/risk-log.md`.
+
+**Immutability guarantees:**
+- No `UPDATE` or `DELETE` paths exist in `write-audit-event.ts`.
+- The `audit_logs` table has no application-level soft-delete column.
+- Row-level security in Supabase should be configured to deny `UPDATE`/`DELETE` for the application role.
+
+---
+
+### 5. Healing and Rollback Traceability
+
+Two tables capture operational recovery events:
+
+**`healing_actions`** records every attempt by `packages/healing/src/healing-engine.ts` to recover a failed step:
+
+- One row per attempt (not per episode) — `attempt` column distinguishes retries within a single episode.
+- `strategy` identifies which `HealingStrategy` from `healing-strategy-registry.ts` was applied.
+- `status` progresses: `pending → running → success | failed | exhausted`.
+- `input` / `output` JSONB columns store the full action parameters and result for forensic replay.
+
+**`rollback_actions`** records compensating actions executed by `rollback-engine.ts` when healing is exhausted or a manual rollback command is issued:
+
+- One row per compensating action (one per completed `plan_tasks` step, executed in reverse order).
+- `triggered_by` distinguishes automatic rollback (`'system'`) from operator-initiated rollback (actorId).
+- `status` tracks whether each individual compensating action succeeded.
+
+Together, these tables allow a post-incident investigator to reconstruct the exact sequence: which step failed, what healing was attempted, how many attempts were made, which strategy succeeded or failed, and exactly which compensating actions were executed to restore system state.
+
+**Example forensic query (healing episode for a run):**
+
+```sql
+-- Full healing and rollback timeline for run 'run_20260404_0042'
+SELECT
+    'healing'                AS record_type,
+    ha.step_id,
+    ha.strategy,
+    ha.attempt,
+    ha.status,
+    ha.created_at
+FROM healing_actions ha
+WHERE ha.run_id = 'run_20260404_0042'
+
+UNION ALL
+
+SELECT
+    'rollback'               AS record_type,
+    ra.step_id,
+    ra.action_type           AS strategy,
+    NULL                     AS attempt,
+    ra.status,
+    ra.created_at
+FROM rollback_actions ra
+WHERE ra.run_id = 'run_20260404_0042'
+
+ORDER BY created_at ASC;
+```
+
+---
+
+## Key Index Summary
+
+| Index | Purpose |
+|---|---|
+| `idx_runs_project_status (project_id, status, created_at DESC)` | Dashboard and CLI run-list queries |
+| `idx_runs_org (organization_id, created_at DESC)` | Org-level run history |
+| `idx_run_gates_run (run_id, status)` | Gate status lookups per run |
+| `idx_run_gates_pending (status) WHERE status = 'pending'` | Approval queue queries |
+| `idx_audit_logs_org_type (organization_id, event_type, created_at DESC)` | Governance audit queries |
+| `idx_audit_logs_correlation (correlation_id)` | Cross-event correlation chain reconstruction |
+| `idx_run_events_run (run_id, created_at ASC)` | Ordered timeline rendering for UI |
+| `idx_outcome_records_success (success, created_at DESC)` | Learning engine analytics |
+
+---
+
+## Table Ownership Summary
+
+| Table | Owner Package | Written by | Read by |
+|---|---|---|---|
+| `organizations` | `packages/core` | Control Service (org create) | Auth, Policy |
+| `workspaces` | `packages/core` | Control Service | Auth, Policy |
+| `projects` | `packages/core` | Control Service | Auth, Policy |
+| `users` | `packages/core` | InsForge sync | Auth, Policy |
+| `organization_memberships` | `packages/policy` | Control Service | Policy |
+| `project_memberships` | `packages/policy` | Control Service | Policy |
+| `service_accounts` | `packages/auth` | `service-account.ts` | Auth |
+| `permissions` | `packages/policy` | Migrations only | Policy |
+| `role_permissions` | `packages/policy` | Migrations only | Policy |
+| `runs` | `packages/memory` | `run-store.ts` | Orchestrator, Command Engine |
+| `plan_tasks` | `packages/memory` | `run-store.ts` | Orchestrator, Planner |
+| `run_gates` | `packages/governance` | `gate-manager.ts` | Gate handlers, Approval API |
+| `run_events` | `packages/events` | `publish-event.ts` | Realtime, Observability |
+| `audit_logs` | `packages/audit` | `write-audit-event.ts` | Audit API, Compliance |
+| `outcome_records` | `packages/learning` | `outcome-engine.ts` | Learning Engine |
+| `healing_actions` | `packages/healing` | `healing-engine.ts` | Observability, Audit |
+| `rollback_actions` | `packages/orchestrator` | `rollback-engine.ts` | Observability, Audit |
diff --git a/docs/02_architecture/SEQUENCE_DIAGRAMS.md b/docs/02_architecture/SEQUENCE_DIAGRAMS.md
new file mode 100644
index 0000000..5bb28cf
--- /dev/null
+++ b/docs/02_architecture/SEQUENCE_DIAGRAMS.md
@@ -0,0 +1,476 @@
+# Sequence Diagrams — Code Kit Ultra
+
+**Status:** Authoritative
+**Version:** 1.2.0
+**Last reviewed:** 2026-04-04
+**See also:** `docs/02_architecture/AUTH_ARCHITECTURE.md`, `docs/02_architecture/SYSTEM_ARCHITECTURE.md`, `docs/02_architecture/C4_SYSTEM_DIAGRAM.md`
+
+---
+
+## Overview
+
+This document provides detailed sequence diagrams for the four most critical flows in Code Kit Ultra:
+
+1. **Auth & Session Resolution** — how every authenticated request is verified.
+2. **Run Lifecycle (Happy Path)** — the end-to-end flow from CLI submission to run completion.
+3. **Gate Approval Flow** — how a gate pause is raised, reviewed, and resolved.
+4. **Healing Loop (Phase 10.5)** — how step failures are classified, healed, or rolled back.
+
+All diagrams use [Mermaid `sequenceDiagram`](https://mermaid.js.org/syntax/sequenceDiagram.html) syntax.
+
+---
+
+## Flow 1 — Auth & Session Resolution
+
+This flow executes on **every authenticated API request**. The `authenticate.ts` middleware in `apps/control-service` is the entry point. It delegates to `packages/auth/src/resolve-session.ts`, which fans out to the appropriate strategy module.
+
+```mermaid
+sequenceDiagram
+    autonumber
+    participant Client as Client<br/>(CLI / Web UI / Service Account)
+    participant API as Control Service<br/>authenticate.ts
+    participant RS as resolve-session.ts<br/>packages/auth
+    participant VIT as verify-insforge-token.ts<br/>packages/auth
+    participant SA as service-account.ts<br/>packages/auth
+    participant JWKS as InsForge JWKS Endpoint<br/>(external)
+    participant Redis as Redis<br/>(jti blacklist)
+    participant Policy as resolve-permissions.ts<br/>packages/policy
+    participant IET as issue-execution-token.ts<br/>packages/auth
+    participant Orch as Orchestrator<br/>packages/orchestrator
+
+    Client->>API: HTTP request<br/>Authorization: Bearer <token>
+
+    API->>RS: resolveSession(token)
+
+    alt InsForge Session JWT (Primary — RS256)
+        RS->>VIT: verifyInsforgeToken(token)
+
+        VIT->>JWKS: GET /.well-known/jwks.json
+        note over VIT,JWKS: Response cached for 10 minutes.<br/>Subsequent requests use in-memory cache.
+        JWKS-->>VIT: { keys: [...] }
+
+        VIT->>VIT: Verify RS256 signature<br/>Validate iss === INSFORGE_ISSUER<br/>Validate exp not expired<br/>Extract sub (actorId), aud (tenancy claims)
+
+        VIT->>Redis: SISMEMBER jti_blacklist <token.jti>
+        Redis-->>VIT: 0 (not revoked) or 1 (revoked)
+
+        alt jti is revoked
+            VIT-->>RS: Error: TOKEN_REVOKED
+            RS-->>API: 401 Unauthorized
+            API-->>Client: 401 { error: "Token has been revoked" }
+        end
+
+        VIT-->>RS: { actorId: sub, tenancy, authMode: 'session', claims }
+
+    else Service Account JWT (Secondary — HS256)
+        RS->>SA: verifyServiceAccountToken(token)
+
+        SA->>SA: Verify HS256 with SERVICE_ACCOUNT_JWT_SECRET<br/>Validate exp not expired<br/>Extract serviceAccountId, orgId, scopes
+
+        SA->>Redis: SISMEMBER jti_blacklist <token.jti>
+        Redis-->>SA: 0 (not revoked)
+
+        SA-->>RS: { actorId: serviceAccountId, tenancy, authMode: 'service-account', scopes }
+
+    else Legacy API Key (Deprecated — X-Api-Key header)
+        RS->>RS: Lookup key in DB → resolve orgId / actorId
+        note over RS: ⚠ Deprecated. Planned for removal.<br/>No jti tracking. Revocation via DB delete only.
+        RS-->>RS: { actorId, tenancy, authMode: 'legacy-api-key' }
+    end
+
+    RS->>Policy: resolvePermissions(authMode, role, scopes)
+    Policy-->>RS: PermissionSet
+
+    RS-->>API: ResolvedSession { actor, tenant, permissions, authMode, correlationId }
+
+    API->>API: Attach session to req.auth<br/>Proceed to command handler
+
+    note over API,Orch: When orchestrator starts a new run,<br/>it issues a scoped execution token.
+
+    API->>Orch: startRun(runInput, session)
+    Orch->>IET: issueExecutionToken({ actorId, runId, orgId, exp: +10min })
+    IET->>IET: Sign HS256 { sub: actorId, runId, orgId, scope: 'run:execute' }
+    IET-->>Orch: executionToken (10-min HS256 JWT)
+    note over Orch: Token stored in run context only.<br/>Not persisted to DB. Expires automatically.
+    Orch->>Orch: Attach executionToken to all adapter calls<br/>within this run
+```
+
+### Auth Notes
+
+| Aspect | Detail |
+|---|---|
+| JWKS Cache TTL | 10 minutes (in-memory). First request per instance fetches from InsForge. |
+| jti Revocation | Redis `SISMEMBER` on `jti_blacklist` set. Falls back to in-memory set if Redis unavailable. |
+| Execution Token | HS256, 10-min expiry, scoped to `{ runId, orgId, scope: 'run:execute' }`. Never persisted. |
+| Legacy Key | Deprecated. No jti — revocation requires DB row deletion. Removed in a future release. |
+| Auth Failure | Returns HTTP 401 with structured `{ error, code }` body. No partial session built. |
+
+---
+
+## Flow 2 — Run Lifecycle (Happy Path)
+
+This flow covers the full end-to-end journey of a run from CLI submission through all 8 phases to completion. The "happy path" assumes all gates pass and no step failures occur.
+
+```mermaid
+sequenceDiagram
+    autonumber
+    participant CLI as CLI<br/>apps/cli
+    participant API as Control Service<br/>apps/control-service
+    participant Auth as resolve-session.ts
+    participant CMD as execute.ts<br/>command-engine
+    participant Orch as phase-engine.ts<br/>orchestrator
+    participant Intake as intake.ts
+    participant Planner as planner.ts
+    participant Skills as skill-engine selector
+    participant Gates as gate-manager.ts
+    participant Gov as governance<br/>9 gates
+    participant ExecEng as execution-engine.ts
+    participant Adapters as Adapters<br/>claude / gemini / etc.
+    participant Audit as write-audit-event.ts
+    participant Events as publish-event.ts<br/>(SSE)
+    participant Memory as run-store.ts
+
+    CLI->>API: POST /v1/runs { idea, mode, projectId }
+
+    API->>Auth: resolveSession(bearerToken)
+    Auth-->>API: ResolvedSession
+
+    API->>CMD: execute(runInput, session)
+    CMD->>Memory: createRun({ id, status: 'planned', ... })
+    Memory-->>CMD: RunState
+
+    CMD->>Audit: writeAuditEvent(run.created, { runId, actorId })
+    CMD->>Events: publishEvent(run.created, { runId })
+    note over Events: SSE stream delivers run.created to<br/>subscribed CLI / Web UI clients.
+
+    CMD->>Orch: startPhaseEngine(run, session)
+    Orch->>Memory: updateRun({ status: 'running' })
+    Orch->>Audit: writeAuditEvent(run.started, { runId })
+    Orch->>Events: publishEvent(run.started, { runId })
+
+    rect rgb(240, 248, 255)
+        note right of Orch: PHASE: intake
+        Orch->>Intake: runIntakePhase(run)
+        Intake->>Adapters: normalizeIdeaText(idea) → structured summary
+        Adapters-->>Intake: normalized idea
+        Intake->>Adapters: inferSolutionCategory(summary) → category
+        Adapters-->>Intake: { category, confidence }
+        Intake->>Adapters: generateClarifyingQuestions(summary, category)
+        Adapters-->>Intake: clarifyingQuestions[]
+        Intake-->>Orch: IntakeResult { summary, category, questions }
+        Orch->>Events: publishEvent(run.phase.completed, { phase: 'intake' })
+    end
+
+    rect rgb(240, 255, 240)
+        note right of Orch: PHASE: planning
+        Orch->>Planner: runPlanningPhase(intakeResult)
+        Planner->>Adapters: buildTaskPlan(clarificationAnswers)
+        Adapters-->>Planner: PlanTask[] (ordered tasks per phase)
+        Planner-->>Orch: PlanningResult { tasks }
+        Orch->>Memory: updateRun({ planTasks })
+        Orch->>Events: publishEvent(run.phase.completed, { phase: 'planning' })
+    end
+
+    rect rgb(255, 255, 240)
+        note right of Orch: PHASE: skills
+        Orch->>Skills: selectSkills(planTasks)
+        Skills->>Skills: resolveManifest + validateSchema per skill
+        Skills-->>Orch: SkillSelection { selectedSkills }
+        Orch->>Events: publishEvent(run.phase.completed, { phase: 'skills' })
+    end
+
+    rect rgb(255, 245, 230)
+        note right of Orch: PHASE: gating (9 gates evaluated)
+        Orch->>Gates: evaluateGates(run, plan, skills)
+        loop For each of 9 gates
+            Gates->>Gov: evaluateGate(gateType, context)
+            Gov-->>Gates: GateDecision { status, reason, shouldPause }
+            Gates->>Audit: writeAuditEvent(gate.evaluated, { gateType, status })
+            Gates->>Events: publishEvent(gate.evaluated, { gateType, status })
+        end
+        note over Gates: If any gate returns NEEDS_REVIEW → pause.<br/>(See Flow 3 for gate approval detail.)
+        Gates-->>Orch: GateResult { allPassed: true }
+        Orch->>Events: publishEvent(run.phase.completed, { phase: 'gating' })
+    end
+
+    rect rgb(245, 230, 255)
+        note right of Orch: PHASE: building — executeRunBundle
+        Orch->>ExecEng: executeRunBundle(run, session, executionToken)
+
+        ExecEng->>Audit: writeAuditEvent(execution.started, { runId })
+
+        loop For each step/action in plan
+            ExecEng->>Adapters: executeAction(action, executionToken)
+            Adapters-->>ExecEng: ActionResult { success, output }
+            ExecEng->>Audit: writeAuditEvent(action.executed, { actionId, success })
+            ExecEng->>Events: publishEvent(run.step.completed, { stepId, status: 'success' })
+            ExecEng->>Memory: updateStep({ status: 'success' })
+        end
+
+        ExecEng-->>Orch: ExecutionResult { success: true }
+        Orch->>Events: publishEvent(run.phase.completed, { phase: 'building' })
+    end
+
+    rect rgb(230, 255, 245)
+        note right of Orch: PHASES: testing / reviewing / deployment (simulated)
+        Orch->>Orch: runSimulatedPhase('testing')
+        Orch->>Events: publishEvent(run.phase.completed, { phase: 'testing' })
+        Orch->>Orch: runSimulatedPhase('reviewing')
+        Orch->>Events: publishEvent(run.phase.completed, { phase: 'reviewing' })
+        Orch->>Orch: runSimulatedPhase('deployment')
+        Orch->>Events: publishEvent(run.phase.completed, { phase: 'deployment' })
+    end
+
+    Orch->>Memory: updateRun({ status: 'completed' })
+    Orch->>Audit: writeAuditEvent(run.completed, { runId, durationMs })
+    Orch->>Events: publishEvent(run.completed, { runId, status: 'completed' })
+
+    note over Orch: outcome-engine.ts runs post-completion
+    Orch->>Orch: outcomeEngine.record(run)
+    note over Orch: learning-engine.ts updates reliability scores
+
+    CMD-->>API: { runId, status: 'completed' }
+    API-->>CLI: 200 { runId, status: 'completed' }
+```
+
+### Run Lifecycle Notes
+
+| Phase | Handler | AI Call | Gate Check |
+|---|---|---|---|
+| `intake` | `intake.ts` | Yes (normalize, categorize, questions) | No |
+| `planning` | `planner.ts` | Yes (task plan) | No |
+| `skills` | `skill-engine/selector.ts` | No | No |
+| `gating` | `gate-manager.ts` | Depends on gate type | Yes (9 gates) |
+| `building` | `execution-engine.ts` | Yes (via adapters) | Yes (approval gate, step 5) |
+| `testing` | `phase-engine.ts` | No (simulated) | No |
+| `reviewing` | `phase-engine.ts` | No (simulated) | No |
+| `deployment` | `phase-engine.ts` | No (simulated) | No |
+
+---
+
+## Flow 3 — Gate Approval Flow
+
+This flow covers the case where a governance gate returns `NEEDS_REVIEW`, pausing the run until a human operator approves or rejects via the API. The flow branches on approval vs. rejection.
+
+```mermaid
+sequenceDiagram
+    autonumber
+    participant Gates as gate-manager.ts<br/>orchestrator
+    participant Gov as governance gate<br/>(any of 9)
+    participant Memory as run-store.ts
+    participant Audit as write-audit-event.ts
+    participant Events as publish-event.ts<br/>(SSE)
+    participant CLI as CLI / Web UI<br/>(operator)
+    participant API as Control Service<br/>approve handler
+    participant Resume as resume-run.ts<br/>orchestrator
+    participant Orch as phase-engine.ts<br/>orchestrator
+
+    Gates->>Gov: evaluateGate(gateType, context)
+    Gov-->>Gates: GateDecision { status: 'needs-review', reason, shouldPause: true }
+
+    Gates->>Memory: updateGateDecision({ status: 'needs-review' })
+    Gates->>Memory: updateRun({ status: 'paused' })
+
+    Gates->>Audit: writeAuditEvent(gate.paused, { gateType, reason, runId })
+    Gates->>Events: publishEvent(gate.approval.required, { runId, gateType, reason })
+    note over Events: SSE stream pushes gate.approval.required<br/>to all subscribers on this run channel.
+
+    Gates-->>Orch: GateResult { needsReview: true, gateType }
+    Orch->>Orch: Suspend phase-engine execution<br/>(awaiting resume signal)
+
+    CLI->>CLI: Operator receives SSE notification:<br/>"Gate [type] requires review"
+    CLI->>CLI: Operator reviews reason and context
+
+    alt Operator APPROVES
+
+        CLI->>API: POST /v1/gates/:gateId/approve { note }
+        API->>API: Authenticate + authorize request
+        API->>Memory: updateGateDecision({ status: 'approved', decidedBy, decidedAt, decisionNote })
+        API->>Memory: updateRun({ status: 'running' })
+
+        API->>Audit: writeAuditEvent(gate.approved, { gateId, gateType, decidedBy, note })
+        API->>Events: publishEvent(gate.approved, { runId, gateType, decidedBy })
+
+        API->>Resume: resumeRun(runId, session)
+        Resume->>Memory: loadRun(runId) → RunState with checkpoint
+        Resume->>Orch: reenterPhaseEngine(run, checkpoint)
+
+        Orch->>Orch: Continue execution from paused checkpoint
+        Orch->>Events: publishEvent(run.resumed, { runId })
+
+        note over Orch: Run continues with the remaining gates<br/>and then proceeds to the building phase.
+
+    else Operator REJECTS
+
+        CLI->>API: POST /v1/gates/:gateId/reject { reason }
+        API->>API: Authenticate + authorize request
+        API->>Memory: updateGateDecision({ status: 'rejected', decidedBy, decidedAt, decisionNote: reason })
+        API->>Memory: updateRun({ status: 'cancelled' })
+
+        API->>Audit: writeAuditEvent(gate.rejected, { gateId, gateType, decidedBy, reason })
+        API->>Events: publishEvent(gate.rejected, { runId, gateType, reason })
+
+        note over Events: SSE stream pushes gate.rejected.<br/>CLI / Web UI marks run as cancelled.
+
+        API-->>CLI: 200 { runId, status: 'cancelled' }
+    end
+```
+
+### Gate Type Reference
+
+The 9 governance gates evaluated during the `gating` phase, in evaluation order:
+
+| # | Gate Type | Evaluator | Pause Trigger |
+|---|---|---|---|
+| 1 | Risk Threshold Gate | `gate-controller.ts` | Risk score exceeds mode threshold |
+| 2 | Policy Compliance Gate | `governed-pipeline.ts` + `constraint-engine.ts` | Policy violation detected |
+| 3 | Confidence Score Gate | `confidence-engine.ts` | Score below mode minimum |
+| 4 | Kill Switch Gate | `kill-switch.ts` | Kill switch active for org/workspace |
+| 5 | Consensus Gate | `consensus-engine.ts` + `adaptive-consensus.ts` | Consensus not reached across adapters |
+| 6 | Constraint Gate | `constraint-engine.ts` | Hard constraint violated |
+| 7 | Validation Gate | `validation-engine.ts` | Output fails validation schema |
+| 8 | Intent Alignment Gate | `intent-engine.ts` | Plan intent diverges from idea |
+| 9 | Approval Gate | `gate-controller.ts` | Mode requires explicit human approval |
+
+### GateStatus Transitions
+
+```
+pending → pass          (gate evaluated and passed — run continues)
+pending → needs-review  (gate requires human decision — run paused)
+pending → blocked       (gate hard-blocked — run fails immediately)
+needs-review → approved (human approved — run resumes)
+needs-review → rejected (human rejected — run cancelled)
+```
+
+---
+
+## Flow 4 — Healing Loop (Phase 10.5)
+
+This flow executes within `execution-engine.ts` at step 10.5 — between a step failure (step 7) and the final rollback decision (step 10). It is triggered automatically whenever an action returns a failure result.
+
+```mermaid
+sequenceDiagram
+    autonumber
+    participant ExecEng as execution-engine.ts<br/>(step 7 → 10.5)
+    participant HealInt as healing-integration.ts<br/>orchestrator
+    participant FC as failure-classifier.ts<br/>packages/healing
+    participant Registry as healing-strategy-registry.ts<br/>packages/healing
+    participant HealEng as healing-engine.ts<br/>packages/healing
+    participant Reval as revalidation.ts<br/>packages/healing
+    participant Adapters as Adapters<br/>(retry target)
+    participant Rollback as rollback-engine.ts<br/>orchestrator
+    participant Audit as write-audit-event.ts
+    participant Events as publish-event.ts<br/>(SSE)
+    participant Memory as run-store.ts
+
+    ExecEng->>ExecEng: Action fails (step 7 — execution with retry exhausted)
+    ExecEng->>Audit: writeAuditEvent(action.failed, { actionId, error, attempt })
+    ExecEng->>Events: publishEvent(run.step.failed, { stepId, error })
+    ExecEng->>Memory: updateStep({ status: 'failed' })
+
+    ExecEng->>HealInt: invokeHealingPipeline(failure, runContext)
+    note over HealInt: Phase 10.5 — healing-integration bridges<br/>execution-engine to packages/healing.
+
+    HealInt->>FC: classifyFailure(failure)
+    note over FC: Analyses error type, stack trace, and context.<br/>Maps to a FailureCategory enum.
+    FC-->>HealInt: FailureClassification { category, severity, retryable }
+
+    alt Not retryable (e.g. auth failure, schema violation)
+        HealInt-->>ExecEng: HealingResult { healed: false, reason: 'not-retryable' }
+        ExecEng->>Rollback: triggerRollback(run, failedStep)
+        note over Rollback: Skip healing loop — go straight to rollback.
+    end
+
+    HealInt->>Registry: resolveStrategy(classification)
+    note over Registry: Matches classification to registered<br/>healing strategies (retry, fallback-adapter,<br/>partial-replan, prompt-revision, etc.)
+    Registry-->>HealInt: HealingStrategy { strategyId, maxAttempts, actions }
+
+    loop Healing attempts (up to maxAttempts per strategy)
+        HealInt->>HealEng: executeHealingStrategy(strategy, failure, runContext)
+
+        HealEng->>HealEng: Apply strategy actions<br/>(e.g. swap adapter, revise prompt,<br/>reduce scope, add context)
+
+        HealEng->>Adapters: Retry action with modified parameters
+        Adapters-->>HealEng: ActionResult { success, output }
+
+        alt Action succeeds
+            HealEng->>Reval: revalidate(output, step.doneDefinition)
+            Reval-->>HealEng: RevalidationResult { valid, score }
+
+            alt Revalidation passes
+                HealEng-->>HealInt: HealingResult { healed: true, attempt, strategy }
+                HealInt->>Audit: writeAuditEvent(healing.succeeded, { stepId, strategy, attempt })
+                HealInt->>Events: publishEvent(run.step.healed, { stepId, strategy })
+                HealInt->>Memory: updateHealingAction({ status: 'success', runId, stepId })
+                HealInt-->>ExecEng: HealingResult { healed: true }
+                ExecEng->>ExecEng: Continue with next step
+            else Revalidation fails
+                HealEng->>HealEng: Increment attempt counter
+                note over HealEng: Output did not meet doneDefinition.<br/>Try next healing attempt.
+            end
+
+        else Action fails again
+            HealEng->>HealEng: Increment attempt counter
+            HealEng->>Audit: writeAuditEvent(healing.attempt.failed, { attempt, error })
+        end
+    end
+
+    note over HealInt: All healing attempts exhausted without success.
+
+    HealInt->>Audit: writeAuditEvent(healing.exhausted, { runId, stepId, attempts })
+    HealInt->>Events: publishEvent(run.healing.exhausted, { runId, stepId })
+    HealInt->>Memory: updateHealingAction({ status: 'exhausted' })
+    HealInt-->>ExecEng: HealingResult { healed: false, reason: 'exhausted' }
+
+    ExecEng->>Rollback: triggerRollback(run, failedStep)
+    note over Rollback: rollback-engine executes compensating actions<br/>in reverse order for all completed steps.
+
+    loop For each completed step (reverse order)
+        Rollback->>Adapters: executeCompensatingAction(step)
+        Adapters-->>Rollback: CompensationResult
+        Rollback->>Memory: updateStep({ status: 'rolled-back' })
+        Rollback->>Audit: writeAuditEvent(rollback.action.executed, { stepId })
+        Rollback->>Memory: writeRollbackAction({ runId, stepId, status })
+    end
+
+    Rollback->>Memory: updateRun({ status: 'failed' })
+    Rollback->>Audit: writeAuditEvent(run.failed, { runId, reason: 'healing-exhausted' })
+    Rollback->>Events: publishEvent(run.failed, { runId, reason: 'healing-exhausted' })
+```
+
+### Healing Strategy Types
+
+| Strategy | Trigger Condition | Action |
+|---|---|---|
+| `retry-same` | Transient network/timeout error | Retry identical action with exponential backoff |
+| `fallback-adapter` | AI provider error or low-confidence output | Switch to next AI adapter in priority order |
+| `prompt-revision` | Output failed validation but adapter responded | Revise prompt with additional constraints |
+| `partial-replan` | Step scope too large for single action | Decompose step into smaller sub-actions |
+| `add-context` | Insufficient context in original action | Inject additional context from run memory |
+| `escalate-mode` | Low confidence across all adapters | Temporarily elevate execution mode |
+
+### Healing Loop Limits
+
+| Mode | Max Healing Attempts per Step |
+|---|---|
+| `turbo` | 1 |
+| `builder` | 2 |
+| `pro` | 3 |
+| `expert` | 3 |
+| `safe` | 5 |
+| `balanced` | 3 |
+| `god` | 5 |
+
+When `maxAttempts` is exhausted, the healing integration returns `{ healed: false }` and `execution-engine.ts` immediately invokes `rollback-engine.ts`.
+
+### Audit Events in Healing
+
+| Event Type | When Emitted |
+|---|---|
+| `action.failed` | Initial step failure (execution-engine, step 7) |
+| `healing.attempt.started` | Each healing attempt begins |
+| `healing.attempt.failed` | A healing attempt produces failure |
+| `healing.succeeded` | Healing attempt produces passing revalidation |
+| `healing.exhausted` | All attempts used without success |
+| `rollback.action.executed` | Each compensating action runs |
+| `run.failed` | Run status transitions to `failed` after rollback |
diff --git a/docs/03_specs/SPEC_EXECUTION_ENGINE.md b/docs/03_specs/SPEC_EXECUTION_ENGINE.md
new file mode 100644
index 0000000..1963ff0
--- /dev/null
+++ b/docs/03_specs/SPEC_EXECUTION_ENGINE.md
@@ -0,0 +1,402 @@
+# SPEC — Execution Engine
+**Status:** Draft
+**Version:** 1.0
+**Linked to:** packages/orchestrator/src/execution-engine.ts
+**Implements:** executeRunBundle pipeline, per-task execution contract, retry policy, healing integration, rollback, and manual retry/rollback operations
+
+---
+
+## Objective
+
+Define the complete behavioral contract for the Execution Engine — the component responsible for running a `RunBundle` through a sequential task pipeline. This spec covers the 6-stage per-task pipeline (within `executeTask`), the outer bundle loop (within `executeRunBundle`), retry semantics, adapter selection, risk simulation, approval gating, outcome capture, healing integration, rollback mechanics, and the manual `retryTask` and `rollbackTask` operations.
+
+---
+
+## Scope
+
+- `executeRunBundle` function: bundle-level orchestration loop
+- `executeTask` function: per-task 6-stage pipeline
+- Retry policy: configurable attempts, healing extension
+- Adapter selection: `createProviderAdapters` + `findAdapter`
+- Simulation and risk assessment per task
+- Approval gating within task execution
+- Learning optimizer: `optimizeTasks` applied before the task loop
+- Outcome capture after bundle completion or failure
+- Healing integration: `healFailedStep` invocation and result handling
+- Automatic rollback via `adapter.rollback`
+- `retryTask`: manual single-step retry
+- `rollbackTask`: manual single-step rollback
+- Audit events and step log entries emitted at each stage
+
+Out of scope: phase engine coordination, gate evaluation, and post-deployment observability.
+
+---
+
+## Inputs / Outputs
+
+| Direction | Item | Type | Description |
+|-----------|------|------|-------------|
+| Input | `bundle` | `RunBundle` | Complete run bundle including plan, state, and existing logs |
+| Input | `actor` | `string` | Identity string for audit records (default: `"system"`) |
+| Output | `RunBundle` | `RunBundle` | Mutated bundle with updated `state`, `executionLog`, `adapters`, and `auditLog` |
+
+---
+
+## Data Structures
+
+```typescript
+// packages/shared/src/types.ts
+
+interface RunBundle {
+  state: RunState;              // mutable run state
+  plan: PlanArtifact;          // tasks to execute
+  intake: IntakeArtifact;
+  adapters: AdapterLog;        // per-task adapter execution summaries
+  executionLog: ExecutionLog;  // ordered step execution log
+  auditLog: AuditLog;          // append-only audit entries
+  gates: GateDecision[];
+  reportMarkdown: string;
+}
+
+interface PlanTask {
+  id: string;
+  title: string;
+  adapterId: string;
+  payload: Record<string, unknown>;
+  rollbackPayload?: Record<string, unknown>;
+  requiresApproval?: boolean;
+  retryPolicy?: { maxAttempts: number };
+  dependencies?: string[];
+}
+
+interface StepExecutionLog {
+  stepId: string;
+  title: string;
+  adapter: string;
+  attempt: number;
+  status: StepStatus;    // "pending" | "running" | "success" | "failed" | "paused" | "rolled-back"
+  startedAt: string;
+  finishedAt?: string;
+  output?: string;
+  error?: string;
+  rollbackAvailable: boolean;
+  risk?: ExecutionRisk;           // "low" | "medium" | "high"
+  simulationSummary?: string;
+  verificationStatus?: "passed" | "failed";
+  verificationSummary?: string;
+  fixSuggestion?: string;
+}
+
+interface AdapterExecutionSummary {
+  taskId: string;
+  adapter: string;
+  status: "success" | "failed" | "rolled-back";
+  attempts: number;
+  output: string;
+}
+```
+
+---
+
+## Interfaces / APIs
+
+### `executeRunBundle`
+
+```typescript
+export async function executeRunBundle(
+  bundle: RunBundle,
+  actor: string = "system"
+): Promise<RunBundle>
+```
+
+### `retryTask`
+
+```typescript
+export async function retryTask(
+  runId: string,
+  targetStepId?: string,
+  actor: string = "system"
+): Promise<RunBundle>
+```
+
+### `rollbackTask`
+
+```typescript
+export async function rollbackTask(
+  runId: string,
+  targetStepId?: string,
+  actor: string = "system"
+): Promise<RunBundle>
+```
+
+---
+
+## `executeRunBundle`: Full Pipeline
+
+### Pre-loop: Learning Optimizer
+
+Before the task loop begins, the execution engine applies the learning optimizer:
+
+1. `loadLearningStore()` retrieves historical run outcome data.
+2. `optimizeTasks(bundle.plan.tasks, store)` returns an optimized task list with adjusted `retryLimit` values and a list of `suggestions`.
+3. Any task with an adjusted `retryLimit` has its `retryPolicy.maxAttempts` overwritten in `bundle.plan.tasks`.
+4. If suggestions are non-empty, a `OPTIMIZER_SUGGESTIONS_APPLIED` audit event is written.
+
+### Task Loop
+
+```
+for index = bundle.state.currentStepIndex to bundle.plan.tasks.length - 1:
+    result = await executeTask(bundle, task, index, actor)
+    if result.completed === false:
+        if bundle.state.status === "failed":
+            recordRunOutcome(success=false)
+        return bundle  // exits early on pause or failure
+// all tasks completed:
+writeAuditEvent("RUN_COMPLETED")
+markState(bundle, "completed", { currentStepIndex: tasks.length })
+recordRunOutcome(success=true)
+return bundle
+```
+
+The loop starts at `bundle.state.currentStepIndex`, enabling resume from a checkpoint without re-executing already-completed steps.
+
+---
+
+## `executeTask`: 6-Stage Per-Task Pipeline
+
+Each invocation of `executeTask` runs the following stages in strict sequence:
+
+### Stage 1: Audit Start
+
+`writeAuditEvent` with action `TASK_EXECUTION_ATTEMPT`. Fields included: `runId`, `actorName`, `actorId`, `actorType`, `orgId`, `workspaceId`, `projectId`, `correlationId`, `role`, `stepId`, `details.index`, `details.title`, `details.adapter`.
+
+### Stage 2: Policy Evaluation
+
+`evaluatePolicy(task)` is called. If `policyResult.allowed === false`:
+- Writes `POLICY_BLOCK` audit event with `details.reason`
+- Appends `StepExecutionLog` with `status: "failed"`
+- Upserts `AdapterExecutionSummary` with `status: "failed"`
+- Calls `markState(bundle, "failed", { currentStepIndex: index })`
+- Returns `{ completed: false }`
+
+### Stage 3: Adapter Lookup
+
+`findAdapter(createProviderAdapters(), task.adapterId)` resolves the adapter. If `null` is returned:
+- Writes `ADAPTER_NOT_FOUND` audit event
+- Appends step log with `status: "failed"`, upserts adapter summary
+- Calls `markState(bundle, "failed")`
+- Returns `{ completed: false }`
+
+### Stage 4: Simulation and Risk Assessment
+
+If the adapter exposes a `simulate` method, `adapter.simulate(task.payload)` is called. The result's `risk` field determines `estimatedRisk`. If no `simulate` method, `adapter.estimateRisk(task.payload)` is tried. If neither exists, `estimatedRisk` defaults to `"medium"`.
+
+`requiresApproval` is determined as:
+```
+requiresApproval = task.requiresApproval
+  || policyResult.requiresApproval
+  || simulation?.requiresApproval
+  || estimatedRisk === "high"
+```
+
+### Stage 4b: Approval Gating
+
+If `requiresApproval === true` and `bundle.state.approved === false`:
+- `markState(bundle, "paused", { currentStepIndex: index, approvalRequired: true, pauseReason: <reason> })`
+- Appends step log with `status: "paused"`, `attempt: 0`
+- Writes `APPROVAL_REQUIRED` audit event
+- Returns `{ completed: false, paused: true }`
+
+### Stage 5: Validation
+
+`adapter.validate(task.payload)` must return truthy. On `false`:
+- `adapter.suggestFix` is called if available; result stored as `fixSuggestion`
+- Writes `VALIDATION_FAILED` audit event with `fixSuggestion`
+- Appends step log with `status: "failed"` and `fixSuggestion`
+- Upserts adapter summary, calls `markState(bundle, "failed")`
+- Returns `{ completed: false }`
+
+### Stage 6: Execution with Retry and Outcome Verification
+
+A loop runs from `attempt = 1` to `maxAttempts` (from `task.retryPolicy?.maxAttempts ?? 1`):
+
+**On each attempt:**
+1. Writes `STEP_EXECUTION_STARTED` audit event with `attempt` and `risk`.
+2. Calls `adapter.execute(task.payload)`. If `result.success === false`, throws `result.error`.
+3. Calls `adapter.verify(task.payload, result)`. If `!verification.ok`, throws `"Verification failed: {summary}"`.
+4. On success: writes `STEP_EXECUTION_SUCCEEDED`, appends step log `status: "success"`, upserts adapter summary, increments `currentStepIndex`, resets `approved = false`.
+
+**On catch (error):**
+1. Writes `STEP_EXECUTION_FAILED` audit event with attempt, error, and `fixSuggestion`.
+2. Appends step log with `status: "failed"`.
+3. If this is the final attempt, invokes healing integration (Stage 6b).
+
+### Stage 6b: Healing Integration (final attempt only)
+
+`healFailedStep(context)` is called with `runId`, `stepId`, `adapterId`, `errorMessage`, `payload`, and `scope`.
+
+| Healing result | Action |
+|----------------|--------|
+| `status === "verified"` | Writes `HEALING_APPLIED_AND_VERIFIED`; increments `maxAttempts` by 1; continues loop for one additional retry |
+| `approvalRequired === true` | `markState(bundle, "paused", { ... })`; returns `{ completed: false, paused: true }` |
+| Any other status | Writes `HEALING_ATTEMPTED_BUT_ESCALATED`; falls through to automatic rollback |
+| `healFailedStep` throws | Writes `HEALING_ENGINE_ERROR`; falls through to automatic rollback |
+
+### Stage 6c: Automatic Rollback (after final failed attempt)
+
+If `task.rollbackPayload` exists and `adapter.rollback` is defined:
+1. Calls `adapter.rollback(task.rollbackPayload)`.
+2. Writes `ROLLBACK_COMPLETED` audit event with `details.automatic: true`.
+3. Appends step log with `status: "rolled-back"`.
+
+After rollback (or if rollback is not available):
+- Upserts adapter summary with `status: "failed"`, output includes `fixSuggestion` if present.
+- `markState(bundle, "failed", { currentStepIndex: index })`.
+- Returns `{ completed: false }`.
+
+---
+
+## Retry Policy
+
+| Property | Source | Default |
+|----------|--------|---------|
+| `maxAttempts` | `task.retryPolicy.maxAttempts` | `1` |
+| Learning-adjusted limit | `optimizeTasks` → `retryLimit` field | Overrides task default |
+| Healing extension | On `healFailedStep` returning `"verified"`, `maxAttempts += 1` | One additional attempt granted |
+
+**Retryable errors:** All thrown errors are retried up to `maxAttempts`. There is no per-error-type allow-list; the retry decision is purely count-based. Non-retryable conditions (policy block, adapter not found, validation failure) exit before the retry loop.
+
+**Backoff:** Not currently implemented. All retries execute immediately with no delay.
+
+---
+
+## Adapter Selection
+
+```typescript
+const adapters = createProviderAdapters(); // from packages/adapters/src
+const adapter = findAdapter(adapters, task.adapterId);
+```
+
+`createProviderAdapters` returns all registered adapters. `findAdapter` performs a lookup by `adapterId`. If not found, the stage 3 failure path fires. The adapter interface requires:
+- `validate(payload): Promise<boolean>`
+- `execute(payload): Promise<{ success: boolean; output?: unknown; error?: string }>`
+
+Optional methods that enhance behavior:
+- `simulate(payload): Promise<{ risk: ExecutionRisk; summary: string; requiresApproval?: boolean }>`
+- `estimateRisk(payload): Promise<ExecutionRisk>`
+- `verify(payload, result): Promise<{ ok: boolean; summary: string }>`
+- `suggestFix(error, payload): Promise<string>`
+- `rollback(rollbackPayload): Promise<void>`
+
+---
+
+## Outcome Capture
+
+`recordRunOutcome` from `outcome-engine.ts` is called in two places:
+
+| When | `success` | `dominantFailureType` |
+|------|-----------|----------------------|
+| Bundle loop exits early with `status === "failed"` | `false` | `"step-failed"` |
+| All tasks complete successfully | `true` | undefined |
+
+`computeMetrics(bundle, success)` calculates:
+- `timeTakenMs`: sum of `finishedAt - startedAt` across all steps
+- `retryCount`: count of steps where `attempt > 1`
+- `qualityScore`: `1` if success, `0` otherwise
+- `adaptersUsed`: deduplicated list of adapter IDs from step logs
+
+`recordRunOutcome` delegates to `learnFromOutcome` in the learning engine to update the learning store for future optimizer runs.
+
+---
+
+## Concurrency Model
+
+Execution is **strictly sequential**. The task loop processes one task at a time. `executeRunBundle` does not use `Promise.all` or the batch queue for task execution. The batch queue (`batch-queue.ts`) is a separate utility used by the action runner for agent-generated action batches; it does not influence the core execution engine loop.
+
+---
+
+## `retryTask` Specification
+
+1. `loadRunBundle(runId)` — throws if not found.
+2. Resolves `stepId`: uses `targetStepId` if provided; otherwise uses `bundle.plan.tasks[bundle.state.currentStepIndex].id`.
+3. Finds task index by `id`. Throws if not found.
+4. Writes `RunState`: `currentStepIndex = index`, `status = "running"`, `pauseReason = undefined`, calls `updateRunState`.
+5. Writes `STEP_RETRY_REQUESTED` audit event.
+6. Calls `executeTask(bundle, task, index, actor)`.
+7. Reloads bundle from store via `loadRunBundle(runId)` and returns it.
+
+Note: `retryTask` re-runs the full 6-stage pipeline for the single target task. It does not continue to subsequent tasks after the retry.
+
+---
+
+## `rollbackTask` Specification
+
+1. `loadRunBundle(runId)` — throws if not found.
+2. Resolves `stepId`: uses `targetStepId` if provided; otherwise uses the last entry in `bundle.executionLog.steps`.
+3. Finds `PlanTask` by `stepId`. Throws if not found.
+4. Resolves adapter via `findAdapter`. Throws if `adapter.rollback` is absent or `task.rollbackPayload` is absent.
+5. Calls `adapter.rollback(task.rollbackPayload)`.
+6. Writes `TASK_ROLLBACK_MANUAL` audit event with `role = "operator"` if actor is not `"system"`.
+7. Appends step log with `status: "rolled-back"`, `title: "{title} manual rollback"`.
+8. Upserts adapter summary with `status: "rolled-back"`.
+9. Decrements `bundle.state.currentStepIndex` by 1 (if > 0).
+10. Calls `updateRunState`.
+11. Reloads and returns updated bundle.
+
+---
+
+## Dependencies
+
+| Dependency | Package | Purpose |
+|-----------|---------|---------|
+| `adapters/src` | `packages/adapters` | `createProviderAdapters`, `findAdapter` |
+| `memory/src/run-store` | `packages/memory` | `loadRunBundle`, `updateAdapters`, `updateExecutionLog`, `updateRunState` |
+| `core/src/policy-engine` | `packages/core` | `evaluatePolicy` — allows/blocks tasks |
+| `audit/src` | `packages/audit` | `writeAuditEvent` — append-only audit trail |
+| `learning/src/store` | `packages/learning` | `loadLearningStore` — historical outcome data |
+| `learning/src/execution-optimizer` | `packages/learning` | `optimizeTasks` — adjusts retry limits |
+| `healing-integration.ts` | `packages/orchestrator` | `healFailedStep` — invokes healing engine |
+| `outcome-engine.ts` | `packages/orchestrator` | `recordRunOutcome` → `learnFromOutcome` |
+
+---
+
+## Edge Cases
+
+- **Zero tasks in plan:** The task loop exits immediately; `RUN_COMPLETED` is written and `status = "completed"`.
+- **Resume at last step:** `currentStepIndex === tasks.length - 1`; only that one step re-runs.
+- **`adapter.verify` absent:** Treated as verified; step log records `"No verification hook; accepting successful execution."`.
+- **Healing throws:** Caught internally; `HEALING_ENGINE_ERROR` written; automatic rollback proceeds normally.
+- **`rollbackTask` with no prior step log:** `bundle.executionLog.steps.at(-1)` returns `undefined`; `stepId` is `undefined`; `task` lookup fails; throws `"Step not found for rollback: undefined"`.
+- **`retryTask` on a completed run:** Allowed by implementation — the step is re-executed. Callers should check `bundle.state.status` before calling to avoid redundant retries.
+- **Healing extends `maxAttempts` beyond the loop bounds:** The `continue` statement after `maxAttempts += 1` re-enters the `for` loop with the new limit, so execution correctly attempts one more time.
+
+---
+
+## Risks
+
+| Risk | Likelihood | Impact | Mitigation |
+|------|-----------|--------|-----------|
+| No backoff between retries causes rapid failure cascade | Medium | Medium | Add configurable backoff to `retryPolicy`; expose `backoffMs` in `PlanTask` |
+| `approved` reset after each step may cause re-pause on next high-risk step | Low | Medium | Expected behavior; document that approval is per-step, not per-run |
+| Healing extending `maxAttempts` indefinitely if healing keeps succeeding | Low | High | Cap total attempts at a hard limit (e.g., `maxAttempts + 1`, never more) |
+| Learning optimizer applying incorrect retry limits from stale store | Medium | Medium | Version the learning store; include store timestamp in optimizer suggestions |
+| Manual rollback decrementing index incorrectly when step was not the current one | Medium | Low | `rollbackTask` decrements unconditionally; validate that `currentStepIndex > 0` before decrement |
+
+---
+
+## Definition of Done
+
+- [ ] `executeRunBundle` processes all tasks sequentially and returns completed bundle when all succeed
+- [ ] Policy block at stage 2 produces `status: failed` and correct audit events
+- [ ] Adapter not found at stage 3 produces `status: failed` with `ADAPTER_NOT_FOUND` event
+- [ ] Approval gating at stage 4b pauses bundle with correct `pauseReason` and `APPROVAL_REQUIRED` event
+- [ ] Validation failure at stage 5 includes `fixSuggestion` in step log and audit
+- [ ] Retry loop runs `maxAttempts` times before triggering healing
+- [ ] Healing `"verified"` result causes one additional retry attempt
+- [ ] Healing `approvalRequired` result pauses the run
+- [ ] Automatic rollback fires when `rollbackPayload` is present after final failure
+- [ ] `recordRunOutcome` called with `success=false` on early loop exit and `success=true` on completion
+- [ ] `retryTask` resumes from exact step index with `STEP_RETRY_REQUESTED` audit event
+- [ ] `rollbackTask` throws when adapter has no `rollback` method or task has no `rollbackPayload`
+- [ ] Learning optimizer suggestions are logged as `OPTIMIZER_SUGGESTIONS_APPLIED` when non-empty
+- [ ] Zero-task bundle completes immediately with `RUN_COMPLETED` event
diff --git a/docs/03_specs/SPEC_ORCHESTRATOR.md b/docs/03_specs/SPEC_ORCHESTRATOR.md
new file mode 100644
index 0000000..188a62f
--- /dev/null
+++ b/docs/03_specs/SPEC_ORCHESTRATOR.md
@@ -0,0 +1,480 @@
+# SPEC — Orchestrator
+**Status:** Draft
+**Version:** 1.0
+**Linked to:** packages/orchestrator/src/index.ts
+**Implements:** System-level orchestration wiring, entry points, mode control, batch queue, trust pipeline, healing hooks, resume/rollback flows, log writing, and canonical event contract
+
+---
+
+## Objective
+
+Define the complete behavioral contract for the Orchestrator package — the central coordination layer that ties together intake, planning, skill selection, gate evaluation, execution, outcome capture, healing, resume, and rollback into coherent pipelines. This spec describes how all sub-components are connected, which entry points exist, how mode policy propagates, and which events the orchestrator must emit in which order.
+
+---
+
+## Scope
+
+- System overview and component wiring
+- Entry points: `runVerticalSlice`, `runOrchestrationStep`, `resumeRun`
+- Mode controller: `getModePolicy`, `trimQuestionsByMode`
+- Batch queue: `QueuedBatch` lifecycle and storage
+- Trust pipeline: `prepareTrustedBatch` and what it evaluates
+- Healing integration hook within execution loop
+- Resume flow: state reconstruction and continuation
+- Rollback engine: `rollbackRun` and `rollbackTask`
+- Log writer: artifact and log file conventions
+- Canonical events contract: ordering, required fields, scoping rules
+
+Out of scope: individual adapter implementations, authentication infrastructure, database schema migrations.
+
+---
+
+## Inputs / Outputs
+
+| Direction | Item | Type | Description |
+|-----------|------|------|-------------|
+| Input | `RunVerticalSliceInput` | `{ idea, mode?, dryRun?, approvedGates?, currentRun? }` | Full pipeline trigger |
+| Input | `RunReport` | Full report | Single-step trigger via `runOrchestrationStep` |
+| Input | `runId` + `approve` flag | Strings | Resume trigger via `resumeRun` |
+| Output | `RunVerticalSliceResult` | `{ report, artifactDirectory, artifactReportPath, memoryPath, overallGateStatus, currentPhase }` | Full result after pipeline completes or halts |
+| Output | `RunBundle` | Persisted in run-store | State snapshot after any execution step |
+
+---
+
+## Data Structures
+
+```typescript
+// packages/orchestrator/src/run-vertical-slice.ts
+interface RunVerticalSliceInput {
+  idea: string;
+  mode?: Mode;           // default: "builder"
+  dryRun?: boolean;
+  approvedGates?: string[];
+  currentRun?: RunReport; // inject existing report to continue from checkpoint
+}
+
+interface RunVerticalSliceResult {
+  report: RunReport;
+  artifactDirectory: string;
+  artifactReportPath: string;
+  memoryPath: string;
+  overallGateStatus: string;
+  currentPhase: string;
+}
+
+// packages/orchestrator/src/batch-queue.ts
+interface QueuedBatch {
+  id: string;          // "batch_" + random 8-char hex
+  runId: string;
+  phase: string;
+  createdAt: string;
+  status: QueueStatus; // "pending" | "approved" | "executed" | "blocked"
+  riskSummary: { low: number; medium: number; high: number };
+  generatedBy: string;
+  summary: string;
+  batch: BuilderActionBatch;
+}
+
+// packages/orchestrator/src/rollback-metadata.ts
+interface RollbackEntry {
+  type: "file_write" | "file_append" | "dir_create" | "command";
+  target: string;       // relative path from workspaceRoot
+  timestamp: string;
+  note: string;
+}
+
+interface RollbackMetadata {
+  runId: string;
+  entries: RollbackEntry[];
+}
+```
+
+---
+
+## Interfaces / APIs
+
+### Public Exports from `packages/orchestrator/src/index.ts`
+
+```typescript
+// Mode control
+export { getModePolicy, trimQuestionsByMode } from "./mode-controller";
+
+// Intake
+export { runIntake } from "./intake";
+
+// Gate evaluation
+export { evaluateGates } from "./gate-manager";
+
+// Full pipeline + single-step
+export { runVerticalSlice } from "./run-vertical-slice";
+
+// Task execution
+export * from "./execution-engine";    // executeRunBundle, retryTask, rollbackTask
+
+// Run management
+export * from "./resume-run";          // resumeRun, inspectRun
+export * from "./rollback-engine";     // rollbackRun
+export * from "./outcome-engine";      // recordRunOutcome
+export * from "./healing-integration"; // healFailedStep
+
+// Utility
+export * from "./action-runner";       // runActionBatch
+export * from "./log-writer";          // writeArtifact, writeJsonRecord, writeActionLog
+export * from "./rollback-metadata";   // RollbackEntry, RollbackMetadata types
+export * from "./batch-queue";         // createQueuedBatch, listQueuedBatches, etc.
+```
+
+---
+
+## System Overview: Component Wiring
+
+The following diagram shows how orchestrator sub-components are invoked during a standard `runVerticalSlice` call.
+
+```
+runVerticalSlice(input)
+  │
+  ├─► getModePolicy(mode)
+  │     └─► Returns ModePolicy (gateThresholds, execution config)
+  │
+  ├─► runOrchestrationStep(report) [loop until blocked/finished/expert-mode-pause]
+  │     │
+  │     ├─[intake phase]──► runIntake(idea, mode)
+  │     │                       ├─► normalizeIdeaText
+  │     │                       ├─► inferSolutionCategory
+  │     │                       ├─► deriveAssumptions
+  │     │                       ├─► generateClarifyingQuestions
+  │     │                       └─► trimQuestionsByMode(questions, mode)
+  │     │
+  │     ├─[planning phase]─► buildPlanFromClarification(intakeResult)
+  │     │
+  │     ├─[skills phase]──► selectSkills(clarification, plan)
+  │     │
+  │     ├─[gating phase]──► evaluateGates(intakeResult, plan, skills, mode, approvedGates)
+  │     │                       └─► [if needs-review] emitGateAwaitingApproval(state, reason)
+  │     │
+  │     └─[building phase]─► executeRunBundle(bundle, actor)
+  │                               ├─► [pre-loop] optimizeTasks(tasks, learningStore)
+  │                               ├─► [per-task] executeTask(bundle, task, index, actor)
+  │                               │       ├─► writeAuditEvent(TASK_EXECUTION_ATTEMPT)
+  │                               │       ├─► evaluatePolicy(task)
+  │                               │       ├─► findAdapter(adapters, task.adapterId)
+  │                               │       ├─► adapter.simulate / adapter.estimateRisk
+  │                               │       ├─► [if approval needed] markState(paused)
+  │                               │       ├─► adapter.validate
+  │                               │       ├─► adapter.execute + adapter.verify [retry loop]
+  │                               │       ├─► [on final failure] healFailedStep(context)
+  │                               │       │       └─► attemptHealing (healing-engine)
+  │                               │       └─► [on heal fail] adapter.rollback(rollbackPayload)
+  │                               │
+  │                               └─► [on completion] recordRunOutcome → learnFromOutcome
+  │
+  └─► recordRun(report)
+        └─► Persists to memory; returns artifactDirectory, reportPath, memoryPath
+```
+
+---
+
+## Mode Controller
+
+`getModePolicy(mode: Mode): ModePolicy` returns a `ModePolicy` object containing:
+
+- `maxClarifyingQuestions`: upper bound on questions surfaced to user
+- `gateThresholds`: numeric thresholds used by all 5 gate evaluators
+- `execution`: flags controlling approval requirements and dry-run behavior
+
+`trimQuestionsByMode<T extends QuestionLike>(questions: T[], mode: Mode): T[]`:
+- Sorts questions by priority weight: `required` (100) > `critical` (90) > `high` (70) > `medium` (50) > `low` (30) > default (40)
+- Slices to `policy.maxClarifyingQuestions`
+- Used in `intake.ts` to limit clarifying questions surfaced per mode
+
+### Mode Policy Summary
+
+| Mode | Max Questions | Medium Risk Approval | High Risk Approval | Dry Run Default | Command Exec |
+|------|--------------|---------------------|-------------------|-----------------|-------------|
+| turbo | 2 | No | Yes | No | Yes |
+| builder | 5 | Yes | Yes | No | Yes |
+| pro | 8 | Yes | Yes | Yes | Yes |
+| expert | 15 | Yes | Yes | Yes | Yes |
+| safe | 20 | Yes | Yes | Yes | No |
+| balanced | 10 | Yes | Yes | Yes | Yes |
+| god | 0 | No | No | No | Yes |
+
+---
+
+## Batch Queue
+
+The batch queue is a file-backed queue stored at `{workspaceRoot}/.ck/queue/{id}.json`. It is used by the action runner to stage agent-generated `BuilderActionBatch` objects for approval or execution.
+
+### Lifecycle
+
+```
+createQueuedBatch(params)   → writes {id}.json with status "pending"
+updateQueuedBatchStatus(id) → overwrites file with new status
+getQueuedBatch(id)          → reads and parses {id}.json
+listQueuedBatches(runId?)   → reads all .json files, filters by runId, sorts by createdAt
+```
+
+### Status Transitions
+
+```
+pending → approved  (human approves the batch)
+pending → blocked   (gate evaluation or policy rejects)
+approved → executed (action runner processes the batch)
+```
+
+### Risk Summary
+
+Each `QueuedBatch` carries a `riskSummary: { low, medium, high }` count summarizing the actions in its `BuilderActionBatch`. This is used to surface a concise approval prompt to operators.
+
+### Concurrency
+
+No locking mechanism. Concurrent writes to the same batch file will result in last-writer-wins. Consumers must not assume atomic updates across multiple batches.
+
+---
+
+## Trust Pipeline
+
+`prepareTrustedBatch` in `trust-pipeline.ts` is called before a `BuilderActionBatch` is executed to establish cryptographic provenance and a diff preview.
+
+### What It Evaluates and Produces
+
+1. **Diff preview:** `writeDiffPreview(workspaceRoot, batch)` generates a human-readable preview of all file changes in the batch. Written to the artifact store.
+2. **Provenance record:** `createBatchProvenance({ batch, sourcePhase, sourceArtifact, actor })` captures who generated the batch and from which phase/artifact.
+3. **Batch signing:** `signBatch({ batch, provenance, secret })` produces a signed envelope (`BatchSignedEnvelope`) using the provided `signingSecret`. Written to the workspace as a signature file.
+
+### Return Values
+
+```typescript
+{
+  diffArtifactPath: string;    // path to generated diff preview
+  provenancePath: string;      // path to provenance JSON
+  signaturePath: string;       // path to signed batch envelope
+  envelope: BatchSignedEnvelope;
+}
+```
+
+### When It Is Invoked
+
+`prepareTrustedBatch` is called by components that generate and stage action batches before execution (e.g., agent-generated file write batches). It is not automatically invoked by `executeRunBundle` — it is an opt-in call from the action runner or phase-level code that produces batch artifacts.
+
+---
+
+## Healing Integration Hook
+
+`healFailedStep(context: FailedStepContext): Promise<HealingAttempt>` is the orchestrator's integration point with the healing engine. It is invoked exclusively from within `executeTask` in `execution-engine.ts`, on the final retry attempt of a failing task.
+
+```typescript
+interface FailedStepContext {
+  runId: string;
+  stepId: string;
+  adapterId: string;
+  errorMessage: string;
+  payload?: Record<string, unknown>;
+  workingDirectory?: string;
+  scope?: ExecutionScope;
+}
+```
+
+`healFailedStep` delegates to `attemptHealing` from `packages/healing/src/healing-engine`. The `HealingAttempt` return type (from `packages/shared/src/phase10_5-types`) carries:
+
+| Field | Meaning |
+|-------|---------|
+| `status: "verified"` | Healing applied and re-validated; execution engine grants one more retry |
+| `approvalRequired: true` | Healing strategy selected but requires human approval; run pauses |
+| Any other status | Healing could not resolve; execution engine proceeds to automatic rollback |
+
+The orchestrator does not retry healing more than once per step failure. Healing outcome is recorded via `writeAuditEvent` with one of: `HEALING_APPLIED_AND_VERIFIED`, `HEALING_ATTEMPTED_BUT_ESCALATED`, or `HEALING_ENGINE_ERROR`.
+
+---
+
+## Resume Flow
+
+`resumeRun(runId: string, approve: boolean, actor: string): Promise<RunBundle>` reconstructs and continues a paused run.
+
+### Steps
+
+1. `loadRunBundle(runId)` loads the full `RunBundle` from the memory store. Throws `"Run not found: {runId}"` if absent.
+2. If `approve === true`:
+   - Sets `bundle.state.approved = true`
+   - Sets `bundle.state.approvalRequired = false`
+   - Sets `bundle.state.updatedAt = now()`
+   - Calls `updateRunState(runId, bundle.state)` to persist the approval
+3. If `bundle.state.status === "completed"`, returns immediately without re-executing.
+4. Calls `executeRunBundle(bundle, actor)` which resumes from `bundle.state.currentStepIndex`.
+
+### Phase-level Resume
+
+`runVerticalSlice` supports phase-level resume via `input.currentRun`. When provided:
+- The report's `completedPhases`, `currentPhase`, `approvedGates`, and all artifacts are preserved.
+- The orchestration loop begins at `currentPhase` without re-running already-completed phases.
+- This enables resuming a run that was blocked at gating after adding new gate approvals.
+
+---
+
+## Rollback Engine
+
+`rollbackRun(workspaceRoot: string, runId: string): RollbackOutcome` coordinates a full multi-step rollback based on persisted rollback metadata files.
+
+### How It Works
+
+1. Reads all files matching `{workspaceRoot}/.ck/logs/{runId}/*-rollback.json`.
+2. Processes files in **reverse chronological order** (most recent first).
+3. For each file, processes entries in **reverse order** (last action undone first).
+
+### Entry Type Handling
+
+| Entry Type | Action | Can Be Reverted? |
+|-----------|--------|-----------------|
+| `file_write` | `fs.unlinkSync(target)` if file exists | Yes |
+| `dir_create` | `fs.rmdirSync(target)` if empty | Conditional (non-empty dirs skipped) |
+| `file_append` | Skipped with note | No (manual only) |
+| `command` | Skipped with note | No (manual only) |
+| Unknown | Skipped with note | No |
+
+### Return Value
+
+```typescript
+interface RollbackOutcome {
+  runId: string;
+  attempted: number;  // total entries processed
+  reverted: number;   // successfully undone
+  skipped: number;    // not undoable automatically
+  notes: string[];    // human-readable log of each action
+}
+```
+
+### Relationship to `rollbackTask`
+
+`rollbackRun` (in `rollback-engine.ts`) operates on filesystem metadata records — it is a coarse-grained undo of file system changes. `rollbackTask` (in `execution-engine.ts`) calls `adapter.rollback(rollbackPayload)` which is a fine-grained, adapter-aware undo of a single task's side effects. Both can be used independently.
+
+---
+
+## Log Writer
+
+`log-writer.ts` provides three file-writing utilities:
+
+### `writeArtifact(workspaceRoot, runId, phase, markdown): string`
+- Path: `{workspaceRoot}/.ck/artifacts/{runId}/{phase}.md`
+- Used to persist markdown reports for each phase execution
+- Returns the full written path
+
+### `writeJsonRecord(workspaceRoot, bucket, runId, filename, payload): string`
+- Path: `{workspaceRoot}/.ck/{bucket}/{runId}/{filename}`
+- Used for structured JSON records (e.g., gate results, plan artifacts)
+- Returns the full written path
+
+### `writeActionLog(workspaceRoot, runId, filename, payload): string`
+- Path: `{workspaceRoot}/.ck/logs/{runId}/{filename}`
+- Accepts string or JSON-serializable payload
+- Used for rollback metadata files and action execution logs
+- Returns the full written path
+
+All three functions call `fs.mkdirSync(dir, { recursive: true })` before writing, so directories are always created as needed.
+
+---
+
+## Canonical Events Contract
+
+The orchestrator must emit the following events in the following order during a standard run. All events are published via `publishEvent(eventType, payload)` from `packages/events/src`.
+
+### Event Emission Rules
+
+1. Events are only emitted when `RunState.orgId` and `RunState.workspaceId` are present. Missing tenant scope silently skips emission.
+2. Every event payload includes: `runId`, `tenant: { orgId, workspaceId, projectId }`, `actor: { id, type, authMode }`, `correlationId`.
+3. Events are fire-and-forget from the orchestrator's perspective; emission errors do not halt execution.
+
+### Canonical Sequence for a Successful Run
+
+| Order | Event Type | Emitted By | Trigger |
+|-------|-----------|-----------|---------|
+| 1 | `execution.started` | phase-engine.ts building handler | Before `executeRunBundle` is called |
+| 2 | `execution.completed` | phase-engine.ts building handler | After `executeRunBundle` returns with `status: "completed"` |
+
+### Canonical Sequence for a Paused Run (approval required)
+
+| Order | Event Type | Emitted By | Trigger |
+|-------|-----------|-----------|---------|
+| 1 | `execution.started` | phase-engine.ts building handler | Before `executeRunBundle` is called |
+| 2 | `gate.awaiting_approval` | phase-engine.ts building handler | After `executeRunBundle` returns with `status: "paused"` |
+
+### Canonical Sequence for a Gating Pause
+
+| Order | Event Type | Emitted By | Trigger |
+|-------|-----------|-----------|---------|
+| 1 | `gate.awaiting_approval` | phase-engine.ts gating handler | When `evaluateGates` returns `overallStatus: "needs-review"` |
+
+### Canonical Sequence for a Failed Run
+
+| Order | Event Type | Emitted By | Trigger |
+|-------|-----------|-----------|---------|
+| 1 | `execution.started` | phase-engine.ts building handler | Before `executeRunBundle` is called |
+| 2 | `execution.failed` | phase-engine.ts building handler | After `executeRunBundle` returns with `status: "failed"` |
+
+### Additional Events (verification)
+
+| Event Type | Emitted By | Trigger |
+|-----------|-----------|---------|
+| `verification.completed` | Callers using `emitVerificationCompleted` helper | After adapter verification step, if caller opts in |
+
+---
+
+## Dependencies
+
+| Dependency | Package | Purpose |
+|-----------|---------|---------|
+| `intake.ts` | `packages/orchestrator` | Phase 1: idea normalization and clarification |
+| `planner.ts` | `packages/orchestrator` | Phase 2: task plan generation |
+| `skill-engine` | `packages/skill-engine` | Phase 3: skill selection |
+| `gate-manager.ts` | `packages/orchestrator` | Phase 4: gate evaluation |
+| `execution-engine.ts` | `packages/orchestrator` | Phase 5: task execution |
+| `mode-controller.ts` | `packages/orchestrator` | Policy per mode |
+| `events.ts` | `packages/orchestrator` | Orchestrator-scoped event emission helpers |
+| `memory/src` | `packages/memory` | `recordRun`, `loadRunBundle`, `updateRunState` |
+| `healing/src` | `packages/healing` | `attemptHealing` via `healing-integration.ts` |
+| `learning/src` | `packages/learning` | `learnFromOutcome` via `outcome-engine.ts` |
+| `security/src` | `packages/security` | Batch provenance and signing via `trust-pipeline.ts` |
+| `audit/src` | `packages/audit` | `writeAuditEvent` |
+| `events/src` | `packages/events` | `publishEvent` |
+
+---
+
+## Edge Cases
+
+- **`runVerticalSlice` with `mode === "expert"`:** Only one phase executes per call. The caller must re-invoke with the returned report as `currentRun` to advance. `shouldContinue` returns `false` immediately in expert mode.
+- **`runVerticalSlice` with `mode === "turbo"`:** The inner `while` loop continues until `isFinished === true` or `status !== "in-progress"`, resulting in a fully autonomous single-call pipeline.
+- **`prepareTrustedBatch` called without signing secret:** `signBatch` receives an empty string; signing behavior depends on the security package's handling of empty secrets — this should be treated as an error.
+- **`rollbackRun` with no rollback log directory:** Returns `RollbackOutcome` with `attempted=0`, `reverted=0`, and note `"No rollback logs found."` — a safe no-op.
+- **Event emission with partial tenant scope (orgId present but workspaceId missing):** `emitOrchestratorEvent` silently skips; no warning is logged by default (there is a commented-out `console.warn` in the source).
+- **`resumeRun` called on a completed run:** Returns the bundle immediately without re-execution. Idempotent.
+- **Batch queue file corruption:** `getQueuedBatch` will throw a JSON parse error. No error recovery is built in; the file must be manually repaired or deleted.
+
+---
+
+## Risks
+
+| Risk | Likelihood | Impact | Mitigation |
+|------|-----------|--------|-----------|
+| Event emission skipped silently when tenant scope is missing | High | Medium | Enable the commented-out `console.warn`; add telemetry for skipped event count |
+| Batch queue race condition on concurrent `createQueuedBatch` with identical random IDs | Very Low | Low | `Math.random().toString(36).slice(2,10)` collision probability is negligible; add UUID for production |
+| `rollbackRun` deletes files without confirming they were created by this run | Low | High | Rollback metadata entries should include a run-scoped checksum or creation marker |
+| Trust pipeline signing secret passed as plain string in function call | Medium | High | Move `signingSecret` to environment variable retrieval inside `trust-pipeline.ts` |
+| `runVerticalSlice` has no timeout; turbo mode loops indefinitely on phase errors | Low | Medium | Add a maximum phase iteration count guard inside the `while` loop |
+
+---
+
+## Definition of Done
+
+- [ ] `runVerticalSlice` completes full 8-phase pipeline in builder mode with a valid idea
+- [ ] `runVerticalSlice` stops after phase 1 in expert mode and returns correct `currentPhase`
+- [ ] `runVerticalSlice` loops to completion in turbo mode without requiring multiple calls
+- [ ] `resumeRun(runId, approve=true)` resumes a paused bundle from the correct `currentStepIndex`
+- [ ] `resumeRun` on a completed run returns the bundle without re-executing
+- [ ] `rollbackRun` processes rollback files in reverse chronological order with entries in reverse order
+- [ ] `rollbackRun` skips `file_append` and `command` entries with appropriate notes
+- [ ] `prepareTrustedBatch` produces a diff artifact, provenance file, and signature file
+- [ ] All 5 canonical event types emit with correct `runId`, `tenant`, `actor`, `correlationId` fields
+- [ ] Events are not emitted when `orgId` or `workspaceId` is missing from `RunState`
+- [ ] Batch queue `listQueuedBatches` returns results sorted by `createdAt` ascending
+- [ ] `getModePolicy("god")` returns policy with `requireApprovalForHighRisk: false`
+- [ ] `trimQuestionsByMode` returns no more than `maxClarifyingQuestions` items sorted by priority weight
+- [ ] `writeArtifact`, `writeJsonRecord`, and `writeActionLog` create parent directories automatically
diff --git a/docs/03_specs/SPEC_PHASE_ENGINE.md b/docs/03_specs/SPEC_PHASE_ENGINE.md
new file mode 100644
index 0000000..2bd0801
--- /dev/null
+++ b/docs/03_specs/SPEC_PHASE_ENGINE.md
@@ -0,0 +1,319 @@
+# SPEC — Phase Engine
+**Status:** Draft
+**Version:** 1.0
+**Linked to:** packages/orchestrator/src/phase-engine.ts
+**Implements:** Sequential phase execution pipeline, per-phase contracts, runOrchestrationStep function, and mode-influenced behavior
+
+---
+
+## Objective
+
+Define the complete behavioral contract for the Phase Engine — the component that orchestrates a run through eight sequential phases from idea intake to deployment. This spec covers the function signature and return semantics of `runOrchestrationStep`, the interface every phase handler must satisfy, how mode policy shapes each phase, how progress is checkpointed for resume, and which audit events are emitted at each phase boundary.
+
+---
+
+## Scope
+
+- 8-phase sequence definition and ordering
+- `PhaseHandler` interface contract
+- `runOrchestrationStep` function specification
+- Per-phase: inputs, outputs, preconditions, postconditions
+- Gating phase: 5-gate sequential evaluation and short-circuit logic
+- Building phase: full execution delegation to execution engine
+- Mode influence table per phase
+- Checkpointing and resume behavior
+- Audit events emitted per phase transition
+
+Out of scope: individual gate logic (see `SPEC_RUN_LIFECYCLE.md`), adapter implementation, and post-deployment observability.
+
+---
+
+## Inputs / Outputs
+
+| Direction | Item | Type | Description |
+|-----------|------|------|-------------|
+| Input | `RunReport` | `RunReport` | Full report snapshot at the start of each step |
+| Output | `RunReport & { isFinished: boolean }` | Extended report | Updated report with new phase, status, and finished flag |
+
+---
+
+## Data Structures
+
+```typescript
+// Defined in packages/orchestrator/src/phase-engine.ts
+
+export interface PhaseContext {
+  idea: string;         // normalized idea text from report.input.idea
+  mode: Mode;           // execution mode from report.input.mode
+  approvedGates: string[]; // list of gate IDs manually approved
+  report: Partial<RunReport>; // full report snapshot passed into the step
+}
+
+export type PhaseHandler = (context: PhaseContext) => Promise<{
+  nextPhase: Phase | null;    // null signals the pipeline is finished
+  updates: Partial<RunReport>; // fields to merge into RunReport
+  status: "success" | "blocked" | "awaiting-approval";
+}>;
+
+// Phase sequence order (PHASE_HANDLERS keys, evaluated left to right)
+type Phase = "intake" | "planning" | "skills" | "gating" | "building"
+           | "testing" | "reviewing" | "deployment";
+```
+
+---
+
+## Interfaces / APIs
+
+### `runOrchestrationStep`
+
+```typescript
+export async function runOrchestrationStep(
+  report: RunReport
+): Promise<RunReport & { isFinished: boolean }>
+```
+
+**Behavior:**
+1. Look up the handler for `report.currentPhase` in `PHASE_HANDLERS`.
+2. Build a `PhaseContext` from the report (idea, mode, approvedGates, report).
+3. Invoke the handler and await the result.
+4. Merge `result.updates` onto `report` and set `updatedAt = now()`.
+5. Apply status and phase advancement logic:
+   - `result.status === "success"` + `nextPhase != null` → advance `currentPhase`, push completed phase into `completedPhases`, set `status = "in-progress"`, `isFinished = false`
+   - `result.status === "success"` + `nextPhase === null` → set `status = "success"`, `isFinished = true`
+   - `result.status === "blocked"` → set `status = "blocked"`, `isFinished = false`
+   - `result.status === "awaiting-approval"` → set `status = "awaiting-approval"`, `isFinished = false`
+6. Return the merged report with `isFinished` attached.
+
+**Error handling:** If `PHASE_HANDLERS[report.currentPhase]` is undefined, throws `Error("Unknown phase: {phase}")`. Individual phase handlers may throw; errors propagate to the caller (`runVerticalSlice`).
+
+---
+
+## Phase Sequence Definition
+
+```
+intake → planning → skills → gating → building → testing → reviewing → deployment
+```
+
+Phases are strictly sequential. No phase can be skipped by the engine itself (only mode policy can reduce their work). Each phase reads exclusively from `PhaseContext` and writes exclusively to `Partial<RunReport>` via its `updates` return value.
+
+---
+
+## Per-Phase Contracts
+
+### Phase 1: intake
+
+| | Detail |
+|---|---|
+| **Preconditions** | `context.idea` must be a non-empty string |
+| **Inputs consumed** | `context.idea`, `context.mode` |
+| **Operations** | `runIntake({ idea, mode })` → calls `normalizeIdeaText`, `inferSolutionCategory`, `deriveAssumptions`, `generateClarifyingQuestions`, `trimQuestionsByMode` |
+| **Outputs** | `updates.intakeResult`, `updates.assumptions`, `updates.clarifyingQuestions` |
+| **Next phase** | `"planning"` |
+| **Postconditions** | `report.intakeResult` is populated; `report.clarifyingQuestions` is trimmed to mode limit |
+| **Can block?** | No — always returns `status: "success"` |
+
+### Phase 2: planning
+
+| | Detail |
+|---|---|
+| **Preconditions** | `context.report.intakeResult` must exist; throws otherwise |
+| **Inputs consumed** | `context.report.intakeResult` |
+| **Operations** | `buildPlanFromClarification(intakeResult)` → produces task list |
+| **Outputs** | `updates.plan` |
+| **Next phase** | `"skills"` |
+| **Postconditions** | `report.plan` is a non-empty `Task[]` |
+| **Can block?** | No |
+
+### Phase 3: skills
+
+| | Detail |
+|---|---|
+| **Preconditions** | `context.report.intakeResult` and `context.report.plan` must exist; throws otherwise |
+| **Inputs consumed** | `context.report.intakeResult`, `context.report.plan` |
+| **Operations** | `selectSkills({ clarification, plan })` from skill-engine |
+| **Outputs** | `updates.selectedSkills` |
+| **Next phase** | `"gating"` |
+| **Postconditions** | `report.selectedSkills` is a `SelectedSkill[]` |
+| **Can block?** | No |
+
+### Phase 4: gating
+
+| | Detail |
+|---|---|
+| **Preconditions** | `intakeResult`, `plan`, `selectedSkills` must all be populated |
+| **Inputs consumed** | All three artifacts plus `context.mode`, `context.approvedGates` |
+| **Operations** | `evaluateGates(...)` → evaluates 5 gates sequentially |
+| **Outputs** | `updates.gates`, `updates.overallGateStatus`, `updates.status` |
+| **Next phase** | `"building"` if `overallStatus === "pass"`; stays `"gating"` otherwise |
+| **Postconditions** | `report.gates` contains 5 `GateDecision` entries; `overallGateStatus` is one of `pass \| needs-review \| blocked` |
+| **Can block?** | Yes — `blocked` propagates as handler `status: "blocked"` |
+| **Can pause?** | Yes — `needs-review` propagates as `status: "awaiting-approval"`, emits `gate.awaiting_approval` |
+
+### Phase 5: building
+
+| | Detail |
+|---|---|
+| **Preconditions** | `context.report.id` or auto-generated runId; `context.report.input` required to initialize a new bundle |
+| **Inputs consumed** | Existing `RunBundle` from `loadRunBundle(runId)` or initialized from context |
+| **Operations** | `emitExecutionStarted` → `executeRunBundle(bundle)` → emit completion/failure/pause event |
+| **Outputs** | `updates.id`, `updates.summary`, `updates.status` |
+| **Next phase** | `"testing"` if `isCompleted`; stays `"building"` if paused or failed |
+| **Postconditions** | `bundle.state.status` reflects actual execution outcome; `RunBundle` persisted to run-store |
+| **Can block?** | No — delegates failure to execution engine |
+| **Can pause?** | Yes — when `bundle.state.status === "paused"`, returns `awaiting-approval` |
+
+### Phase 6: testing
+
+| | Detail |
+|---|---|
+| **Preconditions** | Building phase must have completed |
+| **Operations** | Simulated — returns immediately with placeholder summary |
+| **Outputs** | `updates.summary = "Testing phase completed (SIMULATED)."` |
+| **Next phase** | `"reviewing"` |
+| **Can block?** | No |
+
+### Phase 7: reviewing
+
+| | Detail |
+|---|---|
+| **Operations** | Simulated — returns immediately with placeholder summary |
+| **Next phase** | `"deployment"` |
+| **Can block?** | No |
+
+### Phase 8: deployment
+
+| | Detail |
+|---|---|
+| **Operations** | Simulated — returns immediately with success summary |
+| **Outputs** | `updates.status = "success"`, `updates.summary = "Deployment completed (SIMULATED). Pipeline finished."` |
+| **Next phase** | `null` — signals `isFinished = true` to `runOrchestrationStep` |
+| **Can block?** | No |
+
+---
+
+## Phase Transition Rules
+
+A phase advances to its `nextPhase` if and only if its handler returns `status: "success"`. The following conditions gate advancement:
+
+- **Blocked:** The handler returned `status: "blocked"`. `currentPhase` does not change. `runOrchestrationStep` returns with `status: "blocked"` and `isFinished: false`. The caller must resolve the block before re-invoking.
+- **Awaiting approval:** The handler returned `status: "awaiting-approval"`. `currentPhase` does not change. The run persists in its current phase until a resume call grants approval.
+- **Finished:** `nextPhase === null` and `status: "success"`. `isFinished = true`; `runReport.status = "success"`.
+
+---
+
+## Gating Phase: 5-Gate Sequential Evaluation
+
+`evaluateGates` in `gate-manager.ts` runs the following 5 evaluators in order. Short-circuit semantics apply at the `getOverallGateStatus` aggregation level, not at individual gate evaluation — all 5 gates always evaluate, but the first `blocked` result wins overall.
+
+| # | Gate ID | Evaluator | Block Condition | Review Condition |
+|---|---------|-----------|-----------------|-----------------|
+| 1 | `objective-clarity` | `evaluateObjectiveClarityGate` | No normalized idea | Category is `"unknown"` or `"unclear"` |
+| 2 | `requirements-completeness` | `evaluateRequirementsCompletenessGate` | Questions ≥ `maxQuestionsBeforeBlock` | Questions ≥ `maxQuestionsBeforeReview` |
+| 3 | `plan-readiness` | `evaluatePlanReadinessGate` | Zero tasks in plan | Tasks < `minimumPlanTasks` or no dependencies |
+| 4 | `skill-coverage` | `evaluateSkillCoverageGate` | Zero skills selected | Skills < `minimumSelectedSkills` or no specialist skills |
+| 5 | `ambiguity-risk` | `evaluateAmbiguityRiskGate` | Questions ≥ `ambiguityBlockThreshold` | Assumptions > 6 or questions ≥ `ambiguityReviewThreshold` |
+
+**Turbo mode override:** After all 5 decisions are computed, any gate with `status === "needs-review"` is rewritten to `status === "pass"` with reason suffix `"(AUTO-PASSED VIA TURBO)"` and `shouldPause = false`.
+
+**Manual approval override:** Any gate whose `gate` ID appears in `context.approvedGates` is rewritten to `status === "pass"` with reason suffix `"(MANUALLY APPROVED)"` and `shouldPause = false`. This override is applied before the turbo override.
+
+---
+
+## Building Phase: Execution Delegation
+
+When the building phase handler runs:
+
+1. **Load or initialize `RunBundle`:** `loadRunBundle(runId)` is attempted. On miss, a new bundle is constructed from context (intake artifact, plan artifact, run state initialized to `status: "planned"`, `currentStepIndex: 0`).
+2. **Persist artifacts:** `updateIntake`, `updatePlan`, `updateRunState` are called before execution begins.
+3. **Emit start event:** `emitExecutionStarted(bundle.state)` fires `execution.started` event (requires `orgId` + `workspaceId`).
+4. **Delegate to execution engine:** `executeRunBundle(bundle)` runs the full 6-step per-task pipeline (see `SPEC_EXECUTION_ENGINE.md`).
+5. **Emit outcome event:** Based on returned bundle state: `emitExecutionCompleted`, `emitExecutionFailed`, or `emitGateAwaitingApproval`.
+6. **Return phase result:** Status maps as — completed → `"success"` + nextPhase `"testing"`; paused → `"awaiting-approval"` + nextPhase `"building"`; failed → handler `status: "failure"` (treated as non-success by engine) + nextPhase `"building"`.
+
+---
+
+## Mode Influence Per Phase
+
+| Phase | turbo | builder | pro | expert | safe | balanced | god |
+|-------|-------|---------|-----|--------|------|----------|-----|
+| intake | max 2 questions | max 5 | max 8 | max 15 | max 20 | max 10 | max 0 |
+| planning | no change | no change | no change | no change | no change | no change | no change |
+| skills | min 1 skill required | min 2 | min 2 | min 3 | min 3 | min 2 | min 0 |
+| gating | needs-review auto-passed | standard | stricter thresholds | strictest thresholds | most strict | moderate | all gates effectively bypassed |
+| building | medium-risk no approval | medium+high require approval | dry-run default | dry-run default | no commands allowed | dry-run + approval | no approval required for any risk |
+| testing/reviewing/deployment | simulated | simulated | simulated | simulated | simulated | simulated | simulated |
+
+---
+
+## Audit Events Per Phase
+
+| Phase | Audit Action | Source |
+|-------|-------------|--------|
+| gating (needs-review) | `gate.awaiting_approval` via `emitGateAwaitingApproval` | phase-engine.ts gating handler |
+| building (start) | `execution.started` via `emitExecutionStarted` | phase-engine.ts building handler |
+| building (complete) | `execution.completed` via `emitExecutionCompleted` | phase-engine.ts building handler |
+| building (failed) | `execution.failed` via `emitExecutionFailed` | phase-engine.ts building handler |
+| building (paused) | `gate.awaiting_approval` via `emitGateAwaitingApproval` | phase-engine.ts building handler |
+
+Per-step audit events (`TASK_EXECUTION_ATTEMPT`, `STEP_EXECUTION_STARTED`, `STEP_EXECUTION_SUCCEEDED`, `STEP_EXECUTION_FAILED`, etc.) are emitted by `execution-engine.ts` via `writeAuditEvent`, not the phase engine.
+
+---
+
+## Checkpointing
+
+Partial progress is preserved for resume through the following mechanism:
+
+1. **Per-task checkpoint:** After each successful `executeTask`, `bundle.state.currentStepIndex` is incremented and `updateRunState` is called immediately.
+2. **On pause:** `markState(bundle, "paused", { currentStepIndex: index, ... })` persists the exact index of the paused task.
+3. **On resume:** `resumeRun(runId, approve=true)` calls `loadRunBundle(runId)` to reload state from the store, then calls `executeRunBundle(bundle)` which starts the loop at `bundle.state.currentStepIndex` — the exact step that was paused.
+4. **`runVerticalSlice` continuation:** Supports `input.currentRun` parameter to inject an existing `RunReport` (carrying `completedPhases`, `currentPhase`, etc.) for phase-level resume without re-running completed phases.
+
+---
+
+## Dependencies
+
+| Dependency | Package | Purpose |
+|-----------|---------|---------|
+| `intake.ts` | `packages/orchestrator` | Phase 1 operations |
+| `planner.ts` | `packages/orchestrator` | Phase 2 operations |
+| `skill-engine` | `packages/skill-engine` | Phase 3 skill selection |
+| `gate-manager.ts` | `packages/orchestrator` | Phase 4 gate evaluation |
+| `execution-engine.ts` | `packages/orchestrator` | Phase 5 task execution |
+| `events.ts` | `packages/orchestrator` | Orchestrator-scoped event emission |
+| `run-store` | `packages/memory` | Bundle persistence and retrieval |
+| `mode-controller.ts` | `packages/orchestrator` | Policy per mode for question trimming |
+
+---
+
+## Edge Cases
+
+- **Unknown phase value:** `PHASE_HANDLERS[report.currentPhase]` returns `undefined`; `runOrchestrationStep` throws immediately before any state mutation.
+- **Missing `intakeResult` at planning phase:** Handler throws `"Intake result missing"` — this indicates an out-of-sequence call; caller must ensure phases run in order.
+- **Building phase with no `context.report.input`:** Throws `"Cannot initialize RunBundle: ctx.report.input is undefined."` — only safe when bundle already exists in the store.
+- **`runVerticalSlice` in expert mode:** Executes exactly one phase and returns, regardless of `isFinished`. The caller must call `runVerticalSlice` again with the returned report as `input.currentRun` to advance.
+- **Re-entering gating after manual approval:** If `POST /v1/gates/{id}/approve` adds a gate to `approvedGates`, the caller must re-invoke the run from the gating phase with the updated `approvedGates` list; the phase engine does not automatically re-evaluate.
+
+---
+
+## Risks
+
+| Risk | Likelihood | Impact | Mitigation |
+|------|-----------|--------|-----------|
+| Phase state divergence if handler throws mid-update | Medium | High | Ensure all `updates` are built before any async operation; wrap handler calls in try/catch at `runOrchestrationStep` level |
+| Testing/reviewing/deployment phases returning success without real work | High | Medium | These phases are explicitly marked `(SIMULATED)` — replace with real implementations before production readiness |
+| `buildInitialPlan` in building phase diverging from `buildPlanFromClarification` used in planning | Medium | High | Planning phase result must be propagated into building phase via `RunBundle.plan`; avoid re-planning from raw input |
+
+---
+
+## Definition of Done
+
+- [ ] All 8 phases have unit tests asserting correct `nextPhase` and `updates` shape
+- [ ] `runOrchestrationStep` tested with each possible handler return status: success/blocked/awaiting-approval
+- [ ] Gating phase tests cover all 5 gates individually and in combination (block + needs-review)
+- [ ] Turbo auto-pass override tested against a `needs-review` gate result
+- [ ] Manual approval override tested against a gate that would otherwise block
+- [ ] Building phase tested for completed, paused, and failed execution engine outcomes
+- [ ] Checkpointing test: pause at step N, resume, verify execution continues at step N not step 0
+- [ ] Expert mode single-step test: `runVerticalSlice` returns after first phase with `isFinished: false`
+- [ ] `runVerticalSlice` with `currentRun` input correctly skips already-completed phases
+- [ ] Unknown phase throws typed error — confirmed not a silent no-op
diff --git a/docs/03_specs/SPEC_RUN_LIFECYCLE.md b/docs/03_specs/SPEC_RUN_LIFECYCLE.md
new file mode 100644
index 0000000..391d9df
--- /dev/null
+++ b/docs/03_specs/SPEC_RUN_LIFECYCLE.md
@@ -0,0 +1,311 @@
+# SPEC — Run Lifecycle State Machine
+**Status:** Draft
+**Version:** 1.0
+**Linked to:** docs/02_architecture/ARCHITECTURE.md
+**Implements:** Complete state machine for RunStatus, StepStatus, and GateStatus across all phases of execution
+
+---
+
+## Objective
+
+Define the canonical state machine governing the lifecycle of a Run in Code-Kit-Ultra. This spec establishes every valid state, every legal transition, the trigger that causes each transition, the side effects that must occur (events emitted, DB writes, tokens issued), and the invariants that must hold at all times. All runtime components — orchestrator, execution engine, gate manager, and API layer — must conform to this machine.
+
+---
+
+## Scope
+
+- `RunStatus` state machine (top-level run lifecycle)
+- `StepStatus` state machine (per-task execution within a run)
+- `GateStatus` state machine (per-gate evaluation result)
+- Phase-to-status mapping
+- API endpoints that drive state transitions
+- Error states and recovery paths
+- Invariants and consistency rules
+
+Out of scope: individual adapter behavior, billing lifecycle, authentication flows.
+
+---
+
+## Inputs / Outputs
+
+| Direction | Item | Type | Description |
+|-----------|------|------|-------------|
+| Input | Run creation request | `RunVerticalSliceInput` | idea, mode, dryRun, approvedGates |
+| Input | Resume request | runId + approve flag | Triggers paused → running transition |
+| Input | Gate approval | gateId, actorId | Satisfies a `needs-review` gate |
+| Input | Cancel request | runId, actorId | Triggers running/paused → cancelled |
+| Output | `RunBundle` | Persistent bundle in run-store | Full snapshot of run state at any point |
+| Output | Audit events | `writeAuditEvent` calls | Append-only audit log per transition |
+| Output | Orchestrator events | `publishEvent` via events.ts | Scoped to orgId + workspaceId |
+
+---
+
+## Data Structures
+
+```typescript
+// Defined in packages/shared/src/types.ts
+type RunStatus = "planned" | "running" | "paused" | "completed" | "failed" | "cancelled";
+type StepStatus = "pending" | "running" | "success" | "failed" | "paused" | "skipped" | "rolled-back";
+type GateStatus = "pass" | "fail" | "needs-review" | "blocked" | "pending";
+
+interface RunState {
+  runId: string;
+  createdAt: string;
+  updatedAt: string;
+  currentStepIndex: number;
+  status: RunStatus;
+  approvalRequired: boolean;
+  approved: boolean;
+  pauseReason?: string;
+  orgId?: string;
+  workspaceId?: string;
+  projectId?: string;
+  actorId?: string;
+  actorType?: ActorType;
+  correlationId?: string;
+}
+
+interface StepExecutionLog {
+  stepId: string;
+  title: string;
+  adapter: string;
+  attempt: number;
+  status: StepStatus;
+  startedAt: string;
+  finishedAt?: string;
+  output?: string;
+  error?: string;
+  rollbackAvailable: boolean;
+  risk?: ExecutionRisk;
+  simulationSummary?: string;
+  verificationStatus?: "passed" | "failed";
+  verificationSummary?: string;
+  fixSuggestion?: string;
+}
+```
+
+---
+
+## Interfaces / APIs
+
+### API Endpoints That Trigger Transitions
+
+| Endpoint | Transition | Required Role |
+|----------|-----------|---------------|
+| `POST /v1/runs` | `planned → running` | operator, admin |
+| `POST /v1/runs/{id}/resume` | `paused → running` | operator, admin |
+| `POST /v1/runs/{id}/cancel` | `running/paused → cancelled` | operator, admin |
+| `POST /v1/gates/{id}/approve` | gate: `needs-review → pass` | reviewer, admin |
+| `POST /v1/gates/{id}/reject` | gate: `needs-review → blocked` | reviewer, admin |
+
+---
+
+## RunStatus State Machine
+
+```mermaid
+stateDiagram-v2
+    [*] --> planned : POST /v1/runs created
+
+    planned --> running : runVerticalSlice or executeRunBundle called
+    running --> paused : approvalRequired gate encountered
+    running --> completed : all tasks succeed
+    running --> failed : task fails after all retries and healing
+    running --> cancelled : POST /v1/runs/{id}/cancel
+
+    paused --> running : POST /v1/runs/{id}/resume (approve=true)
+    paused --> cancelled : POST /v1/runs/{id}/cancel
+
+    completed --> [*]
+    failed --> [*]
+    cancelled --> [*]
+```
+
+### Transition Details
+
+| From | To | Trigger | Actions |
+|------|----|---------|---------|
+| `planned` | `running` | `executeRunBundle` called; `markState(bundle, "running")` | Write `RunState.status = running`, audit `TASK_EXECUTION_ATTEMPT`, emit `execution.started` |
+| `running` | `paused` | Task requires approval and `bundle.state.approved === false` | Write `RunState.status = paused`, `approvalRequired = true`, `pauseReason = <reason>`, audit `APPROVAL_REQUIRED`, emit `gate.awaiting_approval` |
+| `running` | `completed` | All tasks in `bundle.plan.tasks` complete successfully | Write `RunState.status = completed`, `currentStepIndex = tasks.length`, audit `RUN_COMPLETED`, emit `execution.completed`, call `recordRunOutcome(success=true)` |
+| `running` | `failed` | `executeTask` returns `completed: false` after retry exhaustion | Write `RunState.status = failed`, audit `STEP_EXECUTION_FAILED`, emit `execution.failed`, call `recordRunOutcome(success=false)` |
+| `running` | `cancelled` | API call to cancel endpoint | Write `RunState.status = cancelled`, audit `RUN_CANCELLED` |
+| `paused` | `running` | `resumeRun(runId, approve=true)` called | Write `RunState.approved = true`, `approvalRequired = false`, re-invoke `executeRunBundle` |
+| `paused` | `cancelled` | API call to cancel endpoint while paused | Write `RunState.status = cancelled` |
+
+---
+
+## Phase-to-Status Mapping
+
+The phase engine (`phase-engine.ts`) runs phases sequentially. The following table shows which `RunStatus` values are valid at each phase boundary.
+
+| Phase | Entry Status | Exit on Success | Exit on Blocked | Exit on Awaiting-Approval |
+|-------|-------------|-----------------|-----------------|--------------------------|
+| intake | `in-progress` | `in-progress` | `blocked` | — |
+| planning | `in-progress` | `in-progress` | — | — |
+| skills | `in-progress` | `in-progress` | — | — |
+| gating | `in-progress` | `in-progress` | `blocked` | `awaiting-approval` |
+| building | `in-progress` | `in-progress` | — | `awaiting-approval` |
+| testing | `in-progress` | `in-progress` | — | — |
+| reviewing | `in-progress` | `in-progress` | — | — |
+| deployment | `in-progress` | `success` | — | — |
+
+Note: `runOrchestrationStep` maps phase handler `status` to `RunReport.status` as follows:
+- handler `"success"` + `nextPhase != null` → report status `"in-progress"`
+- handler `"success"` + `nextPhase === null` → report status `"success"`, `isFinished = true`
+- handler `"blocked"` → report status `"blocked"`
+- handler `"awaiting-approval"` → report status `"awaiting-approval"`
+
+---
+
+## StepStatus State Machine
+
+```mermaid
+stateDiagram-v2
+    [*] --> pending : task created in plan
+
+    pending --> running : executeTask begins attempt
+    running --> success : adapter.execute succeeds and verification passes
+    running --> failed : adapter.execute throws after all retries
+    running --> paused : requiresApproval=true and bundle.state.approved=false
+    running --> rolled-back : healing fails and rollbackPayload present
+    failed --> rolled-back : automatic rollback via adapter.rollback
+    paused --> running : resumeRun(approve=true) restarts from currentStepIndex
+    success --> [*]
+    rolled-back --> [*]
+    failed --> [*]
+```
+
+### StepStatus Transition Details
+
+| From | To | Trigger |
+|------|----|---------|
+| `pending` | `running` | `executeTask` invoked for task at `currentStepIndex` |
+| `running` | `success` | `adapter.execute` returns `{ success: true }` and `adapter.verify` returns `{ ok: true }` |
+| `running` | `failed` | Error thrown on final retry attempt; healing does not produce `"verified"` status |
+| `running` | `paused` | `requiresApproval === true` and `bundle.state.approved === false` |
+| `running` | `rolled-back` | Execution fails; `task.rollbackPayload` exists; `adapter.rollback(task.rollbackPayload)` called |
+| `paused` | `running` | `resumeRun` sets `approved = true`, re-enters task loop at saved `currentStepIndex` |
+| `failed` | `rolled-back` | Automatic rollback path in `executeTask` after retry exhaustion |
+
+---
+
+## GateStatus State Machine
+
+```mermaid
+stateDiagram-v2
+    [*] --> pending : gate registered
+
+    pending --> pass : evaluateGates returns pass
+    pending --> needs-review : evaluateGates returns needs-review
+    pending --> blocked : evaluateGates returns blocked
+
+    needs-review --> pass : POST /v1/gates/{id}/approve (manual approval)
+    needs-review --> blocked : POST /v1/gates/{id}/reject
+    needs-review --> pass : mode=turbo auto-pass applied
+
+    pass --> [*]
+    blocked --> [*]
+```
+
+### Gate Resolution Priority
+
+`getOverallGateStatus` applies the following precedence across all 5 gate decisions:
+1. If any gate is `"blocked"` → overall is `"blocked"` (run cannot proceed)
+2. If any gate is `"needs-review"` → overall is `"needs-review"` (run pauses for human)
+3. If all gates are `"pass"` → overall is `"pass"` (run advances to building phase)
+
+Manual approvals are tracked in `RunReport.approvedGates: string[]`. Any gate whose ID appears in that array is force-set to `"pass"` regardless of evaluation result.
+
+---
+
+## Invariants
+
+The following rules must hold at all times across the system:
+
+1. **Terminal states are final.** A run with status `completed`, `failed`, or `cancelled` must never transition to another status without an explicit retry mechanism creating a new run.
+2. **`currentStepIndex` monotonicity.** `RunState.currentStepIndex` only decreases during explicit `rollbackTask` calls; it never decreases automatically during forward execution.
+3. **`approved` resets after each step.** Upon successful step completion, `bundle.state.approved` is reset to `false` to prevent approval bleed to subsequent steps.
+4. **Audit events are append-only.** `writeAuditEvent` is never called to overwrite or delete prior events.
+5. **Events require tenant scope.** `emitOrchestratorEvent` silently skips emission if `orgId` or `workspaceId` is missing from `RunState`, preventing orphan events.
+6. **Gate approval list is cumulative.** `approvedGates` only grows during a run; approved gates are never un-approved during a single run lifecycle.
+7. **Paused run state is persisted before returning.** `markState(bundle, "paused", {...})` calls `updateRunState` before the function returns, ensuring the checkpoint is durable.
+
+---
+
+## Error States and Recovery Paths
+
+### Policy Block (`POLICY_BLOCK`)
+- **Cause:** `evaluatePolicy(task)` returns `allowed: false`
+- **State written:** `RunStatus = failed`, `StepStatus = failed`
+- **Recovery:** Fix the policy rule or request an exemption; no automatic recovery
+
+### Adapter Not Found (`ADAPTER_NOT_FOUND`)
+- **Cause:** `findAdapter(adapters, task.adapterId)` returns null
+- **State written:** `RunStatus = failed`, `StepStatus = failed`
+- **Recovery:** Register the missing adapter in the adapter registry; then call `retryTask`
+
+### Validation Failure (`VALIDATION_FAILED`)
+- **Cause:** `adapter.validate(task.payload)` returns `false`
+- **State written:** `RunStatus = failed`, audit includes `fixSuggestion` if adapter provides one
+- **Recovery:** Use `fixSuggestion` to correct payload; call `retryTask`
+
+### Execution Failure with Healing
+- **Cause:** `adapter.execute` throws on final retry attempt
+- **Path:** `healFailedStep` invoked → if `status === "verified"`, `maxAttempts` incremented and execution resumes; if `approvalRequired`, run pauses; otherwise run fails
+- **State written:** `RunStatus = failed` if healing cannot recover
+- **Recovery:** Call `retryTask(runId, stepId)` after fixing root cause
+
+### Resume After Approval
+- **Cause:** Run is `paused` due to `requiresApproval`
+- **Path:** `POST /v1/runs/{id}/resume` → `resumeRun(runId, approve=true)` → `bundle.state.approved = true` → `executeRunBundle` resumes from `currentStepIndex`
+- **Side effect:** `approved` resets to `false` after the approved step completes
+
+---
+
+## Dependencies
+
+| Dependency | Package | Purpose |
+|-----------|---------|---------|
+| `run-store` | `packages/memory` | `loadRunBundle`, `updateRunState`, persist state |
+| `execution-engine` | `packages/orchestrator` | `executeRunBundle`, transitions running/paused/completed/failed |
+| `gate-manager` | `packages/orchestrator` | `evaluateGates`, GateStatus transitions |
+| `events.ts` | `packages/orchestrator` | `emitExecutionStarted/Completed/Failed`, `emitGateAwaitingApproval` |
+| `audit` | `packages/audit` | `writeAuditEvent`, append-only audit record |
+| `policy-engine` | `packages/core` | `evaluatePolicy`, blocks or allows task execution |
+
+---
+
+## Edge Cases
+
+- **God mode with no tenant scope:** `emitOrchestratorEvent` skips event emission; run still executes but is unobservable via event stream.
+- **Expert mode exits after each phase:** `runVerticalSlice` returns after the first `runOrchestrationStep` call in expert mode, requiring the caller to re-invoke for each phase.
+- **Turbo mode auto-passes `needs-review` gates:** In `evaluateGates`, any gate returning `needs-review` is rewritten to `pass` when `mode === "turbo"`, bypassing the approval pause.
+- **Resume of a completed run:** `resumeRun` checks `bundle.state.status === "completed"` and returns early without re-executing, preventing double-execution.
+- **Rollback of step with no rollback payload:** `rollbackTask` throws `"Rollback not available for step: {id}"` — callers must check `StepExecutionLog.rollbackAvailable` before calling.
+- **Concurrent approval and cancellation:** No lock is held on `RunState`; last writer wins. API layer must serialize concurrent state mutations per runId.
+
+---
+
+## Risks
+
+| Risk | Likelihood | Impact | Mitigation |
+|------|-----------|--------|-----------|
+| State desync between memory store and audit log | Medium | High | Wrap `markState` and `writeAuditEvent` in the same call path; do not split them |
+| `approved` flag persisting across steps due to resume edge cases | Low | High | Invariant #3 enforced in `executeTask` success path; covered by tests |
+| Orphan paused runs (no actor to resume) | Medium | Medium | `gate.awaiting_approval` event must trigger notification; add TTL on paused state |
+| Terminal state re-entry via concurrent API calls | Low | High | Idempotency guard needed at API layer before `executeRunBundle` |
+
+---
+
+## Definition of Done
+
+- [ ] All `RunStatus` transitions are covered by integration tests with state assertions
+- [ ] All `StepStatus` transitions emit the correct audit event with expected fields
+- [ ] Gate approval flow tested end-to-end: `needs-review → approved → run resumes`
+- [ ] `resumeRun` with `approve=false` does not re-approve the paused step
+- [ ] `rollbackTask` throws a typed error when `rollbackPayload` is absent
+- [ ] Turbo mode auto-pass behavior validated against all 5 gates
+- [ ] Expert mode single-step behavior validated in `runVerticalSlice`
+- [ ] All 5 gate IDs (`objective-clarity`, `requirements-completeness`, `plan-readiness`, `skill-coverage`, `ambiguity-risk`) are reflected in `approvedGates` tracking
+- [ ] Terminal state guard tested: completed run re-submitted returns bundle without re-executing
+- [ ] Mermaid diagrams in this spec render correctly in the project documentation site
diff --git a/docs/06_validation/GO_NO_GO_CHECKLIST.md b/docs/06_validation/GO_NO_GO_CHECKLIST.md
new file mode 100644
index 0000000..a678ca5
--- /dev/null
+++ b/docs/06_validation/GO_NO_GO_CHECKLIST.md
@@ -0,0 +1,159 @@
+# Go/No-Go Release Gate — v1.3.0
+
+| Field | Value |
+|-------|-------|
+| Release | v1.3.0 |
+| Target date | [TBD] |
+| Decision date | [to be filled at review meeting] |
+| Decision makers | Engineering Lead, Security Lead, Product Owner |
+| Meeting format | Synchronous review of this document |
+| Document status | Draft — all gates open |
+| Last updated | 2026-04-04 |
+
+---
+
+## Purpose
+
+This document is the formal gate that controls whether Code-Kit-Ultra v1.3.0 may
+proceed to production release. It is reviewed in a synchronous meeting attended
+by all decision makers listed above. No release may proceed unless the outcome
+recorded in the Decision Log is "GO" or "CONDITIONAL GO" with documented
+exceptions approved by the Security Lead.
+
+This document is completed fresh for each release. Items are verified — not
+assumed — before being checked. Evidence (test run URLs, coverage reports, or
+audit screenshots) must be linked in the notes for every Security Gate item.
+
+---
+
+## Gate 1 — Security Gate
+
+> **HARD BLOCK — the release cannot proceed if any item in this gate is unchecked.**
+> Evidence of verification is required for every item.
+
+- [ ] Zero P0 (Critical) open security vulnerabilities
+  - _Evidence:_ link to security scan results
+- [ ] Zero P1 (High) open security vulnerabilities
+  - _Evidence:_ link to security scan results
+- [ ] **R-01 verified:** SA secret loaded from env var (`SA_SECRET`); service startup
+      throws and refuses to start if the env var is absent or empty
+  - _Test:_ start service without `SA_SECRET` set → must exit with non-zero code and
+    log `FATAL: SA_SECRET is required`
+  - _Evidence:_ CI run link
+- [ ] **R-02 verified:** default org bypass removed from `resolveSession`; cross-tenant
+      access blocked at API layer
+  - _Test:_ `POST /v1/runs` with `orgId="default"` → `400 INVALID_ORG_ID`
+  - _Evidence:_ passing test case in `TEST_PLAN_RUN_SCOPING.md §4.5`
+- [ ] **R-03 verified:** Redis jti blacklist implemented; revoked session tokens return 401
+  - _Test:_ issue token, revoke it via logout endpoint, reuse token → `401 TOKEN_REVOKED`
+  - _Evidence:_ passing security test `auth/revocation.test.ts`
+- [ ] **R-04 verified:** execution token validated on every protected API call;
+      expired or missing execution token returns 401
+  - _Test:_ call `POST /v1/runs/{id}/resume` with expired exec token → `401`
+  - _Evidence:_ passing security test `exec-token-validation.test.ts`
+- [ ] **R-05 verified:** audit hash chain is restart-safe (uses DB-persisted `lastHash`,
+      not module-level variable); chain integrity survives service restart
+  - _Test:_ append 50 events, restart service, append 10 more, run chain verifier → no
+    mismatch
+  - _Evidence:_ passing test in `SECURITY_TESTING_PLAN.md §3 Audit Integrity`
+
+---
+
+## Gate 2 — Quality Gate
+
+> **HARD BLOCK — the release cannot proceed if any item in this gate is unchecked.**
+
+- [ ] All smoke tests pass on staging environment
+  - _Command:_ `pnpm test:smoke --env=staging`
+  - _Evidence:_ CI run link
+- [ ] `packages/auth` test coverage ≥ 90% (measured, not estimated)
+  - _Command:_ `pnpm test --coverage --filter=auth`
+  - _Evidence:_ coverage report screenshot or artifact link
+- [ ] `packages/orchestrator` test coverage ≥ 80%
+  - _Command:_ `pnpm test --coverage --filter=orchestrator`
+  - _Evidence:_ coverage report
+- [ ] Zero P0 functional bugs open
+  - _Evidence:_ link to issue tracker filtered by P0 + open
+- [ ] Zero regressions from v1.2.0 verified by regression test suite
+  - _Command:_ `pnpm test:regression`
+  - _Evidence:_ CI run link
+
+---
+
+## Gate 3 — Operations Gate
+
+> **HARD BLOCK — the release cannot proceed if any item in this gate is unchecked.**
+
+- [ ] Staging deployment successful: Dockerfile built and service started without errors
+  - _Evidence:_ deployment log link
+- [ ] DB migrations ran cleanly on staging against a clean schema (no pre-existing tables)
+  - _Evidence:_ migration runner output in deployment log
+- [ ] Rollback tested: deployed v1.3.0 on staging, rolled back to v1.2.0, verified core
+      functionality remained intact
+  - _Evidence:_ rollback test log link
+- [ ] Health and readiness endpoints functional on staging
+  - `GET /health` → `200 {"status":"healthy"}`
+  - `GET /ready` → `200` when DB and Redis are reachable; `503` when either is down
+  - _Evidence:_ curl output
+- [ ] Alerts configured and tested for P0 errors (5xx bursts, auth failures)
+  - _Evidence:_ alert rule screenshot + test notification confirmation
+
+---
+
+## Gate 4 — Product Gate
+
+> **CONDITIONAL** — release may proceed with documented exceptions approved by
+> the Product Owner and Engineering Lead. Any unchecked item must be logged in
+> the Decision Log with a resolution date.
+
+- [ ] Product Owner sign-off on feature completeness for v1.3.0 scope
+  - _Sign-off by:_ [name, date]
+- [ ] Customer-facing changelog reviewed and approved for accuracy
+  - _Evidence:_ link to reviewed `CHANGELOG.md` diff
+- [ ] Documentation complete: OpenAPI 3.1 spec generated and matches implementation
+  - _Evidence:_ spec file path + validation command output
+- [ ] `README.md` and quickstart guide updated for v1.3.0 changes
+  - _Evidence:_ PR link
+
+---
+
+## Outcome
+
+| Outcome | Condition | Action |
+|---------|-----------|--------|
+| GO | All Gate 1 + 2 + 3 items checked; Gate 4 items checked | Proceed to production release |
+| CONDITIONAL GO | All Gate 1 + 2 + 3 items checked; one or more Gate 4 items pending with Product Owner approval | Release with documented limitations; Gate 4 items tracked as follow-up |
+| NO-GO | Any Gate 1, 2, or 3 item unchecked | Release blocked — schedule remediation sprint, re-convene for re-review |
+
+---
+
+## Decision Log
+
+| Date | Release | Outcome | Blocker (if NO-GO) | Resolved By | Sign-off |
+|------|---------|---------|-------------------|-------------|----------|
+| — | v1.3.0 | Pending | — | — | — |
+
+---
+
+## Current Status — v1.3.0
+
+> Status as of 2026-04-04. All gates open; work in progress.
+
+| Gate | Items | Checked | Remaining | Status |
+|------|-------|---------|-----------|--------|
+| Gate 1 — Security | 7 | 0 | 7 | OPEN (HARD BLOCK) |
+| Gate 2 — Quality | 5 | 0 | 5 | OPEN (HARD BLOCK) |
+| Gate 3 — Operations | 5 | 0 | 5 | OPEN (HARD BLOCK) |
+| Gate 4 — Product | 4 | 0 | 4 | OPEN (CONDITIONAL) |
+| **Overall** | **21** | **0** | **21** | **NO-GO** |
+
+---
+
+## Related Documents
+
+- `docs/06_validation/PRODUCTION_READINESS.md` — detailed effort estimates and owners
+- `docs/06_validation/SECURITY_TESTING_PLAN.md` — security test cases and evidence requirements
+- `docs/06_validation/TEST_PLAN_RUN_SCOPING.md` — run isolation test plan (required for Gate 1 R-02)
+- `docs/06_validation/TEST_PLAN_AUTH.md` — auth package test plan (required for Gate 2 coverage)
+- `docs/SECURITY_AUDIT.md` — open risk register (R-01 through R-08)
+- `docs/ROLLBACK.md` — rollback procedure (required for Gate 3)
diff --git a/docs/06_validation/PRODUCTION_READINESS.md b/docs/06_validation/PRODUCTION_READINESS.md
new file mode 100644
index 0000000..6b70363
--- /dev/null
+++ b/docs/06_validation/PRODUCTION_READINESS.md
@@ -0,0 +1,135 @@
+# Production Readiness Checklist
+
+- **Document type**: Release Checklist
+- **Version target**: v1.3.0
+- **Last updated**: 2026-04-04
+- **Status**: In progress — all items open
+
+---
+
+## Purpose
+
+This checklist must be completed in full before any production release. Work
+through each item with the designated owner, mark it checked, and record the
+date and reviewer in the notes column. Categories 1–3 are hard gates: a single
+unchecked item in Security, Reliability, or Observability blocks release.
+Categories 4–6 are strong recommendations; any exception requires explicit
+sign-off from the Engineering Lead.
+
+**How to use this checklist**
+
+1. Open this file at the start of each release milestone.
+2. Assign owners to unchecked items.
+3. Work items in priority order: Security first, then Reliability, then
+   Observability, then Testing, then Deployment, then Documentation.
+4. Mark each item `[x]` when verified — not just implemented.
+5. Bring outstanding items to the Go/No-Go review meeting
+   (`GO_NO_GO_CHECKLIST.md`).
+
+---
+
+## Category 1 — Security
+
+> **HARD GATE — release cannot proceed if any item in this category is unchecked.**
+
+| # | Item | Effort | Owner | Done |
+|---|------|--------|-------|------|
+| S-01 | R-01: SA secret loaded from env var (`SA_SECRET`); service throws on startup if env var is absent or empty | 1h | eng | [ ] |
+| S-02 | R-02: Default org bypass removed from `resolveSession`; cross-tenant access blocked at API layer | 2h | eng | [ ] |
+| S-03 | R-03: Redis-backed session revocation via jti blacklist; revoked tokens return 401 | 4h | eng | [ ] |
+| S-04 | R-04: Execution token validated on every protected API call; expired/missing exec token returns 401 | 3h | eng | [ ] |
+| S-05 | R-05: Audit hash chain uses DB-persisted `lastHash` (not module-level variable); chain survives service restart | 3h | eng | [ ] |
+| S-06 | R-06: Rate limiting enforced — 100 req/min per actor globally, 10/min for token creation endpoint | 4h | eng | [ ] |
+| S-07 | Secrets (tokens, keys, passwords) are never written to application logs; audit log sanitization verified | 2h | eng | [ ] |
+| S-08 | HTTPS enforced in production (HTTP → HTTPS redirect at load balancer or app layer) | 1h | infra | [ ] |
+| S-09 | CORS policy configured with an explicit origin allowlist — wildcard (`*`) not permitted in production | 1h | eng | [ ] |
+| S-10 | CSP headers configured on the web control plane (`Content-Security-Policy` response header present) | 2h | eng | [ ] |
+
+---
+
+## Category 2 — Reliability
+
+> **HARD GATE — release cannot proceed if any item in this category is unchecked.**
+
+| # | Item | Effort | Owner | Done |
+|---|------|--------|-------|------|
+| R-01 | PostgreSQL runtime wired: `run-store.ts` reads and writes to DB (not in-memory map) | 8h | eng | [ ] |
+| R-02 | Connection pooling configured: pg pool `min=2`, `max=10`; connections reused across requests | 1h | eng | [ ] |
+| R-03 | DB migrations run automatically on service startup (via migration runner in entrypoint) | 2h | eng | [ ] |
+| R-04 | Graceful shutdown: `SIGTERM` handler drains in-flight requests and closes DB pool before exit | 3h | eng | [ ] |
+| R-05 | `GET /health` returns `200 {"status":"healthy"}` and does not gate on DB connectivity | 1h | eng | [ ] |
+| R-06 | `GET /ready` gates on both DB and Redis connectivity; returns `503` if either is unreachable | 2h | eng | [ ] |
+
+---
+
+## Category 3 — Observability
+
+> **HARD GATE — release cannot proceed if any item in this category is unchecked.**
+
+| # | Item | Effort | Owner | Done |
+|---|------|--------|-------|------|
+| O-01 | Structured JSON logging in place (`logger.ts`); all `console.log` / `console.error` calls replaced | 4h | eng | [ ] |
+| O-02 | Trace ID (`X-Trace-ID` header) injected on every inbound request and included in all log lines | 2h | eng | [ ] |
+| O-03 | `GET /metrics` Prometheus endpoint exposed; request count, latency histograms, error rates present | 6h | eng | [ ] |
+| O-04 | Error alerting configured: critical errors (5xx bursts, auth failures) route to alert channel | 3h | infra | [ ] |
+| O-05 | Audit log persisted to DB for every material action (run create/cancel/resume, gate approve/reject) | 4h | eng | [ ] |
+
+---
+
+## Category 4 — Testing
+
+| # | Item | Effort | Owner | Done |
+|---|------|--------|-------|------|
+| T-01 | `packages/auth` test coverage ≥ 90% (measured via `pnpm test --coverage`, not estimated) | 8h | eng | [ ] |
+| T-02 | `packages/orchestrator` test coverage ≥ 80% | 12h | eng | [ ] |
+| T-03 | Governance gates package test coverage ≥ 80% | 8h | eng | [ ] |
+| T-04 | All smoke tests pass on staging environment (`pnpm test:smoke --env=staging`) | 2h | qa | [ ] |
+| T-05 | Zero P0/P1 security vulnerabilities open (`npm audit --audit-level=high` returns clean) | varies | security | [ ] |
+
+---
+
+## Category 5 — Deployment
+
+| # | Item | Effort | Owner | Done |
+|---|------|--------|-------|------|
+| D-01 | Dockerfile exists, builds successfully (`docker build .` passes with no errors) | 4h | eng | [ ] |
+| D-02 | `.env.example` documents every required environment variable with description and example value | 1h | eng | [ ] |
+| D-03 | DB migrations tested on a completely clean schema (no pre-existing tables) | 1h | eng | [ ] |
+| D-04 | Rollback procedure documented (`docs/ROLLBACK.md`) and tested: v1.3.0 → v1.2.0 verified working | 2h | eng | [ ] |
+| D-05 | Zero-downtime deploy strategy defined and documented (blue/green or rolling; no forced restarts mid-request) | 4h | infra | [ ] |
+
+---
+
+## Category 6 — Documentation
+
+| # | Item | Effort | Owner | Done |
+|---|------|--------|-------|------|
+| Doc-01 | OpenAPI 3.1 spec generated and validated against implementation (all routes, request/response schemas present) | 8h | eng | [ ] |
+| Doc-02 | `CHANGELOG.md` updated with v1.3.0 entries (new features, bug fixes, breaking changes, security fixes) | 1h | eng | [ ] |
+| Doc-03 | `SECURITY.md` current with accurate vulnerability contact information and disclosure timeline | 30m | security | [ ] |
+
+---
+
+## Summary Table
+
+> Update this table manually at each milestone review.
+
+| Category | Total Items | Checked | Remaining | Effort Remaining |
+|----------|-------------|---------|-----------|-----------------|
+| Security | 10 | 0 | 10 | 20h |
+| Reliability | 6 | 0 | 6 | 17h |
+| Observability | 5 | 0 | 5 | 19h |
+| Testing | 5 | 0 | 5 | 30h |
+| Deployment | 5 | 0 | 5 | 12h |
+| Documentation | 3 | 0 | 3 | 9.5h |
+| **Total** | **34** | **0** | **34** | **107.5h** |
+
+---
+
+## Related Documents
+
+- `docs/06_validation/GO_NO_GO_CHECKLIST.md`
+- `docs/06_validation/SECURITY_TESTING_PLAN.md`
+- `docs/06_validation/TEST_PLAN_RUN_SCOPING.md`
+- `docs/SECURITY_AUDIT.md`
+- `docs/RELEASE_CHECKLIST.md`
diff --git a/docs/06_validation/SECURITY_TESTING_PLAN.md b/docs/06_validation/SECURITY_TESTING_PLAN.md
new file mode 100644
index 0000000..ca558ca
--- /dev/null
+++ b/docs/06_validation/SECURITY_TESTING_PLAN.md
@@ -0,0 +1,499 @@
+# Security Testing Plan
+
+- **Document type**: Security Test Plan
+- **Version target**: v1.3.0
+- **Last updated**: 2026-04-04
+- **Status**: Draft — Phase 1 required before v1.3.0 release
+
+---
+
+## 1. Scope
+
+### In scope
+
+| System | What is tested |
+|--------|---------------|
+| `apps/control-service` API | All authenticated endpoints, rate limiting, input validation |
+| `packages/auth` | JWT validation, session resolution, token revocation |
+| Tenant isolation layer | Cross-org access prevention, run scoping |
+| SSE realtime stream (`GET /v1/stream`) | Cross-tenant event leakage |
+| Audit log subsystem | Hash chain integrity, restart safety |
+
+### Out of scope
+
+- InsForge platform internals (JWT issuance, JWKS endpoint) — not operated by this team
+- Third-party AI provider APIs (OpenAI, Anthropic, etc.)
+- Network-layer controls (TLS termination, DDoS mitigation) — handled by infrastructure
+- Client-side / browser security of any frontend applications
+
+---
+
+## 2. Testing Phases
+
+### Phase 1 — Pre-release (required for v1.3.0)
+
+| Activity | Owner | Timing |
+|----------|-------|--------|
+| Manual security review of open risks R-01 through R-07 | Security Lead | 2 weeks before release |
+| Automated security test suite (`pnpm test:security`) | Engineering | CI, every PR to main |
+| OWASP ZAP API scan against staging | Security Lead | 1 week before release |
+| `npm audit` dependency vulnerability scan | Engineering | CI, weekly |
+| Static analysis: `eslint-plugin-security` | Engineering | CI, every PR |
+
+**Exit criteria for Phase 1:** Zero High or Critical findings from ZAP; all automated
+security tests pass; all P0/P1 risks from the open risk register resolved.
+
+### Phase 2 — Post-release (scheduled)
+
+| Activity | Frequency | Owner |
+|----------|-----------|-------|
+| Automated `npm audit` | Weekly (CI cron) | Engineering |
+| OWASP ZAP scan against production (read-only, non-destructive) | Monthly | Security Lead |
+| Dependency update review | Monthly | Engineering |
+
+### Phase 3 — Ongoing (CI enforcement)
+
+- All tests in `pnpm test:security` run on every PR to `main`.
+- Any PR that removes or skips a security test requires Security Lead approval.
+- New security bug fixes must be accompanied by a regression test before merge.
+
+---
+
+## 3. Test Cases by Category
+
+All automated test cases live under `packages/auth/tests/security/` and
+`apps/control-service/tests/security/`. Run with:
+
+```bash
+pnpm test:security
+```
+
+---
+
+### 3.1 Authentication Attacks
+
+**JWT algorithm confusion**
+- Send a token signed with HS256 using the RS256 public key as the HMAC secret.
+  Server expects RS256 (from JWKS). Must reject with `401 INVALID_TOKEN`.
+
+**JWT "none" algorithm**
+- Craft a token with header `{"alg":"none","typ":"JWT"}` and no signature.
+  Must reject with `401 INVALID_TOKEN` — server must never accept `alg: "none"`.
+
+**Expired token**
+- Issue a valid RS256 token with `exp` set 1 hour in the past.
+  Must reject with `401 TOKEN_EXPIRED`.
+
+**Wrong issuer**
+- Issue a token with `iss: "https://evil.example.com"` signed with a valid key.
+  Must reject with `401 INVALID_ISSUER`.
+
+**Tampered payload**
+- Take a valid token, decode the payload, flip one character in `sub`, re-encode
+  with the original signature (now invalid).
+  Must reject with `401 INVALID_SIGNATURE`.
+
+**Revoked token (jti blacklist)**
+- Issue a valid token. Call the logout endpoint to revoke it (adds jti to Redis
+  blacklist). Immediately reuse the same token on a protected endpoint.
+  Must reject with `401 TOKEN_REVOKED`.
+
+**Brute force rate limit**
+- Send 11 login/token-creation requests within 60 seconds from the same actor.
+  The 11th request must receive `429 Too Many Requests` with a `Retry-After`
+  header.
+
+```typescript
+describe('Authentication Attacks', () => {
+  it('rejects HS256 algorithm confusion token', async () => {
+    const token = buildAlgConfusionToken({ alg: 'HS256', secret: RS256_PUBLIC_KEY });
+    const res = await api.get('/v1/runs').set('Authorization', `Bearer ${token}`);
+    expect(res.status).toBe(401);
+    expect(res.body.code).toBe('INVALID_TOKEN');
+  });
+
+  it('rejects alg:none token', async () => {
+    const token = buildNoneAlgToken({ sub: 'user-a-admin' });
+    const res = await api.get('/v1/runs').set('Authorization', `Bearer ${token}`);
+    expect(res.status).toBe(401);
+  });
+
+  it('rejects expired token', async () => {
+    const token = buildExpiredToken({ sub: 'user-a-admin', ageSeconds: 3600 });
+    const res = await api.get('/v1/runs').set('Authorization', `Bearer ${token}`);
+    expect(res.status).toBe(401);
+    expect(res.body.code).toBe('TOKEN_EXPIRED');
+  });
+
+  it('rejects wrong issuer', async () => {
+    const token = buildToken({ sub: 'user-a-admin', iss: 'https://evil.example.com' });
+    const res = await api.get('/v1/runs').set('Authorization', `Bearer ${token}`);
+    expect(res.status).toBe(401);
+    expect(res.body.code).toBe('INVALID_ISSUER');
+  });
+
+  it('rejects tampered payload', async () => {
+    const token = buildTamperedPayloadToken({ sub: 'user-a-admin' });
+    const res = await api.get('/v1/runs').set('Authorization', `Bearer ${token}`);
+    expect(res.status).toBe(401);
+  });
+
+  it('rejects revoked token (jti in Redis blacklist)', async () => {
+    const { token } = await issueToken({ sub: 'user-a-admin' });
+    await api.post('/v1/auth/logout').set('Authorization', `Bearer ${token}`);
+    const res = await api.get('/v1/runs').set('Authorization', `Bearer ${token}`);
+    expect(res.status).toBe(401);
+    expect(res.body.code).toBe('TOKEN_REVOKED');
+  });
+
+  it('rate limits token creation at 11 attempts per minute', async () => {
+    for (let i = 0; i < 10; i++) {
+      await api.post('/v1/auth/token').send({ clientId: 'x', clientSecret: 'bad' });
+    }
+    const res = await api.post('/v1/auth/token').send({ clientId: 'x', clientSecret: 'bad' });
+    expect(res.status).toBe(429);
+    expect(res.headers['retry-after']).toBeDefined();
+  });
+});
+```
+
+---
+
+### 3.2 Authorization Bypass
+
+**Cross-tenant run access**
+- Authenticate as an orgA user with a valid token. Request `GET /v1/runs/{id}` where
+  `id` belongs to orgB. Must receive `404` — not `200` (leak) or `403` (information
+  disclosure that the run exists).
+
+**Gate approval without permission**
+- Authenticate as a user without `gate:approve` permission. Call the gate approval
+  endpoint. Must receive `403 FORBIDDEN`.
+
+**Privilege escalation via crafted JWT**
+- Craft a token with `roles: ["org:admin"]` for an account that is only
+  `workspace:member`. Because the token is not signed by InsForge's private key,
+  it must be rejected with `401 INVALID_SIGNATURE`.
+
+**Service account accessing admin endpoint**
+- Authenticate as a service account (which has `actorType: service_account`).
+  Call an admin-only endpoint (e.g., `GET /v1/admin/orgs`).
+  Must receive `403 FORBIDDEN`.
+
+**Forged execution token**
+- Issue an execution token for run-a1. Substitute run-a2's ID into the token payload
+  without re-signing. Call `POST /v1/runs/run-a2/resume` with this token.
+  Must receive `401 INVALID_EXEC_TOKEN`.
+
+```typescript
+describe('Authorization Bypass', () => {
+  it('returns 404 (not 200 or 403) for cross-tenant run access', async () => {
+    const res = await authed(orgAToken).get('/v1/runs/run-b1');
+    expect(res.status).toBe(404);
+  });
+
+  it('returns 403 for gate approval without gate:approve permission', async () => {
+    const res = await authed(viewerToken).post(`/v1/runs/run-a1/gates/1/approve`);
+    expect(res.status).toBe(403);
+  });
+
+  it('rejects crafted admin JWT with invalid signature', async () => {
+    const token = craftAdminToken({ sub: 'user-a-member' }); // not signed by InsForge
+    const res = await api.get('/v1/admin/orgs').set('Authorization', `Bearer ${token}`);
+    expect(res.status).toBe(401);
+  });
+
+  it('returns 403 for service account accessing admin endpoint', async () => {
+    const res = await authed(saToken).get('/v1/admin/orgs');
+    expect(res.status).toBe(403);
+  });
+
+  it('rejects forged execution token for different run', async () => {
+    const forgedToken = forgeExecToken({ originalRunId: 'run-a1', targetRunId: 'run-a2' });
+    const res = await api
+      .post('/v1/runs/run-a2/resume')
+      .set('X-Execution-Token', forgedToken);
+    expect(res.status).toBe(401);
+    expect(res.body.code).toBe('INVALID_EXEC_TOKEN');
+  });
+});
+```
+
+---
+
+### 3.3 Input Validation
+
+**SQL injection in runId**
+- Send `GET /v1/runs/'; DROP TABLE runs; --` as the run ID path parameter.
+  The query must use parameterized statements; the response must be `400` or `404`
+  with no DB error. The `runs` table must still exist after the request.
+
+**XSS in idea text**
+- Submit `POST /v1/runs` with `idea: "<script>alert(1)</script>"`.
+  The stored and returned value must be sanitized or escaped — the literal
+  `<script>` tag must not appear unescaped in API responses.
+
+**Oversized payload**
+- Submit a `POST /v1/runs` request with a 10 MB `idea` field.
+  Must receive `413 Entity Too Large` before the payload is processed.
+
+**Malformed JSON**
+- Send a `POST /v1/runs` request with body `{broken json`.
+  Must receive `400 Bad Request` with a parse error message.
+
+**Negative step index**
+- Send `PATCH /v1/runs/{id}/step` with `stepIndex: -1`.
+  Must receive `400` with a validation error indicating `stepIndex` must be
+  a non-negative integer.
+
+```typescript
+describe('Input Validation', () => {
+  it('handles SQL injection in runId path param without DB error', async () => {
+    const res = await authed(orgAToken).get("/v1/runs/'; DROP TABLE runs; --");
+    expect([400, 404]).toContain(res.status);
+    // Verify runs table still exists
+    const check = await authed(orgAToken).get('/v1/runs');
+    expect(check.status).toBe(200);
+  });
+
+  it('sanitizes XSS in idea text', async () => {
+    const res = await authed(orgAToken)
+      .post('/v1/runs')
+      .send({ orgId: 'org-a', workspaceId: 'ws-1', idea: '<script>alert(1)</script>' });
+    expect(res.status).toBe(201);
+    expect(res.body.run.idea).not.toMatch(/<script>/i);
+  });
+
+  it('returns 413 for oversized payload', async () => {
+    const res = await authed(orgAToken)
+      .post('/v1/runs')
+      .send({ orgId: 'org-a', workspaceId: 'ws-1', idea: 'x'.repeat(10 * 1024 * 1024) });
+    expect(res.status).toBe(413);
+  });
+
+  it('returns 400 for malformed JSON', async () => {
+    const res = await api
+      .post('/v1/runs')
+      .set('Authorization', `Bearer ${orgAToken}`)
+      .set('Content-Type', 'application/json')
+      .send('{broken json');
+    expect(res.status).toBe(400);
+  });
+
+  it('returns 400 for negative stepIndex', async () => {
+    const res = await authed(orgAToken)
+      .patch('/v1/runs/run-a1/step')
+      .send({ stepIndex: -1 });
+    expect(res.status).toBe(400);
+    expect(res.body.error).toMatch(/stepIndex/i);
+  });
+});
+```
+
+---
+
+### 3.4 Tenant Isolation
+
+**Valid orgA token with orgB runId in path**
+- Must return `404` with no body content that reveals the run exists in orgB.
+
+**Run enumeration via pagination**
+- Exhaust all pages of `GET /v1/runs` as an orgA user; assert zero orgB runs appear
+  across all pages regardless of page size.
+
+**SSE stream cross-tenant event leak**
+- Connect to `GET /v1/stream` as an orgA user. Trigger an orgB run state change.
+  Assert that no event with `orgId: "org-b"` is received on the orgA connection.
+
+**Cross-tenant audit log query**
+- Call `GET /v1/audit` as an orgA admin. Assert that every event in the response
+  has `orgId: "org-a"`.
+
+```typescript
+describe('Tenant Isolation', () => {
+  it('returns 404 for valid orgA token + orgB runId', async () => {
+    const res = await authed(orgAToken).get('/v1/runs/run-b1');
+    expect(res.status).toBe(404);
+  });
+
+  it('does not leak orgB runs through pagination', async () => {
+    const allRuns: Run[] = [];
+    let page = 1, hasMore = true;
+    while (hasMore) {
+      const res = await authed(orgAToken).get(`/v1/runs?page=${page}&limit=50`);
+      allRuns.push(...res.body.runs);
+      hasMore = res.body.hasNextPage;
+      page++;
+    }
+    expect(allRuns.every((r) => r.orgId === 'org-a')).toBe(true);
+  });
+
+  it('SSE stream delivers no orgB events to orgA subscriber', async () => {
+    const received: SSEEvent[] = [];
+    const stream = connectSSE('/v1/stream', orgAToken);
+    stream.on('event', (e: SSEEvent) => received.push(e));
+
+    await authed(orgBToken).post('/v1/runs').send({ idea: 'b event', orgId: 'org-b' });
+    await wait(300);
+    stream.close();
+
+    expect(received.filter((e) => e.orgId === 'org-b')).toHaveLength(0);
+  });
+
+  it('audit log returns only orgA events for orgA admin', async () => {
+    const res = await authed(orgAToken).get('/v1/audit');
+    expect(res.status).toBe(200);
+    expect(res.body.events.every((e: AuditEvent) => e.orgId === 'org-a')).toBe(true);
+  });
+});
+```
+
+---
+
+### 3.5 Rate Limiting
+
+**Global actor rate limit**
+- Send 101 requests within 60 seconds from the same actor.
+  The 101st request must return `429 Too Many Requests` with a `Retry-After` header.
+  After 60 seconds the limit must reset and the 102nd request must succeed.
+
+**Token creation endpoint rate limit**
+- Send 11 token creation attempts within 60 seconds.
+  The 11th attempt must return `429`.
+
+```typescript
+describe('Rate Limiting', () => {
+  it('returns 429 on 101st request in same minute', async () => {
+    for (let i = 0; i < 100; i++) {
+      await authed(orgAToken).get('/v1/runs');
+    }
+    const res = await authed(orgAToken).get('/v1/runs');
+    expect(res.status).toBe(429);
+    expect(res.headers['retry-after']).toBeDefined();
+  });
+
+  it('returns 429 on 11th token creation attempt in same minute', async () => {
+    for (let i = 0; i < 10; i++) {
+      await api.post('/v1/auth/token').send({ clientId: 'x', clientSecret: 'bad' });
+    }
+    const res = await api.post('/v1/auth/token').send({ clientId: 'x', clientSecret: 'bad' });
+    expect(res.status).toBe(429);
+  });
+
+  it('rate limit resets after 60 seconds', async () => {
+    // This test uses fake timers to avoid real 60s wait
+    vi.useFakeTimers();
+    // ... exhaust limit, advance clock by 61s, verify next request succeeds
+    vi.useRealTimers();
+  });
+});
+```
+
+---
+
+### 3.6 Audit Integrity
+
+**Hash chain end-to-end validity**
+- POST 100 audit events in sequence. Retrieve the full chain and verify each
+  entry's `hash` equals `SHA256(previousHash + eventPayload)`. The chain must be
+  valid from entry 1 to entry 100.
+
+**Hash chain survives service restart (R-05)**
+- POST 50 events. Restart the control service. POST 10 more events. Run the chain
+  verifier across all 60 entries. No mismatch must be detected. This confirms that
+  `lastHash` is loaded from DB on startup, not initialized to a static value.
+
+**Tampered entry detection**
+- POST 20 events. Directly UPDATE one row in `audit_log` to change its payload.
+  Run the chain verifier. Must report a mismatch at the modified entry.
+
+```typescript
+describe('Audit Integrity', () => {
+  it('SHA256 hash chain is valid across 100 events', async () => {
+    await postAuditEvents(100);
+    const valid = await verifyAuditChain();
+    expect(valid.ok).toBe(true);
+    expect(valid.invalidAt).toBeNull();
+  });
+
+  it('hash chain continues correctly after service restart (R-05)', async () => {
+    await postAuditEvents(50);
+    await restartService(); // helper stops and restarts the control-service process
+    await postAuditEvents(10);
+    const valid = await verifyAuditChain();
+    expect(valid.ok).toBe(true);
+  });
+
+  it('chain verifier detects a tampered audit entry', async () => {
+    await postAuditEvents(20);
+    await db.query("UPDATE audit_log SET payload = 'tampered' WHERE sequence = 10");
+    const valid = await verifyAuditChain();
+    expect(valid.ok).toBe(false);
+    expect(valid.invalidAt).toBe(10);
+  });
+});
+```
+
+---
+
+## 4. Tools
+
+| Tool | Purpose | When run |
+|------|---------|----------|
+| Vitest security suite (`pnpm test:security`) | Automated execution of all cases in §3 | CI on every PR to `main` |
+| `npm audit` | Known vulnerability scanning of all dependencies | CI weekly cron + before each release |
+| OWASP ZAP (API scan mode) | Active scanning of staging API for OWASP Top 10 | Manual, pre-release (Phase 1) and monthly (Phase 2) |
+| `eslint-plugin-security` | Static analysis for common JS/TS security anti-patterns | CI on every PR |
+| Manual code review | Review of auth, session, and tenant-resolution code paths | Security Lead review before Gate 1 sign-off |
+
+---
+
+## 5. Pass Criteria
+
+| Criterion | Requirement | Gate |
+|-----------|-------------|------|
+| All P0 (Critical) issues resolved | Zero open P0 security risks | Gate 1 — HARD BLOCK |
+| All P1 (High) issues resolved | Zero open P1 security risks (R-03, R-04, R-05) | Gate 1 — HARD BLOCK |
+| Automated security tests | `pnpm test:security` passes with zero failures | Gate 2 — HARD BLOCK |
+| OWASP ZAP scan | Zero High or Critical findings in pre-release scan | Gate 1 — HARD BLOCK |
+| `npm audit` | Zero High or Critical dependency vulnerabilities | Gate 2 — HARD BLOCK |
+
+Any finding that cannot be resolved before the release date must be documented
+with an accepted risk sign-off from the Security Lead. P0/P1 findings cannot
+be accepted-risk deferred — they block release unconditionally.
+
+---
+
+## 6. Regression Testing
+
+- All test cases in §3 are part of the `pnpm test:security` suite and run on
+  every PR to `main`.
+- Any security bug fix must include a new test case that would have caught the
+  bug. The test is written first (red), then the fix is applied (green).
+- Security tests may not be skipped (`it.skip`) or excluded from coverage without
+  an issue reference and Security Lead approval recorded in the PR.
+- The security suite is isolated from the main test suite so it can be run
+  independently in time-sensitive CI pipelines.
+
+---
+
+## 7. Responsible Disclosure
+
+For reporting vulnerabilities discovered in Code-Kit-Ultra, refer to
+`docs/SECURITY.md` (or `SECURITY.md` at the repository root). That document
+contains the contact address, expected response timeline, and disclosure
+embargo policy.
+
+Vulnerabilities found during internal testing are tracked in the private security
+issue tracker referenced in `SECURITY.md`. Do not open public issues for
+unpatched security vulnerabilities.
+
+---
+
+## Related Documents
+
+- `docs/06_validation/GO_NO_GO_CHECKLIST.md` — release gate (Gate 1 references this plan)
+- `docs/06_validation/PRODUCTION_READINESS.md` — full readiness checklist (S-01 through S-10)
+- `docs/06_validation/TEST_PLAN_RUN_SCOPING.md` — tenant isolation test plan
+- `docs/06_validation/TEST_PLAN_AUTH.md` — auth package unit test plan
+- `docs/SECURITY_AUDIT.md` — open risk register (R-01 through R-08)
diff --git a/docs/06_validation/SMOKE_TEST_PACK.md b/docs/06_validation/SMOKE_TEST_PACK.md
new file mode 100644
index 0000000..acce4d2
--- /dev/null
+++ b/docs/06_validation/SMOKE_TEST_PACK.md
@@ -0,0 +1,562 @@
+# Smoke Test Pack — Code-Kit-Ultra
+
+**Version:** 1.2.0 → 1.3.0  
+**Last Updated:** 2026-04-04  
+**Owner:** Engineering Lead  
+**Run Command:** `pnpm test:smoke`  
+**Target Duration:** < 5 minutes
+
+---
+
+## 1. Purpose
+
+Smoke tests are the first line of defence. They run in under 5 minutes and verify the system
+is alive and its critical paths are functional before the full test suite executes or a
+deployment proceeds.
+
+Smoke tests are intentionally narrow. They do not test edge cases, error paths, or performance.
+They answer one question: **"Is the system basically working right now?"**
+
+If any smoke test fails, the pipeline stops immediately — no further tests run and no
+deployment proceeds until the failure is resolved.
+
+---
+
+## 2. When to Run
+
+| Trigger | Action |
+|---|---|
+| Every `git push` to any branch | Run smoke tests in CI before lint/unit/integration |
+| Every pull request | Run as a required status check — blocks merge if failing |
+| Before every staging deploy | Run against staging environment |
+| Before every production deploy | Run against production canary post-deploy |
+| After any infrastructure change | Confirm no breakage |
+| After any DB migration | Run against migrated environment |
+
+---
+
+## 3. Environment Requirements
+
+The smoke suite expects the following environment variables to be set:
+
+```
+BASE_URL=http://localhost:3000        # Override for staging/prod
+SMOKE_CLIENT_ID=smoke-test-client
+SMOKE_CLIENT_SECRET=<test secret>
+DATABASE_URL=postgresql://localhost:5432/cku_smoke
+REDIS_URL=redis://localhost:6379
+JWT_ISSUER=https://auth.codekit.local
+JWKS_URI=https://auth.codekit.local/.well-known/jwks.json
+```
+
+The smoke suite uses a dedicated test database and a throwaway Redis namespace
+(`smoke:<timestamp>`) that is flushed after the run. It does not touch production data.
+
+---
+
+## 4. Test Categories
+
+### 4.1 Startup Checks
+
+Verify the server process starts correctly and all infrastructure dependencies are reachable.
+
+| ID | Test | Expected Result | Timeout |
+|---|---|---|---|
+| S-001 | Server binds to `PORT` (default 3000) | TCP connection accepted | 10 s |
+| S-002 | `GET /health` returns 200 | `{ "status": "ok" }` | 2 s |
+| S-003 | `GET /health` includes DB status | `"db": "connected"` in body | 2 s |
+| S-004 | `GET /health` includes Redis status | `"redis": "connected"` in body | 2 s |
+| S-005 | `GET /ready` returns 200 | Server reports ready to serve traffic | 2 s |
+
+**Failure behaviour:** If the server does not start or `/health` returns non-200, all
+subsequent smoke tests are skipped and the pipeline is blocked.
+
+---
+
+### 4.2 Auth Smoke Tests
+
+Verify the InsForge JWT (RS256/JWKS) authentication stack accepts valid credentials and
+rejects invalid ones.
+
+| ID | Test | Input | Expected Result |
+|---|---|---|---|
+| A-001 | Valid credentials return token | POST `/auth/token` with correct client_id + secret | 200, body contains `access_token` |
+| A-002 | Invalid credentials return 401 | POST `/auth/token` with wrong secret | 401, body contains `error` |
+| A-003 | Missing body returns 400 | POST `/auth/token` with empty body | 400 |
+| A-004 | Token uses RS256 algorithm | Decode returned token header | `alg: "RS256"`, `typ: "JWT"` |
+| A-005 | Token contains expected claims | Decode returned token payload | Contains `sub`, `org`, `iat`, `exp` |
+| A-006 | No auth header returns 401 | `GET /v1/runs` with no Authorization | 401 |
+
+---
+
+### 4.3 Run Lifecycle Smoke
+
+Verify the core run management endpoints function end-to-end across the 8-phase execution
+pipeline.
+
+| ID | Test | Input | Expected Result |
+|---|---|---|---|
+| R-001 | Create run | POST `/v1/runs` with valid idea payload | 201, body contains `run.id` |
+| R-002 | Fetch run by ID | GET `/v1/runs/{id}` | 200, body contains `run.id` matching created run |
+| R-003 | List runs | GET `/v1/runs` | 200, body contains runs array with length >= 1 |
+| R-004 | Cancel run | POST `/v1/runs/{id}/cancel` | 200, body contains `status: "cancelled"` |
+| R-005 | Fetch cancelled run | GET `/v1/runs/{id}` | 200, `status: "cancelled"` |
+| R-006 | Fetch non-existent run | GET `/v1/runs/00000000-0000-0000-0000-000000000000` | 404 |
+
+Tests R-002 through R-005 depend on the `run.id` produced by R-001.
+
+---
+
+### 4.4 Gate Smoke
+
+Verify the 9-gate governance system responds correctly to basic gate operations.
+
+| ID | Test | Input | Expected Result |
+|---|---|---|---|
+| G-001 | List gates for run | GET `/v1/runs/{id}/gates` | 200, body contains gates array |
+| G-002 | Gate list is non-empty | Inspect gates array | At least 1 gate object present |
+| G-003 | Gate object has required fields | Inspect first gate | Contains `id`, `name`, `status`, `phase` |
+| G-004 | Approve gate | POST `/v1/gates/{id}/approve` with approver token | 200, `status: "approved"` |
+| G-005 | Gate status updated | GET `/v1/runs/{id}/gates` | Matching gate shows `status: "approved"` |
+
+Gate operations require a token with the `gate:approve` permission scope. The smoke suite
+uses a pre-configured approver service account for this purpose.
+
+---
+
+### 4.5 Realtime (SSE) Smoke
+
+Verify the Server-Sent Events stream connects and emits heartbeat events.
+
+| ID | Test | Input | Expected Result |
+|---|---|---|---|
+| E-001 | Stream connects | GET `/v1/events/stream` with valid auth | HTTP 200, `Content-Type: text/event-stream` |
+| E-002 | Heartbeat received within 5 s | Hold SSE connection open | At least 1 `event: heartbeat` received |
+| E-003 | Stream closes cleanly | Abort connection client-side | No server-side error logged |
+| E-004 | Run-scoped events arrive | Create a run while stream is open | `event: run.created` received on stream |
+
+The SSE smoke test uses a 5-second connection window. Full realtime isolation and ordering
+tests live in the integration suite.
+
+---
+
+### 4.6 CLI Smoke
+
+Verify the `ck` CLI binary is installed and responds to basic commands.
+
+| ID | Test | Command | Expected Result |
+|---|---|---|---|
+| C-001 | Version flag | `ck --version` | Prints version matching `package.json`, exit code 0 |
+| C-002 | Help command | `ck help` | Prints command list, exit code 0 |
+| C-003 | Run create help | `ck runs create --help` | Prints usage for `runs create`, exit code 0 |
+| C-004 | Run list help | `ck runs list --help` | Prints usage for `runs list`, exit code 0 |
+
+---
+
+## 5. Execution
+
+### 5.1 Running Locally
+
+```bash
+# Start infrastructure dependencies
+docker compose up -d postgres redis
+
+# Run DB migrations
+pnpm db:migrate
+
+# Start the server
+pnpm dev &
+
+# Run smoke tests
+pnpm test:smoke
+
+# Or against a specific base URL
+BASE_URL=https://staging.codekit.internal pnpm test:smoke
+```
+
+### 5.2 CI Configuration
+
+Smoke tests run as the first job in `.github/workflows/ci.yml`. All other jobs depend on it.
+
+```yaml
+jobs:
+  smoke:
+    name: Smoke Tests
+    runs-on: ubuntu-latest
+    timeout-minutes: 5
+    services:
+      postgres:
+        image: postgres:16
+        env:
+          POSTGRES_DB: cku_smoke
+          POSTGRES_USER: cku
+          POSTGRES_PASSWORD: ${{ secrets.CI_DB_PASSWORD }}
+        options: >-
+          --health-cmd pg_isready
+          --health-interval 5s
+          --health-timeout 5s
+          --health-retries 5
+      redis:
+        image: redis:7
+        options: >-
+          --health-cmd "redis-cli ping"
+          --health-interval 5s
+          --health-retries 5
+    steps:
+      - uses: actions/checkout@v4
+      - uses: pnpm/action-setup@v3
+        with:
+          version: 9
+      - run: pnpm install --frozen-lockfile
+      - run: pnpm db:migrate
+      - run: pnpm build
+      - run: pnpm test:smoke
+        env:
+          BASE_URL: http://localhost:3000
+          SMOKE_CLIENT_ID: ${{ secrets.SMOKE_CLIENT_ID }}
+          SMOKE_CLIENT_SECRET: ${{ secrets.SMOKE_CLIENT_SECRET }}
+
+  lint:
+    needs: smoke
+  unit:
+    needs: smoke
+  integration:
+    needs: smoke
+  build:
+    needs: [lint, unit, integration]
+```
+
+---
+
+## 6. Example Smoke Test Code
+
+All smoke tests live in `tests/smoke/`. They use Vitest with the native `fetch` API.
+
+### 6.1 Startup Checks (`tests/smoke/startup.smoke.test.ts`)
+
+```typescript
+import { describe, it, expect } from "vitest";
+
+const BASE_URL = process.env.BASE_URL ?? "http://localhost:3000";
+
+describe("Startup checks", () => {
+  it("S-002: GET /health returns 200", async () => {
+    const res = await fetch(`${BASE_URL}/health`);
+    expect(res.status).toBe(200);
+  });
+
+  it("S-003: GET /health reports db connected", async () => {
+    const res = await fetch(`${BASE_URL}/health`);
+    const body = await res.json();
+    expect(body.db).toBe("connected");
+  });
+
+  it("S-004: GET /health reports redis connected", async () => {
+    const res = await fetch(`${BASE_URL}/health`);
+    const body = await res.json();
+    expect(body.redis).toBe("connected");
+  });
+
+  it("S-005: GET /ready returns 200", async () => {
+    const res = await fetch(`${BASE_URL}/ready`);
+    expect(res.status).toBe(200);
+  });
+});
+```
+
+### 6.2 Auth Smoke (`tests/smoke/auth.smoke.test.ts`)
+
+```typescript
+import { describe, it, expect } from "vitest";
+import { decodeProtectedHeader, decodeJwt } from "jose";
+
+const BASE_URL = process.env.BASE_URL ?? "http://localhost:3000";
+const CLIENT_ID = process.env.SMOKE_CLIENT_ID ?? "smoke-test-client";
+const CLIENT_SECRET = process.env.SMOKE_CLIENT_SECRET ?? "";
+
+async function getToken(): Promise<string> {
+  const res = await fetch(`${BASE_URL}/auth/token`, {
+    method: "POST",
+    headers: { "Content-Type": "application/json" },
+    body: JSON.stringify({ client_id: CLIENT_ID, client_secret: CLIENT_SECRET }),
+  });
+  const body = await res.json();
+  return body.access_token as string;
+}
+
+describe("Auth smoke tests", () => {
+  it("A-001: valid credentials return access_token", async () => {
+    const res = await fetch(`${BASE_URL}/auth/token`, {
+      method: "POST",
+      headers: { "Content-Type": "application/json" },
+      body: JSON.stringify({ client_id: CLIENT_ID, client_secret: CLIENT_SECRET }),
+    });
+    expect(res.status).toBe(200);
+    const body = await res.json();
+    expect(body).toHaveProperty("access_token");
+    expect(typeof body.access_token).toBe("string");
+  });
+
+  it("A-002: invalid credentials return 401", async () => {
+    const res = await fetch(`${BASE_URL}/auth/token`, {
+      method: "POST",
+      headers: { "Content-Type": "application/json" },
+      body: JSON.stringify({ client_id: CLIENT_ID, client_secret: "wrong-secret" }),
+    });
+    expect(res.status).toBe(401);
+  });
+
+  it("A-003: missing body returns 400", async () => {
+    const res = await fetch(`${BASE_URL}/auth/token`, {
+      method: "POST",
+      headers: { "Content-Type": "application/json" },
+      body: "{}",
+    });
+    expect(res.status).toBe(400);
+  });
+
+  it("A-004: returned token uses RS256 algorithm", async () => {
+    const token = await getToken();
+    const header = decodeProtectedHeader(token);
+    expect(header.alg).toBe("RS256");
+  });
+
+  it("A-005: token payload contains required claims", async () => {
+    const token = await getToken();
+    const claims = decodeJwt(token);
+    expect(claims.sub).toBeDefined();
+    expect(claims.org).toBeDefined();
+    expect(claims.iat).toBeDefined();
+    expect(claims.exp).toBeDefined();
+  });
+
+  it("A-006: no Authorization header returns 401", async () => {
+    const res = await fetch(`${BASE_URL}/v1/runs`);
+    expect(res.status).toBe(401);
+  });
+});
+```
+
+### 6.3 Run Lifecycle (`tests/smoke/runs.smoke.test.ts`)
+
+```typescript
+import { describe, it, expect, beforeAll } from "vitest";
+
+const BASE_URL = process.env.BASE_URL ?? "http://localhost:3000";
+let authToken: string;
+let runId: string;
+
+beforeAll(async () => {
+  const res = await fetch(`${BASE_URL}/auth/token`, {
+    method: "POST",
+    headers: { "Content-Type": "application/json" },
+    body: JSON.stringify({
+      client_id: process.env.SMOKE_CLIENT_ID,
+      client_secret: process.env.SMOKE_CLIENT_SECRET,
+    }),
+  });
+  const body = await res.json();
+  authToken = body.access_token;
+});
+
+function authHeaders() {
+  return {
+    Authorization: `Bearer ${authToken}`,
+    "Content-Type": "application/json",
+  };
+}
+
+describe("Run lifecycle smoke", () => {
+  it("R-001: POST /v1/runs creates a run", async () => {
+    const res = await fetch(`${BASE_URL}/v1/runs`, {
+      method: "POST",
+      headers: authHeaders(),
+      body: JSON.stringify({ idea: "Smoke test run — auto-generated, safe to ignore" }),
+    });
+    expect(res.status).toBe(201);
+    const body = await res.json();
+    expect(body.run).toHaveProperty("id");
+    runId = body.run.id;
+  });
+
+  it("R-002: GET /v1/runs/{id} returns the run", async () => {
+    const res = await fetch(`${BASE_URL}/v1/runs/${runId}`, { headers: authHeaders() });
+    expect(res.status).toBe(200);
+    const body = await res.json();
+    expect(body.run.id).toBe(runId);
+    expect(body.run.status).toBeDefined();
+  });
+
+  it("R-003: GET /v1/runs includes the created run", async () => {
+    const res = await fetch(`${BASE_URL}/v1/runs`, { headers: authHeaders() });
+    expect(res.status).toBe(200);
+    const body = await res.json();
+    expect(Array.isArray(body.runs)).toBe(true);
+    expect(body.runs.some((r: { id: string }) => r.id === runId)).toBe(true);
+  });
+
+  it("R-004: POST /v1/runs/{id}/cancel cancels the run", async () => {
+    const res = await fetch(`${BASE_URL}/v1/runs/${runId}/cancel`, {
+      method: "POST",
+      headers: authHeaders(),
+    });
+    expect(res.status).toBe(200);
+    const body = await res.json();
+    expect(body.status).toBe("cancelled");
+  });
+
+  it("R-006: GET non-existent run returns 404", async () => {
+    const res = await fetch(
+      `${BASE_URL}/v1/runs/00000000-0000-0000-0000-000000000000`,
+      { headers: authHeaders() }
+    );
+    expect(res.status).toBe(404);
+  });
+});
+```
+
+### 6.4 SSE Stream (`tests/smoke/sse.smoke.test.ts`)
+
+```typescript
+import { describe, it, expect } from "vitest";
+
+const BASE_URL = process.env.BASE_URL ?? "http://localhost:3000";
+
+async function getToken(): Promise<string> {
+  const res = await fetch(`${BASE_URL}/auth/token`, {
+    method: "POST",
+    headers: { "Content-Type": "application/json" },
+    body: JSON.stringify({
+      client_id: process.env.SMOKE_CLIENT_ID,
+      client_secret: process.env.SMOKE_CLIENT_SECRET,
+    }),
+  });
+  const body = await res.json();
+  return body.access_token as string;
+}
+
+describe("SSE realtime smoke", () => {
+  it("E-001 / E-002: stream connects and receives heartbeat within 5s", async () => {
+    const token = await getToken();
+    const controller = new AbortController();
+    const timeout = setTimeout(() => controller.abort(), 5000);
+    let heartbeatReceived = false;
+
+    try {
+      const res = await fetch(`${BASE_URL}/v1/events/stream`, {
+        headers: { Authorization: `Bearer ${token}` },
+        signal: controller.signal,
+      });
+
+      expect(res.status).toBe(200);
+      expect(res.headers.get("content-type")).toContain("text/event-stream");
+
+      const reader = res.body!.getReader();
+      const decoder = new TextDecoder();
+
+      while (true) {
+        const { done, value } = await reader.read();
+        if (done) break;
+        const chunk = decoder.decode(value);
+        if (chunk.includes("event: heartbeat") || chunk.includes("event: connected")) {
+          heartbeatReceived = true;
+          reader.cancel();
+          break;
+        }
+      }
+    } catch (e: unknown) {
+      if ((e as Error).name !== "AbortError") throw e;
+    } finally {
+      clearTimeout(timeout);
+    }
+
+    expect(heartbeatReceived).toBe(true);
+  });
+});
+```
+
+### 6.5 CLI Smoke (`tests/smoke/cli.smoke.test.ts`)
+
+```typescript
+import { describe, it, expect } from "vitest";
+import { execSync } from "child_process";
+
+describe("CLI smoke", () => {
+  it("C-001: ck --version exits 0 and prints a version string", () => {
+    const output = execSync("ck --version", { encoding: "utf-8" });
+    expect(output).toMatch(/\d+\.\d+\.\d+/);
+  });
+
+  it("C-002: ck help exits 0 and lists commands", () => {
+    const output = execSync("ck help", { encoding: "utf-8" });
+    expect(output.toLowerCase()).toContain("run");
+  });
+
+  it("C-003: ck runs create --help exits 0", () => {
+    expect(() => execSync("ck runs create --help")).not.toThrow();
+  });
+
+  it("C-004: ck runs list --help exits 0", () => {
+    expect(() => execSync("ck runs list --help")).not.toThrow();
+  });
+});
+```
+
+---
+
+## 7. Pass Criteria
+
+A smoke run **passes** when:
+
+1. Every test in every category exits with `pass`.
+2. Total wall-clock time is under **5 minutes**.
+3. Each individual test completes within its 10-second timeout.
+4. The test runner exits with code `0`.
+
+A smoke run **fails** when:
+
+- Any single assertion fails.
+- Any test times out (individual timeout: 10 seconds).
+- The test runner itself crashes.
+- Total runtime exceeds 5 minutes.
+
+---
+
+## 8. Failure Behaviour
+
+When any smoke test fails in CI:
+
+1. The failing test name, assertion message, and stack trace are printed to the job log.
+2. All downstream CI jobs (`lint`, `unit`, `integration`, `build`, `deploy`) are
+   **immediately cancelled**.
+3. The PR or branch is marked **failing** — merge is blocked.
+4. The on-call engineer is notified via the configured alert channel.
+5. No deployment to any environment proceeds until smoke tests pass on a new commit.
+
+**There are no exceptions to this policy.** A failing smoke test always blocks deployment.
+
+---
+
+## 9. Maintenance
+
+| Action | Owner | Frequency |
+|---|---|---|
+| Review smoke test coverage after each release | Engineering Lead | Per release |
+| Update `BASE_URL` targets when environments change | Infra | As needed |
+| Add smoke test for each new top-level API endpoint | Feature owner | Per feature |
+| Review total duration; refactor if approaching 4 minutes | Engineering | Monthly |
+| Fix or quarantine flaky smoke tests | Feature owner | Within 2 business days of detection |
+
+Flaky smoke tests undermine the gate. If a smoke test fails intermittently it must be
+stabilised or removed from the smoke suite immediately and tracked as a bug.
+
+---
+
+## 10. Related Documents
+
+- `docs/06_validation/PRODUCTION_READINESS.md` — full production checklist
+- `docs/06_validation/GO_NO_GO_CHECKLIST.md` — release gate criteria
+- `docs/06_validation/SECURITY_TESTING_PLAN.md` — security-focused test cases
+- `docs/06_validation/TEST_STRATEGY.md` — overall test strategy
+- `.github/workflows/ci.yml` — CI pipeline configuration
+- `SECURITY_AUDIT.md` — open risk register
diff --git a/docs/06_validation/TEST_PLAN_AUTH.md b/docs/06_validation/TEST_PLAN_AUTH.md
new file mode 100644
index 0000000..1c6dd26
--- /dev/null
+++ b/docs/06_validation/TEST_PLAN_AUTH.md
@@ -0,0 +1,523 @@
+# Test Plan — Auth and Session
+
+**Version**: 1.0.0
+**Date**: 2026-04-04
+**Status**: Active
+**Owner**: Platform Security
+**Related packages**: `packages/auth`
+
+---
+
+## Table of Contents
+
+1. [Scope](#1-scope)
+2. [Test Cases — InsForge Token Verification](#2-test-cases--insforge-token-verification)
+3. [Test Cases — Session Resolution](#3-test-cases--session-resolution)
+4. [Test Cases — Execution Token](#4-test-cases--execution-token)
+5. [Test Cases — Service Account](#5-test-cases--service-account)
+6. [Mock Requirements](#6-mock-requirements)
+7. [Test File Locations](#7-test-file-locations)
+8. [Example Test Code](#8-example-test-code)
+
+---
+
+## 1. Scope
+
+This test plan covers the four source files that form the auth package's public surface:
+
+| Source file | Exported symbols | Risk level |
+|---|---|---|
+| `packages/auth/src/verify-insforge-token.ts` | `verifyInsForgeToken` | Critical |
+| `packages/auth/src/resolve-session.ts` | `resolveInsForgeSession`, `mapClaimsToSession` | Critical |
+| `packages/auth/src/issue-execution-token.ts` | `issueExecutionToken` | High |
+| `packages/auth/src/service-account.ts` | `ServiceAccountAuth` | High |
+
+The existing tests in `verify-insforge-token.test.ts` and `resolve-session.test.ts` cover the happy
+path and basic error cases. This plan documents the full set of cases required to reach the ≥ 90%
+coverage target and satisfy the security requirements in `TEST_STRATEGY.md`.
+
+Out of scope: HTTP middleware, rate limiting, refresh token rotation. Those are tested in
+`apps/control-service/test/session.test.ts`.
+
+---
+
+## 2. Test Cases — InsForge Token Verification
+
+**File**: `packages/auth/src/verify-insforge-token.test.ts`
+**Function under test**: `verifyInsForgeToken(token: string): Promise<Record<string, unknown>>`
+
+### Existing Coverage (do not duplicate)
+
+- Valid RS256 JWT returns decoded claims. (PASS)
+- Invalid signature rejects with error message. (PASS)
+- Expired token rejects with error message. (PASS)
+- Missing env config rejects with configuration error. (PASS)
+
+### Missing Test Cases
+
+#### TC-AUTH-001: Wrong issuer throws descriptive error
+
+```
+Given: a valid RS256 JWT whose `iss` claim is "https://evil.example.com"
+When: verifyInsForgeToken is called
+Then: throws an error containing "InsForge token verification failed"
+```
+
+Rationale: issuer mismatch is the primary defence against token replay from a foreign IdP.
+
+#### TC-AUTH-002: Wrong audience throws descriptive error
+
+```
+Given: a valid RS256 JWT whose `aud` claim is "other-service"
+When: verifyInsForgeToken is called with INSFORGE_JWT_AUDIENCE="cku-api"
+Then: throws an error containing "InsForge token verification failed"
+```
+
+#### TC-AUTH-003: JWKS key fetch failure triggers retry then throws
+
+```
+Given: the JWKS endpoint returns HTTP 503 on every attempt
+When: verifyInsForgeToken is called
+Then:
+  - the JWKS client is called at least twice (retry behaviour)
+  - the promise rejects with an error indicating JWKS unavailability
+```
+
+#### TC-AUTH-004: JWKS key fetch succeeds on second attempt (transient failure)
+
+```
+Given: the JWKS endpoint returns HTTP 503 on attempt 1 and HTTP 200 on attempt 2
+When: verifyInsForgeToken is called
+Then: the promise resolves with the decoded claims (no error thrown)
+```
+
+#### TC-AUTH-005: Token with unknown `kid` throws
+
+```
+Given: a JWT signed with a key whose `kid` is not present in the JWKS response
+When: verifyInsForgeToken is called
+Then: throws an error (key not found in JWKS)
+```
+
+#### TC-AUTH-006: Revoked token (jti in Redis blacklist) throws
+
+```
+Given: a structurally valid, unexpired JWT
+And: the token's `jti` claim is present in the Redis revocation set
+When: verifyInsForgeToken is called (with revocation check enabled)
+Then: throws an error indicating the token has been revoked
+Note: this test requires a Redis mock (see Section 6)
+```
+
+#### TC-AUTH-007: Token with no `jti` claim is accepted (revocation check skipped)
+
+```
+Given: a structurally valid JWT with no `jti` claim
+And: the revocation store is empty
+When: verifyInsForgeToken is called
+Then: resolves successfully with decoded claims
+```
+
+---
+
+## 3. Test Cases — Session Resolution
+
+**File**: `packages/auth/src/resolve-session.test.ts`
+**Functions under test**: `resolveInsForgeSession`, `mapClaimsToSession`
+
+### Existing Coverage (do not duplicate)
+
+- `mapClaimsToSession`: valid full claims → correct `ResolvedSession` shape. (PASS)
+- `mapClaimsToSession`: minimal claims → default `viewer` role, `default-org`. (PASS)
+- `resolveInsForgeSession`: delegates to `verifyInsForgeToken` and maps correctly. (PASS)
+
+### Missing Test Cases
+
+#### TC-SESSION-001: Service account JWT returns `service_account` actor type
+
+```
+Given: a token whose decoded payload contains `type: "service_account"`
+When: resolveSession(token) is called (the unified resolver)
+Then: the returned ResolvedSession has actor.actorType === "service_account"
+```
+
+#### TC-SESSION-002: Legacy API key header returns `legacy_api_key` actor type
+
+```
+Given: an Authorization header value of "ApiKey sk-live-abc123"
+When: resolveSession is called with this header
+Then: returns a ResolvedSession with actor.actorType === "legacy_api_key"
+And: actor.roles defaults to ["viewer"] unless the API key record grants more
+```
+
+#### TC-SESSION-003: Missing Authorization header throws UnauthorizedError
+
+```
+Given: no Authorization header (undefined or empty string)
+When: resolveSession is called
+Then: throws an error with a message suitable for a 401 response
+```
+
+#### TC-SESSION-004: Malformed bearer token throws MalformedTokenError
+
+```
+Given: Authorization header value of "Bearer not.a.jwt"
+When: resolveSession is called
+Then: throws an error indicating the token is malformed (not a valid JWT structure)
+Note: must NOT leak the raw token value in the error message
+```
+
+#### TC-SESSION-005: Claims with multiple roles accumulate all permissions
+
+```
+Given: claims with roles: ["operator", "reviewer"]
+When: mapClaimsToSession is called
+Then: session.actor.roles contains both "operator" and "reviewer"
+```
+
+#### TC-SESSION-006: Role alias normalisation
+
+```
+Given: claims with roles: ["Owner", "Reviewer"]  (InsForge capitalised aliases)
+When: mapClaimsToSession is called
+Then: session.actor.roles resolves to ["admin", "reviewer"] after alias normalisation
+```
+
+---
+
+## 4. Test Cases — Execution Token
+
+**File**: `packages/auth/src/issue-execution-token.test.ts` (to be created)
+**Function under test**: `issueExecutionToken(scope: ExecutionScope): Promise<string>`
+
+All test cases in this section are new (no existing coverage).
+
+#### TC-EXEC-001: Issued token is a valid HS256 JWT
+
+```
+Given: a valid ExecutionScope with runId, orgId, workspaceId, projectId, actorId
+And: INSFORGE_SERVICE_ROLE_KEY is set
+When: issueExecutionToken(scope) is called
+Then:
+  - the returned string is a three-part JWT (header.payload.signature)
+  - jwt.decode(token).header.alg === "HS256"
+```
+
+#### TC-EXEC-002: Token payload contains all scope fields
+
+```
+Given: scope = { runId: "run-001", tenant: { orgId: "org-1", workspaceId: "ws-1", projectId: "proj-1" }, actor: { actorId: "user-1", ... }, correlationId: "corr-1" }
+When: issueExecutionToken(scope) is called
+Then: decoded token payload includes:
+  - sub === "user-1"
+  - run_id === "run-001"
+  - org_id === "org-1"
+  - workspace_id === "ws-1"
+  - project_id === "proj-1"
+  - correlation_id === "corr-1"
+```
+
+#### TC-EXEC-003: Token expires in exactly 10 minutes
+
+```
+Given: current time is T
+When: issueExecutionToken(scope) is called
+Then: decoded token exp ≈ T + 600 seconds (within a 5-second tolerance)
+```
+
+```typescript
+it("should expire the execution token after 10 minutes", async () => {
+  const before = Math.floor(Date.now() / 1000);
+  const token = await issueExecutionToken(testScope);
+  const decoded = jwt.decode(token) as any;
+  const after = Math.floor(Date.now() / 1000);
+
+  expect(decoded.exp).toBeGreaterThanOrEqual(before + 600);
+  expect(decoded.exp).toBeLessThanOrEqual(after + 600 + 5);
+});
+```
+
+#### TC-EXEC-004: Token is rejected if verified with wrong secret
+
+```
+Given: a token issued with secret "correct-secret"
+When: jwt.verify is called with "wrong-secret"
+Then: throws JsonWebTokenError (invalid signature)
+```
+
+#### TC-EXEC-005: Missing INSFORGE_SERVICE_ROLE_KEY throws configuration error
+
+```
+Given: INSFORGE_SERVICE_ROLE_KEY is not set in process.env
+When: issueExecutionToken(scope) is called
+Then: throws an error containing "Internal execution token foundation"
+```
+
+#### TC-EXEC-006: Token audience is scoped to execution-engine-worker
+
+```
+Given: a valid scope
+When: issueExecutionToken(scope) is called
+Then: decoded token aud === "execution-engine-worker"
+```
+
+---
+
+## 5. Test Cases — Service Account
+
+**File**: `packages/auth/src/service-account.test.ts` (to be created)
+**Class under test**: `ServiceAccountAuth`
+
+#### TC-SA-001: issueToken returns a decodable JWT with correct payload
+
+```
+Given: a ServiceAccount object with id, name, orgId, workspaceId, roles
+When: ServiceAccountAuth.issueToken(sa) is called
+Then:
+  - returns a non-empty string
+  - jwt.decode(token).sub === sa.id
+  - jwt.decode(token).type === "service_account"
+  - jwt.decode(token).orgId === sa.orgId
+  - jwt.decode(token).roles deep-equals sa.roles
+```
+
+#### TC-SA-002: issueToken with custom expiry is reflected in exp claim
+
+```
+Given: ServiceAccountAuth.issueToken(sa, "1h")
+When: the token is decoded
+Then: token.exp ≈ now + 3600 (within 5-second tolerance)
+```
+
+#### TC-SA-003: verifyToken with valid SA token returns ResolvedSession
+
+```
+Given: a token issued by issueToken for a SA with roles: ["operator"]
+When: ServiceAccountAuth.verifyToken(token) is called
+Then:
+  - returns a ResolvedSession where actor.actorType === "service_account"
+  - actor.roles deep-equals ["operator"]
+  - tenant.orgId === sa.orgId
+  - tenant.workspaceId === sa.workspaceId
+```
+
+#### TC-SA-004: verifyToken with expired token throws
+
+```
+Given: a token issued with expiresIn: "1ms" (immediately expired)
+When: ServiceAccountAuth.verifyToken(token) is called after a brief delay
+Then: throws an error containing "Service Account verification failed"
+```
+
+#### TC-SA-005: verifyToken with a non-SA token (user JWT) throws
+
+```
+Given: a standard user JWT that does NOT contain type: "service_account"
+When: ServiceAccountAuth.verifyToken(token) is called
+Then: throws "Invalid token type: Not a service account token"
+```
+
+#### TC-SA-006: isServiceAccountToken correctly identifies SA tokens
+
+```
+Given: a token with type: "service_account" in payload
+When: ServiceAccountAuth.isServiceAccountToken(token)
+Then: returns true
+
+Given: a token with no type field
+When: ServiceAccountAuth.isServiceAccountToken(token)
+Then: returns false
+```
+
+#### TC-SA-007: Secret rotation — old token rejected after secret change (integration)
+
+```
+Given: a token issued with OLD_SECRET
+And: CKU_SERVICE_ACCOUNT_SECRET env var is changed to NEW_SECRET
+When: ServiceAccountAuth.verifyToken(old_token) is called
+Then: throws (invalid signature with new secret)
+
+And: a new token issued with NEW_SECRET
+When: ServiceAccountAuth.verifyToken(new_token) is called
+Then: resolves successfully
+```
+
+---
+
+## 6. Mock Requirements
+
+### JWKS Mock
+
+The JWKS mock must be started in `vitest.config.ts` `globalSetup` and torn down in `globalTeardown`.
+
+```typescript
+// vitest.config.ts
+import { defineConfig } from "vitest/config";
+
+export default defineConfig({
+  test: {
+    globalSetup: ["./tests/setup/jwks-server.ts"],
+    env: {
+      INSFORGE_JWKS_URL: "http://localhost:9999/.well-known/jwks.json",
+    },
+  },
+});
+```
+
+Individual tests that need to simulate JWKS failures should use `vi.mock("jwks-rsa")` to override
+the module-level client, as demonstrated in the existing `verify-insforge-token.test.ts`.
+
+### Redis Mock (for jti revocation — TC-AUTH-006)
+
+Use `ioredis-mock` or `vi.mock` the Redis client module. The mock must support:
+- `SADD key value` — add a jti to the revocation set.
+- `SISMEMBER key value` — check if a jti is revoked.
+
+```typescript
+vi.mock("ioredis", () => ({
+  default: vi.fn().mockImplementation(() => ({
+    sadd: vi.fn(),
+    sismember: vi.fn().mockResolvedValue(1), // 1 = member exists (revoked)
+  })),
+}));
+```
+
+---
+
+## 7. Test File Locations
+
+| File | Status | Notes |
+|---|---|---|
+| `packages/auth/src/verify-insforge-token.test.ts` | Exists — extend with TC-AUTH-001 through TC-AUTH-007 | |
+| `packages/auth/src/resolve-session.test.ts` | Exists — extend with TC-SESSION-001 through TC-SESSION-006 | |
+| `packages/auth/src/issue-execution-token.test.ts` | To be created | TC-EXEC-001 through TC-EXEC-006 |
+| `packages/auth/src/service-account.test.ts` | To be created | TC-SA-001 through TC-SA-007 |
+| `packages/auth/test/session-revocation.test.ts` | To be created | Integration test; requires Redis mock |
+
+---
+
+## 8. Example Test Code
+
+### issue-execution-token.test.ts (skeleton)
+
+```typescript
+import { describe, it, expect, beforeEach, afterEach } from "vitest";
+import jwt from "jsonwebtoken";
+import { issueExecutionToken } from "./issue-execution-token";
+import type { ExecutionScope } from "../../shared/src/types";
+
+const TEST_SECRET = "test-service-role-key-must-be-32-chars";
+
+const testScope: ExecutionScope = {
+  runId: "run-tc-001",
+  tenant: { orgId: "org-001", workspaceId: "ws-001", projectId: "proj-001" },
+  actor: { actorId: "user-001", actorType: "user", actorName: "Test User", roles: ["operator"] },
+  correlationId: "corr-001",
+};
+
+describe("issueExecutionToken", () => {
+  beforeEach(() => {
+    process.env.INSFORGE_SERVICE_ROLE_KEY = TEST_SECRET;
+  });
+
+  afterEach(() => {
+    delete process.env.INSFORGE_SERVICE_ROLE_KEY;
+  });
+
+  it("should return a valid HS256 JWT containing all scope fields", async () => {
+    const token = await issueExecutionToken(testScope);
+    const decoded = jwt.decode(token) as Record<string, unknown>;
+
+    expect(typeof token).toBe("string");
+    expect(token.split(".")).toHaveLength(3);
+    expect(decoded.sub).toBe("user-001");
+    expect(decoded.run_id).toBe("run-tc-001");
+    expect(decoded.org_id).toBe("org-001");
+    expect(decoded.workspace_id).toBe("ws-001");
+    expect(decoded.project_id).toBe("proj-001");
+    expect(decoded.correlation_id).toBe("corr-001");
+  });
+
+  it("should expire the token after 10 minutes", async () => {
+    const before = Math.floor(Date.now() / 1000);
+    const token = await issueExecutionToken(testScope);
+    const decoded = jwt.decode(token) as any;
+
+    expect(decoded.exp).toBeGreaterThanOrEqual(before + 600);
+    expect(decoded.exp).toBeLessThanOrEqual(before + 605);
+  });
+
+  it("should throw when INSFORGE_SERVICE_ROLE_KEY is not configured", async () => {
+    delete process.env.INSFORGE_SERVICE_ROLE_KEY;
+    await expect(issueExecutionToken(testScope)).rejects.toThrow(
+      "Internal execution token foundation"
+    );
+  });
+
+  it("should be rejected when verified with the wrong secret", async () => {
+    const token = await issueExecutionToken(testScope);
+    expect(() => jwt.verify(token, "wrong-secret")).toThrow();
+  });
+});
+```
+
+### service-account.test.ts (skeleton)
+
+```typescript
+import { describe, it, expect } from "vitest";
+import { ServiceAccountAuth, type ServiceAccount } from "./service-account";
+
+const testSA: ServiceAccount = {
+  id: "sa-001",
+  name: "CI Bot",
+  orgId: "org-001",
+  workspaceId: "ws-001",
+  roles: ["operator"],
+};
+
+describe("ServiceAccountAuth", () => {
+  describe("issueToken", () => {
+    it("should return a JWT with service_account type claim", () => {
+      const token = ServiceAccountAuth.issueToken(testSA);
+      const decoded = require("jsonwebtoken").decode(token) as any;
+      expect(decoded.type).toBe("service_account");
+      expect(decoded.sub).toBe("sa-001");
+    });
+  });
+
+  describe("verifyToken", () => {
+    it("should return a ResolvedSession with service_account actor type", async () => {
+      const token = ServiceAccountAuth.issueToken(testSA);
+      const session = await ServiceAccountAuth.verifyToken(token);
+      expect(session.actor.actorType).toBe("service_account");
+      expect(session.actor.roles).toEqual(["operator"]);
+      expect(session.tenant.orgId).toBe("org-001");
+    });
+
+    it("should throw when the token does not have service_account type", async () => {
+      // A plain JWT without the type field
+      const plain = require("jsonwebtoken").sign({ sub: "user-1" }, "test-secret");
+      // Override the secret env for this test
+      const oldSecret = process.env.CKU_SERVICE_ACCOUNT_SECRET;
+      process.env.CKU_SERVICE_ACCOUNT_SECRET = "test-secret";
+
+      await expect(ServiceAccountAuth.verifyToken(plain)).rejects.toThrow(
+        "Invalid token type: Not a service account token"
+      );
+      process.env.CKU_SERVICE_ACCOUNT_SECRET = oldSecret;
+    });
+  });
+
+  describe("isServiceAccountToken", () => {
+    it("should return true for a token with type: service_account", () => {
+      const token = ServiceAccountAuth.issueToken(testSA);
+      expect(ServiceAccountAuth.isServiceAccountToken(token)).toBe(true);
+    });
+
+    it("should return false for a token without the type field", () => {
+      const token = require("jsonwebtoken").sign({ sub: "user-1" }, "any-secret");
+      expect(ServiceAccountAuth.isServiceAccountToken(token)).toBe(false);
+    });
+  });
+});
+```
diff --git a/docs/06_validation/TEST_PLAN_GATES.md b/docs/06_validation/TEST_PLAN_GATES.md
new file mode 100644
index 0000000..c46a4e1
--- /dev/null
+++ b/docs/06_validation/TEST_PLAN_GATES.md
@@ -0,0 +1,660 @@
+# Test Plan — Gate Evaluation and Approval
+
+**Version**: 1.0.0
+**Date**: 2026-04-04
+**Status**: Active
+**Owner**: Governance
+**Related packages**: `packages/governance`, `packages/orchestrator`
+
+---
+
+## Table of Contents
+
+1. [Scope](#1-scope)
+2. [Gate Reference](#2-gate-reference)
+3. [Test Cases — Individual Gates](#3-test-cases--individual-gates)
+4. [Test Cases — Gate Sequencing](#4-test-cases--gate-sequencing)
+5. [Test Cases — Approval and Rejection Flows](#5-test-cases--approval-and-rejection-flows)
+6. [Test Cases — Mode Influence](#6-test-cases--mode-influence)
+7. [Mock Requirements](#7-mock-requirements)
+8. [Test File Locations](#8-test-file-locations)
+9. [Example Test Code](#9-example-test-code)
+
+---
+
+## 1. Scope
+
+This test plan covers gate evaluation, sequencing, and the human approval/rejection flow.
+
+| Source file | What it implements |
+|---|---|
+| `packages/governance/src/gate-controller.ts` | Gate persistence and status transitions |
+| `packages/governance/src/confidence-engine.ts` | Weighted confidence score calculation |
+| `packages/governance/src/consensus-engine.ts` | Multi-agent consensus decision |
+| `packages/validation-engine.ts` | Input and output validation |
+| `packages/governance/src/constraint-engine.ts` | Policy constraint evaluation |
+| `packages/governance/src/intent-engine.ts` | Intent-to-plan alignment |
+| `packages/governance/src/kill-switch.ts` | Kill switch gate |
+| `packages/orchestrator/src/gate-manager.ts` | `evaluateGates`, `getOverallGateStatus` |
+
+Gate evaluation is the most complex safety-critical subsystem in Code-Kit-Ultra. A defect here
+could allow an unsafe run to proceed without approval, or incorrectly block a safe run.
+
+---
+
+## 2. Gate Reference
+
+The system defines 9 governance gates. Each gate can produce one of these statuses:
+`pass | fail | needs-review | blocked | pending`.
+
+| # | Gate ID | Source engine | Blocking status |
+|---|---|---|---|
+| 1 | `objective-clarity` | `gate-manager.ts` | `blocked` if no objective |
+| 2 | `requirements-completeness` | `gate-manager.ts` | `blocked` if too many open questions |
+| 3 | `plan-readiness` | `gate-manager.ts` | `blocked` if no tasks |
+| 4 | `skill-coverage` | `gate-manager.ts` | `blocked` if no skills |
+| 5 | `ambiguity-risk` | `gate-manager.ts` | `blocked` if ambiguity exceeds threshold |
+| 6 | `confidence-score` | `confidence-engine.ts` | `needs-review` if score below threshold |
+| 7 | `kill-switch` | `kill-switch.ts` | `blocked` immediately when active |
+| 8 | `constraint-violation` | `constraint-engine.ts` | `fail` if policy violated |
+| 9 | `approval` | `gate-controller.ts` | `needs-review` (human required, mode-dependent) |
+
+---
+
+## 3. Test Cases — Individual Gates
+
+### Gate 1: Objective Clarity
+
+**File**: `packages/orchestrator/src/gate-manager.test.ts`
+
+#### TC-GATE-OBJ-001: No normalised objective → blocked
+
+```
+Given: clarificationResult.normalizedIdea === ""
+When: evaluateGates(input) is called
+Then:
+  - the "objective-clarity" decision has status "blocked"
+  - overallStatus is "blocked"
+  - decision.shouldPause === true
+```
+
+#### TC-GATE-OBJ-002: Objective present but category unknown → needs-review
+
+```
+Given: clarificationResult.normalizedIdea === "Build a CRM"
+And: clarificationResult.inferredProjectType === "unclear"
+When: evaluateGates(input) is called
+Then:
+  - the "objective-clarity" decision has status "needs-review"
+  - overallStatus is at least "needs-review"
+```
+
+#### TC-GATE-OBJ-003: Clear objective and known category → pass
+
+```
+Given: normalizedIdea === "Build a CRM"; inferredProjectType === "web-app"
+When: evaluateGates(input) is called
+Then: "objective-clarity" decision has status "pass"
+```
+
+### Gate 2: Requirements Completeness
+
+#### TC-GATE-REQ-001: Exceeds maxQuestionsBeforeBlock → blocked
+
+```
+Given: clarifyingQuestions.length >= policy.gateThresholds.maxQuestionsBeforeBlock
+When: evaluateGates(input) is called
+Then: "requirements-completeness" status is "blocked"
+```
+
+#### TC-GATE-REQ-002: Within review threshold → needs-review
+
+```
+Given: clarifyingQuestions.length >= maxQuestionsBeforeReview AND < maxQuestionsBeforeBlock
+When: evaluateGates(input) is called
+Then: "requirements-completeness" status is "needs-review"
+```
+
+#### TC-GATE-REQ-003: Below review threshold → pass
+
+```
+Given: clarifyingQuestions.length < maxQuestionsBeforeReview
+When: evaluateGates(input) is called
+Then: "requirements-completeness" status is "pass"
+```
+
+### Gate 3: Plan Readiness
+
+#### TC-GATE-PLAN-001: Empty task list → blocked
+
+```
+Given: plan = []
+When: evaluateGates(input) is called
+Then: "plan-readiness" status is "blocked"
+```
+
+#### TC-GATE-PLAN-002: Tasks present but no dependencies → needs-review
+
+```
+Given: plan = [{ id: "t1", dependencies: [] }]
+When: evaluateGates(input) is called (in builder mode where minimumPlanTasks > 1)
+Then: "plan-readiness" status is "needs-review"
+```
+
+#### TC-GATE-PLAN-003: Sufficient tasks with dependencies → pass
+
+```
+Given: plan has >= minimumPlanTasks tasks, at least one with non-empty dependencies
+When: evaluateGates(input) is called
+Then: "plan-readiness" status is "pass"
+```
+
+### Gate 4: Skill Coverage
+
+#### TC-GATE-SKILL-001: No skills selected → blocked
+
+```
+Given: selectedSkills = []
+When: evaluateGates(input) is called
+Then: "skill-coverage" status is "blocked"
+```
+
+#### TC-GATE-SKILL-002: Only fallback skills → needs-review
+
+```
+Given: selectedSkills = [{ skillId: "s1", category: "fallback", ... }]
+When: evaluateGates(input) is called
+Then: "skill-coverage" status is "needs-review"
+```
+
+#### TC-GATE-SKILL-003: Has specialist skills → pass
+
+```
+Given: selectedSkills includes at least one with category !== "fallback"
+And: selectedSkills.length >= minimumSelectedSkills
+When: evaluateGates(input) is called
+Then: "skill-coverage" status is "pass"
+```
+
+### Gate 5: Ambiguity Risk
+
+#### TC-GATE-AMB-001: Ambiguity signal above block threshold → blocked
+
+```
+Given: questionCount >= policy.gateThresholds.ambiguityBlockThreshold
+When: evaluateGates(input) is called
+Then: "ambiguity-risk" status is "blocked"
+```
+
+#### TC-GATE-AMB-002: High assumptions with some questions → needs-review
+
+```
+Given: assumptions.length > 6
+When: evaluateGates(input) is called
+Then: "ambiguity-risk" status is "needs-review"
+```
+
+#### TC-GATE-AMB-003: Low ambiguity → pass
+
+```
+Given: questionCount < ambiguityReviewThreshold and assumptions.length <= 6
+When: evaluateGates(input) is called
+Then: "ambiguity-risk" status is "pass"
+```
+
+### Gate 6: Confidence Score
+
+**File**: `packages/governance/src/confidence-engine.test.ts`
+
+#### TC-CONF-001: All sub-scores at maximum → overall score near 1.0
+
+```
+Given:
+  intent = { confidence: 1.0 }
+  validation = { valid: true }
+  constraints = { valid: true }
+  consensus = { finalDecision: "approve", agreementScore: 1.0 }
+When: scoreExecution(params) is called
+Then: result.overall === 1.0
+```
+
+#### TC-CONF-002: Failed validation reduces score proportionally
+
+```
+Given: validation = { valid: false }, all other inputs at maximum
+When: scoreExecution(params) is called
+Then: result.validationScore === 0.4
+And: result.overall < 1.0
+```
+
+#### TC-CONF-003: Consensus "revise" reduces consensus score by 40%
+
+```
+Given: consensus = { finalDecision: "revise", agreementScore: 1.0 }
+When: scoreExecution(params) is called
+Then: result.consensusScore === 0.6
+```
+
+#### TC-CONF-004: Consensus "reject" produces near-zero consensus score
+
+```
+Given: consensus = { finalDecision: "reject", agreementScore: 0.0 }
+When: scoreExecution(params) is called
+Then: result.consensusScore === 0.1 (or near zero per the formula)
+```
+
+#### TC-CONF-005: Weights sum to 1.0 (regression guard)
+
+```
+This test asserts the formula coefficients (0.35 + 0.20 + 0.25 + 0.20) === 1.0
+to prevent silent drift in the weighting.
+```
+
+### Gate 7: Kill Switch
+
+**File**: `packages/governance/src/kill-switch.test.ts`
+
+#### TC-KS-001: Kill switch active → gate blocked immediately
+
+```
+Given: the kill switch flag is active (returned by the kill-switch store)
+When: the kill switch gate is evaluated
+Then: gate status is "blocked" and shouldPause is true
+```
+
+#### TC-KS-002: Kill switch inactive → gate passes
+
+```
+Given: the kill switch flag is NOT active
+When: the kill switch gate is evaluated
+Then: gate status is "pass"
+```
+
+### Gate 8: Constraint Violation
+
+**File**: `packages/governance/src/constraint-engine.test.ts`
+
+#### TC-CONSTRAINT-001: No constraints violated → pass
+
+```
+Given: no policy constraints are violated by the plan
+When: the constraint engine evaluates the plan
+Then: ConstraintResult.valid === true
+```
+
+#### TC-CONSTRAINT-002: Blocked command in plan → fail
+
+```
+Given: a PlanTask payload that calls a command in the policy blockList
+When: the constraint engine evaluates the plan
+Then: ConstraintResult.valid === false; violations array is non-empty
+```
+
+### Gate 9: Approval Gate
+
+Approval gate tests are covered in Section 5 (Approval and Rejection Flows) and Section 6
+(Mode Influence). The approval gate itself is exercised via the gate-controller.
+
+---
+
+## 4. Test Cases — Gate Sequencing
+
+**File**: `packages/orchestrator/src/gate-manager.test.ts`
+
+#### TC-SEQ-001: First blocked gate short-circuits overall status
+
+```
+Given: gate 1 is "blocked", gate 2 is "pass", gate 3 is "needs-review"
+When: getOverallGateStatus(decisions) is called
+Then: overallStatus === "blocked"
+Note: the function inspects ALL decisions but the first blocking one wins.
+```
+
+#### TC-SEQ-002: needs-review gates accumulate; first one pauses the run
+
+```
+Given: gate 1 is "pass", gate 2 is "needs-review", gate 3 is "needs-review"
+When: getOverallGateStatus(decisions) is called
+Then: overallStatus === "needs-review"
+```
+
+#### TC-SEQ-003: All gates pass → overallStatus is "pass"
+
+```
+Given: all decisions have status "pass"
+When: getOverallGateStatus(decisions) is called
+Then: overallStatus === "pass"
+```
+
+#### TC-SEQ-004: Manual approval overrides a blocked gate
+
+```
+Given:
+  - "objective-clarity" gate would normally return "needs-review"
+  - input.approvedGates includes "objective-clarity"
+When: evaluateGates(input) is called
+Then:
+  - the "objective-clarity" decision has status "pass"
+  - decision.reason contains "MANUALLY APPROVED"
+```
+
+---
+
+## 5. Test Cases — Approval and Rejection Flows
+
+**File**: `apps/control-service/test/approvals.test.ts` (already exists — extend)
+**File**: `packages/orchestrator/test/gate-rejection.test.ts` (to be created)
+
+#### TC-APPR-001: Valid approver approves a pending gate → gate passes, run resumes
+
+```
+Given: a run in "paused" status with gate G1 in "needs-review"
+And: the caller has the gate:approve permission
+When: POST /v1/gates/:G1Id/approve is called
+Then:
+  - gate G1 status becomes "pass"
+  - the run status transitions back to "running"
+  - an audit event is recorded with the approver's actorId
+```
+
+#### TC-APPR-002: Valid approver rejects a gate → gate fails, run is cancelled @security
+
+```
+Given: a run in "paused" status with gate G1 in "needs-review"
+And: the caller has the gate:reject permission
+When: POST /v1/gates/:G1Id/reject is called
+Then:
+  - gate G1 status becomes "fail"
+  - the run status transitions to "cancelled"
+  - no further phases are executed
+  - an audit event is recorded
+```
+
+#### TC-APPR-003: Approval of an already-decided gate returns 409
+
+```
+Given: gate G1 already has status "pass"
+When: POST /v1/gates/:G1Id/approve is called
+Then: HTTP 409 Conflict
+```
+
+#### TC-APPR-004: Rejection of an already-rejected gate returns 409
+
+```
+Given: gate G1 already has status "fail"
+When: POST /v1/gates/:G1Id/reject is called
+Then: HTTP 409 Conflict
+```
+
+#### TC-APPR-005: Approval by user without permission returns 403
+
+```
+Given: a pending gate; caller has role "viewer" (no gate:approve)
+When: POST /v1/gates/:id/approve is called
+Then: HTTP 403; gate status unchanged
+```
+
+---
+
+## 6. Test Cases — Mode Influence
+
+**File**: `packages/orchestrator/src/gate-manager.test.ts`
+
+Mode policies are defined in `packages/orchestrator/src/mode-controller.ts` and affect gate
+thresholds. The following tests verify that `evaluateGates` correctly applies mode policies.
+
+#### TC-MODE-001: turbo mode auto-passes needs-review gates
+
+```
+Given: a gate that would produce "needs-review" in builder mode
+And: mode === "turbo"
+When: evaluateGates(input) is called
+Then: the gate decision has status "pass" with reason containing "AUTO-PASSED VIA TURBO"
+And: overallStatus === "pass"
+```
+
+#### TC-MODE-002: turbo mode does NOT override blocked gates
+
+```
+Given: a gate that produces "blocked" (objective missing)
+And: mode === "turbo"
+When: evaluateGates(input) is called
+Then: the gate decision still has status "blocked"
+Note: "AUTO-PASSED VIA TURBO" only applies to "needs-review", not "blocked".
+```
+
+#### TC-MODE-003: safe mode uses stricter thresholds than builder mode
+
+```
+Given: a plan with 2 tasks and 1 clarifying question
+When: evaluateGates(input) is called with mode === "safe"
+Then: "requirements-completeness" status is at least "needs-review"
+
+When: evaluateGates(input) is called with mode === "turbo"
+Then: "requirements-completeness" status is "pass"
+(because turbo mode has higher maxQuestionsBeforeReview)
+```
+
+#### TC-MODE-004: god mode — all gates pass trivially
+
+```
+Given: worst-case inputs (no plan, no skills, no objective)
+And: mode === "god"
+When: evaluateGates(input) is called (if god mode bypasses gates)
+Then: overallStatus === "pass" (or no gates are evaluated)
+Note: this test documents expected behaviour; implement once god mode policy is defined.
+```
+
+#### TC-MODE-005: builder mode (default) uses standard thresholds
+
+```
+Given: mode is omitted from input
+When: evaluateGates(input) is called
+Then: the effective mode defaults to "builder"
+And: gate thresholds match getModePolicy("builder") values
+```
+
+---
+
+## 7. Mock Requirements
+
+### Mode Policy Mock
+
+Some tests need to control mode thresholds precisely. Mock `getModePolicy` to return a controlled
+policy object:
+
+```typescript
+vi.mock("./mode-controller", () => ({
+  getModePolicy: vi.fn(() => ({
+    mode: "builder",
+    gateThresholds: {
+      maxQuestionsBeforeBlock: 5,
+      maxQuestionsBeforeReview: 2,
+      minimumPlanTasks: 2,
+      minimumSelectedSkills: 1,
+      ambiguityBlockThreshold: 5,
+      ambiguityReviewThreshold: 3,
+    },
+  })),
+}));
+```
+
+### Kill Switch Store Mock
+
+```typescript
+vi.mock("../kill-switch", () => ({
+  isKillSwitchActive: vi.fn().mockResolvedValue(false), // default inactive
+}));
+```
+
+### Gate Approval HTTP Tests (integration)
+
+Require:
+- A running test instance of `apps/control-service`.
+- A seeded database with at least one run in "paused" status and one gate in "needs-review".
+- Tokens for users with different roles (see `TEST_PLAN_RBAC.md` token factory).
+
+---
+
+## 8. Test File Locations
+
+| File | Status | Tests |
+|---|---|---|
+| `packages/orchestrator/src/gate-manager.test.ts` | To be created | TC-GATE-OBJ-*, TC-GATE-REQ-*, TC-GATE-PLAN-*, TC-GATE-SKILL-*, TC-GATE-AMB-*, TC-SEQ-*, TC-MODE-* |
+| `packages/governance/src/confidence-engine.test.ts` | To be created | TC-CONF-001 through TC-CONF-005 |
+| `packages/governance/src/kill-switch.test.ts` | To be created | TC-KS-001, TC-KS-002 |
+| `packages/governance/src/constraint-engine.test.ts` | To be created | TC-CONSTRAINT-001, TC-CONSTRAINT-002 |
+| `apps/control-service/test/approvals.test.ts` | Exists — extend | TC-APPR-001 through TC-APPR-005 |
+| `packages/orchestrator/test/gate-rejection.test.ts` | To be created | TC-APPR-002 (integration) |
+
+---
+
+## 9. Example Test Code
+
+### gate-manager.test.ts (skeleton)
+
+```typescript
+import { describe, it, expect, vi, beforeEach } from "vitest";
+import { evaluateGates, getOverallGateStatus } from "./gate-manager";
+import type { GateEvaluationInput } from "./gate-manager";
+
+vi.mock("./mode-controller", () => ({
+  getModePolicy: vi.fn(() => ({
+    mode: "builder",
+    gateThresholds: {
+      maxQuestionsBeforeBlock: 5,
+      maxQuestionsBeforeReview: 2,
+      minimumPlanTasks: 2,
+      minimumSelectedSkills: 1,
+      ambiguityBlockThreshold: 5,
+      ambiguityReviewThreshold: 3,
+    },
+  })),
+}));
+
+const baseInput: GateEvaluationInput = {
+  clarificationResult: {
+    normalizedIdea: "Build a task manager app",
+    inferredProjectType: "web-app",
+    assumptions: [],
+    clarifyingQuestions: [],
+    completeness: "sufficient-for-initial-planning",
+  },
+  plan: [
+    { id: "t1", title: "Setup", description: "", status: "pending", type: "planning", dependencies: ["t0"], metadata: {} },
+    { id: "t2", title: "Build", description: "", status: "pending", type: "implementation", dependencies: ["t1"], metadata: {} },
+  ],
+  selectedSkills: [
+    { skillId: "s1", name: "React Expert", category: "frontend", reason: "UI needed", source: "registry" },
+  ],
+  mode: "builder",
+};
+
+describe("evaluateGates", () => {
+  it("should return overallStatus pass when all inputs are valid", () => {
+    const result = evaluateGates(baseInput);
+    expect(result.overallStatus).toBe("pass");
+    expect(result.decisions.every(d => d.status === "pass")).toBe(true);
+  });
+
+  it("should return blocked when no normalized objective is provided", () => {
+    const input: GateEvaluationInput = {
+      ...baseInput,
+      clarificationResult: { ...baseInput.clarificationResult, normalizedIdea: "" },
+    };
+    const result = evaluateGates(input);
+    const gate = result.decisions.find(d => d.gate === "objective-clarity");
+    expect(gate?.status).toBe("blocked");
+    expect(result.overallStatus).toBe("blocked");
+  });
+
+  it("should auto-pass needs-review gates in turbo mode", () => {
+    const input: GateEvaluationInput = {
+      ...baseInput,
+      mode: "turbo",
+      clarificationResult: {
+        ...baseInput.clarificationResult,
+        inferredProjectType: "unclear", // would produce needs-review in builder
+      },
+    };
+    const result = evaluateGates(input);
+    const gate = result.decisions.find(d => d.gate === "objective-clarity");
+    expect(gate?.status).toBe("pass");
+    expect(gate?.reason).toContain("AUTO-PASSED VIA TURBO");
+  });
+
+  it("should mark a manually approved gate as pass regardless of evaluation", () => {
+    const input: GateEvaluationInput = {
+      ...baseInput,
+      clarificationResult: { ...baseInput.clarificationResult, inferredProjectType: "unclear" },
+      approvedGates: ["objective-clarity"],
+    };
+    const result = evaluateGates(input);
+    const gate = result.decisions.find(d => d.gate === "objective-clarity");
+    expect(gate?.status).toBe("pass");
+    expect(gate?.reason).toContain("MANUALLY APPROVED");
+  });
+});
+
+describe("getOverallGateStatus", () => {
+  it("should return blocked when any decision is blocked", () => {
+    const decisions = [
+      { gate: "g1", status: "pass" as const, reason: "", shouldPause: false },
+      { gate: "g2", status: "blocked" as const, reason: "", shouldPause: true },
+      { gate: "g3", status: "needs-review" as const, reason: "", shouldPause: true },
+    ];
+    expect(getOverallGateStatus(decisions)).toBe("blocked");
+  });
+
+  it("should return needs-review when no blocked but some needs-review", () => {
+    const decisions = [
+      { gate: "g1", status: "pass" as const, reason: "", shouldPause: false },
+      { gate: "g2", status: "needs-review" as const, reason: "", shouldPause: true },
+    ];
+    expect(getOverallGateStatus(decisions)).toBe("needs-review");
+  });
+
+  it("should return pass when all decisions pass", () => {
+    const decisions = [
+      { gate: "g1", status: "pass" as const, reason: "", shouldPause: false },
+      { gate: "g2", status: "pass" as const, reason: "", shouldPause: false },
+    ];
+    expect(getOverallGateStatus(decisions)).toBe("pass");
+  });
+});
+```
+
+### confidence-engine.test.ts (skeleton)
+
+```typescript
+import { describe, it, expect } from "vitest";
+import { scoreExecution } from "./confidence-engine";
+
+const maxInputs = {
+  intent: { confidence: 1.0, aligned: true, summary: "" },
+  validation: { valid: true, errors: [] },
+  constraints: { valid: true, violations: [] },
+  consensus: { finalDecision: "approve" as const, agreementScore: 1.0, votes: [] },
+};
+
+describe("scoreExecution", () => {
+  it("should return overall score of 1.0 when all sub-scores are at maximum", () => {
+    const result = scoreExecution(maxInputs);
+    expect(result.overall).toBe(1.0);
+  });
+
+  it("should reduce the score when validation fails", () => {
+    const result = scoreExecution({ ...maxInputs, validation: { valid: false, errors: ["missing field"] } });
+    expect(result.validationScore).toBe(0.4);
+    expect(result.overall).toBeLessThan(1.0);
+  });
+
+  it("should reduce consensus score by 40% when decision is revise", () => {
+    const result = scoreExecution({
+      ...maxInputs,
+      consensus: { finalDecision: "revise", agreementScore: 1.0, votes: [] },
+    });
+    expect(result.consensusScore).toBeCloseTo(0.6, 2);
+  });
+});
+```
diff --git a/docs/06_validation/TEST_PLAN_RBAC.md b/docs/06_validation/TEST_PLAN_RBAC.md
new file mode 100644
index 0000000..c171741
--- /dev/null
+++ b/docs/06_validation/TEST_PLAN_RBAC.md
@@ -0,0 +1,498 @@
+# Test Plan — RBAC and Permissions
+
+**Version**: 1.0.0
+**Date**: 2026-04-04
+**Status**: Active
+**Owner**: Platform Security
+**Related packages**: `packages/policy`, `packages/orchestrator`, `packages/auth`
+
+---
+
+## Table of Contents
+
+1. [Scope](#1-scope)
+2. [Permission Matrix](#2-permission-matrix)
+3. [Test Cases — Permission Resolution](#3-test-cases--permission-resolution)
+4. [Test Cases — Cross-Tenant Isolation](#4-test-cases--cross-tenant-isolation-critical)
+5. [Test Cases — Gate Approval Permissions](#5-test-cases--gate-approval-permissions)
+6. [Mock Requirements](#6-mock-requirements)
+7. [Test File Locations](#7-test-file-locations)
+8. [Example Test Code](#8-example-test-code)
+
+---
+
+## 1. Scope
+
+This test plan covers the permission and RBAC subsystem, specifically:
+
+| Source file | Exported symbols | Risk level |
+|---|---|---|
+| `packages/policy/src/permissions.ts` | `Permission` type | Reference |
+| `packages/policy/src/role-mapping.ts` | `ROLE_PERMISSIONS`, `ROLE_ALIASES` | Critical |
+| `packages/policy/src/resolve-permissions.ts` | `resolvePermissions` | Critical |
+
+It also covers integration-level tests that validate multi-tenant isolation across the
+orchestrator and HTTP layer. These integration tests require a seeded test database.
+
+Out of scope: UI permission rendering, API key scoping, invitation flows.
+
+---
+
+## 2. Permission Matrix
+
+The following matrix defines the expected output of `resolvePermissions` for each role.
+Every cell in this matrix must be exercised by at least one test.
+
+| Permission | admin | operator | reviewer | viewer | service_account |
+|---|:---:|:---:|:---:|:---:|:---:|
+| `run:create` | Y | Y | N | N | Y |
+| `run:view` | Y | Y | Y | Y | Y |
+| `run:cancel` | Y | Y | N | N | Y |
+| `gate:view` | Y | Y | Y | Y | Y |
+| `gate:approve` | Y | Y | Y | N | Y |
+| `gate:reject` | Y | N | Y | N | N |
+| `execution:view` | Y | Y | Y | Y | Y |
+| `execution:high_risk` | Y | Y | N | N | Y |
+| `execution:rollback` | Y | N | N | N | N |
+| `healing:invoke` | Y | Y | N | N | Y |
+| `policy:view` | Y | Y | Y | Y | N |
+| `policy:manage` | Y | N | N | N | N |
+| `audit:view` | Y | N | Y | Y | N |
+| `service_account:manage` | Y | N | N | N | N |
+| `service_account:view` | Y | N | N | N | N |
+
+Source of truth: `packages/policy/src/role-mapping.ts` → `ROLE_PERMISSIONS`.
+
+---
+
+## 3. Test Cases — Permission Resolution
+
+**File**: `packages/policy/src/resolve-permissions.test.ts` (to be created)
+**Function under test**: `resolvePermissions(roles: string[]): Permission[]`
+
+#### TC-RBAC-001: Admin role grants all permissions
+
+```
+Given: roles = ["admin"]
+When: resolvePermissions(["admin"]) is called
+Then: the result includes every permission defined in ROLE_PERMISSIONS.admin
+```
+
+#### TC-RBAC-002: Viewer role grants only read permissions
+
+```
+Given: roles = ["viewer"]
+When: resolvePermissions(["viewer"]) is called
+Then:
+  - result includes "run:view", "gate:view", "execution:view", "policy:view", "audit:view"
+  - result does NOT include "run:create", "run:cancel", "gate:approve", "gate:reject",
+    "execution:high_risk", "execution:rollback", "healing:invoke",
+    "policy:manage", "service_account:manage", "service_account:view"
+```
+
+#### TC-RBAC-003: Operator role cannot reject gates or manage policies
+
+```
+Given: roles = ["operator"]
+When: resolvePermissions(["operator"]) is called
+Then:
+  - result includes "run:create", "gate:approve", "execution:high_risk", "healing:invoke"
+  - result does NOT include "gate:reject", "policy:manage", "execution:rollback",
+    "service_account:manage"
+```
+
+#### TC-RBAC-004: Reviewer role can approve and reject gates but cannot create runs
+
+```
+Given: roles = ["reviewer"]
+When: resolvePermissions(["reviewer"]) is called
+Then:
+  - result includes "gate:approve", "gate:reject", "audit:view"
+  - result does NOT include "run:create", "run:cancel", "execution:high_risk",
+    "execution:rollback", "healing:invoke", "policy:manage"
+```
+
+#### TC-RBAC-005: Service account role cannot view policy or audit logs
+
+```
+Given: roles = ["service_account"]
+When: resolvePermissions(["service_account"]) is called
+Then:
+  - result includes "run:create", "gate:approve", "execution:high_risk", "healing:invoke"
+  - result does NOT include "policy:view", "policy:manage", "audit:view",
+    "service_account:manage", "gate:reject", "execution:rollback"
+```
+
+#### TC-RBAC-006: Multiple roles accumulate permissions (union)
+
+```
+Given: roles = ["operator", "reviewer"]
+When: resolvePermissions(["operator", "reviewer"]) is called
+Then:
+  - result includes all permissions from both operator AND reviewer
+  - "gate:reject" is included (from reviewer)
+  - "run:create" is included (from operator)
+  - no permission appears twice in the array
+```
+
+#### TC-RBAC-007: Unknown role grants no permissions
+
+```
+Given: roles = ["unknown-role"]
+When: resolvePermissions(["unknown-role"]) is called
+Then: result is an empty array
+```
+
+#### TC-RBAC-008: Role alias "Owner" resolves to admin permissions
+
+```
+Given: roles = ["Owner"]  (InsForge capitalised alias)
+When: resolvePermissions(["Owner"]) is called
+Then: result equals resolvePermissions(["admin"])
+```
+
+#### TC-RBAC-009: Role alias "ServiceAccount" resolves to service_account permissions
+
+```
+Given: roles = ["ServiceAccount"]
+When: resolvePermissions(["ServiceAccount"]) is called
+Then: result equals resolvePermissions(["service_account"])
+```
+
+#### TC-RBAC-010: Empty roles array returns empty permissions
+
+```
+Given: roles = []
+When: resolvePermissions([]) is called
+Then: result is []
+```
+
+---
+
+## 4. Test Cases — Cross-Tenant Isolation (CRITICAL)
+
+These tests are labelled `@security` and run on every PR. They require a seeded test database
+with two separate orgs (orgA and orgB), each with their own workspaces, projects, runs, and users.
+
+**File**: `packages/policy/test/cross-tenant.test.ts` (to be created)
+
+The integration tests in this section operate through the HTTP layer of `apps/control-service` to
+test the full request → permission check → DB query → response chain.
+
+#### TC-CROSS-001: Run owned by orgA is not visible to a user from orgB @security
+
+```
+Given:
+  - runA belongs to orgA / wsA / projA
+  - userB is authenticated to orgB with role "admin"
+When: GET /v1/runs/:runAId is called with userB's token
+Then:
+  - HTTP 404 is returned (not 403)
+  - Response body does not contain any data about runA
+Note: returning 404 instead of 403 prevents enumeration attacks.
+```
+
+#### TC-CROSS-002: Listing runs filters to actor's tenant @security
+
+```
+Given:
+  - 5 runs exist in orgA
+  - 3 runs exist in orgB
+  - userB is authenticated to orgB
+When: GET /v1/runs is called
+Then:
+  - Response contains exactly 3 runs (orgB's runs)
+  - None of orgA's runIds appear in the response
+```
+
+#### TC-CROSS-003: Gate approval by user not in the project is forbidden @security
+
+```
+Given:
+  - a gate on a run in orgA / projA
+  - userB is authenticated to orgB with role "admin"
+When: POST /v1/gates/:gateId/approve is called with userB's token
+Then: HTTP 404 is returned
+```
+
+#### TC-CROSS-004: Service account cannot access runs in a different org @security
+
+```
+Given:
+  - SA-A is a service account in orgA
+  - runB exists in orgB
+When: GET /v1/runs/:runBId is called with SA-A's token
+Then: HTTP 404 is returned
+```
+
+#### TC-CROSS-005: runId from a different org in path param returns 404 @security
+
+```
+Given:
+  - userA is authenticated to orgA
+  - runB.runId is a valid run in orgB
+When: GET /v1/runs/:runBId is called with userA's token
+Then: HTTP 404 (not 403)
+Rationale: 403 would confirm the runId exists; 404 prevents cross-tenant enumeration.
+```
+
+#### TC-CROSS-006: Pause run in wrong org returns 404 @security
+
+```
+Given: runB is in orgB; userA is authenticated to orgA
+When: POST /v1/runs/:runBId/pause is called with userA's token
+Then: HTTP 404
+```
+
+#### TC-CROSS-007: User in org but not in project cannot access project resources
+
+```
+Given:
+  - userA is a member of orgA and wsA but NOT projA
+  - runP is a run in projA
+When: GET /v1/runs/:runPId is called with userA's token
+Then: HTTP 404 (project-scoped resource not visible)
+```
+
+#### TC-CROSS-008: User with project-level role override gets project permissions
+
+```
+Given:
+  - userA has role "viewer" at org level
+  - userA has role "operator" at projA level (project-level override)
+When: POST /v1/runs is called within projA context
+Then: HTTP 201 (operator can create runs; org-level viewer cannot)
+```
+
+---
+
+## 5. Test Cases — Gate Approval Permissions
+
+**File**: `apps/control-service/test/approvals.test.ts` (already exists — extend)
+
+#### TC-GATE-PERM-001: User with gate:approve permission can approve
+
+```
+Given: a pending gate; userA has role "operator" (which grants gate:approve)
+When: POST /v1/gates/:gateId/approve is called
+Then: HTTP 200; gate status transitions to "pass"
+```
+
+#### TC-GATE-PERM-002: User without gate:approve permission gets 403
+
+```
+Given: a pending gate; userA has role "viewer" (which does NOT grant gate:approve)
+When: POST /v1/gates/:gateId/approve is called
+Then: HTTP 403; gate status remains "pending"
+```
+
+#### TC-GATE-PERM-003: Only admin and reviewer can reject gates
+
+```
+Given: a pending gate; userA has role "operator"
+When: POST /v1/gates/:gateId/reject is called
+Then: HTTP 403
+```
+
+```
+Given: a pending gate; userB has role "reviewer"
+When: POST /v1/gates/:gateId/reject is called
+Then: HTTP 200; gate status transitions to "fail"
+```
+
+#### TC-GATE-PERM-004: Approving an already-approved gate returns 409
+
+```
+Given: a gate that has already been approved (status: "pass")
+When: POST /v1/gates/:gateId/approve is called again
+Then: HTTP 409 Conflict
+```
+
+#### TC-GATE-PERM-005: Rejecting an already-rejected gate returns 409
+
+```
+Given: a gate with status "fail"
+When: POST /v1/gates/:gateId/reject is called
+Then: HTTP 409 Conflict
+```
+
+---
+
+## 6. Mock Requirements
+
+### Test Database Fixtures
+
+The cross-tenant tests require a fully seeded test database. The seed script must create:
+
+```sql
+-- Two organisations
+INSERT INTO organisations (id, name) VALUES
+  ('org-A', 'Org Alpha'),
+  ('org-B', 'Org Beta');
+
+-- Three workspaces (2 in orgA, 1 in orgB)
+INSERT INTO workspaces (id, org_id, name) VALUES
+  ('ws-A1', 'org-A', 'Alpha Workspace 1'),
+  ('ws-A2', 'org-A', 'Alpha Workspace 2'),
+  ('ws-B1', 'org-B', 'Beta Workspace 1');
+
+-- Five projects
+INSERT INTO projects (id, workspace_id, name) VALUES
+  ('proj-A1', 'ws-A1', 'Alpha Project 1'),
+  ('proj-A2', 'ws-A1', 'Alpha Project 2'),
+  ('proj-A3', 'ws-A2', 'Alpha Project 3'),
+  ('proj-B1', 'ws-B1', 'Beta Project 1'),
+  ('proj-B2', 'ws-B1', 'Beta Project 2');
+
+-- Users with roles
+INSERT INTO users (id, name) VALUES
+  ('user-admin-A', 'Admin Alpha'),
+  ('user-viewer-A', 'Viewer Alpha'),
+  ('user-admin-B', 'Admin Beta'),
+  ('user-sa-A', 'Service Account Alpha');
+
+INSERT INTO memberships (user_id, org_id, role) VALUES
+  ('user-admin-A', 'org-A', 'admin'),
+  ('user-viewer-A', 'org-A', 'viewer'),
+  ('user-admin-B', 'org-B', 'admin');
+```
+
+### Token Factories for Integration Tests
+
+```typescript
+// tests/fixtures/token-factory.ts
+import jwt from "jsonwebtoken";
+
+const TEST_SECRET = process.env.INSFORGE_SERVICE_ROLE_KEY ?? "test-secret";
+
+export function makeUserToken(userId: string, orgId: string, roles: string[]): string {
+  return jwt.sign(
+    { sub: userId, org_id: orgId, actor_type: "user", roles },
+    TEST_SECRET,
+    { expiresIn: "1h" }
+  );
+}
+
+export function makeServiceAccountToken(saId: string, orgId: string): string {
+  return jwt.sign(
+    { sub: saId, orgId, type: "service_account", roles: ["operator"] },
+    process.env.CKU_SERVICE_ACCOUNT_SECRET ?? "test-sa-secret",
+    { expiresIn: "1h" }
+  );
+}
+```
+
+---
+
+## 7. Test File Locations
+
+| File | Status | Notes |
+|---|---|---|
+| `packages/policy/src/resolve-permissions.test.ts` | To be created | TC-RBAC-001 through TC-RBAC-010 |
+| `packages/policy/test/cross-tenant.test.ts` | To be created | TC-CROSS-001 through TC-CROSS-008; integration |
+| `apps/control-service/test/approvals.test.ts` | Exists — extend | TC-GATE-PERM-001 through TC-GATE-PERM-005 |
+
+---
+
+## 8. Example Test Code
+
+### resolve-permissions.test.ts
+
+```typescript
+import { describe, it, expect } from "vitest";
+import { resolvePermissions } from "./resolve-permissions";
+
+describe("resolvePermissions", () => {
+  it("should grant all permissions to admin", () => {
+    const perms = resolvePermissions(["admin"]);
+    expect(perms).toContain("run:create");
+    expect(perms).toContain("gate:reject");
+    expect(perms).toContain("policy:manage");
+    expect(perms).toContain("service_account:manage");
+    expect(perms).toContain("execution:rollback");
+  });
+
+  it("should grant only read permissions to viewer", () => {
+    const perms = resolvePermissions(["viewer"]);
+    expect(perms).toContain("run:view");
+    expect(perms).not.toContain("run:create");
+    expect(perms).not.toContain("gate:approve");
+    expect(perms).not.toContain("execution:rollback");
+  });
+
+  it("should accumulate permissions from multiple roles without duplicates", () => {
+    const perms = resolvePermissions(["operator", "reviewer"]);
+    expect(perms).toContain("run:create");   // from operator
+    expect(perms).toContain("gate:reject");  // from reviewer
+    const unique = new Set(perms);
+    expect(unique.size).toBe(perms.length); // no duplicates
+  });
+
+  it("should normalise the Owner alias to admin permissions", () => {
+    const ownerPerms = resolvePermissions(["Owner"]);
+    const adminPerms = resolvePermissions(["admin"]);
+    expect(ownerPerms.sort()).toEqual(adminPerms.sort());
+  });
+
+  it("should return empty array for an unknown role", () => {
+    expect(resolvePermissions(["unknown"])).toEqual([]);
+  });
+
+  it("should return empty array for an empty roles list", () => {
+    expect(resolvePermissions([])).toEqual([]);
+  });
+});
+```
+
+### cross-tenant.test.ts (integration skeleton)
+
+```typescript
+import { describe, it, expect, beforeAll, afterAll } from "vitest";
+import { createTestApp } from "../../../apps/control-service/test/helpers/test-app";
+import { seedMultiTenantFixtures, clearFixtures } from "../../tests/fixtures/multi-tenant";
+import { makeUserToken } from "../../tests/fixtures/token-factory";
+
+describe("Cross-Tenant Isolation @security", () => {
+  let app: ReturnType<typeof createTestApp>;
+  let runAId: string;
+
+  beforeAll(async () => {
+    app = await createTestApp();
+    const { runIds } = await seedMultiTenantFixtures();
+    runAId = runIds.orgA[0];
+  });
+
+  afterAll(async () => {
+    await clearFixtures();
+    await app.close();
+  });
+
+  it("should return 404 when a user from orgB requests a run from orgA", async () => {
+    const tokenB = makeUserToken("user-admin-B", "org-B", ["admin"]);
+
+    const response = await app.inject({
+      method: "GET",
+      url: `/v1/runs/${runAId}`,
+      headers: { authorization: `Bearer ${tokenB}` },
+    });
+
+    expect(response.statusCode).toBe(404);
+  });
+
+  it("should list only orgB runs when authenticated as orgB admin", async () => {
+    const tokenB = makeUserToken("user-admin-B", "org-B", ["admin"]);
+
+    const response = await app.inject({
+      method: "GET",
+      url: "/v1/runs",
+      headers: { authorization: `Bearer ${tokenB}` },
+    });
+
+    expect(response.statusCode).toBe(200);
+    const body = JSON.parse(response.body);
+    for (const run of body.runs) {
+      expect(run.orgId).toBe("org-B");
+    }
+  });
+});
+```
diff --git a/docs/06_validation/TEST_PLAN_RUN_SCOPING.md b/docs/06_validation/TEST_PLAN_RUN_SCOPING.md
new file mode 100644
index 0000000..25cd631
--- /dev/null
+++ b/docs/06_validation/TEST_PLAN_RUN_SCOPING.md
@@ -0,0 +1,450 @@
+# Test Plan: Run Scoping and Multi-Tenant Isolation
+
+- **Document type**: Test Plan
+- **Version target**: v1.3.0
+- **Last updated**: 2026-04-04
+- **Status**: Draft — not yet executed
+- **Related risks**: R-02 (Default org bypass — Critical)
+
+---
+
+## 1. Purpose
+
+Verify that all run operations are correctly scoped to the tenant hierarchy
+(org → workspace → project) and that cross-tenant access is impossible under
+any authenticated request. A user or service account belonging to orgA must
+never be able to read, mutate, or enumerate runs that belong to orgB, even if
+they hold a valid JWT.
+
+This plan also directly verifies the remediation of security risk **R-02**:
+the `default` org exemption in `resolveSession` must be removed and blocked
+at the API layer before v1.3.0 ships.
+
+---
+
+## 2. Scope
+
+### In scope
+
+| Area | Package / App |
+|------|---------------|
+| Run creation (POST /v1/runs) | `apps/control-service` |
+| Run retrieval (GET /v1/runs, GET /v1/runs/:id) | `apps/control-service` |
+| Run state transitions (cancel, pause, resume) | `apps/control-service` |
+| Session resolution and tenant scoping | `packages/auth` |
+| Audit event scoping | `apps/control-service` |
+| SSE stream tenant isolation | `apps/control-service` |
+| RBAC permission checks on run operations | `packages/auth`, `apps/control-service` |
+| Orchestrator run context propagation | `packages/orchestrator` |
+
+### Out of scope
+
+- InsForge platform internals
+- Third-party AI provider calls
+- Billing and usage metering
+
+---
+
+## 3. Test Infrastructure
+
+### 3.1 Multi-Tenant DB Fixture
+
+All tests in this plan require a seeded test database with the following
+entities. The fixture must be created fresh for each test file and torn down
+after (using `beforeAll` / `afterAll` with `tx.rollback()`).
+
+**Organizations**
+
+| ID | Name |
+|----|------|
+| `org-a` | Org Alpha |
+| `org-b` | Org Bravo |
+
+**Workspaces**
+
+| ID | Org | Name |
+|----|-----|------|
+| `ws-1` | org-a | Alpha Primary |
+| `ws-2` | org-a | Alpha Secondary |
+| `ws-3` | org-b | Bravo Primary |
+
+**Projects**
+
+| ID | Workspace | Name |
+|----|-----------|------|
+| `proj-1` | ws-1 | Alpha P1 |
+| `proj-2` | ws-1 | Alpha P2 |
+| `proj-3` | ws-2 | Alpha P3 |
+| `proj-4` | ws-3 | Bravo P1 |
+| `proj-5` | ws-3 | Bravo P2 |
+
+**Users**
+
+| ID | Org | Role | Member of |
+|----|-----|------|-----------|
+| `user-a-admin` | org-a | org:admin | all org-a workspaces |
+| `user-a-member` | org-a | workspace:member | ws-1 only |
+| `user-a-viewer` | org-a | project:viewer | proj-1 only |
+| `user-b-admin` | org-b | org:admin | all org-b workspaces |
+| `user-b-member` | org-b | workspace:member | ws-3 only |
+| `sa-org-a` | org-a | service_account | — |
+
+**Pre-seeded Runs**
+
+| ID | Org | Workspace | Project | Created by |
+|----|-----|-----------|---------|------------|
+| `run-a1` | org-a | ws-1 | proj-1 | user-a-admin |
+| `run-a2` | org-a | ws-1 | proj-2 | user-a-member |
+| `run-a3` | org-a | ws-2 | proj-3 | user-a-admin |
+| `run-b1` | org-b | ws-3 | proj-4 | user-b-admin |
+| `run-b2` | org-b | ws-3 | proj-5 | user-b-member |
+
+### 3.2 Vitest Fixture Code
+
+```typescript
+// tests/fixtures/multi-tenant-db.ts
+import { beforeAll, afterAll } from 'vitest';
+import { db } from '../../src/db';
+
+export async function seedMultiTenantFixture() {
+  await db.transaction(async (tx) => {
+    // Orgs
+    await tx.insert(orgs).values([
+      { id: 'org-a', name: 'Org Alpha' },
+      { id: 'org-b', name: 'Org Bravo' },
+    ]);
+
+    // Workspaces
+    await tx.insert(workspaces).values([
+      { id: 'ws-1', orgId: 'org-a', name: 'Alpha Primary' },
+      { id: 'ws-2', orgId: 'org-a', name: 'Alpha Secondary' },
+      { id: 'ws-3', orgId: 'org-b', name: 'Bravo Primary' },
+    ]);
+
+    // Projects
+    await tx.insert(projects).values([
+      { id: 'proj-1', workspaceId: 'ws-1', orgId: 'org-a' },
+      { id: 'proj-2', workspaceId: 'ws-1', orgId: 'org-a' },
+      { id: 'proj-3', workspaceId: 'ws-2', orgId: 'org-a' },
+      { id: 'proj-4', workspaceId: 'ws-3', orgId: 'org-b' },
+      { id: 'proj-5', workspaceId: 'ws-3', orgId: 'org-b' },
+    ]);
+
+    // Users and memberships (abbreviated — expand per user table schema)
+    await tx.insert(users).values([
+      { id: 'user-a-admin', orgId: 'org-a', role: 'org:admin' },
+      { id: 'user-a-member', orgId: 'org-a', role: 'workspace:member' },
+      { id: 'user-a-viewer', orgId: 'org-a', role: 'project:viewer' },
+      { id: 'user-b-admin', orgId: 'org-b', role: 'org:admin' },
+      { id: 'user-b-member', orgId: 'org-b', role: 'workspace:member' },
+    ]);
+
+    // Pre-seeded runs
+    await tx.insert(runs).values([
+      { id: 'run-a1', orgId: 'org-a', workspaceId: 'ws-1', projectId: 'proj-1', actorId: 'user-a-admin' },
+      { id: 'run-a2', orgId: 'org-a', workspaceId: 'ws-1', projectId: 'proj-2', actorId: 'user-a-member' },
+      { id: 'run-a3', orgId: 'org-a', workspaceId: 'ws-2', projectId: 'proj-3', actorId: 'user-a-admin' },
+      { id: 'run-b1', orgId: 'org-b', workspaceId: 'ws-3', projectId: 'proj-4', actorId: 'user-b-admin' },
+      { id: 'run-b2', orgId: 'org-b', workspaceId: 'ws-3', projectId: 'proj-5', actorId: 'user-b-member' },
+    ]);
+  });
+}
+
+export async function teardownMultiTenantFixture() {
+  await db.transaction(async (tx) => {
+    await tx.delete(runs);
+    await tx.delete(projects);
+    await tx.delete(workspaces);
+    await tx.delete(orgs);
+    await tx.delete(users);
+  });
+}
+```
+
+---
+
+## 4. Test Cases
+
+### 4.1 Describe: Run Creation Scoping
+
+```typescript
+describe('Run Creation Scoping', () => {
+  it('should scope run to provided orgId/workspaceId/projectId', async () => {
+    const res = await api.post('/v1/runs', {
+      orgId: 'org-a',
+      workspaceId: 'ws-1',
+      projectId: 'proj-1',
+      idea: 'test run',
+    }, { as: 'user-a-admin' });
+
+    expect(res.status).toBe(201);
+    expect(res.body.run.orgId).toBe('org-a');
+    expect(res.body.run.workspaceId).toBe('ws-1');
+    expect(res.body.run.projectId).toBe('proj-1');
+  });
+
+  it('should scope run to workspace when no projectId provided', async () => {
+    const res = await api.post('/v1/runs', {
+      orgId: 'org-a',
+      workspaceId: 'ws-1',
+      idea: 'workspace-level run',
+    }, { as: 'user-a-admin' });
+
+    expect(res.status).toBe(201);
+    expect(res.body.run.orgId).toBe('org-a');
+    expect(res.body.run.workspaceId).toBe('ws-1');
+    expect(res.body.run.projectId).toBeNull();
+  });
+
+  it("should reject POST /v1/runs with orgId from caller's non-member org → 403", async () => {
+    // user-a-admin is in org-a, attempts to create a run in org-b
+    const res = await api.post('/v1/runs', {
+      orgId: 'org-b',
+      workspaceId: 'ws-3',
+      projectId: 'proj-4',
+      idea: 'cross-tenant attempt',
+    }, { as: 'user-a-admin' });
+
+    expect(res.status).toBe(403);
+    expect(res.body.code).toBe('FORBIDDEN');
+  });
+
+  it("should scope service account run to SA's organization", async () => {
+    const res = await api.post('/v1/runs', {
+      idea: 'sa run',
+    }, { as: 'sa-org-a' });
+
+    expect(res.status).toBe(201);
+    expect(res.body.run.orgId).toBe('org-a');
+    expect(res.body.run.actorType).toBe('service_account');
+  });
+
+  it('should include actorId and actorType in created run', async () => {
+    const res = await api.post('/v1/runs', {
+      orgId: 'org-a',
+      workspaceId: 'ws-1',
+      idea: 'actor check',
+    }, { as: 'user-a-member' });
+
+    expect(res.status).toBe(201);
+    expect(res.body.run.actorId).toBe('user-a-member');
+    expect(res.body.run.actorType).toBe('user');
+  });
+});
+```
+
+---
+
+### 4.2 Describe: Run Retrieval Isolation
+
+```typescript
+describe('Run Retrieval Isolation', () => {
+  it("GET /v1/runs should return only runs in caller's accessible projects", async () => {
+    const res = await api.get('/v1/runs', { as: 'user-a-member' });
+
+    expect(res.status).toBe(200);
+    const ids = res.body.runs.map((r: Run) => r.id);
+    expect(ids).toContain('run-a2');       // user-a-member's own run in ws-1
+    expect(ids).not.toContain('run-b1');   // org-b run — must not appear
+    expect(ids).not.toContain('run-b2');   // org-b run — must not appear
+  });
+
+  it("GET /v1/runs/{id} where id belongs to orgB, caller is orgA user → 404", async () => {
+    const res = await api.get('/v1/runs/run-b1', { as: 'user-a-admin' });
+
+    // Must be 404, not 403 — do not reveal that the run exists
+    expect(res.status).toBe(404);
+  });
+
+  it('admin user should see all runs in their org', async () => {
+    const res = await api.get('/v1/runs', { as: 'user-a-admin' });
+
+    expect(res.status).toBe(200);
+    const ids = res.body.runs.map((r: Run) => r.id);
+    expect(ids).toContain('run-a1');
+    expect(ids).toContain('run-a2');
+    expect(ids).toContain('run-a3');
+    expect(ids).not.toContain('run-b1');
+    expect(ids).not.toContain('run-b2');
+  });
+
+  it("viewer in project should see only that project's runs", async () => {
+    // user-a-viewer has project:viewer on proj-1 only
+    const res = await api.get('/v1/runs', { as: 'user-a-viewer' });
+
+    expect(res.status).toBe(200);
+    const ids = res.body.runs.map((r: Run) => r.id);
+    expect(ids).toContain('run-a1');     // proj-1 run — visible
+    expect(ids).not.toContain('run-a2'); // proj-2 run — not visible
+    expect(ids).not.toContain('run-a3'); // proj-3 run — not visible
+  });
+
+  it('pagination should not allow enumerating runs across tenants', async () => {
+    // Fetch many pages as user-a-admin; none should contain org-b runs
+    let page = 1;
+    let totalOrgBLeaks = 0;
+    let hasMore = true;
+
+    while (hasMore) {
+      const res = await api.get(`/v1/runs?page=${page}&limit=2`, { as: 'user-a-admin' });
+      expect(res.status).toBe(200);
+      const leaked = res.body.runs.filter((r: Run) => r.orgId === 'org-b');
+      totalOrgBLeaks += leaked.length;
+      hasMore = res.body.hasNextPage;
+      page++;
+    }
+
+    expect(totalOrgBLeaks).toBe(0);
+  });
+});
+```
+
+---
+
+### 4.3 Describe: Run State Management Isolation
+
+```typescript
+describe('Run State Management Isolation', () => {
+  it('POST /v1/runs/{id}/cancel where run is in orgB → 404 not 403', async () => {
+    // user-a-admin targets a run that belongs to org-b
+    const res = await api.post('/v1/runs/run-b1/cancel', {}, { as: 'user-a-admin' });
+
+    expect(res.status).toBe(404);
+    // Must not be 403 — that would confirm the run's existence
+    expect(res.status).not.toBe(403);
+  });
+
+  it('POST /v1/runs/{id}/resume by user without run:update permission → 403', async () => {
+    // user-a-viewer only has project:viewer, not run:update
+    const res = await api.post('/v1/runs/run-a1/resume', {}, { as: 'user-a-viewer' });
+
+    expect(res.status).toBe(403);
+    expect(res.body.code).toBe('FORBIDDEN');
+  });
+
+  it('POST /v1/runs/{id}/pause by different org user → 404', async () => {
+    const res = await api.post('/v1/runs/run-a1/pause', {}, { as: 'user-b-member' });
+
+    expect(res.status).toBe(404);
+  });
+});
+```
+
+---
+
+### 4.4 Describe: Audit Trail Scoping
+
+```typescript
+describe('Audit Trail Scoping', () => {
+  it('audit events should include orgId, workspaceId, projectId', async () => {
+    const res = await api.post('/v1/runs', {
+      orgId: 'org-a',
+      workspaceId: 'ws-1',
+      projectId: 'proj-1',
+      idea: 'audit check',
+    }, { as: 'user-a-admin' });
+
+    expect(res.status).toBe(201);
+    const runId = res.body.run.id;
+
+    const auditRes = await api.get(`/v1/audit?runId=${runId}`, { as: 'user-a-admin' });
+    expect(auditRes.status).toBe(200);
+
+    const event = auditRes.body.events[0];
+    expect(event.orgId).toBe('org-a');
+    expect(event.workspaceId).toBe('ws-1');
+    expect(event.projectId).toBe('proj-1');
+  });
+
+  it("GET /v1/audit should only return events from caller's tenant", async () => {
+    const res = await api.get('/v1/audit', { as: 'user-a-admin' });
+
+    expect(res.status).toBe(200);
+    const orgIds = res.body.events.map((e: AuditEvent) => e.orgId);
+    expect(orgIds.every((id: string) => id === 'org-a')).toBe(true);
+  });
+
+  it('SSE stream should not leak events from other tenants', async () => {
+    const received: SSEEvent[] = [];
+
+    // Connect as user-a-admin
+    const stream = await sseClient.connect('/v1/stream', { as: 'user-a-admin' });
+    stream.on('event', (e: SSEEvent) => received.push(e));
+
+    // Trigger an org-b action as user-b-admin
+    await api.post('/v1/runs', { idea: 'b event', orgId: 'org-b' }, { as: 'user-b-admin' });
+    await wait(200); // allow SSE propagation window
+
+    stream.close();
+    const orgBLeaks = received.filter((e) => e.orgId === 'org-b');
+    expect(orgBLeaks).toHaveLength(0);
+  });
+});
+```
+
+---
+
+### 4.5 Describe: Default Org Bypass (Security Bug R-02)
+
+```typescript
+describe('Default Org Bypass — R-02', () => {
+  it("POST /v1/runs with orgId='default' should be rejected with 400", async () => {
+    const res = await api.post('/v1/runs', {
+      orgId: 'default',
+      workspaceId: 'ws-1',
+      idea: 'bypass attempt',
+    }, { as: 'user-a-admin' });
+
+    expect(res.status).toBe(400);
+    expect(res.body.code).toBe('INVALID_ORG_ID');
+  });
+
+  it("resolveSession with orgId='default' should not bypass tenant checks", async () => {
+    // Call resolveSession directly (unit test)
+    const { resolveSession } = await import('packages/auth/src/session');
+    const session = await resolveSession({
+      token: buildValidToken({ sub: 'user-a-admin', org: 'org-a' }),
+      requestedOrgId: 'default',
+    });
+
+    expect(session).toBeNull();
+  });
+
+  it("removing the 'default' exemption should not break normal runs", async () => {
+    // Confirm org-a runs still work correctly after the fix
+    const res = await api.post('/v1/runs', {
+      orgId: 'org-a',
+      workspaceId: 'ws-1',
+      idea: 'normal run after fix',
+    }, { as: 'user-a-admin' });
+
+    expect(res.status).toBe(201);
+    expect(res.body.run.orgId).toBe('org-a');
+  });
+});
+```
+
+---
+
+## 5. Pass Criteria
+
+All tests in this plan must pass before the v1.3.0 release tag is cut. Failure
+of any isolation test, particularly those in section 4.5, is a **hard block**.
+
+| Category | Required | Notes |
+|----------|----------|-------|
+| Run Creation Scoping | 5/5 pass | — |
+| Run Retrieval Isolation | 5/5 pass | — |
+| Run State Management Isolation | 3/3 pass | — |
+| Audit Trail Scoping | 3/3 pass | — |
+| Default Org Bypass (R-02) | 3/3 pass | Hard block — cannot ship with any failure |
+
+---
+
+## 6. Related Documents
+
+- `docs/06_validation/SECURITY_TESTING_PLAN.md`
+- `docs/06_validation/GO_NO_GO_CHECKLIST.md`
+- `docs/06_validation/PRODUCTION_READINESS.md`
+- `docs/06_validation/TEST_PLAN_AUTH.md`
+- `docs/06_validation/TEST_PLAN_RBAC.md`
diff --git a/docs/06_validation/TEST_STRATEGY.md b/docs/06_validation/TEST_STRATEGY.md
new file mode 100644
index 0000000..01aa09a
--- /dev/null
+++ b/docs/06_validation/TEST_STRATEGY.md
@@ -0,0 +1,485 @@
+# Test Strategy — Code-Kit-Ultra
+
+**Version**: 2.0.0
+**Date**: 2026-04-04
+**Status**: Active
+**Owner**: Platform Engineering
+
+---
+
+## Table of Contents
+
+1. [Testing Philosophy](#1-testing-philosophy)
+2. [Test Pyramid](#2-test-pyramid)
+3. [Coverage Targets](#3-coverage-targets)
+4. [Test Naming Convention](#4-test-naming-convention)
+5. [Test Categories and Owners](#5-test-categories-and-owners)
+6. [Test Infrastructure Requirements](#6-test-infrastructure-requirements)
+7. [Environment Setup](#7-environment-setup)
+8. [CI Integration](#8-ci-integration)
+9. [Flakiness Policy](#9-flakiness-policy)
+10. [Coverage Reporting](#10-coverage-reporting)
+11. [Phase-Based Rollout](#11-phase-based-rollout)
+12. [Definition of Done](#12-definition-of-done)
+
+---
+
+## 1. Testing Philosophy
+
+Code-Kit-Ultra is a multi-tenant AI orchestration platform where a single defect in auth, gating, or
+tenancy isolation can silently expose one customer's data to another. Our testing philosophy reflects
+that risk profile directly.
+
+### Contract-First
+
+Every public interface — HTTP routes, token verification, gate evaluation, permission resolution —
+is tested against its stated contract before any implementation detail is examined. If the contract
+changes, the tests must change first. This ensures:
+
+- Consumers of a package can rely on documented behaviour.
+- Refactors that preserve the contract do not break tests.
+- Breaking changes are always visible as red tests before a PR is merged.
+
+### Behavior-Driven
+
+Tests describe *observable system behaviour* from the perspective of the caller, not internal
+implementation steps. A test for `verifyInsForgeToken` asserts "given an expired token, the caller
+receives a `TokenExpiredError`" — not "the `jwt.verify` callback received an error object with
+message `jwt expired`". This means:
+
+- Tests survive implementation refactors.
+- Failure messages communicate user-visible regressions.
+- Mocks are used only at system boundaries (HTTP, DB, external APIs).
+
+### Security-Aware
+
+Security properties — cross-tenant isolation, token revocation, permission enforcement — are treated
+as first-class test requirements, not afterthoughts. Each security invariant has at least one
+dedicated test case. Security tests are labelled `@security` and always run on every PR regardless
+of which files changed.
+
+---
+
+## 2. Test Pyramid
+
+```
+                    ┌──────────────┐
+                    │   E2E (10%)  │  Full pipeline runs against a real DB and JWKS mock
+                    ├──────────────┤
+                    │ Integration  │  Cross-package flows, real DB queries, HTTP layer
+                    │    (20%)     │
+                    ├──────────────┤
+                    │  Unit (70%)  │  Pure functions, mocked I/O, fast feedback
+                    └──────────────┘
+```
+
+### Unit Tests (70%)
+
+Unit tests cover a single function or class with all I/O mocked. They run in under 10 ms each and
+require no external services. Examples:
+
+| File | What is tested |
+|---|---|
+| `packages/auth/src/verify-insforge-token.test.ts` | JWKS key fetch + jwt.verify contract |
+| `packages/auth/src/resolve-session.test.ts` | Claims mapping, default role injection |
+| `packages/auth/src/issue-execution-token.test.ts` | HS256 token payload, 10-min expiry |
+| `packages/policy/src/resolve-permissions.test.ts` | Role → permission matrix, alias expansion |
+| `packages/governance/src/confidence-engine.test.ts` | Weighted score calculation |
+| `packages/governance/src/gate-manager.test.ts` | Gate sequencing, turbo mode auto-pass |
+| `packages/healing/src/failure-classifier.test.ts` | Error type classification |
+
+### Integration Tests (20%)
+
+Integration tests exercise a slice of the system across at least two packages, using a real (seeded)
+test database and mocked external services (JWKS, GitHub API). They run in under 2 seconds each.
+Examples:
+
+| File | What is tested |
+|---|---|
+| `packages/orchestrator/test/run-lifecycle.test.ts` | planned → running → completed transition |
+| `packages/orchestrator/test/gate-rejection.test.ts` | Gate reject → run cancelled |
+| `packages/auth/test/session-revocation.test.ts` | jti blacklist prevents session use |
+| `packages/policy/test/cross-tenant.test.ts` | orgA run not visible to orgB user |
+| `apps/control-service/test/approvals.test.ts` | POST /v1/gates/:id/approve HTTP contract |
+
+### E2E Tests (10%)
+
+E2E tests run the full 8-phase pipeline from intake to deployment artifact against a containerised
+Postgres instance and a JWKS mock server. They validate the happy path and critical failure paths
+for each supported mode. Examples:
+
+| Scenario | Modes covered |
+|---|---|
+| Full pipeline completes successfully | turbo, builder, safe |
+| Gate rejection cancels run | safe |
+| Healing loop recovers transient failure | builder |
+| Service account executes run end-to-end | turbo |
+
+---
+
+## 3. Coverage Targets
+
+Coverage is measured per package using Istanbul/c8 via Vitest's built-in coverage reporter.
+Targets are enforced as hard failures in CI.
+
+| Package | Line coverage target | Branch coverage target | Notes |
+|---|---|---|---|
+| `packages/auth` | ≥ 90% | ≥ 85% | Security-critical; no exceptions |
+| `packages/governance` | ≥ 80% | ≥ 75% | 9 gates must each be tested |
+| `packages/policy` | ≥ 80% | ≥ 75% | Permission matrix fully exercised |
+| `packages/orchestrator` | ≥ 80% | ≥ 70% | All phase transitions covered |
+| `packages/healing` | ≥ 80% | ≥ 70% | All classifier branches covered |
+| `packages/adapters` | ≥ 70% | ≥ 60% | Adapter contracts, not LLM output |
+| `packages/shared` | ≥ 60% | ≥ 50% | Mostly type declarations |
+| `apps/control-service` | ≥ 75% | ≥ 65% | HTTP layer + middleware |
+
+Current overall coverage: ~40%. Target overall coverage: ≥ 78%.
+
+---
+
+## 4. Test Naming Convention
+
+All test files follow a single naming convention to make failure output unambiguous.
+
+```typescript
+// File: packages/auth/src/verify-insforge-token.test.ts
+
+describe("verifyInsForgeToken", () => {
+  describe("when the JWKS endpoint is reachable", () => {
+    it("should return decoded claims for a valid RS256 token", async () => { ... });
+    it("should throw TokenExpiredError when the token exp has passed", async () => { ... });
+    it("should throw InvalidIssuerError when iss does not match config", async () => { ... });
+  });
+
+  describe("when the JWKS endpoint is unreachable", () => {
+    it("should retry three times then throw JwksUnavailableError", async () => { ... });
+  });
+});
+```
+
+Rules:
+- `describe` block names are the name of the exported function, class, or route (e.g. `"evaluateGates"`, `"POST /v1/runs"`).
+- Nested `describe` blocks group by precondition: `"when [context]"`.
+- `it` names always start with `"should"` and include both the expected outcome and the triggering condition: `"should [outcome] when [trigger]"`.
+- Test file names mirror source file names with `.test.ts` suffix.
+- Test files live adjacent to source files for unit tests; in a `test/` subdirectory for integration and E2E tests.
+
+---
+
+## 5. Test Categories and Owners
+
+| Category | Packages | Owner team | Priority |
+|---|---|---|---|
+| **Auth** | `packages/auth` | Platform Security | P0 — blocking |
+| **Run Lifecycle** | `packages/orchestrator`, `packages/shared` | Orchestration | P0 — blocking |
+| **Gate Evaluation** | `packages/governance`, `packages/orchestrator` | Governance | P0 — blocking |
+| **RBAC and Permissions** | `packages/policy` | Platform Security | P0 — blocking |
+| **Cross-Tenant Isolation** | `packages/policy`, `packages/orchestrator` | Platform Security | P0 — blocking |
+| **Adapter Execution** | `packages/adapters`, `packages/skill-engine` | Adapters | P1 — high value |
+| **Persistence** | `packages/storage`, `db/` | Data | P1 — high value |
+| **Healing Loop** | `packages/healing` | Resilience | P1 — high value |
+| **Observability** | `packages/observability`, `packages/audit` | Observability | P2 — nice to have |
+| **Realtime SSE** | `packages/realtime` | Platform | P2 — nice to have |
+
+---
+
+## 6. Test Infrastructure Requirements
+
+All infrastructure components must be available before integration and E2E tests can run.
+
+### JWKS Mock Server
+
+- Purpose: serve RS256 public keys without requiring a live InsForge instance.
+- Implementation: `msw` (Mock Service Worker) in Node mode, or a lightweight `fastify` server
+  started in `globalSetup`.
+- Must expose: `GET /.well-known/jwks.json` returning a valid JWKS payload with at least one RSA
+  key.
+- Key pair: generated at test startup using `node:crypto` `generateKeyPairSync("rsa", { modulusLength: 2048 })`.
+- Tokens for tests are signed with the corresponding private key.
+
+```typescript
+// tests/mocks/jwks-server.ts
+import { createServer } from "node:http";
+import { generateKeyPairSync } from "node:crypto";
+
+export const { privateKey, publicKey } = generateKeyPairSync("rsa", {
+  modulusLength: 2048,
+  publicKeyEncoding: { type: "spki", format: "pem" },
+  privateKeyEncoding: { type: "pkcs8", format: "pem" },
+});
+
+export function startJwksMockServer(port = 9999) {
+  const server = createServer((req, res) => {
+    if (req.url === "/.well-known/jwks.json") {
+      res.writeHead(200, { "Content-Type": "application/json" });
+      res.end(JSON.stringify({ keys: [/* JWK derived from publicKey */] }));
+    }
+  });
+  server.listen(port);
+  return server;
+}
+```
+
+### Test Database
+
+- Engine: PostgreSQL 15 running in Docker (or `pg-mem` for unit tests that need SQL without Docker).
+- Lifecycle: schema applied from `db/migrations/`, fixtures seeded before each test suite, cleaned
+  after each suite.
+- Fixture sets:
+  - `fixtures/multi-tenant.sql`: 2 orgs, 3 workspaces, 5 projects, 8 users across roles.
+  - `fixtures/run-states.sql`: runs in each `RunStatus` state.
+  - `fixtures/gate-decisions.sql`: gates in each `GateStatus` state.
+- Access: exposed via `TEST_DATABASE_URL` environment variable.
+
+### GitHub API Mock
+
+- Purpose: prevent adapter tests from hitting real GitHub APIs.
+- Implementation: `msw` handlers registered in test setup for `https://api.github.com/*`.
+- Covers: repo creation, PR creation, commit, status checks.
+
+### SSE Listener Utility
+
+```typescript
+// tests/utils/sse-listener.ts
+export async function collectSseEvents(url: string, count: number): Promise<unknown[]> {
+  // Opens an EventSource, collects `count` events, then closes.
+}
+```
+
+### Fixture Builders
+
+Type-safe factory functions for constructing test data without manual object assembly.
+
+```typescript
+// tests/fixtures/builders.ts
+import type { RunState, RunBundle, PlanTask } from "@cku/shared";
+
+export function buildRunState(overrides: Partial<RunState> = {}): RunState {
+  return {
+    runId: "run-test-001",
+    createdAt: new Date().toISOString(),
+    updatedAt: new Date().toISOString(),
+    currentStepIndex: 0,
+    status: "planned",
+    approvalRequired: false,
+    approved: false,
+    orgId: "org-test-001",
+    workspaceId: "ws-test-001",
+    projectId: "proj-test-001",
+    actorId: "user-test-001",
+    actorType: "user",
+    correlationId: "corr-test-001",
+    ...overrides,
+  };
+}
+
+export function buildPlanTask(overrides: Partial<PlanTask> = {}): PlanTask {
+  return {
+    id: "task-001",
+    title: "Test task",
+    description: "A test task",
+    phase: "building",
+    doneDefinition: "File created",
+    taskType: "implementation",
+    adapterId: "code-writer",
+    payload: {},
+    ...overrides,
+  };
+}
+```
+
+---
+
+## 7. Environment Setup
+
+### Prerequisites
+
+```bash
+# Install dependencies
+pnpm install
+
+# Start test infrastructure (Docker required)
+docker compose -f docker-compose.test.yml up -d postgres
+```
+
+### Running Tests
+
+```bash
+# All tests in a single package
+pnpm --filter @cku/auth test
+
+# Auth package only (alias)
+pnpm test:auth
+
+# All unit tests across the monorepo
+pnpm vitest run
+
+# Watch mode (development)
+pnpm vitest
+
+# Coverage report for a single package
+pnpm --filter @cku/auth vitest run --coverage
+
+# Coverage report for all packages
+pnpm vitest run --coverage
+```
+
+### Environment Variables for Tests
+
+```bash
+# .env.test (committed without secrets, used by vitest.config.ts)
+INSFORGE_JWT_ISSUER=https://auth.insforge.test
+INSFORGE_JWT_AUDIENCE=cku-api-test
+INSFORGE_JWKS_URL=http://localhost:9999/.well-known/jwks.json
+INSFORGE_SERVICE_ROLE_KEY=test-service-role-key-32-chars-min
+CKU_SERVICE_ACCOUNT_SECRET=test-sa-secret-change-in-prod
+TEST_DATABASE_URL=postgresql://cku_test:cku_test@localhost:5433/cku_test
+```
+
+---
+
+## 8. CI Integration
+
+### On Every Pull Request
+
+The following test suites run as required status checks. A PR cannot be merged if any of these fail.
+
+| Suite | Command | Timeout |
+|---|---|---|
+| Unit tests (all packages) | `pnpm vitest run --reporter=verbose` | 3 min |
+| Auth package coverage gate | `pnpm --filter @cku/auth vitest run --coverage` | 2 min |
+| Policy package coverage gate | `pnpm --filter @cku/policy vitest run --coverage` | 2 min |
+| Security-labelled tests | `pnpm vitest run --reporter=verbose -t @security` | 2 min |
+| TypeScript type check | `pnpm tsc --noEmit` | 2 min |
+
+### On Merge to Main
+
+All PR checks plus:
+
+| Suite | Command | Timeout |
+|---|---|---|
+| Integration tests | `pnpm vitest run --config vitest.integration.config.ts` | 10 min |
+| E2E tests | `pnpm vitest run --config vitest.e2e.config.ts` | 20 min |
+| Full coverage report | `pnpm vitest run --coverage --reporter=lcov` | 15 min |
+
+### On Release Tag
+
+All of the above plus manual smoke test checklist in `docs/06_validation/VALIDATION_MASTER.md`.
+
+---
+
+## 9. Flakiness Policy
+
+Flaky tests are treated with the same urgency as production bugs. A test is considered flaky if it
+fails on more than 1 in 20 runs without a code change.
+
+Rules:
+
+- **No `sleep()` or `setTimeout()` in test bodies.** Use `vi.useFakeTimers()` and `vi.advanceTimersByTime()` for time-dependent code.
+- **All async operations must be properly awaited.** Never use floating promises. ESLint rule `@typescript-eslint/no-floating-promises` is enabled.
+- **No hardcoded ports.** Use `0` (OS-assigned) for servers started in tests, or use a port registry utility to avoid conflicts.
+- **No shared mutable state between tests.** Each test must be independently runnable. Use `beforeEach` / `afterEach` for setup and teardown.
+- **No tests that depend on execution order.** Running `vitest run --shuffle` must produce the same pass/fail result.
+- Any flaky test discovered in CI must be fixed within 2 business days or temporarily skipped with a tracked issue reference:
+  ```typescript
+  it.skip("should [...] — SKIP: tracked in GH-1234", async () => { ... });
+  ```
+
+---
+
+## 10. Coverage Reporting
+
+Coverage is collected using Vitest's built-in c8/Istanbul provider.
+
+```typescript
+// vitest.config.ts (relevant section)
+export default defineConfig({
+  test: {
+    coverage: {
+      provider: "v8",
+      reporter: ["text", "lcov", "html"],
+      thresholds: {
+        lines: 78,
+        branches: 70,
+        functions: 75,
+        statements: 78,
+      },
+      include: ["packages/*/src/**/*.ts"],
+      exclude: ["**/*.test.ts", "**/*.d.ts", "**/index.ts"],
+    },
+  },
+});
+```
+
+### Viewing the Report
+
+```bash
+# Generate HTML report
+pnpm vitest run --coverage
+
+# Open the report
+open coverage/index.html
+```
+
+### CI Artifact
+
+The `lcov.info` file is uploaded as a CI artifact and sent to the coverage dashboard on every merge
+to `main`. Coverage regressions of more than 2% relative to the previous merge block the release
+pipeline.
+
+---
+
+## 11. Phase-Based Rollout
+
+### Phase 1 — Blocking (8 dev-days estimated)
+
+These tests must pass before any production release. They cover the highest-risk security and
+correctness invariants.
+
+| Test file to create | Owner | Days |
+|---|---|---|
+| `packages/auth/src/issue-execution-token.test.ts` | Platform Security | 0.5 |
+| `packages/auth/src/service-account.test.ts` | Platform Security | 0.5 |
+| `packages/auth/test/session-revocation.test.ts` | Platform Security | 1.0 |
+| `packages/policy/src/resolve-permissions.test.ts` | Platform Security | 0.5 |
+| `packages/policy/test/cross-tenant.test.ts` | Platform Security | 1.5 |
+| `packages/governance/src/gate-manager.test.ts` | Governance | 1.0 |
+| `packages/orchestrator/test/gate-rejection.test.ts` | Orchestration | 1.0 |
+| `packages/orchestrator/test/run-lifecycle.test.ts` | Orchestration | 2.0 |
+
+### Phase 2 — High Value (7 dev-days estimated)
+
+These tests significantly raise confidence and coverage but are not strictly blocking for the first
+production release.
+
+| Test file to create | Owner | Days |
+|---|---|---|
+| `packages/healing/src/healing-engine.test.ts` | Resilience | 2.0 |
+| `packages/healing/src/failure-classifier.test.ts` | Resilience | 0.5 |
+| `packages/governance/src/confidence-engine.test.ts` | Governance | 0.5 |
+| `packages/governance/src/constraint-engine.test.ts` | Governance | 0.5 |
+| `packages/governance/src/kill-switch.test.ts` | Governance | 0.5 |
+| `packages/storage/test/persistence.test.ts` | Data | 2.0 |
+| `apps/control-service/test/session.test.ts` | Platform | 1.0 |
+
+### Phase 3 — Nice to Have
+
+- SSE event stream tests (`packages/realtime`)
+- Adapter execution tests (`packages/adapters`)
+- Learning / memory package tests
+- Performance benchmarks for gate evaluation at scale
+
+---
+
+## 12. Definition of Done
+
+The test strategy for Code-Kit-Ultra is considered complete when:
+
+1. All Phase 1 test files exist and pass in CI with zero skipped tests.
+2. Coverage targets in Section 3 are met for `packages/auth`, `packages/policy`, and `packages/governance`.
+3. The JWKS mock server, test database, and fixture builders are implemented and documented.
+4. The flakiness policy in Section 9 is enforced via ESLint and reviewed in every PR.
+5. CI gates in Section 8 are configured and actively blocking merges on failure.
+6. All security-labelled tests (`@security`) run on every PR without exception.
+
+Until all six criteria are met, the project is not cleared for production traffic from external tenants.
diff --git a/docs/07_cicd/CONVENTIONAL_COMMITS.md b/docs/07_cicd/CONVENTIONAL_COMMITS.md
new file mode 100644
index 0000000..84110bf
--- /dev/null
+++ b/docs/07_cicd/CONVENTIONAL_COMMITS.md
@@ -0,0 +1,343 @@
+# Conventional Commits Specification
+
+**Project:** Code-Kit-Ultra v1.2.0
+**Enforced by:** `lint:commits` script (`tsx scripts/release/lint-commits.ts`)
+**Workflow:** `lint-commits.yml` runs on every PR
+
+---
+
+## 1. Format Specification
+
+Every commit message must follow this structure:
+
+```
+<type>(<scope>): <subject>
+
+[optional body]
+
+[optional footer(s)]
+```
+
+- The first line (header) is mandatory. Body and footers are optional.
+- A blank line separates the header from the body, and the body from footers.
+- The header must not exceed 72 characters.
+
+---
+
+## 2. Commit Types
+
+| Type | Semver Impact | Description | Example |
+|---|---|---|---|
+| `feat` | MINOR | New feature added to the public API or user-facing behavior | `feat(auth): add Redis session revocation` |
+| `fix` | PATCH | Bug fix that corrects incorrect behavior | `fix(orchestrator): handle null steps in phase-engine` |
+| `security` | PATCH | Security fix or vulnerability remediation | `security(auth): remove hardcoded SA secret` |
+| `docs` | none | Documentation only changes | `docs(orchestrator): add SPEC_ORCHESTRATOR.md` |
+| `chore` | none | Maintenance work that does not affect runtime behavior | `chore(deps): update vitest to 1.5.0` |
+| `test` | none | Adding or updating tests with no production code change | `test(auth): add cross-tenant isolation tests` |
+| `refactor` | none | Code restructure with no behavior change | `refactor(governance): extract gate-controller from gate-manager` |
+| `perf` | PATCH | Performance improvement | `perf(db): add index on runs.org_id` |
+| `ci` | none | Changes to CI/CD workflows or scripts | `ci: add integration test workflow` |
+| `build` | none | Changes to the build system or tooling configuration | `build: add vitest.integration.config.ts` |
+| `revert` | varies | Reverts a previous commit | `revert: feat(auth): add session revocation` |
+| `BREAKING CHANGE` | MAJOR | Breaking API change — use in footer, not as type | See Section 7 |
+
+---
+
+## 3. Allowed Scopes
+
+Scopes are organized by layer. Use exactly one scope per commit. Multi-scope commits must be
+split into separate commits.
+
+**Packages:**
+`auth`, `orchestrator`, `governance`, `adapters`, `healing`, `learning`, `observability`,
+`audit`, `events`, `realtime`, `storage`, `shared`, `policy`, `memory`, `security`,
+`agents`, `skill-engine`, `command-engine`, `core`, `tools`
+
+**Apps:**
+`cli`, `api`, `web`
+
+**Infrastructure:**
+`db`, `ci`, `deps`, `docker`, `k8s`
+
+**Docs:**
+`docs`, `specs`, `changelog`
+
+---
+
+## 4. Subject Rules
+
+- Use imperative mood: "add" not "added" or "adds"
+- No capital first letter after the colon
+- No period at the end
+- Maximum 72 characters for the full header (`type(scope): subject`)
+- Describe WHAT was done, not WHY — the why belongs in the body
+
+| Correct | Incorrect |
+|---|---|
+| `feat(auth): add session refresh endpoint` | `feat(auth): Added session refresh endpoint` |
+| `fix(orchestrator): handle empty step array` | `fix(orchestrator): fixed the bug` |
+| `docs(specs): add SPEC_GOVERNANCE.md` | `docs(specs): Add SPEC_GOVERNANCE.md.` |
+
+---
+
+## 5. Body Rules
+
+- Separate the body from the subject with a blank line
+- Wrap lines at 72 characters
+- Explain WHY the change was made and WHAT the impact is, not HOW it was implemented
+- Reference related behavior, prior decisions, or spec links when useful
+
+Example body:
+
+```
+Previously the orchestrator would silently skip phases when the step
+array was null rather than throwing. This masked misconfigurations
+upstream and made debugging harder. Now a MissingStepsError is thrown
+at bundle validation time with the phase name and bundle ID in context.
+```
+
+---
+
+## 6. Footer Rules
+
+**Breaking changes** — must appear in the footer for MAJOR version bumps:
+
+```
+BREAKING CHANGE: executeRunBundle now requires actorId as second parameter.
+Update all call sites to pass a valid actorId string.
+```
+
+**Issue references** — link to GitHub issues:
+
+```
+Fixes #123
+Closes #456
+Related to #789
+```
+
+**Co-authorship:**
+
+```
+Co-authored-by: Jane Smith <jane@example.com>
+```
+
+Multiple footers are allowed; each must appear on its own line.
+
+---
+
+## 7. Breaking Changes — Full Example
+
+```
+feat(orchestrator): require actorId in executeRunBundle
+
+Previously actor was optional and defaulted to "system". This was a
+security gap — all executions must be attributed to a known actor so
+that audit events and RBAC checks are meaningful.
+
+Callers that omit actorId will receive a compile-time error after this
+change. The "system" actor is still valid as an explicit value for
+internal automation.
+
+BREAKING CHANGE: executeRunBundle(bundle, actor) — the actor parameter
+is now required. Update all call sites to pass a valid actorId string.
+Passing "system" explicitly is acceptable for automated tasks.
+
+Closes #341
+```
+
+---
+
+## 8. Example Commits
+
+The following 30+ examples cover realistic changes across this codebase. Use them as
+reference when writing commit messages.
+
+```
+feat(auth): add Redis-backed session revocation
+
+feat(orchestrator): implement phase retry with backoff
+
+feat(governance): add gate approval webhook notifications
+
+feat(skill-engine): support async skill execution via job queue
+
+feat(cli): add `ck run status` subcommand
+
+feat(adapters): add Slack adapter for notification delivery
+
+feat(healing): add automatic rollback on consecutive phase failures
+
+fix(orchestrator): handle null steps array in phase-engine
+
+fix(auth): resolve infinite loop in session refresh under high load
+
+fix(governance): gate approval incorrectly blocked by expired policy
+
+fix(adapters): Slack webhook URL not validated before dispatch
+
+fix(realtime): WebSocket connection leak on client disconnect
+
+fix(cli): `ck run list` crashes when no runs exist
+
+security(auth): remove hardcoded service account secret from config
+
+security(policy): restrict policy evaluation to org-scoped rules only
+
+security(audit): ensure all run create/cancel events emit orgId
+
+docs(orchestrator): add SPEC_ORCHESTRATOR.md phase lifecycle section
+
+docs(governance): document gate approval and rejection flow
+
+docs(auth): clarify session resolution and RBAC check order
+
+chore(deps): update vitest from 1.3.0 to 1.5.0
+
+chore(deps): replace deprecated @types/node with current version
+
+chore(shared): remove unused utility functions from string-utils
+
+test(auth): add cross-tenant session isolation tests
+
+test(orchestrator): cover phase-engine null step edge cases
+
+test(governance): add gate approval rejection integration test
+
+test(rbac): verify workspace-scoped permission inheritance
+
+refactor(governance): extract gate-controller from gate-manager module
+
+refactor(orchestrator): split phase-engine into scheduler and executor
+
+perf(db): add composite index on runs (org_id, status, created_at)
+
+perf(observability): batch metric writes to reduce DB round-trips
+
+ci: add integration test job to ci.yml workflow
+
+ci: cache pnpm store in GitHub Actions to reduce install time
+
+build: add vitest.integration.config.ts with 30s timeout
+
+build: configure path aliases in tsconfig for packages/shared
+
+revert: feat(auth): add session refresh endpoint
+```
+
+**Multi-line commit with body and footer:**
+
+```
+fix(healing): prevent double-rollback when recovery phase also fails
+
+When a run phase failed and triggered the healing rollback, if the
+rollback phase itself encountered an error the healing subsystem would
+attempt a second rollback, causing duplicate audit events and
+inconsistent run state in the database.
+
+Added a guard in HealingCoordinator.attemptRecovery() that checks
+run.recoveryAttempted before invoking the rollback path. The state is
+set atomically before the rollback begins.
+
+Fixes #298
+Co-authored-by: Alex Chen <alex@example.com>
+```
+
+**Revert with context:**
+
+```
+revert: feat(adapters): add async Slack adapter
+
+Reverting due to unresolved rate-limit errors under load. Will
+reintroduce after the queue-based dispatch layer is in place.
+
+This reverts commit a3f9c12.
+```
+
+---
+
+## 9. Anti-Patterns to Avoid
+
+These commit messages will be rejected by the `lint:commits` script or blocked in code review:
+
+| Bad message | Problem |
+|---|---|
+| `fix: fixed stuff` | Too vague; subject does not describe what was fixed |
+| `feat: Add Redis session revocation` | Capital letter after colon |
+| `WIP: auth changes` | Not a valid conventional commit type |
+| `feat(auth,orchestrator): big refactor` | Multiple scopes not allowed; split into two commits |
+| `feat(auth): add Redis-backed session revocation and also fix the connection pool leak and update tests` | Subject exceeds 72 characters; covers multiple concerns |
+| `update auth` | No type, no scope, no imperative subject |
+| `feat(AUTH): add session revocation` | Scope must be lowercase |
+| `fix(auth): Fix session resolution.` | Capital first letter and trailing period |
+| `feat: ` | Empty subject |
+
+---
+
+## 10. Linting Integration
+
+The `lint:commits` script (`tsx scripts/release/lint-commits.ts`) enforces this convention
+automatically on CI.
+
+**How it works:**
+1. The `lint-commits.yml` workflow triggers on every pull request
+2. It runs `pnpm lint:commits` against all commits on the PR branch
+3. Any commit that violates the format causes the workflow to fail
+4. A failing lint-commits check blocks the PR from being merged
+
+**Running locally:**
+
+```bash
+pnpm lint:commits
+```
+
+This validates commits on the current branch. Run it before pushing to catch issues early.
+
+---
+
+## 11. Changelog Generation
+
+`pnpm release:prepare` calls `npm run changelog:update`, which reads all conventional commits
+since the last release tag and generates structured `CHANGELOG.md` entries.
+
+Entries are grouped by type in this order:
+
+1. `feat` — New Features
+2. `fix` — Bug Fixes
+3. `security` — Security Fixes
+4. `perf` — Performance Improvements
+5. `refactor` — Refactoring
+6. `docs` — Documentation
+7. `chore`, `ci`, `build`, `test` — Maintenance
+
+Commits with `BREAKING CHANGE` in the footer appear in a dedicated section at the top of
+the release entry.
+
+Commits with type `chore`, `ci`, `build`, or `test` are excluded from the public-facing
+release notes but are included in the internal `CHANGELOG.md`.
+
+---
+
+## 12. PR Title Convention
+
+PR titles must follow conventional commit format because the PR title becomes the
+squash-merge commit message when PRs are merged to `main`.
+
+**Required format:**
+
+```
+type(scope): subject
+```
+
+**Examples:**
+
+```
+feat(auth): add Redis-backed session revocation
+fix(orchestrator): handle null step array in phase-engine
+chore(deps): update vitest to 1.5.0
+```
+
+The `pr-gate.yml` workflow validates the PR title against the conventional commit regex on
+every PR open and edit. A malformed PR title blocks the merge.
+
+If the PR covers multiple scopes, choose the primary scope and reference the others in the
+PR description body.
diff --git a/docs/07_cicd/GITHUB_ACTIONS_PIPELINE.md b/docs/07_cicd/GITHUB_ACTIONS_PIPELINE.md
new file mode 100644
index 0000000..c16d5d3
--- /dev/null
+++ b/docs/07_cicd/GITHUB_ACTIONS_PIPELINE.md
@@ -0,0 +1,630 @@
+# GitHub Actions Pipeline Guide
+
+> Code-Kit-Ultra v1.2.0 — CI/CD reference for all GitHub Actions workflows.
+
+---
+
+## Overview
+
+### Pipeline Strategy
+
+Code-Kit-Ultra uses **trunk-based development** with short-lived feature branches. The pipeline enforces quality gates at every stage of the development lifecycle.
+
+```
+feature/* ──────► main ──────► release/vX.Y ──────► tag vX.Y.Z ──────► public
+                    │                                       │
+                 ci.yml                            version-bump-release.yml
+               pr-gate.yml                          release-control-center.yml
+             lint-commits.yml                           public-release.yml
+```
+
+**Branch conventions:**
+
+| Branch pattern | Purpose |
+|---|---|
+| `main` | Trunk; always deployable |
+| `feature/<slug>` | Short-lived feature work |
+| `release/<version>` | Stabilization branch for a version |
+| `hotfix/<slug>` | Emergency patch off main |
+| `chore/<slug>` | Tooling, dependencies, docs |
+
+**Key principles:**
+- Every push and PR to `main`, `develop`, or `release/**` runs the full verification suite.
+- PRs require branch name validation (`feature/`, `release/`, `hotfix/`, `chore/` prefixes).
+- PR titles must follow conventional commit format (enforced by `lint-commits.yml`).
+- Releases are always gated by the Release Control Center (RCC) governance step.
+- Tags matching `v*` trigger the public release publication workflow.
+
+---
+
+## Current Workflow Inventory
+
+### `ci.yml` — Core CI
+
+**Trigger:** Push and PR to `main`, `develop`, `release/**`
+
+**Purpose:** The primary health check for every code change. Ensures the monorepo can typecheck, tests pass, and the build does not regress.
+
+**Steps:**
+1. Checkout repository
+2. Set up Node.js 20
+3. Set up pnpm 10
+4. Cache the pnpm store (`pnpm-lock.yaml` hash key)
+5. `pnpm install --frozen-lockfile`
+6. `pnpm run typecheck` — TypeScript strict-mode check across all packages
+7. `pnpm run test:auth` — auth package unit tests
+8. Build check via `scripts/build-public-release.sh` (if present)
+
+**Runtime:** ~3–5 minutes on `ubuntu-latest`.
+
+---
+
+### `pr-gate.yml` — PR Quality Gate
+
+**Trigger:** PR events: `opened`, `synchronize`, `reopened`
+
+**Purpose:** Validates branch naming conventions and runs a fast typecheck gate before the full CI suite.
+
+**Steps:**
+1. **Branch name check** — regex validates `feature/`, `release/`, `hotfix/`, or `chore/` prefix; fails immediately if invalid
+2. Checkout, Node 20, pnpm 10 install
+3. `pnpm run typecheck`
+4. Repo health summary (prints branch, Node version, TypeScript version to logs)
+
+**Note:** This job is a required status check. PRs cannot merge if the branch name fails validation.
+
+---
+
+### `lint-commits.yml` — Lint Commit Messages
+
+**Trigger:** Push to `main`; PR events: `opened`, `edited`, `synchronize`, `reopened`
+
+**Purpose:** Enforces conventional commit format on every commit in a PR and validates the PR title.
+
+**Steps:**
+1. Checkout with full history (`fetch-depth: 0`)
+2. Node 20, pnpm 10 install
+3. On push: `tsx scripts/release/lint-commits.ts "<before>..<after>"`
+4. On PR: `tsx scripts/release/lint-commits.ts "origin/<base>..HEAD"`
+5. On PR: `tsx scripts/release/lint-pr-title.ts "<pr-title>"`
+
+**See also:** `docs/07_cicd/CONVENTIONAL_COMMITS.md`
+
+---
+
+### `release.yml` — Release Milestone
+
+**Trigger:** Manual (`workflow_dispatch`) with inputs: `version` (e.g., `v1.2.0`), `milestone_title`
+
+**Purpose:** Runs verification gates then creates a named GitHub Release with a milestone description.
+
+**Steps:**
+1. Verification job: checkout → Node 20 → pnpm install → typecheck → `test:auth`
+2. (After verify) Create GitHub Release via `actions/create-release@v1` with the tag and milestone title
+
+**Inputs:**
+
+| Input | Required | Default |
+|---|---|---|
+| `version` | Yes | `v1.2.0` |
+| `milestone_title` | Yes | `Phase 10 Production Integration` |
+
+---
+
+### `version-bump-release.yml` — Manual Release Preparation
+
+**Trigger:** Manual (`workflow_dispatch`) with inputs: `bump_type` (major/minor/patch or explicit version), `release_summary`
+
+**Purpose:** Full automated release preparation — bumps version, generates release notes, updates changelog, runs governance gate, commits metadata, creates tag, and publishes a GitHub Release.
+
+**Steps:**
+1. Checkout with full history, Node 20, pnpm install
+2. `pnpm run typecheck`
+3. Configure git identity for the commit
+4. `pnpm run version:bump -- <bump_type>`
+5. `pnpm run release:notes`
+6. `pnpm run changelog:update`
+7. `pnpm run release:control-center` (RCC governance)
+8. Parse RCC JSON manifest — exits 1 on `NO-GO`
+9. `git commit` and `git push` release metadata files
+10. Create and push annotated git tag
+11. `gh release create` with generated release notes
+
+---
+
+### `release-control-center.yml` — Release Control Center
+
+**Trigger:** Manual (`workflow_dispatch`) with inputs: `milestone_name`, `release_summary`, `verification_status`, `blockers`, `risks`
+
+**Purpose:** Runs the governance layer (RCC) independently. Generates a release control report, uploads it as an artifact, and fails the workflow if the decision is `NO-GO`.
+
+**Steps:**
+1. Checkout, Node 20, pnpm install
+2. `pnpm run typecheck`
+3. `pnpm run release:control-center` (with env vars from inputs)
+4. Upload `releases/control-center/` as workflow artifact
+5. Print the latest report to the GitHub Actions step summary
+6. Grep for `NO-GO` in the report; exit 1 if found
+
+---
+
+### `public-release.yml` — Public Release
+
+**Trigger:** Push of tags matching `v*`
+
+**Purpose:** Runs the public release build (stripped binaries, checksums) and publishes release artifacts.
+
+**Steps:**
+1. Checkout, Node 20, pnpm install
+2. `pnpm run build:public-release`
+3. `pnpm run checksums`
+
+---
+
+## Proposed Additional Workflows
+
+### 1. `integration-tests.yml` — Integration Test Suite
+
+Runs the full integration test suite on every merge to `main`. Requires a live test database and Redis instance, provisioned via GitHub Actions services.
+
+```yaml
+name: Integration Tests
+
+on:
+  push:
+    branches:
+      - main
+
+jobs:
+  integration:
+    name: Run Integration Tests
+    runs-on: ubuntu-latest
+
+    services:
+      postgres:
+        image: postgres:16
+        env:
+          POSTGRES_USER: cktest
+          POSTGRES_PASSWORD: cktest
+          POSTGRES_DB: cku_integration
+        ports:
+          - 5432:5432
+        options: >-
+          --health-cmd pg_isready
+          --health-interval 10s
+          --health-timeout 5s
+          --health-retries 5
+
+      redis:
+        image: redis:7-alpine
+        ports:
+          - 6379:6379
+        options: >-
+          --health-cmd "redis-cli ping"
+          --health-interval 10s
+          --health-timeout 5s
+          --health-retries 5
+
+    env:
+      DATABASE_URL: postgresql://cktest:cktest@localhost:5432/cku_integration
+      REDIS_URL: redis://localhost:6379
+      NODE_ENV: test
+
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+
+      - name: Setup Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+
+      - name: Setup pnpm
+        uses: pnpm/action-setup@v4
+        with:
+          version: 10
+          run_install: false
+
+      - name: Get pnpm store directory
+        run: echo "STORE_PATH=$(pnpm store path --silent)" >> $GITHUB_ENV
+
+      - name: Cache pnpm store
+        uses: actions/cache@v4
+        with:
+          path: ${{ env.STORE_PATH }}
+          key: ${{ runner.os }}-pnpm-store-${{ hashFiles('**/pnpm-lock.yaml') }}
+          restore-keys: |
+            ${{ runner.os }}-pnpm-store-
+
+      - name: Install dependencies
+        run: pnpm install --frozen-lockfile
+
+      - name: Run database migrations
+        run: pnpm run db:migrate
+
+      - name: Run integration tests
+        run: pnpm run test:integration
+
+      - name: Upload test results
+        if: always()
+        uses: actions/upload-artifact@v4
+        with:
+          name: integration-test-results
+          path: test-results/
+          retention-days: 7
+```
+
+---
+
+### 2. `security-scan.yml` — Security Scan (Scheduled)
+
+Runs OWASP ZAP and `npm audit` daily. Results are uploaded as artifacts and failures open GitHub Security alerts.
+
+```yaml
+name: Security Scan
+
+on:
+  schedule:
+    - cron: '0 3 * * *'   # Daily at 03:00 UTC
+  workflow_dispatch:        # Allow manual trigger
+
+jobs:
+  dependency-audit:
+    name: Dependency Audit
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+
+      - name: Setup Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+
+      - name: Setup pnpm
+        uses: pnpm/action-setup@v4
+        with:
+          version: 10
+          run_install: false
+
+      - name: Get pnpm store directory
+        run: echo "STORE_PATH=$(pnpm store path --silent)" >> $GITHUB_ENV
+
+      - name: Cache pnpm store
+        uses: actions/cache@v4
+        with:
+          path: ${{ env.STORE_PATH }}
+          key: ${{ runner.os }}-pnpm-store-${{ hashFiles('**/pnpm-lock.yaml') }}
+          restore-keys: |
+            ${{ runner.os }}-pnpm-store-
+
+      - name: Install dependencies
+        run: pnpm install --frozen-lockfile
+
+      - name: Run pnpm audit
+        run: pnpm audit --audit-level=moderate
+        continue-on-error: true
+
+      - name: Run security-focused tests
+        run: pnpm run test:security
+        env:
+          INSFORGE_JWKS_URL: ${{ secrets.INSFORGE_JWKS_URL }}
+          NODE_ENV: test
+
+  owasp-zap-scan:
+    name: OWASP ZAP API Scan
+    runs-on: ubuntu-latest
+    needs: dependency-audit
+
+    services:
+      app:
+        image: node:20-alpine
+        env:
+          DATABASE_URL: ${{ secrets.TEST_DATABASE_URL }}
+          REDIS_URL: ${{ secrets.TEST_REDIS_URL }}
+          NODE_ENV: test
+        ports:
+          - 4000:4000
+
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+
+      - name: Run OWASP ZAP Baseline Scan
+        uses: zaproxy/action-baseline@v0.12.0
+        with:
+          target: 'http://localhost:4000'
+          rules_file_name: '.zap/rules.tsv'
+          cmd_options: '-a'
+
+      - name: Upload ZAP Report
+        if: always()
+        uses: actions/upload-artifact@v4
+        with:
+          name: zap-security-report
+          path: report_html.html
+          retention-days: 30
+```
+
+---
+
+### 3. `coverage-report.yml` — Coverage Report
+
+Generates a Vitest coverage report and publishes it to GitHub Pages (or as a PR comment).
+
+```yaml
+name: Coverage Report
+
+on:
+  push:
+    branches:
+      - main
+  pull_request:
+    branches:
+      - main
+
+jobs:
+  coverage:
+    name: Generate Coverage Report
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+
+      - name: Setup Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+
+      - name: Setup pnpm
+        uses: pnpm/action-setup@v4
+        with:
+          version: 10
+          run_install: false
+
+      - name: Get pnpm store directory
+        run: echo "STORE_PATH=$(pnpm store path --silent)" >> $GITHUB_ENV
+
+      - name: Cache pnpm store
+        uses: actions/cache@v4
+        with:
+          path: ${{ env.STORE_PATH }}
+          key: ${{ runner.os }}-pnpm-store-${{ hashFiles('**/pnpm-lock.yaml') }}
+          restore-keys: |
+            ${{ runner.os }}-pnpm-store-
+
+      - name: Install dependencies
+        run: pnpm install --frozen-lockfile
+
+      - name: Run coverage
+        run: pnpm run test:coverage
+        env:
+          NODE_ENV: test
+
+      - name: Upload coverage artifact
+        uses: actions/upload-artifact@v4
+        with:
+          name: coverage-report
+          path: coverage/
+          retention-days: 14
+
+      - name: Publish coverage to GitHub Pages
+        if: github.ref == 'refs/heads/main'
+        uses: peaceiris/actions-gh-pages@v4
+        with:
+          github_token: ${{ secrets.GITHUB_TOKEN }}
+          publish_dir: ./coverage
+          destination_dir: coverage
+
+      - name: Comment coverage summary on PR
+        if: github.event_name == 'pull_request'
+        uses: davelosert/vitest-coverage-report-action@v2
+        with:
+          github-token: ${{ secrets.GITHUB_TOKEN }}
+          json-summary-path: coverage/coverage-summary.json
+```
+
+---
+
+### 4. `smoke-test-staging.yml` — Smoke Test (Staging)
+
+Runs the smoke test pack against the staging environment after a deploy. Triggered manually or by a downstream deploy workflow.
+
+```yaml
+name: Smoke Test (Staging)
+
+on:
+  workflow_dispatch:
+    inputs:
+      staging_url:
+        description: 'Staging API base URL'
+        required: true
+        default: 'https://staging.code-kit-ultra.internal'
+  workflow_call:
+    inputs:
+      staging_url:
+        type: string
+        required: true
+
+jobs:
+  smoke:
+    name: Run Smoke Tests Against Staging
+    runs-on: ubuntu-latest
+    timeout-minutes: 15
+
+    env:
+      STAGING_URL: ${{ inputs.staging_url }}
+      INSFORGE_JWKS_URL: ${{ secrets.INSFORGE_JWKS_URL }}
+      TEST_TENANT_ID: ${{ secrets.TEST_TENANT_ID }}
+      TEST_SERVICE_TOKEN: ${{ secrets.TEST_SERVICE_TOKEN }}
+
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+
+      - name: Setup Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+
+      - name: Setup pnpm
+        uses: pnpm/action-setup@v4
+        with:
+          version: 10
+          run_install: false
+
+      - name: Get pnpm store directory
+        run: echo "STORE_PATH=$(pnpm store path --silent)" >> $GITHUB_ENV
+
+      - name: Cache pnpm store
+        uses: actions/cache@v4
+        with:
+          path: ${{ env.STORE_PATH }}
+          key: ${{ runner.os }}-pnpm-store-${{ hashFiles('**/pnpm-lock.yaml') }}
+          restore-keys: |
+            ${{ runner.os }}-pnpm-store-
+
+      - name: Install dependencies
+        run: pnpm install --frozen-lockfile
+
+      - name: Wait for staging to be healthy
+        run: |
+          for i in $(seq 1 10); do
+            STATUS=$(curl -s -o /dev/null -w "%{http_code}" $STAGING_URL/health || echo "000")
+            if [ "$STATUS" = "200" ]; then
+              echo "Staging is healthy."
+              exit 0
+            fi
+            echo "Attempt $i: staging returned $STATUS, waiting 15s..."
+            sleep 15
+          done
+          echo "Staging did not become healthy in time."
+          exit 1
+
+      - name: Run smoke tests
+        run: pnpm run test:smoke
+
+      - name: Upload smoke test results
+        if: always()
+        uses: actions/upload-artifact@v4
+        with:
+          name: smoke-test-results
+          path: test-results/smoke/
+          retention-days: 3
+```
+
+---
+
+## Required GitHub Secrets
+
+Configure these secrets in **Settings → Secrets and variables → Actions**:
+
+| Secret | Used By | Description |
+|---|---|---|
+| `GITHUB_TOKEN` | All | Auto-provisioned by GitHub; write access for releases and pages |
+| `DATABASE_URL` | `integration-tests.yml` | PostgreSQL connection string for integration tests |
+| `REDIS_URL` | `integration-tests.yml` | Redis connection string for session/cache tests |
+| `TEST_DATABASE_URL` | `security-scan.yml` | Read-only test DB for ZAP scan target |
+| `TEST_REDIS_URL` | `security-scan.yml` | Redis URL for ZAP scan target app |
+| `INSFORGE_JWKS_URL` | `security-scan.yml`, `smoke-test-staging.yml` | JWKS endpoint for JWT verification in tests |
+| `TEST_TENANT_ID` | `smoke-test-staging.yml` | Tenant ID for smoke test auth context |
+| `TEST_SERVICE_TOKEN` | `smoke-test-staging.yml` | Service account token for smoke test API calls |
+
+> **Never commit secrets to source code.** The `security(auth)` commit type exists specifically to flag when secrets are removed from code and moved to environment configuration.
+
+---
+
+## Branch Protection Rules
+
+Configure on **Settings → Branches → main**:
+
+| Rule | Setting |
+|---|---|
+| Require a pull request before merging | Enabled |
+| Required approvals | 1 (recommended: 2 for security changes) |
+| Dismiss stale pull request approvals when new commits are pushed | Enabled |
+| Require status checks to pass before merging | Enabled |
+| Required status checks | `Verify Base` (ci.yml), `Verified Run` (pr-gate.yml), `Lint Commit Messages` (lint-commits.yml) |
+| Require branches to be up to date before merging | Enabled |
+| Do not allow bypassing the above settings | Enabled |
+| Restrict who can push to matching branches | Limit to release managers for direct pushes |
+
+---
+
+## Cache Strategy
+
+All workflows share a consistent pnpm store cache pattern keyed on `pnpm-lock.yaml`:
+
+```yaml
+- name: Get pnpm store directory
+  run: echo "STORE_PATH=$(pnpm store path --silent)" >> $GITHUB_ENV
+
+- name: Cache pnpm store
+  uses: actions/cache@v4
+  with:
+    path: ${{ env.STORE_PATH }}
+    key: ${{ runner.os }}-pnpm-store-${{ hashFiles('**/pnpm-lock.yaml') }}
+    restore-keys: |
+      ${{ runner.os }}-pnpm-store-
+```
+
+**How it works:**
+- The primary cache key is an exact match on the lockfile hash. A lockfile change invalidates the cache and forces a full install.
+- The restore key `${{ runner.os }}-pnpm-store-` allows partial cache hits from the most recent run with a different lockfile.
+- Typical cache hit time: < 30 seconds. Cold install: 2–3 minutes.
+
+---
+
+## Matrix Strategy
+
+To validate against multiple Node.js versions, extend the CI job with a matrix:
+
+```yaml
+jobs:
+  verify:
+    name: Verify (Node ${{ matrix.node-version }})
+    runs-on: ubuntu-latest
+
+    strategy:
+      matrix:
+        node-version: ['18', '20']
+      fail-fast: false   # Run all matrix variants even if one fails
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Setup Node.js ${{ matrix.node-version }}
+        uses: actions/setup-node@v4
+        with:
+          node-version: ${{ matrix.node-version }}
+
+      - name: Setup pnpm
+        uses: pnpm/action-setup@v4
+        with:
+          version: 10
+          run_install: false
+
+      - name: Get pnpm store directory
+        run: echo "STORE_PATH=$(pnpm store path --silent)" >> $GITHUB_ENV
+
+      - name: Cache pnpm store
+        uses: actions/cache@v4
+        with:
+          path: ${{ env.STORE_PATH }}
+          key: ${{ runner.os }}-node${{ matrix.node-version }}-pnpm-store-${{ hashFiles('**/pnpm-lock.yaml') }}
+          restore-keys: |
+            ${{ runner.os }}-node${{ matrix.node-version }}-pnpm-store-
+
+      - name: Install dependencies
+        run: pnpm install --frozen-lockfile
+
+      - name: Typecheck
+        run: pnpm run typecheck
+
+      - name: Test
+        run: pnpm run test:auth
+```
+
+**Note:** Node 18 reaches end-of-life in April 2025. The matrix is provided as a cross-version validation pattern; the canonical target runtime is Node 20.
diff --git a/docs/07_cicd/MONOREPO_SCRIPTS.md b/docs/07_cicd/MONOREPO_SCRIPTS.md
new file mode 100644
index 0000000..6b9f785
--- /dev/null
+++ b/docs/07_cicd/MONOREPO_SCRIPTS.md
@@ -0,0 +1,314 @@
+# Monorepo Scripts Reference
+
+**Project:** Code-Kit-Ultra v1.2.0
+**Package manager:** pnpm workspaces
+**Runtime:** Node 20, tsx, TypeScript strict
+
+---
+
+## 1. Philosophy
+
+Scripts in this monorepo follow a three-layer model:
+
+**Root (`package.json`)** — orchestration only. Root scripts coordinate across packages, run
+the full test suite, trigger releases, or start the full development stack. They do not contain
+package-specific logic. Naming convention: `noun:verb` for namespaced commands (`test:auth`,
+`db:migrate`) and plain verbs for top-level commands (`build`, `typecheck`, `dev`).
+
+**App-level (`apps/*/package.json`)** — app-specific concerns. Each app owns its own `dev`,
+`build`, and `start` scripts. Root scripts may delegate to app scripts via `pnpm -w` or
+`concurrently`.
+
+**Package-level (`packages/*/package.json`)** — library concerns. Every package exposes a
+standard set of scripts (`build`, `typecheck`, `test`) so that workspace-wide commands like
+`pnpm -r build` work uniformly. Packages must not reference each other's scripts directly.
+
+**Naming conventions:**
+- Namespaced scripts use a colon separator: `test:auth`, `db:seed`, `dev:api`
+- Release-related scripts are prefixed with `release:` or `version:`
+- Database scripts are prefixed with `db:`
+- Docker scripts are prefixed with `docker:`
+- Development shortcuts use `dev` as the root or first segment
+
+---
+
+## 2. Current Scripts Inventory
+
+| Script | Command | Purpose | When to use |
+|---|---|---|---|
+| `ck` | `tsx apps/cli/src/index.ts` | Run the CLI entry point via tsx | Local CLI development and manual testing |
+| `preflight` | `pnpm run typecheck && pnpm run test:auth` | Full type check + auth test suite | Before opening a PR; run by CI on push |
+| `build` | `pnpm -r build` | Build all packages and apps recursively | Before release; after pulling changes that affect build output |
+| `typecheck` | `tsc --noEmit` | TypeScript compiler check across entire workspace | Daily development; always run before commit |
+| `test:phase10_5` | `tsx examples/healing-test.ts` | Ad-hoc healing subsystem test | Manual validation of healing package behavior |
+| `test:auth` | `vitest run packages/auth` | Run auth package test suite | Auth changes; included in `preflight` |
+| `test:session` | `vitest run packages/auth/src/resolve-session.ts` | Run session resolution tests only | Targeted debugging of session logic |
+| `test:rbac` | `vitest run packages/auth/src/rbac.ts` | Run RBAC tests only | Targeted debugging of permission logic |
+| `db:migrate` | `echo 'Migration logic placeholder'` | Run database migrations | Before first run; after pulling schema changes |
+| `db:seed` | `echo 'Seed logic placeholder'` | Seed development data | After a fresh migration; resetting local state |
+| `dev:web` | `npm run dev -w apps/web-control-plane` | Start the Vite dev server for the web control plane | Frontend development |
+| `version:bump` | `tsx scripts/release/bump-version.ts` | Bump version in VERSION file and package.json | Part of release flow; do not run manually |
+| `release:notes` | `tsx scripts/release/generate-release-notes.ts` | Generate release notes from conventional commits | Automated; called by `release:prepare` |
+| `changelog:update` | `tsx scripts/release/update-changelog.ts` | Append new entries to CHANGELOG.md | Automated; called by `release:prepare` |
+| `lint:commits` | `tsx scripts/release/lint-commits.ts` | Validate commit messages against conventional commits spec | Run by `lint-commits.yml` on every PR |
+| `release:prepare` | `npm run release:notes && npm run changelog:update` | Generate release notes and update changelog in sequence | Step 2 of the release workflow |
+| `package:release` | `scripts/package-release.sh` | Package build artifacts for distribution | Run by `release.yml` workflow |
+| `release:control-center` | `tsx scripts/release/control-center.ts` | Interactive release control panel | Run by `release-control-center.yml` |
+| `release:control` | `npm run release:control-center` | Alias for release control center | Convenience alias |
+| `release:control:milestone` | `MILESTONE_NAME='Phase 10 Milestone' tsx ...` | Release control center scoped to a milestone | Milestone-specific releases |
+| `release:validate` | `tsx scripts/release/validate-release.ts` | Validate release artifacts and metadata | Final check before publishing |
+| `cku` | `npx tsx ./codekit/apps/cli/src/index.ts` | Run the codekit variant of the CLI | Codekit-specific CLI testing |
+
+---
+
+## 3. Missing Scripts to Add
+
+The following scripts are not yet present in `package.json` and should be added. Each fills
+a gap in the development or CI workflow.
+
+**Testing gaps:**
+- `test:unit` — runs all package-level unit tests in one command, required for CI
+- `test:integration` — runs integration tests with a dedicated config; separate from unit
+  tests to allow different timeouts and setup
+- `test:smoke` — fast sanity-check tests for post-deploy validation
+- `test:coverage` — generates a coverage report across all packages; run before release
+- `test:all` — convenience script combining unit and integration; used in local preflight
+- `test:security` — runs security-focused test suite; should block release if failing
+
+**Development gaps:**
+- `dev` — single command to start all local services concurrently using `concurrently`
+- `dev:api` — start the control-service API with hot reload via `tsx watch`
+
+**Code quality gaps:**
+- `lint` — ESLint across packages and apps; currently absent from CI
+- `format` — Prettier formatting; must be run before commit
+
+**Database gaps:**
+- `db:reset` — drops and recreates the local database; destructive, local only
+
+**Docker gaps:**
+- `docker:build` — build the production Docker image
+- `docker:run` — run the container locally with environment file
+
+Add the following to the `"scripts"` block in the root `package.json`:
+
+```json
+{
+  "test:unit": "vitest run packages/",
+  "test:integration": "vitest run --config vitest.integration.config.ts",
+  "test:smoke": "vitest run tests/smoke/",
+  "test:coverage": "vitest run --coverage packages/",
+  "test:all": "pnpm run test:unit && pnpm run test:integration",
+  "test:security": "vitest run tests/security/",
+  "dev": "concurrently \"pnpm run dev:api\" \"pnpm run dev:web\"",
+  "dev:api": "tsx watch apps/control-service/src/index.ts",
+  "dev:web": "npm run dev -w apps/web-control-plane",
+  "lint": "eslint packages/ apps/ --ext .ts,.tsx",
+  "format": "prettier --write packages/ apps/",
+  "db:migrate": "tsx scripts/db/migrate.ts",
+  "db:seed": "tsx scripts/db/seed.ts",
+  "db:reset": "tsx scripts/db/reset.ts",
+  "docker:build": "docker build -t code-kit-ultra .",
+  "docker:run": "docker run -p 4000:4000 --env-file .env code-kit-ultra"
+}
+```
+
+> Note: `dev:web` already exists; `db:migrate` and `db:seed` exist with placeholder commands
+> and should be replaced with the real `tsx` invocations above.
+
+---
+
+## 4. Per-Package Scripts Standard
+
+Every package under `packages/` must expose the following scripts in its own `package.json`.
+This enables `pnpm -r` commands to work uniformly across the workspace.
+
+```json
+{
+  "scripts": {
+    "build": "tsc",
+    "typecheck": "tsc --noEmit",
+    "test": "vitest run src/"
+  }
+}
+```
+
+**Packages that must conform to this standard:**
+
+`adapters`, `agents`, `audit`, `auth`, `command-engine`, `core`, `events`, `governance`,
+`healing`, `learning`, `memory`, `observability`, `orchestrator`, `policy`, `realtime`,
+`security`, `shared`, `skill-engine`, `storage`, `tools`, `cku`
+
+To audit compliance across all packages:
+
+```bash
+pnpm -r exec -- node -e \
+  "const p = require('./package.json');
+   ['build','typecheck','test'].forEach(s => {
+     if (!p.scripts?.[s]) console.log(p.name, 'missing:', s)
+   })"
+```
+
+---
+
+## 5. pnpm Workspace Filtering
+
+Use `--filter` to target specific packages or dependency graphs. This avoids rebuilding the
+entire workspace for isolated changes.
+
+| Command | Effect |
+|---|---|
+| `pnpm --filter packages/auth build` | Build only the `auth` package |
+| `pnpm -r --filter './packages/*' test` | Run `test` in every package |
+| `pnpm -r --filter '...packages/auth' build` | Build `auth` and all packages that depend on it |
+| `pnpm -r --filter 'packages/auth...' build` | Build `auth` and all of its dependencies |
+| `pnpm --filter apps/cli dev` | Start only the CLI app |
+| `pnpm -r --filter './apps/*' build` | Build all apps |
+| `pnpm --filter packages/auth test -- --reporter=verbose` | Run auth tests with verbose output |
+
+**Filtering by changed files (useful in CI for affected-only runs):**
+
+```bash
+# Build only packages changed since main
+pnpm -r --filter '...[origin/main]' build
+```
+
+**Running a one-off command in a package without a script:**
+
+```bash
+pnpm --filter packages/auth exec -- tsc --version
+```
+
+---
+
+## 6. Environment Variables
+
+The following table documents which scripts require specific environment variables. Missing
+variables will cause silent failures or runtime errors.
+
+| Script | Required Env Vars | Description |
+|---|---|---|
+| `dev:api` | `DATABASE_URL`, `JWT_SECRET`, `PORT` | API will not start without a valid DB connection and JWT secret |
+| `dev:web` | `VITE_API_URL` | Vite dev server needs the API base URL for proxying |
+| `dev` | All of `dev:api` + `dev:web` | Combines both sets |
+| `test:auth` | `JWT_SECRET`, `TEST_DATABASE_URL` | Auth tests spin up isolated DB connections |
+| `test:integration` | `DATABASE_URL`, `JWT_SECRET`, `REDIS_URL` | Integration tests require full service stack |
+| `test:security` | `JWT_SECRET`, `TEST_DATABASE_URL` | Security tests exercise auth and permission boundaries |
+| `db:migrate` | `DATABASE_URL` | Migrations connect directly to the configured DB |
+| `db:seed` | `DATABASE_URL` | Seeds require an already-migrated schema |
+| `db:reset` | `DATABASE_URL` | Destructive — drops and recreates; never point at a production URL |
+| `docker:run` | All vars passed via `--env-file .env` | Container reads from `.env` at runtime |
+| `release:prepare` | `GITHUB_TOKEN` | Changelog generation may query GitHub API for PR metadata |
+| `package:release` | `GITHUB_TOKEN`, `NPM_TOKEN` (if publishing) | Publishing requires auth tokens |
+
+Copy `.env.example` to `.env` and populate all variables before running any local
+development script.
+
+---
+
+## 7. Development Workflow Walkthrough
+
+Follow these steps to go from a fresh clone to a running local environment.
+
+**Step 1 — Clone and install**
+
+```bash
+git clone https://github.com/your-org/code-kit-ultra.git
+cd code-kit-ultra
+pnpm install
+```
+
+pnpm will install all workspace dependencies and link packages to each other. Do not use
+`npm install` or `yarn` — the lockfile is pnpm-specific.
+
+**Step 2 — Set up environment variables**
+
+```bash
+cp .env.example .env
+```
+
+Open `.env` and fill in all required values. At minimum: `DATABASE_URL`, `JWT_SECRET`.
+See Section 6 for the full list.
+
+**Step 3 — Set up the database**
+
+```bash
+pnpm db:migrate
+pnpm db:seed
+```
+
+This creates the schema and loads development fixtures. Run `pnpm db:reset` if you need a
+clean slate.
+
+**Step 4 — Start development servers**
+
+```bash
+pnpm dev
+```
+
+This starts the control-service API (with tsx watch for hot reload) and the Vite web dev
+server concurrently. API defaults to port 4000; web defaults to port 5173.
+
+**Step 5 — Run the test suite**
+
+```bash
+pnpm test:all        # unit + integration
+pnpm test:auth       # auth package only (fast, good for TDD)
+pnpm typecheck       # type errors only
+```
+
+**Step 6 — Build for production**
+
+```bash
+pnpm build
+```
+
+Verify there are no TypeScript errors before building. The build will not fail on type errors
+unless `noEmit` is removed from the tsconfig.
+
+---
+
+## 8. Release Workflow
+
+Follow these steps in order. Do not skip steps — the release workflows are sequential and
+depend on each other's outputs.
+
+**Step 1 — Run preflight**
+
+```bash
+pnpm preflight
+```
+
+Runs `typecheck` and `test:auth`. Both must pass cleanly. Fix any failures before proceeding.
+
+**Step 2 — Prepare release artifacts**
+
+```bash
+pnpm release:prepare
+```
+
+Generates release notes from conventional commits since the last tag, then appends a new
+entry to `CHANGELOG.md`. Review the generated output before committing.
+
+**Step 3 — Open a PR**
+
+Commit the updated `CHANGELOG.md` and any version bump changes. Open a pull request
+targeting `main`. The PR title must follow conventional commit format (see
+`CONVENTIONAL_COMMITS.md`).
+
+**Step 4 — CI validation**
+
+On push, `ci.yml` runs automatically: checkout → install → typecheck → auth tests → build
+check. The `lint-commits.yml` and `pr-gate.yml` workflows also run on the PR. All checks
+must pass before merge is allowed.
+
+**Step 5 — Merge to main**
+
+After approval, merge the PR. This triggers `ci.yml` on `main`. Do not trigger the release
+workflow until this run is green.
+
+**Step 6 — Trigger the release**
+
+Go to Actions → `release.yml` → Run workflow. Provide the version number and milestone name.
+The workflow runs typecheck + tests, then creates a GitHub release with the generated notes
+and packaged artifacts.
diff --git a/docs/07_cicd/PR_ISSUE_TEMPLATES.md b/docs/07_cicd/PR_ISSUE_TEMPLATES.md
new file mode 100644
index 0000000..5f24c1f
--- /dev/null
+++ b/docs/07_cicd/PR_ISSUE_TEMPLATES.md
@@ -0,0 +1,391 @@
+# PR and Issue Templates Guide
+
+**Project:** Code-Kit-Ultra v1.2.0
+**Template location:** `.github/` (PR template) and `.github/ISSUE_TEMPLATE/` (issue templates)
+**Maintained by:** Core team. Changes to templates require a PR reviewed by at least one
+maintainer and approval from the team lead.
+
+---
+
+## Overview
+
+This document describes every GitHub template used in this repository, provides the full
+content of each template, and explains when and how to use them. Templates enforce consistent
+issue and PR quality, ensure security considerations are captured, and reduce the back-and-forth
+between contributors and reviewers.
+
+| Template | File | Triggers on |
+|---|---|---|
+| Pull Request | `.github/pull_request_template.md` | Every new PR |
+| Bug Report | `.github/ISSUE_TEMPLATE/bug_report.md` | "Bug report" issue type |
+| Feature Request | `.github/ISSUE_TEMPLATE/feature_request.md` | "Feature request" issue type |
+| Security Vulnerability | `.github/ISSUE_TEMPLATE/security_vulnerability.md` | "Security" issue type |
+| Documentation Gap | `.github/ISSUE_TEMPLATE/doc_gap.md` | "Documentation" issue type |
+
+---
+
+## Section 1: Pull Request Template
+
+**File:** `.github/pull_request_template.md`
+
+This template loads automatically when a contributor opens a new PR. Every field must be
+filled in. Reviewers should reject PRs where the template is left blank or removed.
+
+```markdown
+## Summary
+<!-- 1-3 bullet points describing what this PR does -->
+
+## Type of Change
+- [ ] Bug fix (non-breaking change that fixes an issue)
+- [ ] New feature (non-breaking change that adds functionality)
+- [ ] Security fix (addresses a security vulnerability)
+- [ ] Breaking change (fix or feature that would cause existing functionality to change)
+- [ ] Documentation update
+- [ ] Refactor (no behavior change)
+- [ ] CI/CD change
+
+## Changes Made
+<!-- List the key files and what changed in each -->
+
+## Testing
+<!-- How did you test this? What test cases cover this change? -->
+- [ ] Unit tests added/updated
+- [ ] Integration tests added/updated
+- [ ] Manual testing performed — describe steps
+
+## Security Considerations
+<!-- Did this change touch auth, permissions, or audit? -->
+- [ ] No security impact
+- [ ] Auth flow modified — reviewed by security
+- [ ] Multi-tenant scoping verified
+- [ ] Audit events emitted for all material actions
+- [ ] No secrets introduced in code
+
+## Definition of Done
+- [ ] Code compiles (`pnpm typecheck` passes)
+- [ ] Tests pass (`pnpm test:auth` passes, relevant tests added)
+- [ ] PR title follows conventional commit format
+- [ ] No `console.log` in production code paths
+- [ ] Linked to spec (if implementing a `SPEC_*.md`)
+
+## Related Issues
+<!-- Closes #123, Related to #456 -->
+```
+
+**Usage notes:**
+- The PR title must follow conventional commit format: `type(scope): subject`. See
+  `CONVENTIONAL_COMMITS.md`.
+- The `pr-gate.yml` workflow validates the PR title on open and every subsequent edit.
+- If the PR is a security fix, add the `security-review-required` label before requesting
+  review. Do not merge until a maintainer with security authority has approved.
+- Breaking changes require the `breaking-change` label and must document migration steps in
+  the PR body or a linked spec.
+
+---
+
+## Section 2: Bug Report Template
+
+**File:** `.github/ISSUE_TEMPLATE/bug_report.md`
+
+```markdown
+---
+name: Bug Report
+about: Report a reproducible defect in Code-Kit-Ultra
+labels: type:bug
+assignees: ''
+---
+
+## Environment
+
+- **Version:** (e.g., v1.2.0 — check `VERSION` file or `pnpm ck --version`)
+- **Node version:** (run `node --version`)
+- **OS:** (e.g., macOS 14, Ubuntu 22.04)
+- **Deployment:** (local dev / Docker / cloud — specify)
+
+## Steps to Reproduce
+
+1.
+2.
+3.
+
+## Expected Behavior
+
+<!-- What should have happened? -->
+
+## Actual Behavior
+
+<!-- What actually happened? -->
+
+## Logs / Error Output
+
+```
+<!-- Paste relevant logs, stack traces, or error messages here -->
+```
+
+## Security Impact
+
+<!-- Was any sensitive data exposed? Were permissions bypassed? If yes, DO NOT
+file a public issue — use the Security Vulnerability template instead or email
+security@[company].com -->
+
+- [ ] No security impact
+- [ ] Potential security impact — please review the Security Vulnerability template
+
+## Workaround
+
+<!-- Is there a known workaround? Document it here if so. -->
+```
+
+**Usage notes:**
+- Set the `priority:P*` label based on user impact before assigning.
+- If the bug involves data exposure or a permission bypass, close this issue and file a
+  security vulnerability report instead (see Section 4).
+- Add the `component:*` label matching the affected package or app.
+
+---
+
+## Section 3: Feature Request Template
+
+**File:** `.github/ISSUE_TEMPLATE/feature_request.md`
+
+```markdown
+---
+name: Feature Request
+about: Propose a new capability or enhancement
+labels: type:feature
+assignees: ''
+---
+
+## Problem Statement
+
+<!-- As a [role], I need [capability] so that [outcome]. -->
+
+## Proposed Solution
+
+<!-- Describe how you envision this working. Include API shape, UX flow, or
+configuration format if applicable. -->
+
+## Alternatives Considered
+
+<!-- What other approaches did you consider and why did you rule them out? -->
+
+## Acceptance Criteria
+
+- [ ] (Describe verifiable, testable outcomes)
+- [ ] (Each item should be independently checkable)
+- [ ] (Avoid vague criteria like "works correctly")
+
+## Related Specs
+
+<!-- Link to any SPEC_*.md in docs/03_specs/ that this feature relates to or
+would require updating. If no spec exists, note whether one should be created. -->
+
+## Priority
+
+<!-- Select one and explain briefly -->
+- [ ] P0 — Critical: blocking a release or a major customer
+- [ ] P1 — High: significant user impact, no adequate workaround
+- [ ] P2 — Medium: meaningful improvement, workaround exists
+- [ ] P3 — Low: nice to have, minimal user impact
+```
+
+**Usage notes:**
+- A feature request does not guarantee implementation. The core team triages requests weekly.
+- If the feature changes a public API or requires a spec, add the `status:needs-spec` label
+  and link the relevant spec directory.
+- P0 and P1 requests are discussed in the next planning cycle. P2/P3 are backlogged.
+
+---
+
+## Section 4: Security Vulnerability Template
+
+**File:** `.github/ISSUE_TEMPLATE/security_vulnerability.md`
+
+> IMPORTANT: This template exists so reporters know the correct process. The template itself
+> instructs reporters NOT to use a public GitHub issue for vulnerabilities.
+
+```markdown
+---
+name: Security Vulnerability
+about: Report a security vulnerability in Code-Kit-Ultra
+labels: type:security, security-review-required
+assignees: ''
+---
+
+## STOP — Do Not File Public Security Issues
+
+If you have found a security vulnerability, **do not open a public GitHub issue**.
+Public issues are visible to everyone, including potential attackers.
+
+**How to report a security vulnerability:**
+
+1. **Email:** security@[company].com with subject line `[SECURITY] Brief description`
+2. **GitHub Private Advisory:** Use GitHub's private security advisory feature at
+   `Security` → `Advisories` → `Report a vulnerability` in this repository.
+
+We will acknowledge your report within 48 hours and provide a remediation timeline.
+
+---
+
+If you have already confirmed this is NOT a sensitive vulnerability (e.g., it is a
+publicly known issue with a published CVE and no exploit is available), you may
+continue with this template:
+
+## Impact Assessment
+
+<!-- What data or systems are at risk? Who is affected (all users, admins only,
+specific orgs)? What can an attacker do? -->
+
+## Steps to Reproduce
+
+<!-- Only share details in this public issue if the vulnerability is already
+publicly known. Otherwise, share reproduction steps in the private advisory. -->
+
+## Affected Versions
+
+<!-- Which versions are affected? Is the latest release affected? -->
+
+## CVSS Score
+
+<!-- If you have computed a CVSS score, include it here. If not, leave blank. -->
+
+## Suggested Remediation
+
+<!-- If you have a suggested fix or mitigation, describe it here. -->
+```
+
+**Usage notes:**
+- The `security-review-required` label is applied automatically. Do not remove it.
+- A maintainer with security authority must acknowledge and triage all security issues within
+  48 hours of filing.
+- After a fix is merged, a security advisory is published and the CVE is requested if
+  warranted.
+
+---
+
+## Section 5: Documentation Gap Template
+
+**File:** `.github/ISSUE_TEMPLATE/doc_gap.md`
+
+```markdown
+---
+name: Documentation Gap
+about: Report missing, incorrect, or outdated documentation
+labels: type:docs
+assignees: ''
+---
+
+## Which Document Is Missing or Incorrect
+
+<!-- Provide the file path (e.g., docs/03_specs/SPEC_ORCHESTRATOR.md) or describe
+what documentation you expected to find and could not. -->
+
+## What the Correct Behavior Is
+
+<!-- Describe what the system actually does (per the implementation or an existing
+spec), or what documentation should say. -->
+
+## Suggested Content or Correction
+
+<!-- If you know what the correct documentation should say, write it here. Even a
+rough draft is helpful. If you are reporting that documentation simply does not
+exist, describe what topics it should cover. -->
+
+## Related Implementation
+
+<!-- Link to the relevant source file, package, or SPEC_*.md if applicable. -->
+```
+
+**Usage notes:**
+- If the documentation gap reveals a discrepancy between a spec and the implementation,
+  add both `type:docs` and `type:bug` labels.
+- Documentation-only fixes can be merged by any maintainer without a second review if
+  the change is non-controversial (correcting a typo, adding a missing step, etc.).
+
+---
+
+## Section 6: PR Review Checklist
+
+Reviewers must verify the following before approving any PR. This checklist is not embedded
+in the PR template — it is a reviewer responsibility.
+
+**Code quality:**
+- [ ] TypeScript strict mode — no `any` without an explicit justification comment
+- [ ] No `console.log` statements in production code paths (use the `observability` package)
+- [ ] No hardcoded secrets, tokens, or credentials in any file
+- [ ] New behavior has corresponding test coverage (unit tests at minimum)
+
+**Security:**
+- [ ] Multi-tenant scoping enforced: `orgId`, `workspaceId`, and `projectId` are passed and
+      validated at all relevant boundaries
+- [ ] If touching auth: session resolution path tested, RBAC permissions checked
+- [ ] Audit events emitted for all material actions (run create/cancel, gate approve/reject,
+      policy change)
+- [ ] No new dependencies added without a brief justification in the PR description
+
+**Database:**
+- [ ] If schema changes: a migration file is included alongside the code change
+- [ ] No raw SQL string concatenation — use parameterized queries exclusively
+- [ ] New queries are covered by at least one test that exercises the DB layer
+
+**Process:**
+- [ ] PR title follows conventional commit format (`type(scope): subject`)
+- [ ] All PR template checkboxes are completed (not left blank)
+- [ ] If implementing a spec: the spec is linked in the PR description and any deviations
+      are documented
+
+---
+
+## Section 7: GitHub Labels
+
+Configure the following labels in the repository settings. Labels should be created before
+the first PR is opened. Use the exact names and hex colors listed.
+
+**Type labels:**
+
+| Label | Color | Description |
+|---|---|---|
+| `type:bug` | `#d73a4a` | Something is not working correctly |
+| `type:feature` | `#0075ca` | New capability or enhancement |
+| `type:security` | `#e11d48` | Security vulnerability or hardening |
+| `type:docs` | `#0052cc` | Documentation missing, incorrect, or outdated |
+| `type:chore` | `#e4e669` | Maintenance with no user-visible change |
+| `type:test` | `#cfd3d7` | Test-only addition or update |
+
+**Priority labels:**
+
+| Label | Color | Description |
+|---|---|---|
+| `priority:P0` | `#b60205` | Critical — blocking a release or major customer |
+| `priority:P1` | `#e4002b` | High — significant impact, no adequate workaround |
+| `priority:P2` | `#fbca04` | Medium — meaningful improvement, workaround exists |
+| `priority:P3` | `#c2e0c6` | Low — nice to have, minimal user impact |
+
+**Component labels:**
+
+| Label | Color | Description |
+|---|---|---|
+| `component:auth` | `#1d76db` | packages/auth — session, RBAC, permissions |
+| `component:orchestrator` | `#1d76db` | packages/orchestrator — phase engine, run lifecycle |
+| `component:governance` | `#1d76db` | packages/governance — gates, policies |
+| `component:adapters` | `#1d76db` | packages/adapters — external integrations |
+| `component:db` | `#1d76db` | Database schema, migrations, queries |
+| `component:cli` | `#1d76db` | apps/cli — CLI entry point and commands |
+| `component:api` | `#1d76db` | apps/control-service — REST API |
+| `component:web` | `#1d76db` | apps/web-control-plane — Vite frontend |
+
+**Status labels:**
+
+| Label | Color | Description |
+|---|---|---|
+| `status:blocked` | `#b60205` | Cannot proceed — waiting on external dependency |
+| `status:in-review` | `#0075ca` | Actively being reviewed by maintainers |
+| `status:needs-spec` | `#fbca04` | Requires a SPEC_*.md before implementation begins |
+| `status:wont-fix` | `#cfd3d7` | Acknowledged but will not be addressed |
+
+**Special labels:**
+
+| Label | Color | Description |
+|---|---|---|
+| `breaking-change` | `#b60205` | PR introduces a breaking API or behavior change |
+| `security-review-required` | `#e11d48` | Must be reviewed by a maintainer with security authority |