Skip to content

feat: per-persona model selection + self-diagnosing resume handoff#29

Merged
spencermarx merged 41 commits intomainfrom
feat/agent-sessions-and-team-models
May 6, 2026
Merged

feat: per-persona model selection + self-diagnosing resume handoff#29
spencermarx merged 41 commits intomainfrom
feat/agent-sessions-and-team-models

Conversation

@spencermarx
Copy link
Copy Markdown
Owner

@spencermarx spencermarx commented May 6, 2026

Closes #27.

Lands per-persona model selection (the original ask) plus the supporting infrastructure that surfaced as load-bearing once we started actually shipping multi-model team configs through the dashboard:

  • A live event-stream timeline so users can watch their review progress with proper provenance per agent.
  • A self-diagnosing terminal-handoff panel that turns the previous opaque "fresh-start fallback" into typed UnresumableReason cases with copyable diagnostics — backed by a single-owner SessionCaptureService, the per-execution events JSONL as a recovery primitive, and a durable cross-process spawn marker so workflow-id linkage doesn't depend on env-var inheritance.

Highlights

Team multi-model configuration

.ocr/config.yaml default_team now accepts a model per reviewer type, and the dashboard's Team page surfaces a model dropdown per persona/instance with per-instance overrides. The CLI's team-config parser handles three forms (count shorthand, object-with-model, list-of-instance-configs). Adapters declare supportsPerTaskModel; command-runner now surfaces a structured warning when the active vendor lacks per-subagent support and the team configures per-instance models — the previous silent UX failure documented in the archived add-agent-sessions-and-team-models change.

ocr-default-team-composition ocr-default-reviewer-model-configuration ocr-per-review-model-configuration

Self-diagnosing "Pick up in terminal"

When a workflow can't be resumed the panel renders a typed reason
(workflow-not-found / no-session-id-captured / host-binary-missing)
with structured microcopy (headline / cause / remediation) and a
copy-for-issue-report diagnostic block. When it can, vendor-native
cd + claude --resume <id> (or the OpenCode equivalent) commands
are produced server-side from the shared vendor-resume.ts helper —
a single source of truth across the dashboard adapters and the CLI's
own ocr review --resume.

Architecture

  • Branch-by-Abstraction at the SQL layer holds: every vendor_session_id and workflow_id write goes through two primitives in @open-code-review/cli/db (recordVendorSessionIdForExecution, linkDashboardInvocationToWorkflow).
  • SessionCaptureService façade is the single entry point for capture, linkage, and resume-context resolution. Three contract methods (stable across future refactors) plus two internal linkage-discovery strategies that may evolve without spec amendment.
  • Cross-process linkage has three sources of truth in precedence order: --dashboard-uid flag → OCR_DASHBOARD_EXECUTION_UID env var → .ocr/data/dashboard-active-spawn.json marker file (PID-liveness checked). Whichever survives the AI's shell wins.
  • JSONL event journal is read-only on the recovery path. Writers go through SQL primitives; readers may consult JSONL for backfill, never as authoritative state.
  • UnresumableReason is single-sourced from a const-assertion array in unresumable-microcopy.ts. Adding a variant fails CI in three directions (Record exhaustiveness, runtime parity, microcopy completeness lint) — both server- and client-side via type-only re-export.

Three rounds of dogfood-driven self-review (the OpenSpec add-self-diagnosing-resume-handoff change drove the SessionCaptureService work to closure):

  • Round 1: 4 blockers / 5 should-fix → REQUEST CHANGES
  • Round 2: 1 blocker / 5 should-fix → REQUEST CHANGES
  • Round 3: 0 blockers / 7 should-fix / 9 suggestions → APPROVE WITH NITS

All three rounds' final.mds + discourse files are in .ocr/sessions/ for the curious.

Verification

npx nx run-many -t build       ✓
npx nx run-many -t test        ✓ 312 dashboard unit + cli + platform
npx nx run dashboard-api-e2e   ✓ 30 tests
npx nx run cli-e2e             ✓ 31 tests
openspec validate --strict     ✓

Known follow-ups (deferred with explicit doc)

  • tasks.md 8.5/8.6/8.7 — live verification (fresh review / simulated failure / simulated recovery against a real dashboard) — ~15 min of human QA.
  • tasks.md 8.8 — terminal-handoff-panel.test.tsx rendering panel through @testing-library/react + jsdom for each UnresumableReason. Vitest is currently environment: 'node'; this is a follow-up infra PR.
  • Architectural backlog (next-PR / future):
    • WorkflowLinkagePolicy extraction to shrink the service back to its three contract methods.
    • Cross-package shared UnresumableReason source (currently dashboard client uses a type-only re-export from server).
    • Adapter buildResumeCommand 4-line pass-through cleanup.
    • CI lint rule for proc.stdout?.on('data', (chunk: Buffer)) without adjacent setEncoding — would have caught the round-2 UTF-8 regression mechanically.

Test plan

  • Verify per-persona model selection in .ocr/config.yaml is honored when running ocr review from the CLI
  • Verify the dashboard Team page persists per-instance models and shows a structured warning when the active vendor doesn't support per-task models
  • Run a fresh review through the dashboard, copy the resume command from the panel, paste into a terminal, confirm the session reattaches
  • Force a failure (delete vendor_session_id from the DB), confirm the panel renders structured reason + remediation
  • Force recovery (delete vendor_session_id from the DB but leave the events JSONL intact), confirm the panel renders the resumable command after replay backfills

🤖 Generated with claude-flow

spencermarx and others added 30 commits April 30, 2026 15:53
Co-Authored-By: claude-flow <ruv@ruv.net>
Aligns OCR's vision for model selection on Issue #27 with a written
problem brief — the constraints, the user need, and the design
non-goals — so the proposal that follows ties directly to a documented
problem rather than a one-off conversation.

Co-Authored-By: claude-flow <ruv@ruv.net>
Introduces the OpenSpec change for first-class agent session
journaling, three-form per-instance model configuration, and the
dashboard team-management surface. Includes proposal, design rationale,
tasks, and per-capability spec deltas across cli/config/dashboard/
review-orchestration/reviewer-management/session-management/sqlite-state.

Will be archived after implementation lands.

Co-Authored-By: claude-flow <ruv@ruv.net>
Adds esbuild configurations for the new library modules so the
dashboard can re-bundle them without inlining banners. Marks yaml
external in team-config to keep CJS require shims out of the
dashboard server bundle.

Co-Authored-By: claude-flow <ruv@ruv.net>
Migration v11 extends command_executions with the lifecycle columns
that previously lived in a parallel agent_sessions table: workflow_id,
parent_id, vendor, vendor_session_id, persona, instance_index, name,
resolved_model, last_heartbeat_at, notes. Liveness is derived rather
than stored — running rows have no finished_at; the sweep marks
heartbeat-stale rows orphaned (exit_code = -3).

Why: command history and agent sessions describe the same lifecycle
event. Two journals diverge under load. Unifying them gives the
command-history UI free liveness/resume affordances and removes the
duplicate writer pattern.

Co-Authored-By: claude-flow <ruv@ruv.net>
Adds start-instance, bind-vendor-id, beat, end-instance, and list
subcommands. The AI calls these from its workflow to journal what it
spawned, what model it ran, what vendor session it created, and when
it last reported alive. OCR doesn't spawn; it just records.

runtime-config.ts reads runtime.agent_heartbeat_seconds from
.ocr/config.yaml with a 60s default and validates positive integers.

Co-Authored-By: claude-flow <ruv@ruv.net>
Three-tier resolution: native enumeration via the vendor CLI when
available, then a small bundled known-good list (claude-opus-4-7,
sonnet-4-6, haiku-4-5; opencode provider-prefixed ids), then free
text. The user's CLI ultimately decides what's valid — we never
gatekeep against unknown ids.

--vendor flag bypasses PATH detection so this is testable on CI
runners without claude/opencode installed.

Co-Authored-By: claude-flow <ruv@ruv.net>
Adds a three-form default_team schema (shorthand count, object with
shared model, list of per-instance configs) plus the ReviewerInstance
normalization seam that all consumers read through. Form is picked by
YAML type — mixing forms within one persona key is rejected at parse.

ocr team set --stdin uses parseDocument so existing comments and
unrelated keys survive a save: scalar values mutate in place when
form is unchanged, the whole pair is replaced only when the form
changes (e.g. scalar to seq). Inline comments on unchanged team
entries are preserved.

set also regenerates reviewers-meta.json so is_default flags reflect
the new composition immediately. The dashboard's file watcher fires
on this write and refreshes badges via reviewers:updated.

installer.ts switches from regex-based parsing to the team-config
seam for is_default derivation.

Co-Authored-By: claude-flow <ruv@ruv.net>
Adds the review entrypoint that wires the team/session/handoff
commands together. --resume <workflow-id> looks up the latest
captured vendor_session_id for that workflow and re-spawns the AI
CLI with vendor-specific resume flags. CLI index registers the new
session, models, team, and review commands.

state/types.ts gains the resume token plumbing.

Co-Authored-By: claude-flow <ruv@ruv.net>
Updates the OCR skill (both the workspace .ocr copy and the package
source) to teach the host AI Phase 4: read the resolved team via
ocr team resolve, journal each spawned reviewer with ocr session
start-instance, bind vendor session ids when emitted, beat between
phases, and end-instance on completion. config.yaml gains commented
examples of the three-form default_team schema.

The AI orchestrates itself — OCR provides specs, data, and an audit
journal. This documentation update makes that contract explicit.

Co-Authored-By: claude-flow <ruv@ruv.net>
…w session_id capture

Each adapter now exposes listModels with a three-tier fallback (native
enumeration, bundled known-good list, free text). SpawnOptions accepts
model so per-Task model overrides flow through.

command-runner stops dropping session_id events for workflow runs:
the adapter's session_id event is now persisted to the active
command_executions row via bindVendorSessionIdOpportunistically. AI
commands get workflow_id, vendor, and last_heartbeat_at on their
INSERT so liveness derivation works immediately.

Co-Authored-By: claude-flow <ruv@ruv.net>
Adds GET /api/agent-sessions?workflow=<id> reading from the unified
command_executions table, GET /api/sessions/:id/handoff returning
both OCR-mediated and vendor-native resume command pairs (with
hostBinaryAvailable PATH detection), and a startup-time WAL
checkpoint truncate to reclaim stale write-ahead logs.

DbSyncWatcher uses directory-level chokidar with polling+interval=200
to survive sql.js's atomic-rename save semantics, plus a pull-on-read
sync hook so route handlers see the latest CLI writes without waiting
on watcher debounce.

Co-Authored-By: claude-flow <ruv@ruv.net>
GET /api/team/resolved wraps ocr team resolve --json. POST
/api/team/default writes via ocr team set --stdin and emits a
team:updated socket event so badges refresh without a full requery.
GET /api/team/models proxies the active adapter's listModels.

Client hooks (useResolvedTeam, useSetDefaultTeam, useAvailableModels)
hide the React Query plumbing; api-types.ts exports the shared
ReviewerInstance and ModelDescriptor shapes.

Co-Authored-By: claude-flow <ruv@ruv.net>
Custom combobox component used by the team configuration surfaces.
Two-line option rendering (friendly name + raw model id mono),
listbox/option ARIA, click-outside, ESC, arrow-key navigation, and
a free-text fallback for CLIs that don't enumerate models.
defaultOpen + onOpenChange let transient pickers (e.g. Add reviewer)
open immediately and notify parents on close for cancel-on-blur.

Replaces native <select> elements that didn't match the dashboard's
zinc-on-white aesthetic.

Co-Authored-By: claude-flow <ruv@ruv.net>
Relocates default-team configuration from the Command Center to its
proper home alongside the reviewer library. Teams render as a card
grid showing each reviewer's count and resolved model; clicking a
card opens a focused edit dialog with count stepper, same-model /
per-reviewer toggle, and per-instance model dropdowns.

Edits stage in a local draft. Header surfaces Save changes and
Discard buttons only when dirty; Cmd/Ctrl+S saves; beforeunload
warns on navigation. Card visual states differentiate pristine,
modified (small amber dot), newly added (dashed indigo + New pill),
and pending removal (dashed amber + Will remove on save + undo
button). Removing a disk-resident card stages the removal; removing
a draft-only card discards it instantly.

ReviewerDialog (Command Center per-run override) gains a per-card
Advanced disclosure with the same uniform / per-instance model
picker. ReviewerSelection extends with optional models array.
serializeTeam emits JSON when any selection carries models;
otherwise the existing persona:count shorthand.

Reviewer cards in the library show an In default team xN badge
that updates live via reviewers:updated.

Co-Authored-By: claude-flow <ruv@ruv.net>
LivenessHeader derives running / stalled / orphaned states from the
unified command_executions journal and renders them above the phase
tracker on the session detail page. ResumeCard surfaces the
in-dashboard 'Continue here' affordance for stalled and orphaned
sessions.

TerminalHandoffPanel is the explicit hand-off surface. Two modes:
OCR-mediated (default — re-enter the review workflow with
ocr review --resume) and vendor-native (escape hatch — claude
--resume, opencode run --session --continue). Two-step copyable
command sequence with explicit cd. Best-effort PATH detection warns
when the host CLI isn't visible.

use-agent-sessions provides the React Query hook over /api/agent-sessions.

Co-Authored-By: claude-flow <ruv@ruv.net>
…nal handoff

Extends StatusFilter and getStatus with stalled (running rows whose
heartbeat is stale) and orphaned (rows the sweep marked exit_code -3)
states, reusing the LivenessHeader palette. Rows with vendor_session_id
gain a per-row 'Pick up in terminal' affordance that opens the same
TerminalHandoffPanel modal scoped to that workflow.

Expanded rows show vendor, resolved_model, and workflow_id metadata
from the new command_executions columns. CommandHistoryEntry type in
use-commands.ts gains those fields.

Co-Authored-By: claude-flow <ruv@ruv.net>
31 tests covering ocr session start-instance / bind-vendor-id / beat
/ end-instance lifecycle and sweep semantics, ocr team resolve across
all three forms plus session-overrides and alias expansion, ocr team
set --stdin round-trip plus reviewers-meta regeneration plus YAML
comment preservation, ocr models list bundled fallbacks, and
ocr review --resume validation.

Real CLI subprocesses, real DB, real config files. Khorikov classical
school — no mocking the integration seams under test.

Co-Authored-By: claude-flow <ruv@ruv.net>
23 tests across the unified journal API surface: GET /api/agent-sessions
returns rows the CLI inserted with their lifecycle fields visible,
GET /api/sessions/:id/handoff returns OCR-mediated and vendor-native
command pairs after binding plus the fresh-start fallback when no
vendor session id is captured, and constructs vendor-correct commands
for both Claude Code and OpenCode.

Cross-process visibility test confirms the dashboard sees CLI writes
via the directory-watching DbSyncWatcher even after sql.js's atomic
saveDatabase rename.

Co-Authored-By: claude-flow <ruv@ruv.net>
Co-Authored-By: claude-flow <ruv@ruv.net>
`**/.claude-flow/data/` accumulates per-agent insights, traces, and
intermediate state that shouldn't reach PRs. Round-1 SF6.

Co-Authored-By: claude-flow <ruv@ruv.net>
Capture/handoff drift across the dashboard surfaced as silent
"fresh-start" fallbacks with no explanation. Proposal extracts a
single-owner SessionCaptureService, replaces the boolean fallback with
a typed UnresumableReason discriminated union + per-reason microcopy,
and uses the per-execution events JSONL as a recovery primitive that
backfills missed bindings before the unresumable outcome is computed.

Co-Authored-By: claude-flow <ruv@ruv.net>
`buildResumeArgs(vendor, sessionId)` and `buildResumeCommand` are the
single source of truth for resume invocations across processes. Both
the dashboard adapters and the CLI's `ocr review --resume` consume
the helper, eliminating the parallel-implementation hazard that round-2
SF11 surfaced (the OpenCode `run "" --session <id> --continue` shape
was wrong in three places at once).

Built as a separate library bundle (`dist/lib/vendor-resume.js`) and
exported via the package's `./vendor-resume` subpath so the dashboard
can `import` it without bundling the full CLI.

Co-Authored-By: claude-flow <ruv@ruv.net>
Two new primitives in `agent-sessions.ts` are the only writers of
`vendor_session_id` and `workflow_id` on `command_executions`
(Branch-by-Abstraction at the SQL layer, per the proposal):

- `recordVendorSessionIdForExecution` — COALESCE-idempotent capture
- `linkDashboardInvocationToWorkflow` — late-link by uid

`ocr state init` now reads the dashboard's spawn marker file
(`.ocr/data/dashboard-active-spawn.json`) when neither the
`OCR_DASHBOARD_EXECUTION_UID` env var nor the `--dashboard-uid` flag
is present. Three-source resolution (flag > env > marker) closes the
linkage gap on shells that strip unfamiliar env vars. PID-liveness
check via `process.kill(pid, 0)` skips stale markers from crashed
dashboards. Round-2 work; round-3 SF4 documents drift-skips-heartbeat.

Co-Authored-By: claude-flow <ruv@ruv.net>
`AiCliAdapter` gains `buildResumeArgs(sessionId): string[]` and
`buildResumeCommand(sessionId): string` (both delegating to the shared
`@open-code-review/cli/vendor-resume` helper). The service can now
look up an adapter by binary name and ask it to produce the resume
command — zero `if vendor === 'claude'` switches at the service layer.

`AiCliService.isAdapterAvailable(vendor)` reads cached startup
detection rather than per-request `spawnSync(binary, ['--version'])`.
The previous probe blocked the event loop for up to 3s on every
handoff request.

`supportsPerTaskModel` is now wired to a real consumer in command-runner
(emits a structured warning when the team has per-instance models and
the active vendor doesn't support them — previously a silent UX failure
per the archived spec).

Adapter unit tests pin the resume argv shape (regression guard for
the round-2 OpenCode `run ""` shape) and the UTF-8 boundary stitch.

Co-Authored-By: claude-flow <ruv@ruv.net>
…oute

The single owner for vendor_session_id capture, workflow_id linkage,
and resume context resolution. Implements the proposal's contract:

- `recordSessionId(executionId, vendorSessionId)` — idempotent capture
  with once-per-execution drift logging (drift is anomaly signal,
  doesn't refresh heartbeat — round-3 SF4 spec amendment).
- `linkInvocationToWorkflow(uid, workflowId)` — late-link primitive.
- `resolveResumeContext(workflowId)` — discriminated `ResumeOutcome`
  union; recovery via `recoverFromEventsJsonl` runs unconditionally
  before unresumable is computed (load-bearing for type completeness).
- Internal linkage-discovery strategies (`autoLinkPendingDashboardExecution`,
  `linkExecutionToActiveSession`) — server-side fallbacks for cross-
  process uid propagation. SQL filtered by status='active' + 30-min
  upper window to avoid mis-binding under concurrent reviews.

`UnresumableReason` derives from `ALL_UNRESUMABLE_REASONS as const`
in `unresumable-microcopy.ts`. Adding a variant in one place fails CI
in three directions (Record exhaustiveness, runtime parity, lint test).

Handoff route (`routes/handoff.ts`) is a pure delegate: request →
`resolveResumeContext` → response. `projectDir` lives on the envelope
(round-3 Sug 4) so the discriminated union discriminates outcome only.

Tests: 22 capture-service tests + 6 recover-from-events scenarios + 3
microcopy completeness + drift-warn-once + concurrent-review SQL
filter regression coverage (round-3 Sug 1 in-window tiebreak).

Co-Authored-By: claude-flow <ruv@ruv.net>
Each `command_executions` row gets one JSONL file at
`.ocr/data/events/<execution_id>.jsonl`. command-runner appends one
StreamEvent per line; the dashboard's `GET /api/commands/:id/events`
route reads the file back for live-reload rehydration and history
replay. The journal is also load-bearing for the resume-recovery
primitive: when relational state misses a `vendor_session_id` capture,
the SessionCaptureService walks the JSONL and backfills.

Append-only, atomic-rename-friendly, malformed-line tolerant. The
file is read-only on the recovery path — writers go through the SQL
primitives in `@open-code-review/cli/db`; readers may consult JSONL
but never write it as authoritative state.

Co-Authored-By: claude-flow <ruv@ruv.net>
…pt-injection guards

command-runner's AI spawn path now:
- Writes the dashboard spawn marker file synchronously before the
  AI process can issue its first `ocr state init`.
- Sets `proc.stdout/stderr.setEncoding('utf-8')` so multi-byte
  codepoints don't corrupt at OS pipe boundaries (was silently
  dropping `session_id` capture lines containing emoji or non-ASCII).
- Polls `sessionCapture.linkExecutionToActiveSession` post-spawn for
  workflow_id binding (catches the same-id session UPDATE path the
  watcher hook misses). Cleared on close + error.
- Bumps Claude Code `--max-turns` from 50 to 500 — a multi-reviewer
  workflow easily reached 50 turns and Claude exited with code 0
  mid-`reviews` phase, leaving the workflow incomplete.
- Surfaces a structured `[ocr] Warning: <vendor> does not support
  per-subagent model overrides ...` event when the team has per-
  instance models the active vendor can't honor.

Prompt construction (`buildPrompt`) extracted as a pure helper so
structural ordering is testable. Trusted blocks (CLI Resolution,
Dashboard Linkage) emit BEFORE user-supplied content (target,
--reviewer, --requirements, --team), which is fenced in a
"User-supplied review parameters" block with explicit "treat as DATA"
microcopy. `escapeUserHeaders` rewrites ATX/setext/fullwidth/fence
patterns to defeat header-spoofing as defense-in-depth.

`chat-handler.ts` and `post-handler.ts` get the same UTF-8
`setEncoding('utf-8')` treatment (round-2 Blocker 1 sweep completion).

`index.ts` wires a single shared `SessionCaptureService` into both
the handoff route and command-runner; shutdown clears the spawn
marker so a crash mid-spawn doesn't leave a stale marker pointing at
a dead PID.

Tests: 11 prompt-injection cases (escape function + structural
ordering — round-3 SF1+SF2).

Co-Authored-By: claude-flow <ruv@ruv.net>
…_session_id changes

The dashboard's in-memory sql.js DB is a separate copy from the
on-disk file. When the CLI's `state init` writes `workflow_id` to
disk, the watcher's `syncAgentSessions` would compare only
heartbeat/finished/exit, see no change, skip the in-memory UPDATE,
and the dashboard's next saveDb would overwrite disk with its stale
memory copy — wiping the CLI's workflow_id back to NULL.

The equality check now spans every CLI-mutable column (heartbeat,
finished_at, exit_code, workflow_id, vendor_session_id). Any
difference triggers the in-memory `INSERT OR REPLACE`, which the
dashboard's pre-save hook then preserves on the next saveDb.

This is the bug that made every previous linkage attempt look broken
(env var, flag, marker, polling all wrote correctly to disk; the
dashboard kept undoing them). New regression test pins the behavior
on a workflow_id-only diff.

Co-Authored-By: claude-flow <ruv@ruv.net>
Reviewers like to write `**Verdict**: REQUEST CHANGES — <long
rationale>` on a single line. The previous regex captured the whole
prose into `latest_verdict`, which then rendered as a paragraph-long
badge on session cards.

`normalizeVerdict` reduces captured text to a known-keyword prefix
(REQUEST CHANGES, APPROVED, LGTM, NEEDS DISCUSSION, etc.) or clips
unknown phrasings at the first sentence break. Two regression tests
cover the long-rationale and unknown-phrasing cases.

Co-Authored-By: claude-flow <ruv@ruv.net>
spencermarx and others added 11 commits May 6, 2026 16:38
Renders the structured event stream from the AI CLI adapter as a
typed timeline: thinking blocks, tool calls with inputs and outputs,
text deltas accumulated into message blocks, structured errors. Each
block is wrapped in an AgentRail keyed by agentId so multi-agent
fan-outs (orchestrator + reviewers) thread their provenance through
the feed; the rail collapses to nothing when a single agent is the
sole producer (most common case).

Empty/loading state is a centered composition (icon halo + primary
state + supporting microcopy) instead of a bare "Waiting for output"
paragraph. Sticky-to-bottom hook keeps the feed pinned during
streaming, surfaces a "Jump to live" button when the user scrolls
away. Friendly running header replaces the previous raw-command dump
with a parsed verb + reviewer-count chip.

Co-Authored-By: claude-flow <ruv@ruv.net>
…ring

The "Pick up in terminal" panel renders a `ResumeOutcome`
discriminated union: resumable carries the vendor-native command
pair (cd + claude/opencode --resume); unresumable carries microcopy
(headline / cause / remediation) + a copyable diagnostic block for
issue reports. No fabricated commands on failure — Copy buttons and
toggle UI are structurally hidden on the unresumable arm.

The panel renders through `createPortal(document.body)` to escape
parent layout containers' margin-bottom (Tailwind `space-y-*`
applies to fixed-positioned children too, which shifted the modal's
effective bottom edge up by 24px in nested round-page contexts).

Verdict-banner and session-card defensively normalize verdict text
(prefix-match against known labels, fallback to neutral "Verdict"
badge for unknown phrasings) so a long-prose verdict in older data
doesn't crash the round-page render.

Stalled threshold bumped from 60s to 15min — multi-reviewer rounds
sit on a single Claude turn for minutes between heartbeat bumps, so
the previous threshold cried stall on healthy reviews.

Client `UnresumableReason` type re-exports from the server's
`unresumable-microcopy.ts` (round-3 SF3) so client/server can't
drift; type-only import gets erased by the bundler. `HandoffPayload`
hoists `projectDir` to the envelope (round-3 Sug 4) — the
discriminated union now discriminates outcome only.

Co-Authored-By: claude-flow <ruv@ruv.net>
`[vite] ws proxy socket error: write EPIPE` was logged on every
browser tab close/refresh during socket.io upgrades — cosmetic noise,
not a server bug. The error fires from a socket-level handler that
the proxy `configure` callback can't intercept; the durable fix is a
custom Vite logger that drops the specific noise pattern when the
underlying error code is EPIPE or ECONNRESET. Real proxy errors
still surface.

Co-Authored-By: claude-flow <ruv@ruv.net>
End-to-end coverage for the `/api/sessions/:id/handoff` and
`/api/commands/:id/events` routes:

- Each `UnresumableReason` reachable via constructed scenarios
  (workflow-not-found, no-session-id-captured, host-binary-missing).
- Resumable path returns the vendor-native command string with a
  regression guard against the round-2 broken OpenCode shape
  (`run "" --session ...`).
- `state init` env-var late-linkage round-trip: dashboard execution
  → state init with OCR_DASHBOARD_EXECUTION_UID → handoff resolves.
- Events API smoke test — JSONL replay shape.

Co-Authored-By: claude-flow <ruv@ruv.net>
Pins the cross-process linkage contract: `ocr state init` accepts
`--dashboard-uid <uid>` (preferred, survives shell sandboxing) and
falls back to `OCR_DASHBOARD_EXECUTION_UID` env var. After linkage,
`getLatestAgentSessionWithVendorId(workflowId)` resolves the
dashboard parent execution and the resume command is buildable.

Co-Authored-By: claude-flow <ruv@ruv.net>
`.ocr/reviewers-meta.json` and `.ocr/cli-config.json` reflect the
team config + reviewer set used during multi-round dogfooding of
this proposal.

Co-Authored-By: claude-flow <ruv@ruv.net>
Co-Authored-By: claude-flow <ruv@ruv.net>
…o source

CI's nx run dashboard:test failed on fresh checkout because the
adapters import @open-code-review/cli/vendor-resume but the
vitest alias map didn't include it. The published export points at
dist/, which doesn't exist before a build runs. Add the source-file
alias alongside the existing /db and /platform entries so unit tests
work without requiring a prior CLI build.

Co-Authored-By: claude-flow <ruv@ruv.net>
…ondition

CI's e2e jobs failed because nx run dashboard:build runs esbuild on the
server before the CLI's dist/lib/*.js files exist. The dashboard imports
four CLI subpaths (runtime-config, team-config, models, vendor-resume),
all whose package.json exports point at dist/lib/*.js for the standard
import condition.

Adding --conditions=source tells esbuild to honor the source condition
already declared in the CLI package.json, which points at the TS files
in src/lib/. esbuild transpiles those inline during the dashboard
server bundle, so dashboard:build no longer depends on cli:build.

This also avoids a circular Nx graph: cli implicitDependencies on
dashboard (CLI bundles the dashboard build), so we can't add
dashboard:build dependsOn cli:build without breaking the graph.

Co-Authored-By: claude-flow <ruv@ruv.net>
Three screenshots referenced by package READMEs:
- ocr-default-team-composition.png — Team page composition editor
- ocr-default-reviewer-model-configuration.png — per-reviewer model picker
- ocr-per-review-model-configuration.png — Command Center per-review model overrides

Co-Authored-By: claude-flow <ruv@ruv.net>
Document the team/model capabilities that landed in this branch and
were previously absent from any README. Pattern matches the existing
progressive-disclosure layout: capability → one-line value prop →
screenshot.

Top-level README:
- Add "Multi-Model Teams" to the Features TOC and the section itself
  (shorthand, { count, model }, per-instance list, with model aliases)
- Add three Dashboard subsections — default team composition,
  per-reviewer model configuration, per-review model overrides —
  one screenshot each
- Configuration snippet now points at the model forms + aliases

Dashboard package README:
- Three new feature bullets matching the top-level subsections, each
  with the relevant screenshot

CLI package README:
- Bump persona count to 28
- New "Multi-Model Teams" section with config example

Agents package README:
- Brief multi-model paragraph linking to the main README

Co-Authored-By: claude-flow <ruv@ruv.net>
@spencermarx
Copy link
Copy Markdown
Owner Author

Screenshot 2026-05-06 at 10 38 10 PM

Final manual tests passed ✅

@spencermarx spencermarx merged commit 631cc22 into main May 6, 2026
3 checks passed
@spencermarx spencermarx deleted the feat/agent-sessions-and-team-models branch May 6, 2026 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: per-persona model selection for AI CLI adapters

1 participant