feat(trial): zero-friction URL-to-workspace onboarding MVP by simple-agent-manager[bot] · Pull Request #758 · raphaeltm/simple-agent-manager

simple-agent-manager · 2026-04-18T21:35:34Z

Summary

Implements the zero-friction URL-to-workspace onboarding MVP from idea 01KPGJQ853C44JEREXWEZS1GQ8. Anonymous visitors paste a public GitHub repo URL, watch a live discovery agent analyze it, and get pre-generated suggestion chips that lead into a full SAM workspace after a 2-click login.

Built as a single orchestrated PR via 5 waves (foundation + 4 parallel tracks + integration) against the sam/trial-onboarding-mvp integration branch. Not to be merged to main — this is flagged for @raphaeltm manual review before merge and before production configuration is applied.

cc @raphaeltm — Configuration Checklist Before Merge

Staging (`sammy.party`) — zero manual steps required

The deploy pipeline provisions + flips everything automatically:

TRIAL_CLAIM_TOKEN_SECRET — auto-generated by Pulumi (infra/resources/secrets.ts), stored encrypted in the Pulumi R2-backed state, pushed as a Worker secret by configure-secrets.sh (commit 086f4ded)
trials:enabled=true in KV — written by the staging deploy workflow on every run (deploy-reusable.yml + commit b15ca27c removing an invalid --remote flag)
TRIAL_LLM_PROVIDER=workers-ai — already wired in wrangler.toml vars
TRIAL_MODEL=@cf/meta/llama-3.1-8b-instruct
TRIAL_MONTHLY_CAP=1500
sam_anonymous_trials sentinel user — seeded via migration 0043

→ Nothing to click on the staging environment. A fresh workflow_dispatch on deploy-staging.yml gives you a working trial surface.

Production (`simple-agent-manager.org`) — one manual step (the key)

Procure Anthropic API key budgeted for trials
wrangler secret put ANTHROPIC_API_KEY_TRIAL --env production (separate from platform key)
Set TRIAL_LLM_PROVIDER=anthropic, TRIAL_MODEL=claude-3-5-haiku-latest, TRIAL_AGENT_TYPE=claude-code in production vars
Set TRIAL_MONTHLY_CAP to your preferred prod cap (default 500)
Flip the kill switch when ready: pnpm --filter @simple-agent-manager/api exec wrangler kv key put "trials:enabled" "true" --binding KV --env production
Confirm sam_anonymous_trials sentinel user exists on prod D1
Confirm trial_counter KV namespace + TrialCounter DO bindings exist in prod wrangler env

Cookies

HMAC key for trial fingerprint cookies reuses TRIAL_CLAIM_TOKEN_SECRET (auto-provisioned on staging, manual on prod if desired).

Kill Switches

Set KV trials:enabled=false to instantly pause trial creation. /try cleanly falls back to "Trials are paused" — verified on staging.
TRIAL_MONTHLY_CAP=0 is also a hard stop.

What Shipped

Wave 0 — Foundation (`e253c08e`)

Shared Valibot schemas (packages/shared/src/trial.ts) for requests, responses, SSE events, idea shape
D1 migration 0043: trial_projects, trial_waitlist, sam_anonymous_trials sentinel user
Durable Objects: TrialCounter (monthly cap), TrialEventBus (SSE fan-out)
HMAC-signed cookie helpers (apps/api/src/services/trial/cookies.ts) for fingerprint (7d) and claim (48h) tokens
Kill-switch + cap helpers, discovery prompt template, route stubs

Wave 1 Track A — Backend Lifecycle (`4ca29ea6`)

POST /api/trial/create — validates repo URL, checks kill switch + cap, creates project under sentinel user, starts discovery session
GET /api/trial/status — enabled + remaining slots + reset date (public, no auth)
POST /api/trial/waitlist — cap-exceeded email capture
Cron: month-rollover counter reset + 30d waitlist purge

Wave 1 Track B — Backend Claim + SSE (`6ba2e101`)

GET /api/trial/:trialId/events — SSE stream multiplexed from TrialEventBus DO
POST /api/trial/:trialId/claim — post-OAuth handler that transfers the anonymous project from sentinel user to the newly-signed-in user, validates claim cookie
OAuth callback integration (claim=<trialId> query param round-trip)
Agent wiring: discovery session uses TRIAL_LLM_PROVIDER + TRIAL_MODEL

Wave 1 Track C — Frontend Discovery (`e8088705`)

/try landing page (mobile-first, repo URL input, kill-switch + cap-exceeded fallbacks)
/try/:trialId discovery feed consuming the SSE event stream
/try/cap-exceeded + /try/waitlist/thanks pages
React Router entries wired into App.tsx

Wave 1 Track D — Frontend Chat Gate (`1114c8fc`)

ChatGate component: suggestion chip carousel + textarea + send button
LoginSheet modal triggering GitHub OAuth with claim cookie preserved
useTrialDraft hook: localStorage persistence of the draft across the OAuth round-trip
useTrialClaim hook: post-login auto-submit of the stashed draft to the claimed project's chat

Wave 2 — Integration, Automation, and Live Fix

Merged all 4 Wave 1 tracks into sam/trial-onboarding-mvp. Two conflicts resolved:
- apps/api/src/env.ts — kept both Track A + Track B TRIAL_* env vars.
- apps/web/src/components/trial/ChatGate.tsx — kept Track D's real implementation; adapted Track C's TryDiscovery to Track D's TrialIdea contract + onAuthenticatedSubmit callback.
Automated the staging trial secret (commit 086f4ded): added infra/resources/secrets.ts entry that auto-generates TRIAL_CLAIM_TOKEN_SECRET via @pulumi/random, and wired configure-secrets.sh to push it as a Worker secret. No manual wrangler secret put on staging ever.
Automated the staging kill-switch (commits 086f4ded + b15ca27c): added a conditional step to .github/workflows/deploy-reusable.yml that writes trials:enabled=true to KV on every staging deploy (and only staging). Initial attempt used --remote, which is not a valid flag for wrangler kv key put — removed in b15ca27c.
Discovered and fixed a Wave 1 integration bug (commit db1d6332): Track A was persisting new trials to D1 only, while Track B readers (events.ts, claim.ts, trial-runner.ts) look up trials in KV via readTrial(). Every SSE connection 404'd with "Trial not found". Fix mirrors the trial to KV in POST /api/trial/create after the D1 insert, before issuing cookies, with rollback on KV failure (D1 row deleted, TrialCounter slot released). writeTrial() also hardened to skip the trial-by-project: index when projectId is empty (would otherwise collide all pending trials on a single key). Added regression test asserting KV.put("trial:<id>", ...) is invoked on the happy path.

Non-negotiable Constraints Verified

Mobile-first (375×667 authoritative) — all four trial screens rendered and screenshot-verified at mobile width
Public GitHub repos only — GITHUB_REPO_URL_REGEX in shared schemas
Locked initial prompt — discovery prompt template owned by the backend; user cannot write the first message
Login gate on chat interactions — ChatGate triggers LoginSheet on any send attempt by an anonymous visitor
Monthly cap + kill switch — TrialCounter DO + TRIAL_ENABLED env var
Staging uses opencode + Workers AI; production will use claude-code + Anthropic
Valibot for runtime validation — every request schema in packages/shared/src/trial.ts
System user pattern — no schema change to projects.userId; anonymous projects owned by sam_anonymous_trials until claimed
HMAC-signed claim cookie — uses auto-provisioned TRIAL_CLAIM_TOKEN_SECRET

Local Quality Gates

pnpm typecheck — clean across all packages
pnpm lint — 0 errors
API unit tests — 3773 / 3773 passing (includes new writeTrial regression test)
Web unit tests — 1863 / 1863 passing

Staging Deployments

Run	Commit	Result
24614206706	`c2780059`	✅ initial merge deploy
24614985380	pre-`b15ca27c`	❌ `Unknown argument: remote` — fixed by removing `--remote` flag
24615223155	post-`db1d6332`	✅ final green with kill-switch KV put + all fixes

Staging Verification (Playwright + curl, live app)

TRIAL_ENABLED=true on staging, end-to-end happy path exercised:

Check	Result
`GET /api/trial/status`	`{"enabled":true,"remaining":1500,"resetsAt":"2026-05-01"}` ✅
`POST /api/trial/create` with `https://github.com/sindresorhus/is`	`201` with `Set-Cookie: sam_trial_fingerprint=…` + `sam_trial_claim=…` ✅
`GET /api/trial/:trialId/events` via real cookies	`HTTP/2 200`, `content-type: text/event-stream`, `: connected` heartbeat ✅
`/try` landing form submission on mobile 375×667	navigates to `/try/:trialId`, ChatGate renders "Live" status, feed waits for events, zero console errors ✅
Same on desktop 1280×800	✅

Screenshots: trial-sse-live-mobile.png, trial-sse-live-desktop.png (in .codex/tmp/playwright-screenshots/).

Regression spot-check

Authenticated via smoke-test token login → /dashboard renders, project list loads, 0 console errors
Navigation sidebar, command palette, notifications panel all intact
/health → 200 healthy

What was NOT verified end-to-end

The OAuth claim + post-login auto-submit leg (chat gate → login sheet → GitHub OAuth → /api/trial/:trialId/claim → stashed draft replay) requires a real GitHub OAuth round-trip with a human. All individual components have unit + integration coverage; the OAuth leg is gated behind a real sign-in and deferred to Raphaël's manual review.

Review Status

Full specialist review was not dispatched because this PR is flagged for manual review by @raphaeltm before merge. The needs-human-review label is applied. Raphaël will decide whether to dispatch additional reviewers, flip production config, and proceed to merge.

Do NOT Merge Yet

❌ Do NOT merge to main until Raphaël has reviewed the configuration checklist.
❌ Do NOT deploy to production until the Anthropic key is procured and the OAuth claim leg has been exercised at least once.

🤖 Generated with Claude Code

Lays groundwork for /try — shared types (Valibot), DB migration 0043 (system user sentinel + trial_waitlist table), wrangler TRIAL_COUNTER DO binding (v7 migration) + trial env vars, trial services (HMAC-signed cookies with constant-time compare, KV kill-switch with 30s cache + fail-closed, discovery prompt), 501 route stubs under /api/trial/*, TrialCounter DO with atomic transactionSync increment/decrement, frontend Try/TryDiscovery stubs mounted at /try + /try/:trialId, operator docs at docs/guides/trial-configuration.md, and 43 unit tests covering cookie round-trip/tamper/expiry, kill-switch cache/TTL/fail-closed, and TrialCounter cap enforcement. Trials remain disabled by default (kill-switch fails closed) so this is safe to deploy without setting TRIAL_CLAIM_TOKEN_SECRET. Wave 1 will wire the live create/events/claim/waitlist handlers. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Implements backend lifecycle for zero-friction trial onboarding (Wave 1 Track A): - trials table + sentinel-installation workaround (migration 0044) - TrialCounter DO: fetch surface + tryIncrement/prune RPC methods - POST /api/trial/create with Valibot validation, kill-switch gate, GitHub repo probe (size/privacy), DO slot allocation, and counter-decrement rollback on D1 failure - GET /api/trial/status with fail-closed fallback when DO throws - POST /api/trial/waitlist with lowercase-email dedupe via onConflictDoNothing(email, resetDate) - Three scheduled modules wired into cron dispatch: - trial-expire: 5-min sweep marks expired trials - trial-rollover: monthly DO pruning (0 3 1 * *) - trial-waitlist-cleanup: daily notified-row purge (0 4 * * *) - All configurable via DEFAULT_* constants + env overrides (Principle XI) - 92 new behavioral tests covering resolution branches, DO RPC surface, fallback semantics, cookie issuance, and fail-closed error paths Co-Authored-By: Claude Opus 4.6 <[email protected]>

Builds the frontend components that gate the trial experience behind GitHub auth — a chat input with suggestion chips for anonymous users, and a login sheet that opens when they send their first message. Integration into TryDiscovery (SSE streaming `trial.idea` events) lands in wave-2 alongside the live /claim handler. Components - ChatGate: autogrowing textarea + horizontally-scrolling chip row; Cmd/Ctrl+Enter submits, Enter inserts newline; disabled state when empty/whitespace; surfaces submit errors without clearing the draft - LoginSheet: responsive dialog (mobile bottom-sheet, desktop centered modal) with Escape/backdrop/close-button dismissal, focus trap between primary CTA + close, body scroll lock, return-to URL construction (trialId URL-encoded, ?claim=1 sentinel) - SuggestionChip: 44px-tall touch target with title + optional summary, aria-label compose, disabled state Hooks - useTrialDraft: per-trialId localStorage draft with 400ms debounce (flush-on-unmount), synchronous writes when debounceMs=0, rehydrates on trialId change, no-ops with undefined trialId - useTrialClaim: idle → claiming → submitting → done/error state machine; injectable claim/submit fns for testing; StrictMode-safe (single claim per mount); clears draft only on successful submit; preserves projectId when submit fails so UI can retry Harness + tests - TrialChatGateHarness at /__test/trial-chat-gate (public, not linked from nav) renders ChatGate + LoginSheet with query-param-driven mock data (ideas=0..20, long=1, auth=1, loginOpen=1) so Playwright can capture screenshots without hitting the real claim flow - 43 new unit tests across components + hooks covering rendering, interactions, persistence, error states, focus management - 13 Playwright visual scenarios at 375x667 + 1280x800: empty state, 1/5/20 chips (page-level overflow asserted false — chip row owns its horizontal scroll), long-text wrapping, anonymous send opening LoginSheet, bottom-sheet vs centered-modal layouts, 44px touch targets on send button + suggestion chips Co-Authored-By: Claude Opus 4.6 <[email protected]>

Wire trial onboarding backend so the post-OAuth claim flow and the per-trial event stream work end-to-end. - TrialEventBus DO: in-memory ring buffer (MAX_BUFFERED_EVENTS=500) with long-poll /poll, /append with terminal-event auto-close, /close, waiter-wake semantics. Configurable via TRIAL_EVENT_BUS_DEFAULT_POLL_TIMEOUT_MS. - trial-store service: KV-backed writeTrial/readTrial/markTrialClaimed with 3-key indexing (by trialId, by projectId, by fingerprint). - trial-runner: mode-aware config resolution (staging=opencode+workers-ai, production=claude-code+anthropic); production requires ANTHROPIC_API_KEY_TRIAL. startDiscoveryAgent creates chat + ACP session with discovery prompt. emitTrialEvent/emitTrialEventForProject append to TrialEventBus best-effort. - GET /api/trial/:trialId/events: fingerprint-cookie-authenticated SSE. Verifies trial record + HMAC signature + UUID match (fails closed on any mismatch). Heartbeat every TRIAL_SSE_HEARTBEAT_MS (default 15s); long-poll DO every TRIAL_SSE_POLL_TIMEOUT_MS; max duration TRIAL_SSE_MAX_DURATION_MS. Closes on terminal event. - POST /api/trial/claim: auth-required; verifies HMAC claim cookie; atomic D1 UPDATE with WHERE userId=TRIAL_SENTINEL_USER_ID precondition; clears claim cookie; returns {projectId, claimedAt}. Returns 409 on UPDATE-changes=0 race. - OAuth callback hook (maybeAttachTrialClaimCookie): on 2xx/3xx response from /callback/github, if a valid fingerprint cookie maps to an unclaimed non-expired trial, sign a claim token, set sam_trial_claim cookie, and rewrite Location to https://app.${BASE_DOMAIN}/try/:trialId?claim=1. - Env + wrangler binding for TRIAL_EVENT_BUS Durable Object. 70 new unit tests (6 files) cover DO long-poll/waiter-wake/terminal-close, SSE auth-failure matrix + happy path, claim route 400/404/409/200 branches, oauth-hook bail-out matrix + rewrite happy path, trial-runner config resolution + error paths, and trial-store round-trips.

…tion

…ding integration

…ntegration

Replaces Wave 0 stubs with full trial discovery flow: - Try landing page with GitHub URL validation + error branches (invalid_url, repo_private, trials_disabled, cap_exceeded, existing_trial) - TryDiscovery streams SSE events (started, progress, knowledge, idea, ready) with exponential backoff reconnect (max 5 retries) and renders repo header, progress, knowledge graph, ideas, and workspace-ready CTA - TryCapExceeded page with waitlist email capture + inline validation - TryWaitlistThanks confirmation page - trial-api client: createTrial, joinWaitlist, openTrialEventStream - ChatGate stub placeholder for Track D integration Tests: - Vitest component tests for Try + TryCapExceeded (11 cases: URL validation, success nav, existing-trial resume, each error branch, email validation, waitlist submit, API error) - Playwright visual audit at 375x667 and 1280x800 covering landing, discovery (streaming/ready/empty), cap-exceeded, waitlist-thanks, and all inline error states — overflow asserted on every test Mobile-first with design tokens; 56px primary CTA, 44px secondary touch targets; env(safe-area-inset-*) padding. Co-Authored-By: Claude Opus 4.6 <[email protected]>

… integration Resolves conflict in ChatGate.tsx by keeping Track D's real implementation; adapts TryDiscovery to Track D's ChatGate contract (TrialIdea shape, onAuthenticatedSubmit handler that navigates to the claimed project chat with the message staged in sessionStorage).

… kill-switch Previously, self-hosters had to manually run `wrangler secret put TRIAL_CLAIM_TOKEN_SECRET` and `wrangler kv key put trials:enabled true` before the /try flow would work on staging. Wire both into the standard deployment pipeline so staging trials are live out of the box. Changes: - infra/resources/secrets.ts: add `trial-claim-token-secret` RandomId resource (32 bytes base64) + export `trialClaimTokenSecret` Pulumi output, same persistence pattern as encryptionKey / jwtPrivateKey. - infra/index.ts: re-export the new output. - scripts/deploy/configure-secrets.sh: read trialClaimTokenSecret from Pulumi state and set it as a required Worker secret on every deploy. - .github/workflows/deploy-reusable.yml: add a staging-only step that sets KV `trials:enabled=true` via wrangler after the worker deploys. Production stays opt-in per spec (operator flips the flag manually when ready to accept live trial traffic). - docs/guides/trial-configuration.md: document the automation — no more manual secret-put or kv-put steps for staging. Co-Authored-By: Claude Opus 4.6 <[email protected]>

`wrangler kv key put` writes to remote by default; --remote is not a valid flag for that subcommand and caused the staging deploy's trial kill-switch step to fail. Co-Authored-By: Claude Opus 4.6 <[email protected]>

…olve it Track A (create.ts) inserted trial records into D1 only; Track B readers (events.ts, claim.ts, trial-runner.ts) all look trials up via trial-store.readTrial() which reads from KV. The result: every SSE connection 404'd with "Trial not found or expired" seconds after the trial was created. Integration fix: - create.ts calls writeTrial() after the D1 insert, with projectId='' (Track B's orchestrator rewrites the KV record once the project row exists). On KV failure, roll back the D1 row and release the TrialCounter slot so we don't burn a cap entry. - writeTrial() skips the trial-by-project index when projectId is empty, preventing all pending trials from colliding on `trial-by-project:`. - events.ts: use errors.notFound('Trial') — previous argument produced doubled "Trial not found or expired not found". Added a regression test asserting writeTrial is invoked from the happy path (captures the exact KV put) so this bug cannot silently recur. Co-Authored-By: Claude Opus 4.6 <[email protected]>

simple-agent-manager · 2026-04-18T22:31:44Z

Staging verification update — trials automation + integration fix

Two follow-up commits landed after the initial PR:

1. Deploy automation (commits 086f4ded + b15ca27c)

TRIAL_CLAIM_TOKEN_SECRET is now auto-provisioned by Pulumi (infra/resources/secrets.ts) and pushed by configure-secrets.sh on every deploy — no manual wrangler secret put
trials:enabled KV flag is set automatically by deploy-reusable.yml on staging deploys — no manual wrangler kv key put
Production remains opt-in (operator flips the flag when ready)

2. Wave 1 integration bug fix (commit db1d6332)

Track A persisted trials to D1; Track B read from KV via trial-store.readTrial(). Nothing wrote KV → every SSE /events call returned 404 "Trial not found".
Fix: create.ts calls writeTrial() after the D1 insert with projectId='' (Track B's orchestrator rewrites the record once the project row exists). On KV failure, D1 row is rolled back and the TrialCounter slot released.
Hardened writeTrial() to skip the by-project index when projectId is empty, preventing pending-trial collisions.
Added regression test asserting writeTrial is invoked — this bug cannot silently recur.

Staging verification evidence (run 24615223155, 2026-04-18 22:22Z):

✅ /api/trial/status → {"enabled":true,"remaining":1498,"resetsAt":"2026-05-01"}
✅ POST /api/trial/create with public repo URL → 201 with set-cookie fingerprint + claim cookies, returns trialId
✅ GET /api/trial/:trialId/events with fingerprint cookie → HTTP/2 200 text/event-stream, : connected heartbeat received
✅ /try/:trialId page renders ChatGate in "Live" state (green), zero console errors, on mobile 375×667 and desktop 1280×800

Updated configuration checklist for @raphaeltm:

✅ ~~TRIAL_CLAIM_TOKEN_SECRET~~ — auto-provisioned by Pulumi, no action needed
✅ ~~Staging kill-switch~~ — auto-set by deploy workflow, no action needed
Production kill-switch — flip trials:enabled=true manually when ready: pnpm --filter @simple-agent-manager/api exec wrangler kv key put "trials:enabled" "true" --binding KV --env production
Production Anthropic key — set ANTHROPIC_API_KEY_TRIAL via wrangler secret put ... --env production once procured (required for production trials — staging uses Workers AI, no key needed)
Optional tunables in apps/api/wrangler.toml: TRIAL_MONTHLY_CAP (default 1500), TRIAL_WORKSPACE_TTL_MS (default 20 min), TRIAL_DATA_RETENTION_HOURS (default 168)

Production deploy and merge remain deferred per your instructions.

…760) * task: move trial-orchestrator-wire-up to active Co-Authored-By: Claude Opus 4.6 <[email protected]> * feat(shared): add trial orchestrator timing/retry constants Introduce DEFAULT_TRIAL_ORCHESTRATOR_* and DEFAULT_TRIAL_KNOWLEDGE_* constants used by the alarm-driven TrialOrchestrator DO and the fast-path GitHub knowledge probes fired from POST /api/trial/create. Every value is env-var overridable (Constitution Principle XI). Co-Authored-By: Claude Opus 4.6 <[email protected]> * feat(trial): add TrialOrchestrator DO binding, env vars, sentinel installation - Declare TRIAL_ORCHESTRATOR DO binding + v9 migration in wrangler.toml - Extend Env interface with TrialOrchestrator/Knowledge tuning knobs and TRIAL_ANONYMOUS_INSTALLATION_ID override - Migration 0045 seeds the system_anonymous_trials_installation sentinel row so anonymous trial projects can satisfy the NOT NULL + FK constraint on projects.installation_id without owning a real GitHub App install The DO class itself is added in the next commit. * feat(trial): add TrialOrchestrator DO state machine Adds the alarm-driven TrialOrchestrator Durable Object (one per trialId) that replaces the fire-and-forget `waitUntil(provisionTrial())` pattern with a resumable state machine. Module layout mirrors TaskRunner: - types.ts — TrialOrchestratorStep union + persisted state shape - helpers.ts — re-exports TaskRunner helpers; adds sentinel-user / sentinel-installation resolvers + safeEmitTrialEvent. - steps.ts — per-step handlers (project_creation, node_selection, node_provisioning, node_agent_ready, workspace_creation, workspace_ready, discovery_agent_start, running). - index.ts — DO class: start(), alarm() dispatch, backoff retry, overall-timeout guard, trial.error emission on failure. Each step emits `trial.progress` at entry so the SSE stream reflects where the orchestrator is. Terminal `running` step is idle — the ACP bridge (wired separately) is responsible for emitting `trial.ready` after the discovery agent produces its first assistant turn. All timing/retry knobs read from env vars with DEFAULT_* fallbacks (Constitution Principle XI). Adds two new optional env fields: TRIAL_VM_SIZE and TRIAL_VM_LOCATION for trial-specific VM overrides. Exports the class from apps/api/src/index.ts so the Workers runtime can instantiate it via the TRIAL_ORCHESTRATOR binding (already declared in wrangler.toml v9 migration). Task: tasks/active/2026-04-19-trial-orchestrator-wire-up.md * feat(trial): bridge ACP/MCP events into trial SSE stream Adds a dedicated `services/trial/bridge.ts` module with three helpers that hook into existing hot paths and fan qualifying events out as `trial.*` SSE events: - bridgeAcpSessionTransition: `running` → trial.ready (with workspaceUrl derived from BASE_DOMAIN + workspaceId), `failed` → trial.error. - bridgeKnowledgeAdded: fires trial.knowledge when the discovery agent adds a knowledge observation via MCP. - bridgeIdeaCreated: fires trial.idea with a summary-clipped excerpt when the discovery agent creates an idea via MCP. All three helpers short-circuit on non-trial projects after a single `readTrialByProject(env, projectId)` KV lookup, so normal (non-trial) project traffic only pays that one extra KV read on qualifying events. Hook sites: - ProjectData DO `transitionAcpSession` — dynamic-imports the bridge and dispatches after the transition succeeds, guarded by `if (projectId)` and wrapped in try/catch so bridge errors never block the transition. Casts `this.env` through unknown to the worker-scope Env because the DO's local Env type is intentionally narrow. - `handleAddKnowledge` MCP handler — dispatches after addKnowledgeObservation. - `handleCreateIdea` MCP handler — dispatches after the DB insert. Every dispatch is fire-and-forget; bridge errors are already caught inside each helper but the call sites add a second try/catch for defense. Task: tasks/active/2026-04-19-trial-orchestrator-wire-up.md Co-Authored-By: Claude Opus 4.6 <[email protected]> * feat(trial): wire TrialOrchestrator + GitHub knowledge into POST /api/trial/create Adds two fire-and-forget dispatches after the trial record is written and before the HTTP response returns, via c.executionCtx.waitUntil: 1. TrialOrchestrator DO `start()` — kicks off the alarm-driven state machine that provisions a project, workspace, and discovery agent session. The DO is idempotent on `start()`, so accidental re-invocations no-op. 2. emitGithubKnowledgeEvents() — hits unauthenticated GitHub REST endpoints (`/repos/:o/:n`, `/repos/:o/:n/languages`, `/repos/:o/:n/readme`) in parallel and emits up to `TRIAL_KNOWLEDGE_MAX_EVENTS` `trial.knowledge` events within ~`TRIAL_KNOWLEDGE_GITHUB_TIMEOUT_MS` each. Surfaces description, primary language, stars, topics, license, language breakdown, and README first paragraph so the SSE stream shows activity within ~3s while the VM provisions in the background. Both helpers fully swallow errors — an orchestrator dispatch failure or GitHub rate-limit hit never blocks the response or crashes the Worker. All knobs are env-configurable per Constitution Principle XI: - TRIAL_KNOWLEDGE_GITHUB_TIMEOUT_MS (default 5000) - TRIAL_KNOWLEDGE_MAX_EVENTS (default 10) Co-Authored-By: Claude Opus 4.6 <[email protected]> * test(trial): cover orchestrator dispatch, bridge, and GitHub knowledge probe Adds four categories of behavioral tests for the trial onboarding wiring: 1. trial-create.ts.test.ts (+2 cases) - Asserts TrialOrchestrator.start() is dispatched via waitUntil with trialId, repoOwner, repoName, and canonical repoUrl. - Asserts a rejecting start() does NOT propagate — the HTTP response still returns 201 (fire-and-forget contract). - Updates makeEnv() to stub TRIAL_ORCHESTRATOR + TRIAL_EVENT_BUS bindings and introduces makeExecutionCtx() helper. - Also adds a graceful-fallback in create.ts so routes that run without a Worker executionCtx (unit tests) still complete instead of 500-ing on Hono's "This context has no ExecutionContext" throw. 2. trial-github-knowledge.test.ts (new, 5 cases) - Happy path: verifies description, primary language, stars, topics, license, language breakdown, and README paragraph are all emitted. - TRIAL_KNOWLEDGE_MAX_EVENTS cap is enforced. - Total network failure → 0 events, no throw. - Non-2xx repo metadata response → 0 events, no throw. - emitTrialEvent rejection → no throw (last line of defense). 3. trial-orchestrator.test.ts (new, 4 cases) - start() persists initial state with currentStep='project_creation' and schedules an alarm. - start() is idempotent — second call with same input is a no-op and does not re-schedule the alarm. - alarm() on a completed state is a terminal no-op. - alarm() emits trial.error and marks completed when the overall timeout budget is exceeded. 4. trial-bridge.test.ts (new, 9 cases) - bridgeAcpSessionTransition: no-ops on non-trial projects, emits trial.ready on 'running' with ws-{id}.{BASE_DOMAIN} URL, emits trial.error on 'failed', no-ops on other transitions, swallows emitter errors. - bridgeKnowledgeAdded / bridgeIdeaCreated: no-op on non-trial, emit correct event shape when trial exists, swallow errors. All 3,793 tests pass; typecheck clean. Co-Authored-By: Claude Opus 4.6 <[email protected]> * docs(trial): document TrialOrchestrator + GitHub knowledge fast-path Adds an "Orchestrator and Fast-Path Knowledge" section to the trial configuration guide covering the two fire-and-forget background tasks dispatched from POST /api/trial/create (TrialOrchestrator DO and the GitHub REST knowledge probe) plus the ACP/MCP event bridge, with tunables tables for both. Also records the change in CLAUDE.md "Recent Changes" and marks the corresponding checklist items in the task file. * style(trial): sort imports per eslint rules Co-Authored-By: Claude Opus 4.6 <[email protected]> * fix(trial): emit trial.started event from orchestrator start() The SSE stream's first real event must be `trial.started` so the frontend can transition out of the "Warming up..." empty state. Without it, viewers sat on the placeholder until `trial.progress` or `trial.knowledge` arrived — which could be 3-5s later. Added unit test asserting `emitTrialEvent` is called exactly once with type='trial.started' and the expected shape. Co-Authored-By: Claude Opus 4.6 <[email protected]> * test(trial): capability test chaining start() + alarm() through event bus Addresses task-completion-validator HIGH finding #2: no capability test exercised the full orchestrator state machine through the event bus seam. Existing per-method tests covered each transition in isolation but did not chain them. New test drives: start() → persist + setAlarm + emit trial.started → (simulate expired budget) → alarm() → mark failed + emit trial.error The `emitTrialEvent` mock is the event-bus seam; its downstream is already covered by tests/unit/routes/trial-events.test.ts which verifies the bus → SSE stream path. Co-Authored-By: Claude Opus 4.6 <[email protected]> * chore(trial): archive orchestrator wire-up task Co-Authored-By: Claude Opus 4.6 <[email protected]> * test(trial): cover alarm() retry/backoff + step handler invariants Addresses test-engineer review HIGH findings #1 and #2 (partial). Finding #1 — alarm() retry/backoff: Added 4 tests driving the step-error catch branches via a `./steps` vi.mock. Covers transient-error + retries-remaining (increments counter and schedules backoff, no failTrial), permanent-error (immediate failTrial regardless of budget), transient-error with retries exhausted (promotes to failTrial), and the null-state guard (alarm fires before start()). Finding #2 — step handlers: New `trial-orchestrator-steps.test.ts` covers the two highest-value invariants that don't need D1/DO plumbing mocks: - handleRunning marks state.completed = true - handleDiscoveryAgentStart throws permanent on missing IDs - handleDiscoveryAgentStart is idempotent when session already linked Broader per-handler coverage (project_creation / node_selection / node_provisioning / node_agent_ready / workspace_creation / workspace_ready) tracked in tasks/backlog/2026-04-19-trial-orchestrator-step-handler-coverage.md — those paths require mocks for drizzle + node-agent + project-data services and are out of scope for this PR. Co-Authored-By: Claude Opus 4.6 <[email protected]> * fix(trial): remove hardcoded BASE_DOMAIN fallback + extract heartbeat skew constant Addresses constitution-validator findings: HIGH — bridge.ts:41 had `env.BASE_DOMAIN || 'workspaces.example.com'` fallback. BASE_DOMAIN is a non-optional binding; a misconfiguration that let it be empty would silently generate workspace URLs pointing at workspaces.example.com instead of failing loudly. Removed the fallback. MEDIUM — steps.ts had a hardcoded `30_000` heartbeat-skew window. Extracted to DEFAULT_TRIAL_ORCHESTRATOR_HEARTBEAT_SKEW_MS (shared), TRIAL_ORCHESTRATOR_HEARTBEAT_SKEW_MS env override, getHeartbeatSkewMs() getter on the DO, threaded through TrialOrchestratorContext. Co-Authored-By: Claude Opus 4.6 <[email protected]> * fix(trial): per-IP rate limit on POST /api/trial/create + SSE injection guard Addresses security-auditor HIGH findings: 1. Rate limit on POST /api/trial/create (was missing) - New rateLimitTrialCreate() factory (useIp=true, keyPrefix=trial-create) - Default 10 req/hr, configurable via RATE_LIMIT_TRIAL_CREATE env var - Tighter than the general anonymous bucket because each trial create allocates a Durable Object, fires ~4 GitHub API calls, and consumes a monthly trial slot - Mounted per-route in create.ts so the limiter sees request env - Regression test exercises 429 path with IP-scoped KV window 2. SSE event-name sanitization in formatSse() - Strips CR/LF to prevent SSE-frame injection if a future caller ever bypasses the TrialEvent discriminated union via `as never` casts or dynamic event names - Function now exported for direct testing - New trial-events-format.test.ts covers: happy path stable shape, CR/LF strip on hostile event name (single event frame survives), and JSON data escaping for embedded newlines * fix(trial): switch TrialOrchestrator to new_sqlite_classes + drop premature status gate Addresses cloudflare-specialist HIGH findings: 1. wrangler.toml v9 migration: new_classes -> new_sqlite_classes Cloudflare recommends SQLite-backed storage for new DO classes; the KV-style ctx.storage.put() API works identically on both backends but SQLite is the future-forward choice. TrialOrchestrator has not yet been deployed to any environment (introduced in this PR chain), so flipping the migration type is safe. 2. handleNodeProvisioning: remove synchronous status='running' gate After provisionNode() returns, async-IP providers (Scaleway, GCP) leave the node in 'creating' status — the IP and status='running' flip happens on the first heartbeat. Synchronously requiring status='running' here forced every async-IP trial through the retry/backoff cycle until the heartbeat landed, wasting retry budget and risking permanent failure on slow VM boots. The next step (node_agent_ready) polls heartbeat freshness with its own timeout, which correctly handles both sync (Hetzner) and async (Scaleway/GCP) provisioning paths. Regression test: handleNodeProvisioning advances to node_agent_ready even when provisionNode() leaves the node in 'creating' status. * fix(trial): HMAC-verify fingerprint cookie before reusing UUID Security-auditor HIGH: the old code extracted the fingerprint UUID from the `sam_trial_fingerprint` cookie by splitting on the last `.` without verifying the HMAC signature. An attacker who learned a victim's fingerprint UUID (from logs, a captured cookie, or a prior trial row) could forge `<victimUuid>.anything` to overwrite the `trial-by-fingerprint:<victimUuid>` KV index to point at their own trial. The victim's subsequent OAuth hook lookup would then redirect them to the attacker's trial project. Fix: call verifyFingerprint(existingFp, secret) and only trust the returned UUID. Fall back to crypto.randomUUID() on invalid / missing signature. The secret is already resolved earlier in the same handler (line 195-203). Added regression test in trial-create.ts.test.ts — a forged cookie MUST NOT reuse the victim's UUID; a fresh UUID is minted instead. Updated the "reuses existing fingerprint" test to use a validly-signed cookie. --------- Co-authored-by: Raphaël Titsworth-Morin <[email protected]> Co-authored-by: Claude Opus 4.6 <[email protected]>

* task: move trial-onboarding-ux-polish to active * feat(trial): polish discovery feed with skeleton timeline + knowledge grouping - Extract all timing/threshold constants to trial-ui-config.ts (Constitution XI) - Add STAGE_LABELS map + friendlyStageLabel() for orchestrator stage strings - TryDiscovery: render StageSkeleton timeline before first SSE event arrives - TryDiscovery: group rapid trial.knowledge events into a single card - TryDiscovery: surface "taking longer than usual" hint when SSE silent for 20s - TryDiscovery: retry-aware terminal error panel - ChatGate: spinner + aria-busy on send, snap-x chip scroll, anonymous hint copy - Try: friendlier validation copy, testid hooks for landing audit * test(trial): cover stage-label mapping + skeleton/error/knowledge-burst Playwright cases * task: archive trial-onboarding-ux-polish * fix(trial): SSE replay dedup, accessible badges, larger touch targets Addresses Phase 5 review findings on the trial onboarding UX polish PR: CRITICAL — SSE event replay duplication EventSource silently re-opens after a transport error and the server may replay any buffered events the client missed. Without dedup, the feed duplicated every replayed event. Add a composite (`type:at`) dedup set in TryDiscovery that resets on trialId change. HIGH — color-only ConnectionBadge (WCAG 1.4.1) Status was conveyed by background color alone. Prepend a Unicode shape indicator (●/✕/↺/○) so the meaning is also conveyed in monochrome. HIGH — knowledge toggle hit area (WCAG 2.5.5) The "+N more" toggle on grouped knowledge cards was 24px tall — below the 44px touch-target minimum. Promote to min-h-11 with vertical hit padding. MEDIUM — semantic header role + truncation hint The sticky discovery header used role="banner" (reserved for the page-wide masthead) and the truncated repo title had no full-text hover affordance. Switch to role="region" + aria-label and move the title attribute to the truncating wrapper. LOW — error CTA touch targets The "Try again" / "Join the waitlist" Links were below 44px. Promote to inline-flex min-h-[44px]. Tests - try-discovery-dedup.test.ts: behavioural coverage of eventDedupKey and the dedup branch in onEvent (3 scenarios: identical replay, chronological non-collision, type-vs-timestamp collision). - try-discovery-build-feed.test.ts: boundary coverage of buildFeed (within-window merge, exact-boundary `<=` merge, +1ms split, interleaved non-knowledge break, error-event exclusion). - ChatGate.test.tsx: spinner visible/hidden behavioural test using a deferred promise (idle → sending → resolved transitions). - trial-ui-audit.spec.ts: knowledge-burst test now asserts exactly one grouped card (was: presence only) and exercises the expand toggle. Co-Authored-By: Claude Opus 4.6 <[email protected]> * fix(trial): keep StageSkeleton visible after lone trial.started; forward Alert testid Two narrow fixes uncovered by Playwright visual audit: 1. **StageSkeleton hides too eagerly.** `showSkeleton = events.length === 0` meant a lone `trial.started` event (which is just an acknowledgement, not visible progress) caused the "Setting things up" roadmap to vanish while the user was still staring at a blank screen. Tighten to "no substantive events yet" — keep showing the roadmap until a real progress / knowledge / idea / ready / error event arrives. 2. **`Alert` drops `data-testid`.** The shared design-system `Alert` component didn't declare or forward `data-testid`, so `<Alert variant="error" data-testid="trial-error-panel">` silently discarded the prop and the terminal-error Playwright assertion couldn't find the panel. Add the prop to `AlertProps` and forward it to the rendered `<div role="alert">`. All 45 Playwright trial-ui-audit tests now pass across iPhone SE, iPhone 14, and Desktop projects. --------- Co-authored-by: Raphaël Titsworth-Morin <[email protected]> Co-authored-by: Claude Opus 4.6 <[email protected]>

) * task: move trial-events-debug to active * task: instrument trial event bus path for staging triage Add high-signal log.info points at every boundary in the trial event flow so `wrangler tail` can show exactly where the pipeline drops: - create.ts: log dispatch_begin, orchestrator_task.{enter,stub_ready, start_returned}, knowledge_task.{enter,done}, waitUntil_registered - trial-runner.ts:emitTrialEvent — log emit_begin / emit_ok - trial-orchestrator: start.enter, state_put, alarm_set, trial_started_emitted; alarm.enter - trial-event-bus: handleAppend.enter / stored / rejected_closed Pure instrumentation — no behavior change. Will be pared back or removed once the failure mode is identified on staging. * fix(trial): emit unnamed SSE frames so EventSource.onmessage fires Root cause of the zero-events-on-staging incident (2026-04-19): formatSse() wrote named SSE frames ('event: trial.knowledge\ndata: {...}') but the frontend subscribes via source.onmessage, which only fires for the default (unnamed) event. Bytes arrived on the wire — curl saw them — but no frontend-visible event was ever dispatched. Change the SSE serializer to emit unnamed frames ('data: {...}'). The TrialEvent payload itself carries a 'type' discriminator so no information is lost. Update the unit test to lock in the new contract (no 'event:' line) and point at the post-mortem. Also fix a latent eventsUrl contract mismatch: POST /api/trial/create returned '/api/trial/events?trialId=X' while the real route is '/api/trial/:trialId/events'. The frontend builds its own URL so end-users weren't affected, but the response-field contract was wrong. The previous unit test used toContain() on a substring, masking the drift. See docs/notes/2026-04-19-trial-sse-named-events-postmortem.md. * test(trial): add TrialEventBus → SSE capability test Regression guard for the 2026-04-19 incident. Seeds a trial in KV, appends events directly on the TrialEventBus DO (identical to emitTrialEvent()), opens the SSE stream via SELF.fetch with a valid fingerprint cookie, reads the raw stream bytes, and asserts: - HTTP 200 + correct content-type - At least one 'data: {...}' frame - No 'event:' line anywhere (the regression guard) - The parsed JSON payload round-trips through the bus intact Also add TRIAL_EVENT_BUS DO binding and TRIAL_* env bindings to the workers vitest config so this test (and future trial-related worker tests) can construct stubs. Note: the existing workers test pool is currently broken on this branch and base (miniflare WebSocket exits unexpectedly on all 6 pre-existing worker tests too — not caused by this change). Once the pool is unblocked this test runs as-is. * docs(trial): post-mortem + rule 13 ban curl-only SSE verification Post-mortem covers what broke, the two-layer contract mismatch (named SSE events + wrong eventsUrl shape), timeline, why it wasn't caught (no E2E capability test, curl used instead of a real browser, frontend test path not exercised), the class of bug, and the process fixes landing in this PR. Update rule 13 (staging verification) to explicitly ban curl-only verification for browser-consumed SSE/WebSocket streams — curl confirms the byte stream, only a real browser confirms dispatch to onmessage. * task: record root cause + fixes on trial SSE events task * test(trial): update trial-events.test SSE assertion for unnamed frames The integration test for GET /api/trial/:trialId/events was asserting the old named-event contract ('event: trial.ready'). With the formatSse() fix the frame is unnamed; update the assertion to lock in the new contract (data: line present, no event: line). * task: archive trial SSE events debugging task * chore(trial): address review findings on SSE events fix - Add TRIAL_ORCHESTRATOR + TRIAL_COUNTER DO bindings to apps/api/vitest.workers.config.ts (cloudflare-specialist MEDIUM) - CLAUDE.md: prepend 'trial-sse-events-fix' entry to Recent Changes (doc-sync-validator MEDIUM) - Fix broken link in postmortem (tasks/active -> docs/notes) and tick the completed rule-13 follow-up checkbox (doc-sync-validator LOW) - Add cross-reference from .claude/rules/02-quality-gates.md to the rule-13 curl-only SSE-verification ban (doc-sync-validator LOW) - File pre-existing HIGH (AbortController not propagated into busStub.fetch) and MEDIUM (nextCursor persistence) as backlog tasks so they're tracked but don't block this fix PR --------- Co-authored-by: Raphaël Titsworth-Morin <[email protected]>

…764) * task: move trial orchestrator agent-boot task to active * feat(trial): boot discovery agent on VM + detect real default branch Two bugs blocked the trial demo from working end-to-end: 1. handleDiscoveryAgentStart only created chat + ACP session records but never called createAgentSessionOnNode / startAgentSessionOnNode. The ACP session sat in `pending` forever, never transitioning to `running`, so `trial.ready` never fired. 2. Project defaultBranch + workspace branch were hardcoded to 'main', so trials on master-default repos (e.g. octocat/Hello-World) failed the VM-side `git clone --branch main`. Fix (mirrors TaskRunner's agent-session-step pattern): - Add `defaultBranch`, `mcpToken`, `agentSessionCreatedOnVm`, `agentStartedOnVm`, `acpAssignedOnVm`, `acpRunningOnVm` fields to TrialOrchestratorState for crash-safe idempotency. - `fetchDefaultBranch()` probes GitHub's public API with a 5s AbortController timeout (TRIAL_GITHUB_TIMEOUT_MS override), falls back to 'main' on any failure. Threaded through both `projects.default_branch` and the workspace-side `git clone --branch`. - `handleDiscoveryAgentStart` now runs a 5-step idempotent flow: 1. startDiscoveryAgent (existing) -> chat + ACP session records. 2. createAgentSessionOnNode (new) -> D1 agent_sessions row + VM agent registers the session. 3. generateMcpToken + storeMcpToken (new) -> KV token so the agent can call add_knowledge / create_idea. 4. startAgentSessionOnNode (new) -> VM agent boots the agent subprocess with the discovery prompt + MCP server URL. 5. transitionAcpSession pending -> assigned -> running -> the trial bridge emits `trial.ready` with workspaceUrl. - Trial's synthetic taskId = state.trialId (trials have no tasks row), so MCP rate-limiting keys per-trial. Drop get_instructions from the initial prompt since it'd 404 against the tasks table. Co-Authored-By: Claude Opus 4.6 <[email protected]> * test(trial): capability coverage for orchestrator VM agent boot Adds trial-orchestrator-agent-boot.test.ts asserting the 3-step VM boot pattern + ACP pending→assigned→running transitions + idempotency across crash/retry. Updates trial-orchestrator-steps.test.ts for the new nodeId requirement and adds mocks for node-agent/mcp-token/project-data services. Also adds fetchDefaultBranch coverage (master, 404 fallback, network error fallback, idempotent re-entry). Post-mortem at docs/notes/2026-04-19-trial-orchestrator-agent-boot-postmortem.md. Process fix: adds port-of-pattern coverage bullet to .claude/rules/10-e2e-verification.md so a port of TaskRunner's agent-session pattern into a new consumer must assert every step fired. Co-Authored-By: Claude Opus 4.6 <[email protected]> * task: archive trial orchestrator agent-boot task * docs(trial): add CLAUDE.md Recent Changes + TRIAL_GITHUB_TIMEOUT_MS row * fix(trial): persist defaultBranch before D1 insert + redact mcpToken in getStatus Cloudflare-specialist review (HIGH): two fixes 1. handleProjectCreation now persists state.defaultBranch before the D1 projects insert. Previously a crash between the D1 write and the DO state persist could cause a retry to re-probe GitHub and resolve a different branch than what had already landed in the projects row. 2. getStatus() now redacts the live mcpToken bearer credential before returning state to any debug/admin caller. The stale comment claiming the DO doesn't store secrets is corrected. * fix(trial): revoke MCP token on failure + redaction test + review doc sync Addresses Phase 5 reviewer findings from the trial-agent-boot PR: security-auditor HIGH: - Revoke state.mcpToken in failTrial() before emitting trial.error. Mirrors TaskRunner's state-machine.ts:265-275 pattern; closes the 4-hour TTL window where a leaked/botched-trial bearer token stays usable. - Document the intentional non-revocation in handleRunning() — orchestrator terminates but the discovery agent still needs the token for MCP calls during the 20-min workspace TTL. - Document the sentinel userId scoping limitation on resolveAnonymousUserId so future trial code remembers that per-user queries do NOT isolate trials from each other; projectId/trialId scoping is mandatory. task-completion-validator MEDIUM: - New test coverage for getStatus() mcpToken redaction (both populated and uninitialized state branches). - New test coverage for failTrial revocation (happy path + KV-error tolerance). doc-sync-validator HIGH: - Add Trial Onboarding section to .claude/skills/env-reference/SKILL.md cross-referencing docs/guides/trial-configuration.md for the full table. * fix(trial): allow multiple trials per repo (partial unique index) The `(user_id, installation_id, repository)` unique index on `projects` prevented more than one anonymous trial per public repo — every trial after the first on the same repo hit a UNIQUE constraint failure during the projects insert in TrialOrchestrator.handleProjectCreation. The DO retried 6 times on alarm backoff then emitted a terminal `trial.error` ("step_failed"), so the user saw the 10% progress event repeat and then fail. Why it slipped through earlier reviews: the capability tests mock D1, so no test exercised the real constraint. Staging verification only tested a single trial per repo. This surfaced the moment a second trial on `octocat/Hello-World` landed during Phase 6 verification. Fix: - Migration 0046 drops + recreates the index as a partial unique index that excludes the trial-sentinel user `system_anonymous_trials`. Real users still can't register duplicate project rows; sentinel-owned trial rows are isolated by `projectId` (per helpers.ts sentinel scope note). - Drizzle schema updated with matching `.where()` clause so codegen and migration stay in sync. Verified locally: trial-orchestrator tests pass (28/28); typecheck clean; lint clean (no new warnings). Co-Authored-By: Claude Opus 4.6 <[email protected]> --------- Co-authored-by: Raphaël Titsworth-Morin <[email protected]> Co-authored-by: Claude Opus 4.6 <[email protected]>

trial.ready is a provisioning milestone (workspace is up), not a signal that discovery is complete. The discovery agent continues producing trial.knowledge and trial.idea events after the workspace is provisioned. Changes: - Event bus: only auto-close on trial.error, not trial.ready - Frontend: keep EventSource open after trial.ready with a 3-minute grace timer (TRIAL_DISCOVERY_STREAM_TIMEOUT_MS) for late-arriving discovery events - Header shows "Discovering <repo>…" while stream is still open after trial.ready, then "Ready: <repo>" after stream closes Co-Authored-By: Claude Opus 4.6 <[email protected]>

…icons - Add TrialAgentActivityEvent type and bridgeAgentActivity() to pipe agent messages/tool calls into the trial SSE stream - Hook message persistence path to emit trial.agent_activity events - Render agent activity cards in the feed (grouped, showing tool names) - Replace misleading "Workspace ready — chat below" with informative message about agent analyzing repository - Replace emoji icons (📎, ★) with lucide-react icons (BookOpen, Lightbulb, Brain, Wrench, Terminal) matching platform design - Add auto-scroll to bottom on new events (scrollIntoView smooth) Co-Authored-By: Claude Opus 4.6 <[email protected]>

- Deduplicate consecutive progress events with the same stage in the feed — the orchestrator re-emits keepalive progress while waiting for the agent, creating visual spam (3x "Starting the agent" at 70%) - Clean up agent activity text: strip XML tags, collapse JSON blobs, add line-clamp-2 for overflow - Change "AGENT WORKING..." from uppercase to normal case - Add cleanActivityText() helper for readable tool output summaries Co-Authored-By: Claude Opus 4.6 <[email protected]>

…orker secrets The Anthropic API key for the AI proxy should come from admin-managed platform credentials (stored encrypted in D1 via /admin/platform-credentials), not from a Worker secret. This aligns with the existing credential architecture where admins configure shared keys through the UI. The proxy now resolves the key by looking up a 'claude-code' platform credential at request time. No new Worker secrets or deployment steps needed. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Add admin UI tab "AI Proxy" where admins select the default model for the platform inference proxy. Config is stored in KV so the default can be changed without redeploying. Model resolution priority: KV admin override > env var > shared constant. Out-of-box default is a free Workers AI model (Llama 4 Scout 17B). Anthropic models (Claude Haiku) are only selectable when an admin has added a Claude Code platform credential on the Credentials tab. - New API routes: GET/PUT/DELETE /api/admin/ai-proxy/config - AI proxy route and runtime agent-key endpoint read default from KV - Admin UI model picker with availability indicators - Revert DEFAULT_AI_PROXY_MODEL to free Workers AI model - File backlog idea for PLATFORM_TRIAL_ENABLED env var Co-Authored-By: Claude Opus 4.6 <[email protected]>

…g-mvp

…into sam/trial-onboarding-mvp

Merge sam/trial-discovery-stream-fix into trial MVP branch, bringing: - Auto-scroll to bottom on new events - Agent activity cards grouped in feed with Lucide icons - Progress card deduplication and text cleanup - Stream stays open after trial.ready (agent continues producing events) - Default model switched to Qwen 3 30B Update trial-event-bus test to match new behavior: trial.ready no longer closes the bus since the discovery agent continues producing knowledge and idea events after workspace provisioning. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Add AI usage section to the admin analytics dashboard, powered by the AI Gateway Logs API. Shows token usage, estimated cost, trial vs. authenticated breakdown, per-model metrics, and daily trends. Backend: - New admin endpoint GET /api/admin/analytics/ai-usage?period=7d queries AI Gateway logs with pagination and aggregates by model/day - AI proxy now tags requests with projectId and trialId in cf-aig-metadata for trial usage attribution - Configurable via AI_USAGE_PAGE_SIZE, AI_USAGE_MAX_PAGES env vars Frontend: - AIUsageChart component with KPI cards, stacked bar chart (tokens by model), daily usage area chart, and model breakdown table - Integrated into admin analytics dashboard above DAU chart - Graceful fallback if AI Gateway is not configured (catch + null) Co-Authored-By: Claude Opus 4.6 <[email protected]>

…stics The CF AI Gateway Logs API uses `order_by_direction` (not `direction`) for sort order, and error responses now include the upstream body for easier debugging. Co-Authored-By: Claude Opus 4.6 <[email protected]>

The Cloudflare AI Gateway Logs API enforces a maximum per_page of 50. Co-Authored-By: Claude Opus 4.6 <[email protected]>

* fix(trial): address review findings from trial onboarding subagents Security and correctness fixes from 7 specialist reviewers: CRITICAL: - Fix cookie domain mismatch: claim.ts clearClaimCookie and oauth-hook.ts buildClaimCookie now pass domain from BASE_DOMAIN (matching create.ts) HIGH: - TrialEventBus DO: persist `closed` flag to storage so it survives eviction - AI proxy: sanitize error bodies — log raw errors server-side, return generic messages to clients (prevents internal URL/config leakage) - Admin AI usage: sanitize CF API error responses the same way - SSE events endpoint: add per-IP rate limiting (30 req/5min via KV) - Deploy pipeline: forward ANTHROPIC_API_KEY_TRIAL as optional Worker secret - sync-wrangler-config: inject ENVIRONMENT var into generated env sections - Remove hardcoded DEFAULT_GATEWAY_ID; require AI_GATEWAY_ID from env MEDIUM: - Cron collision: move trial counter rollover from 03:00 to 05:00 UTC (avoids collision with daily analytics forward job at 03:00) - Replace magic number in create.ts with DEFAULT_TRIAL_CLAIM_TTL_MS constant - Add trial secrets to secrets-taxonomy.md and trial-configuration.md - Add comprehensive trial + AI proxy env vars to .env.example - Fix test mocks: add ctx.storage to TrialEventBus tests, add KV to SSE tests Co-Authored-By: Claude Opus 4.6 <[email protected]> * fix(trial): address CTO review — 6 quality improvements 1. Reject unknown IP: SSE rate limit now returns 400 when no client IP header is present, instead of sharing a single "unknown" bucket across all headerless clients. CF-Connecting-IP is always present on Workers. 2. Document KV rate limit trade-off: added inline comment explaining why KV's non-atomic read-modify-write is acceptable here (storm prevention, not exact enforcement) vs DO-based counters for credential rotation. 3. Clean up formatSse: removed unused _eventName parameter that gave the false impression the event name was being used. Updated all call sites and tests. 4. Cookie domain consistency test: new regression test suite asserting that buildClaimCookie, clearClaimCookie, and buildFingerprintCookie produce matching Domain= attributes. Explicitly demonstrates the bug where clearing without a domain fails to delete a domain-scoped cookie. 5. AI_GATEWAY_ID self-hoster safe: returns an empty summary (zero counts) when AI_GATEWAY_ID is not configured, instead of throwing. Self-hosters who don't use AI Gateway get a clean "no data" admin dashboard. 6. Fix .env.example cron default: TRIAL_CRON_ROLLOVER_CRON now shows "0 5 1 * *" matching the actual default after the collision fix. Co-Authored-By: Claude Opus 4.6 <[email protected]> --------- Co-authored-by: Raphaël Titsworth-Morin <[email protected]> Co-authored-by: Claude Opus 4.6 <[email protected]>

Resolves package.json version conflict (take main's newer deps) and fixes simple-import-sort/exports error in packages/shared/src/constants/index.ts. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Co-Authored-By: Claude Opus 4.6 <[email protected]>

- Autofix export sort in apps/web/src/lib/api/index.ts - Move useMemo before early return in AIUsageChart (rules-of-hooks) - Prefix unused anthropicModels with _ in staging test - Add FILE SIZE EXCEPTION comments for TryDiscovery.tsx and steps.ts Co-Authored-By: Claude Opus 4.6 <[email protected]>

Co-Authored-By: Claude Opus 4.6 <[email protected]>

sonarqubecloud · 2026-04-21T04:53:27Z

Quality Gate failed

Failed conditions
6 Security Hotspots
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

raphaeltm and others added 9 commits April 18, 2026 20:17

merge: wave-1 track-a backend lifecycle into trial onboarding integra…

4b87b6b

…tion

merge: wave-1 track-b backend claim + SSE streaming into trial onboar…

ac6c7b9

…ding integration

merge: wave-1 track-d chat gate + login sheet into trial onboarding i…

a2342ad

…ntegration

simple-agent-manager bot added the needs-human-review Agent could not complete all review gates — human must approve before merge label Apr 18, 2026

simple-agent-manager bot temporarily deployed to staging April 18, 2026 22:00 Inactive

simple-agent-manager bot had a problem deploying to staging April 18, 2026 22:00 Failure

fix(deploy): remove invalid --remote flag from wrangler kv key put

b15ca27

`wrangler kv key put` writes to remote by default; --remote is not a valid flag for that subcommand and caused the staging deploy's trial kill-switch step to fail. Co-Authored-By: Claude Opus 4.6 <[email protected]>

simple-agent-manager bot temporarily deployed to staging April 18, 2026 22:08 Inactive

simple-agent-manager bot temporarily deployed to staging April 18, 2026 22:13 Inactive

simple-agent-manager bot temporarily deployed to staging April 18, 2026 22:22 Inactive

simple-agent-manager bot temporarily deployed to staging April 18, 2026 22:26 Inactive

simple-agent-manager bot and others added 2 commits April 19, 2026 10:41

simple-agent-manager bot mentioned this pull request Apr 19, 2026

feat(web): support deeply nested chat sessions with context anchors #759

Merged

29 tasks

simple-agent-manager bot and others added 5 commits April 19, 2026 15:58

raphaeltm and others added 3 commits April 20, 2026 20:52

Merge branch 'sam/ai-proxy-anthropic-models' into sam/trial-onboardin…

30f2373

…g-mvp

simple-agent-manager bot temporarily deployed to staging April 20, 2026 21:35 Inactive

simple-agent-manager bot temporarily deployed to staging April 20, 2026 21:40 Inactive

raphaeltm and others added 2 commits April 21, 2026 01:17

Merge remote-tracking branch 'origin/sam/trial-discovery-stream-fix' …

cb40940

…into sam/trial-onboarding-mvp

simple-agent-manager bot temporarily deployed to staging April 21, 2026 01:21 Inactive

simple-agent-manager bot temporarily deployed to staging April 21, 2026 01:25 Inactive

simple-agent-manager bot temporarily deployed to staging April 21, 2026 02:00 Inactive

simple-agent-manager bot temporarily deployed to staging April 21, 2026 02:05 Inactive

simple-agent-manager bot temporarily deployed to staging April 21, 2026 02:10 Inactive

simple-agent-manager bot temporarily deployed to staging April 21, 2026 02:14 Inactive

fix(api): reduce AI Gateway page size to CF max of 50

5c2e5cd

The Cloudflare AI Gateway Logs API enforces a maximum per_page of 50. Co-Authored-By: Claude Opus 4.6 <[email protected]>

simple-agent-manager bot temporarily deployed to staging April 21, 2026 02:17 Inactive

simple-agent-manager bot temporarily deployed to staging April 21, 2026 02:22 Inactive

simple-agent-manager bot and others added 5 commits April 21, 2026 04:25

Merge origin/main into sam/trial-onboarding-mvp + fix export sort lint

e3345fc

Resolves package.json version conflict (take main's newer deps) and fixes simple-import-sort/exports error in packages/shared/src/constants/index.ts. Co-Authored-By: Claude Opus 4.6 <[email protected]>

fix(lint): sort imports in workspaces/runtime.ts

f3e2716

Co-Authored-By: Claude Opus 4.6 <[email protected]>

fix(lint): remove unused anthropicModels variable

ea9b4b1

Co-Authored-By: Claude Opus 4.6 <[email protected]>

simple-agent-manager bot merged commit 1f92ecf into main Apr 21, 2026
16 of 19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(trial): zero-friction URL-to-workspace onboarding MVP#758

feat(trial): zero-friction URL-to-workspace onboarding MVP#758
simple-agent-manager[bot] merged 35 commits intomainfrom
sam/trial-onboarding-mvp

simple-agent-manager bot commented Apr 18, 2026 •

edited

Loading

Uh oh!

simple-agent-manager bot commented Apr 18, 2026

Uh oh!

sonarqubecloud bot commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

simple-agent-manager bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

cc @raphaeltm — Configuration Checklist Before Merge

Staging (sammy.party) — zero manual steps required

Production (simple-agent-manager.org) — one manual step (the key)

Cookies

Kill Switches

What Shipped

Wave 0 — Foundation (e253c08e)

Wave 1 Track A — Backend Lifecycle (4ca29ea6)

Wave 1 Track B — Backend Claim + SSE (6ba2e101)

Wave 1 Track C — Frontend Discovery (e8088705)

Wave 1 Track D — Frontend Chat Gate (1114c8fc)

Wave 2 — Integration, Automation, and Live Fix

Non-negotiable Constraints Verified

Local Quality Gates

Staging Deployments

Staging Verification (Playwright + curl, live app)

Regression spot-check

What was NOT verified end-to-end

Review Status

Do NOT Merge Yet

Uh oh!

simple-agent-manager bot commented Apr 18, 2026

Staging verification update — trials automation + integration fix

Uh oh!

sonarqubecloud bot commented Apr 21, 2026

Quality Gate failed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

simple-agent-manager bot commented Apr 18, 2026 •

edited

Loading

Staging (`sammy.party`) — zero manual steps required

Production (`simple-agent-manager.org`) — one manual step (the key)

Wave 0 — Foundation (`e253c08e`)

Wave 1 Track A — Backend Lifecycle (`4ca29ea6`)

Wave 1 Track B — Backend Claim + SSE (`6ba2e101`)

Wave 1 Track C — Frontend Discovery (`e8088705`)

Wave 1 Track D — Frontend Chat Gate (`1114c8fc`)