fix(runtime): launch CLI agents on Windows#29
Conversation
Agent detection used `which`, which only exists on POSIX shells; the studio server runs under `node` directly so every CLI agent (claude, codex, ...) was reported unavailable on Windows. Spawning also passed the bare bin name, so `claude` (really `claude.cmd`) failed to launch with ENOENT (-4058). - detect.ts: use `where` on win32, `which` elsewhere; take first match - spawn.ts: resolve the real bin path via `where` and run through a shell on win32 so `.cmd` shims launch; keep the cheap PATH path on POSIX. CLI spawn is now async (path resolution is async) but still returns a synchronous SpawnHandle via an AbortController. Verified end-to-end on Windows: claude agent test returns exit 0 with real output.
|
Heads-up: PR #3 and PR #10 are also open against the same Windows CLI runtime path. All three touch You and the other authors may want to compare approaches so effort doesn't get duplicated while maintainers decide what lands. |
…I-key agent The studio composer returned empty replies mid-flow on Windows (e.g. at the confirm/generate phase) with "No ANTHROPIC_API_KEY in env", even though a logged-in `claude` CLI was present. Two causes: - detectAll() probes ~13 agents with Promise.all, so a dozen `where` lookups spawn at once. Under that contention the 2s timeout was too tight and intermittently marked installed CLI agents (claude/codex) unavailable. Bump the which/where timeout to 8s. - When no agent is pinned, the resolver fell back to `anthropic-api` (HTTP, needs a key). Combined with the flaky probe above and a non-persisted project.agentId, a later turn could resolve to anthropic-api and fail with "No ANTHROPIC_API_KEY". Now prefer a ready-to-run CLI agent, only use anthropic-api when it's actually configured, and persist the resolved agent on the project so every turn in a session uses the same one. Verified end-to-end on Windows: detection returns claude=available stably (3/3), and a full opener→confirm→generate run produces HTML (preview_ready) via the claude CLI with no API key.
|
Added a second commit (
Verified end-to-end on Windows: detection returns claude=available stably (3/3), and a full opener→confirm→generate run produces HTML ( |
|
One update since my earlier note: PR #31 has gone through multiple review rounds and was APPROVED on the same That overlaps with the core ENOENT fix in this PR. The pieces that look unique here and aren't in #31:
Worth watching what exactly lands from #31 first. If the Windows ENOENT fix ships via #31, narrowing this PR to those three unique pieces would make the path to merge faster and cleaner. Happy to keep this open and revisit once #31 resolves — just wanted to make sure you had the latest context. |
Problem
On Windows the studio composer couldn't send any message — the agent picker showed every CLI agent (claude, codex, …) as unavailable, and selecting "Anthropic API (direct)" needed an API key. Forcing a CLI agent through failed with exit code -4058 (ENOENT) / empty reply.
Two Windows-only root causes:
which, which only exists on POSIX shells. The studio server runs undernodedirectly, soexecFile('which', …)always failed → every CLI agent reportedavailable: false, even when installed.claudeis reallyclaude.cmd; Node'sspawn()can't launch a batch file without a shell → ENOENT (-4058).Fix
detect.ts: usewhereon win32,whichelsewhere; take the first match line.spawn.ts: resolve the real bin path viawhereand run through a shell on win32 so.cmdshims launch; keep the cheap PATH path on POSIX. The CLI spawn branch is now async (path resolution is async) but still returns a synchronousSpawnHandlevia anAbortController.No behaviour change on macOS/Linux (same
which+ bare-bin spawn path).Verification
Verified end-to-end on Windows 11 against a real
claudelogin (no API key):Before the fix the same call returned
exit_code:-4058(empty) /exit_code:1. Agent picker now correctly showsclaudeandcodexas available.