fix(daemon): add backpressure control and command serialization to prevent IPC EAGAIN#529
Merged
ctate merged 1 commit intovercel-labs:mainfrom Feb 23, 2026
Merged
Conversation
…event IPC EAGAIN - Add AGENT_BROWSER_DEFAULT_TIMEOUT env var to override Playwright's default 60s timeout (CDP/recording 10s timeouts unaffected) - Add backpressure-aware safeWrite() that waits for drain when socket buffer is full, preventing data loss under load - Serialize command execution per socket via queue to prevent concurrent writes that cause buffer contention These daemon-side fixes complement vercel-labs#329 (CLI-side EAGAIN retry) by addressing the root causes: Playwright operations that outlast the CLI's IPC timeout, and concurrent socket.write() calls that overflow the kernel buffer. Tested with heavy React app (1000+ DOM nodes) — 10 consecutive snapshot commands complete without os error 35/11. Refs vercel-labs#322
Contributor
|
@shohu is attempting to deploy a commit to the Vercel Labs Team on Vercel. A member of the Team first needs to authorize it. |
Collaborator
|
This is so good - thanks @shohu |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the daemon-side root causes of EAGAIN (os error 35 on macOS / error 11 on Linux) that crash the Rust CLI during IPC reads.
This complements #329 (CLI-side retry logic) by preventing the conditions that trigger EAGAIN in the first place.
Changes
src/browser.ts— Configurable Playwright timeout via environment variablegetDefaultTimeout()helper that readsAGENT_BROWSER_DEFAULT_TIMEOUTenv varsetDefaultTimeout(60000)calls (5 locations) withsetDefaultTimeout(getDefaultTimeout())src/daemon.ts— Backpressure-aware writes + command serializationsafeWrite()helper that waits fordrainevent whensocket.write()returnsfalse, with proper cleanup onclose/errorsocket.write()calls that cause kernel buffer contention.catch()guards for unhandled rejection safetyRoot Cause Analysis
Two daemon-side issues combine to cause EAGAIN:
Playwright timeout > CLI IPC timeout:
setDefaultTimeout(60000)means Playwright can block for up to 60s, but the Rust CLI times out earlier. The daemon never responds, and the CLI'sread_line()hits EAGAIN.Uncontrolled
socket.write()concurrency: Multiple async command handlers can callsocket.write()in parallel. When payloads are large (e.g.,snapshotresponses), the kernel buffer fills up and subsequent writes/reads fail with EAGAIN.Testing
snapshotcommands complete without os error 35Usage
Refs #322