fix(anthropic-sse): finalize stream on mid-stream socket close (never-hang)#143
Open
jsboige wants to merge 1 commit into
Open
fix(anthropic-sse): finalize stream on mid-stream socket close (never-hang)#143jsboige wants to merge 1 commit into
jsboige wants to merge 1 commit into
Conversation
…-hang)
Z.AI and other Anthropic-compatible providers (MiniMax, Kimi) intermittently
close the TCP socket mid-stream under load / burst limits. The reader.read()
in the passthrough loop rejected, escaped to the outer `catch (e)` block,
which did a bare `controller.close()` with NO terminal message_stop event.
The Claude Code client then reported "API Error: The socket connection was
closed unexpectedly" and froze the turn — a hung agent, the worst proxy
failure mode.
Wrap the read loop in an inner try/catch that has a new `finalizeWithError`
helper in scope (declared in the same outer try). On a reader rejection it
emits a valid terminal Claude message (message_delta with end_turn +
message_stop), preserving whatever streamed before the cutoff so the client
ends the turn cleanly instead of freezing.
The helper handles both cases:
- no content reached the client yet → synthetic message_start + content
block + the notice, so message_stop is well-formed
- content already streaming → message_delta + message_stop (the client
tolerates a missing content_block_stop on early termination far better
than a missing message_stop, which hangs the turn)
Regression test: upstream body that emits message_start + a text delta then
errors. Asserts the partial content survives AND a terminal message_stop is
emitted. Fails without the fix (no message_stop reaches the client), passes
with it.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Anthropic-compatible providers that speak native Anthropic SSE (Z.AI, MiniMax, Kimi) intermittently close the TCP socket mid-stream under load / burst limits. When that happens,
reader.read()in the passthrough loop rejects, the rejection escapes to the outercatch (e)block, which does a barecontroller.close()with no terminalmessage_stopevent.The Claude Code client then reports:
…and freezes the agent turn — a hung agent, the worst proxy failure mode.
Root cause
In
anthropic-sse.ts, the read loop's outer catch can only raw-close the controller. There was no path to emit the terminalmessage_stopthe client requires to end a turn, because a finalization helper was never in scope.Fix
Two changes, both additive:
finalizeWithError(errMsg, path)helper — declared inside the outertry, above the read loop. Emits a valid terminal Claude message so the client ends the turn cleanly:message_start+ a content block carrying a short notice, so the closing events are well-formedmessage_delta(end_turn)+message_stop(the client tolerates a missingcontent_block_stopon early termination far better than a missingmessage_stop, which hangs the turn)Inner
try/catcharound the read loop — a mid-stream reader rejection is now caught wherefinalizeWithErroris in scope and routed through it, instead of escaping to the bare outer close. The outer catch is preserved as a last-resort safety net.Whatever streamed before the cutoff is preserved — no content loss.
Why this matters
A frozen agent is far worse than a clean error. With providers that drop connections under load (observed in production with Z.AI serving GLM models), this turns a transient network blip into a permanently stuck session that the user must manually reset. The fix degrades it to a clean end-of-turn.
Test
Regression test added to
format-translation.test.ts:Replays an upstream body that emits
message_start+ a text delta, then errors (simulating socket close). Asserts:message_stopis presentstop_reason: end_turnFails without the fix (no
message_stopreaches the client), passes with it.Scope
packages/cli/src/handlers/shared/stream-parsers/anthropic-sse.ts🤖 Generated with Claude Code