Summary
After a turn completes (on-success fires), late-arriving session/update notifications carrying agent_message_chunk or agent_thought_chunk are rendered as if they belong to the next turn. The user sees stale output from the prior turn displayed as the response to their new prompt.
How to reproduce
- Use agent-shell with Claude Code (via claude-agent-acp) as the backend.
- Have the agent perform several tool calls in a single turn.
- When the turn completes and the prompt reappears, quickly type and submit a new question.
- The response to the new question may show text from the previous turn's output before eventually showing the real answer.
The symptom is most visible when the "Notices" drawer shows errors like:
No onPostToolUseHook found for tool use ID: toolu_XXXXX
These errors and the stale output tend to appear together.
Root cause
There are two layers to this bug:
1. Server-side: claude-agent-acp sends notifications after the RPC response
In claude-agent-acp (the ACP server wrapping the Claude Agent SDK), the prompt() method in acp-agent.ts returns as soon as the SDK yields a "result" message (line 520: return { stopReason: "end_turn", usage }). It does not await pending PostToolUse hook callbacks.
These hooks are registered during streaming via registerHookCallback() (tools.ts line 731-746) when tool_use content blocks are encountered. The Claude Agent SDK fires them asynchronously after tool execution completes. The registered callback (acp-agent.ts line 1482-1508) calls await client.sessionUpdate(notification) to send a tool_call_update notification to the pipe.
When a hook fires after prompt() has already returned, the SDK's Connection#processMessage (@agentclientprotocol/sdk, acp.js line 762) has already sent the RPC response. The hook's sessionUpdate() write goes through #sendMessage -> #writeQueue, but the response was already enqueued first. The notification arrives on the pipe after the response.
The global toolUseCallbacks map (tools.ts line 720-728) is the mechanism:
- Registered: during streaming, on first tool_use encounter (line 1480-1481)
- Fired: by SDK's PostToolUse hook callback (
tools.ts line 756-775)
- Cleaned up:
delete toolUseCallbacks[toolUseID] after firing (line 767)
- Error case: "No onPostToolUseHook found" logged when the ID was never registered or already consumed (line 769)
2. Client-side: agent-shell trusts notification ordering
In agent-shell.el, agent-shell--send-command resets :last-entry-type to nil when sending a new prompt (line 3939). The agent-shell--on-notification handler for agent_message_chunk (line 1207-1226) uses :last-entry-type to decide whether to create a new fragment or append to an existing one.
When a stale agent_message_chunk arrives after the reset:
:last-entry-type is nil (reset for the new turn)
- The handler treats it as the start of a new agent message
:chunked-group-count is incremented
agent-shell--update-fragment is called with :create-new t
- The stale text appears as the response to the user's new prompt
Evidence from transcript analysis
In a real transcript, the bug manifests as User and Agent timestamps being identical (same second), which is impossible for a real LLM response:
## User (YYYY-MM-DD HH:MM:SS)
why aren't the foo-bar-* root modules showing removed resources?
## Agent (YYYY-MM-DD HH:MM:SS)
All done. Working tree clean, all plans verified sensible. [stale text]
Proposed fix
Server-side (claude-agent-acp)
prompt() should await all pending PostToolUse hook callbacks before returning. Track pending hook promises and await Promise.all(...) before the return statements in the result handler. (agentclientprotocol/claude-agent-acp#353)
Client-side (agent-shell) -- this PR
Regardless of whether the server is fixed, the client should be resilient to late notifications. Add a :turn-completed flag to state that is set to t in the on-success callback and reset to nil when a new prompt is sent. Guard the agent_message_chunk and agent_thought_chunk handlers with (unless (map-elt state :turn-completed) ...) to silently drop stale notifications.
Summary
After a turn completes (
on-successfires), late-arrivingsession/updatenotifications carryingagent_message_chunkoragent_thought_chunkare rendered as if they belong to the next turn. The user sees stale output from the prior turn displayed as the response to their new prompt.How to reproduce
The symptom is most visible when the "Notices" drawer shows errors like:
These errors and the stale output tend to appear together.
Root cause
There are two layers to this bug:
1. Server-side: claude-agent-acp sends notifications after the RPC response
In
claude-agent-acp(the ACP server wrapping the Claude Agent SDK), theprompt()method inacp-agent.tsreturns as soon as the SDK yields a"result"message (line 520:return { stopReason: "end_turn", usage }). It does not await pendingPostToolUsehook callbacks.These hooks are registered during streaming via
registerHookCallback()(tools.tsline 731-746) whentool_usecontent blocks are encountered. The Claude Agent SDK fires them asynchronously after tool execution completes. The registered callback (acp-agent.tsline 1482-1508) callsawait client.sessionUpdate(notification)to send atool_call_updatenotification to the pipe.When a hook fires after
prompt()has already returned, the SDK'sConnection#processMessage(@agentclientprotocol/sdk,acp.jsline 762) has already sent the RPC response. The hook'ssessionUpdate()write goes through#sendMessage->#writeQueue, but the response was already enqueued first. The notification arrives on the pipe after the response.The global
toolUseCallbacksmap (tools.tsline 720-728) is the mechanism:tools.tsline 756-775)delete toolUseCallbacks[toolUseID]after firing (line 767)2. Client-side: agent-shell trusts notification ordering
In
agent-shell.el,agent-shell--send-commandresets:last-entry-typetonilwhen sending a new prompt (line 3939). Theagent-shell--on-notificationhandler foragent_message_chunk(line 1207-1226) uses:last-entry-typeto decide whether to create a new fragment or append to an existing one.When a stale
agent_message_chunkarrives after the reset::last-entry-typeisnil(reset for the new turn):chunked-group-countis incrementedagent-shell--update-fragmentis called with:create-new tEvidence from transcript analysis
In a real transcript, the bug manifests as User and Agent timestamps being identical (same second), which is impossible for a real LLM response:
Proposed fix
Server-side (claude-agent-acp)
prompt()should await all pending PostToolUse hook callbacks before returning. Track pending hook promises andawait Promise.all(...)before thereturnstatements in the result handler. (agentclientprotocol/claude-agent-acp#353)Client-side (agent-shell) -- this PR
Regardless of whether the server is fixed, the client should be resilient to late notifications. Add a
:turn-completedflag to state that is set totin theon-successcallback and reset tonilwhen a new prompt is sent. Guard theagent_message_chunkandagent_thought_chunkhandlers with(unless (map-elt state :turn-completed) ...)to silently drop stale notifications.