Skip to content

Rewrite notifications handling on the client using per-client channel buffer#75

Merged
nerzhulart merged 11 commits intomasterfrom
n500/#74
Mar 4, 2026
Merged

Rewrite notifications handling on the client using per-client channel buffer#75
nerzhulart merged 11 commits intomasterfrom
n500/#74

Conversation

@nerzhulart
Copy link
Copy Markdown
Contributor

Reworked notifications handling to avoid blocking notifications handler in the case when session is being initialized.]
See issue #74 and https://youtrack.jetbrains.com/issue/AIAE-76/Session-load-does-not-send-messages-to-the-actual-client#focus=Comments-27-13428158.0-0

Instead of suspending until a newly created session response is received (that leads to deadlock) we store session related notifications to a per-session channel queue (even if the session-id is not yet here) and unsuspend setNotificationHandler(AcpMethod.ClientMethods.SessionUpdate) {} lambda immediately. Then, after a session load/new/fork/resume response is received we indicate that the session is initialized (see holder.completeSession) and drain the corresponding update channel

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reworks client-side session notification handling to prevent deadlocks when session/* responses are preceded by session/update notifications (e.g., during session/load replay), by buffering session updates per session until the session is initialized.

Changes:

  • Introduces a per-session holder with a notification buffer to queue session/update events until session creation completes.
  • Updates Client session lookup/initialization flow to use the new holder abstraction.
  • Adds regression tests covering “notifications before response” for both session/load and session/new, plus slow-agent/slow-client scenarios.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 6 comments.

File Description
build.gradle.kts Bumps library version to 0.16.2.
acp/src/commonMain/kotlin/com/agentclientprotocol/client/Client.kt Implements per-session buffering via ClientSessionHolder and adjusts session resolution logic.
acp-ktor-test/src/commonTest/kotlin/com/agentclientprotocol/SimpleAgentTest.kt Adds tests ensuring updates arriving before responses are delivered and not lost under timing variance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +53 to +60
suspend fun drainEventsAndCompleteSession(session: ClientSessionImpl) {
@OptIn(ExperimentalCoroutinesApi::class)
notifications.close()
for ((notification, meta) in notifications) {
session.executeWithSession {
session.handleNotification(notification, meta)
}
}
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drainEventsAndCompleteSession() closes the notifications channel before draining. While draining runs, any new SessionUpdate will hit the trySend failure path and the notification handler will suspend, blocking Protocol.start()'s read loop (notifications are handled inline). This can reintroduce message-processing stalls/deadlocks under load. Consider keeping the notification handler strictly non-suspending (always enqueue) and draining/processing from a dedicated consumer coroutine per session, or otherwise avoid closing the channel while new updates may still arrive.

Copilot uses AI. Check for mistakes.
Comment on lines +45 to +50
// Don't make the channel limited, because it leads to a deadlock also:
// when client side makes loadSession/newSession and an agent sends updates more than channel.capacity
// the message with call response suspends because protocol thread is suspended in handleNotification
// if to address it we have to somehow reorder events, that's not obvious on the protocol level, so we pay with memory right now to handle it
private val notifications = Channel<Pair<SessionUpdate, JsonElement?>>(capacity = Channel.UNLIMITED)

Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Channel.UNLIMITED for queued session updates can lead to unbounded memory growth if an agent floods session/update (or if draining is slow). If this is intentional, consider at least adding a configurable cap and a defined overflow strategy (drop/close session/backpressure) to avoid OOM in production.

Copilot uses AI. Check for mistakes.
Comment on lines +380 to +383
}.getOrElse { throwable ->
sessionHolder.completeExceptionally(IllegalStateException("Failed to create session $sessionId", throwable))
removeSessionHolder(sessionId)
throw throwable
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In createSession() error handling, the holder is completed exceptionally with a wrapped IllegalStateException("Failed to create session ..."), but the method rethrows the original throwable. This means callers of loadSession/newSession and code awaiting getSessionOrThrow() will observe different exception types/messages for the same failure, making debugging inconsistent. Consider throwing the same wrapped exception (or not wrapping at all) so both paths surface identical failures.

Copilot uses AI. Check for mistakes.
@erokhins erokhins self-requested a review March 4, 2026 17:34
return clientSessionHolder ?: acpFail("Session $sessionId not found")
}

private fun removeSessionHolder(sessionId: SessionId) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused function

for ((id, holder) in hangingSessions) {
logger.trace { "Removing hanging session $id" }
// report it as non existent session
holder.completeExceptionally(AcpExpectedError("Session $id not found"))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: for failed to initialize sessions we will completeExceptionally twice. -- could check if the session already completed

@nerzhulart nerzhulart merged commit 0c9633d into master Mar 4, 2026
1 check passed
@nerzhulart nerzhulart deleted the n500/#74 branch March 4, 2026 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants