Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -484,6 +484,17 @@ def _ensure_agent_ready(self) -> None:

self._agent_ready = True

def _should_initialize_agent_on_send_message(self) -> bool:
"""Return whether send_message() should eagerly initialize the agent.

ACPAgent startup is substantially heavier than regular agent
initialization because it launches and handshakes with an external ACP
subprocess. Deferring that work to run() keeps send_message() fast and
avoids HTTP client read timeouts on the remote conversation endpoint.
Comment on lines +487 to +493
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Documentation: The docstring explains why we defer ACP init (performance, HTTP timeouts) but doesn't explain the consequence.

Consider adding: "For ACP agents, the agent will not be initialized until run() is called. The message is enqueued in conversation state, but agent internals will not be accessible until after run() begins."

This matters if anyone tries to inspect agent state between send_message() and run() - they'll see different behavior for ACP vs other agents.

"""
agent_kind = getattr(self.agent, "kind", self.agent.__class__.__name__)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable: Using getattr() with string comparison violates the repo guideline to "avoid getattr/hasattr guards and instead enforce type correctness."

The cleaner approach would be:

# In AgentBase:
defer_initialization: bool = False

# In ACPAgent:
defer_initialization: bool = True

# Here:
return not self.agent.defer_initialization

However, your pragmatic choice is reasonable if:

  1. The kind attribute is stable (used for serialization)
  2. You're avoiding import cycles
  3. Adding a base property for one use case feels like over-engineering

This is the kind of trade-off where theory and practice clash. I'd lean toward the base property for maintainability, but I won't block on it given this solves a real production issue.

return agent_kind != "ACPAgent"

def switch_profile(self, profile_name: str) -> None:
"""Switch the agent's LLM to a named profile.

Expand Down Expand Up @@ -520,8 +531,12 @@ def send_message(self, message: str | Message, sender: str | None = None) -> Non
one agent delegates to another, the sender can be set to
identify which agent is sending the message.
"""
# Ensure agent is fully initialized (loads plugins and initializes agent)
self._ensure_agent_ready()
# ACPAgent startup can take much longer than a normal send_message()
# round-trip because it launches and initializes a subprocess-backed
# session. Defer that work to run() so enqueueing the user message
# remains fast for remote callers.
if self._should_initialize_agent_on_send_message():
self._ensure_agent_ready()

if isinstance(message, str):
message = Message(role="user", content=[TextContent(text=message)])
Expand Down
40 changes: 39 additions & 1 deletion tests/sdk/conversation/local/test_conversation_send_message.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,14 @@
from unittest.mock import patch

from pydantic import SecretStr

from openhands.sdk.agent.acp_agent import ACPAgent
from openhands.sdk.agent.base import AgentBase
from openhands.sdk.conversation import Conversation, LocalConversation
from openhands.sdk.conversation.state import ConversationState
from openhands.sdk.conversation.state import (
ConversationExecutionStatus,
ConversationState,
)
from openhands.sdk.conversation.types import (
ConversationCallbackType,
ConversationTokenCallbackType,
Expand Down Expand Up @@ -153,3 +159,35 @@ def test_send_message_with_message_object():
assert len(user_event.llm_message.content) == 1
assert isinstance(user_event.llm_message.content[0], TextContent)
assert user_event.llm_message.content[0].text == test_text


def test_acp_send_message_defers_initialization_until_run(tmp_path):
"""ACP conversations should enqueue messages before starting ACP bootstrap."""

agent = ACPAgent(acp_command=["echo", "test"])
conversation = LocalConversation(agent=agent, workspace=str(tmp_path))

def _finish_immediately(self, conv, on_event, on_token=None):
conv.state.execution_status = ConversationExecutionStatus.FINISHED

with (
patch.object(ACPAgent, "init_state", autospec=True) as mock_init_state,
patch.object(
ACPAgent,
"step",
autospec=True,
side_effect=_finish_immediately,
) as mock_step,
):
Comment on lines +171 to +181
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Acceptable: This test is entirely mock-based - you're not exercising real ACP initialization, just verifying the control flow.

Per repo guidelines, "prefer tests that exercise real code paths and assert on outputs/state." However, this is probably acceptable because:

  1. Real ACP subprocess initialization would be slow/flaky in unit tests
  2. The test does prove the key behavior: init is deferred to run()
  3. You're doing manual validation on real swebench instances

The test serves its purpose - it would catch if someone accidentally removed the deferral logic.

conversation.send_message("Hello from ACP")

assert mock_init_state.call_count == 0
assert mock_step.call_count == 0
assert len(conversation.state.events) == 1
assert isinstance(conversation.state.events[-1], MessageEvent)
assert conversation.state.events[-1].source == "user"

conversation.run()

assert mock_init_state.call_count == 1
assert mock_step.call_count == 1
Loading