Replies: 1 comment
-
|
Your description is correct: once chats become large enough, the real limitation is token bounds, not whether the context is stored in a text file, a database, or anything else. Summarization is necessary, but after several cycles it inevitably starts losing critical meaning, so it cannot serve as a complete long-term solution by itself. At that point, the only practical implementation-level approach is to distribute memory responsibility across agents, including attaching additional agents that maintain scoped context, and this has already been used in practice in experimental repository fork. The core problem is the token limit. As conversations grow, replay becomes infeasible, and summarisation eventually degrades meaning after several cycles, so it cannot be relied on indefinitely. A minimal solution is to avoid passing full context and instead pass lightweight context references. This can be done by introducing a Context ID array at the protocol level. Each ID is a string, and its meaning is defined by its naming. The agent interprets the string and decides which MCP/tool to use to resolve it, for instance: ACP protocol does not define storage or structure. It only carries identifiers. The main requirement is a consistent naming convention, so that agents can reliably interpret the intent of each ID from the string itself. I am focusing on agents here. For a user, passing context IDs is trivial since they can generate and manage them within their own system. The real challenge appears when an agent needs to use the protocol itself — for example, to invoke MCP or interact with other agents. In that case, the problem becomes how the agent understands and correctly resolves those context references. The open question is how to standardise that naming pattern so that different agents can interpret it consistently and map it to the correct tool or retrieval mechanism. I also reviewed the schema in detail. In principle, it seems possible that some intended solution may already partially exist through Embedded Context, and that this direction could be extended with minimal changes. Still, this looks more like a discussion point and a direction for further development than something fully resolved at the protocol level. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I’ve been thinking a lot about long-running chats in ACP-native clients, and I’m curious how others are approaching this.
In Fabriqa, I now support switching between ACP agents within the same chat. Today this works in a pretty poor-man’s way: after initialize, I start a new agent process and send the existing chat log back as an initialization prompt so the new agent can pick up prior context. I already reduce some bloat by offloading heavy tool results instead of replaying everything verbatim, but this only works up to a point.
If the user keeps using the same chat for a long time, I eventually hit context-limit concerns. Also, in many real cases the original ACP process is already gone when the user returns: next day, new machine, ephemeral sandbox, etc. So I can’t rely on the old session still existing.
In a full ACP-based client, I also may not have direct LLM access myself. I only have the ACP agent. So if I need a durable resumable summary, the only feasible fallback I see today is something like:
Try session/load first.
If that is unavailable or fails, create a summary via session/fork.
Persist that summary server-side in my own database with a timestamp or message boundary.
When resuming a very long chat, initialize a fresh session with:
the stored summary from the forked session
plus the messages after that summary point
This seems workable, but it feels wasteful because I am spending extra user tokens to generate a second summary purely for resumability.
What makes this more frustrating is that, at least on the Claude side, the underlying agent stack already seems to have this information:
Claude Code / Claude Agent SDK exposes compaction summaries via PostCompact.compact_summary
the SDK stream/types also seem to have compaction content blocks
but current ACP adapters do not appear to forward that summary to ACP clients
for example, from what I could tell, claude-agent-acp and codex-acp surface compaction as a generic event/message, not as reusable structured compacted state
So from an ACP-native client perspective, we are forced to re-summarize with session/fork even when the agent already has an internal compaction summary it trusts enough to continue from itself.
What I would really like ACP to support is a standard way for clients to access the agent’s internal compaction summaries, ideally with enough metadata to know what message/turn boundary they cover. Then a client could persist that server-side and later resume from:
compacted summary
compaction boundary
subsequent messages
That would make long-running ACP chats much more practical, especially for clients that:
do not have separate direct model access
support agent switching
need resumability after the original agent process is gone, on another machine/sandbox
Is anyone else thinking about this problem from the “ACP-only client, no separate LLM access” angle? It feels like long-lived ACP clients need a standard story for durable resumability once naive full-history replay stops being viable (on an environment that is not possible to load the existing session, like ephermal sandboxes)
Beta Was this translation helpful? Give feedback.
All reactions