feat: add agentic context management with model-driven compression tools#2754
feat: add agentic context management with model-driven compression tools#2754notowen333 wants to merge 1 commit into
Conversation
Introduces `contextManager: 'agentic'` which gives the model tools (summarize_context, truncate_context, pin) and injects context-status XML so it can autonomously manage its own context window. Refactors shared compression logic into a reusable compression/ module.
| ...retryStrategies, | ||
| ...(config?.plugins ?? []), | ||
| ...(config?.contextManager === 'auto' && !hasOffloader | ||
| ...((config?.contextManager === 'auto' || config?.contextManager === 'agentic') && !hasOffloader |
There was a problem hiding this comment.
should we have larger offloading limits? or allow the agent to configure this at all?
i think it makes sense to have a higher token barrier to offload if we are also exposing a tool for this.
There was a problem hiding this comment.
spoke offline that we'll skip tool based offloading setting for now.
| .optional() | ||
| .describe( | ||
| 'Filter which messages to target. "tools" targets only tool use/result messages, ' + | ||
| '"messages" targets only non-tool messages, "all" (default) targets everything.' |
There was a problem hiding this comment.
should we filter by user / assistant message as well?
message being non-tool is a bit of a catch all.... what other items live in the messages array? is there a way we could do inverse (ie. select all messages except user messages, i see that as a use case for future context for sure.
There was a problem hiding this comment.
we can add this later since this isn't exposed. but just a note
| }), | ||
| ]) | ||
| .describe( | ||
| 'How to select messages. "current_turn" pins the last user+assistant exchange. ' + |
There was a problem hiding this comment.
can we pin only user/assistant exchanges?
can we pin only user or only assistant? can we pin tool results?
There was a problem hiding this comment.
i think its important that we can only pin user messages as an option.
fine to ship as is and we can follow up on these things after summit if they are not breaking
There was a problem hiding this comment.
I redesigned the tool interface which should cover these
|
|
||
| let targetIndices: number[] | ||
|
|
||
| if (selector.type === 'current_turn') { |
There was a problem hiding this comment.
so a turn is defined as any batch of assistant messages followed up user messages, both can be any length?
i'd want to iterate on this definition after summit
There was a problem hiding this comment.
its now just walking backwards until we hit a user text block
| /** | ||
| * Partition a range of messages into pinned and unpinned groups. | ||
| */ | ||
| export function partitionPinned( |
There was a problem hiding this comment.
should this be in pin-messages.ts?
There was a problem hiding this comment.
actually im pretty sure we already have this same function there and this is duplicate
There was a problem hiding this comment.
It's not pure pinning, so I didn't put it there
There was a problem hiding this comment.
i.e. it uses a DS in the input that's only for the agentic mode
There was a problem hiding this comment.
sorry what is DS? can we consolidate by calling the existing partition pinned with start=0? i think two functions w the same name that are basically the same is pretty confusing
There was a problem hiding this comment.
It's a really simple functionality. I think this is actually simpler to have a 3 way partition separate instead of first re-using the 2 way
There was a problem hiding this comment.
DS=data structure which specifically I mean MessageTypeFilter
Summary
contextManager: 'agentic'option that registers context tools (summarize_context,truncate_context,pin) and injects<context-status>XML via middleware so the model can autonomously manage its own context windowSlidingWindowConversationManagerandSummarizingConversationManagerinto a reusablecompression/moduleInvokeModelStage.Inputmiddleware stage for pre-model-call message transformationKey design decisions
agent.messages(in-place splice) so compression effects persist across invokesmessageTypefilter lets the model selectively target chat messages vs tool results for compressionTest plan
summarize_contextwithmessageType: "messages"for chat andtruncate_contextwithmessageType: "tools"for bulky tool results