-
Notifications
You must be signed in to change notification settings - Fork 78
feat(callbacks): Callbacks can edit messages #852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
You're cooking, looks good to me |
|
Smolagents implementation pending improvement based on any responses from the HF team on huggingface/smolagents#1834 |
| "model": model_id, | ||
| "api_key": api_key, | ||
| "api_base": api_base, | ||
| "allow_running_loop": True, # Because smolagents uses sync api |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bug in our code that was never found because we weren't testing callbacks in our integration tests
| # Only invoke callbacks on the first LLM call, not on retry attempts | ||
| # Retries can be detected by checking if there are error messages in the history | ||
| is_retry = any( | ||
| "Error:" in str(msg.content) and "Now let's retry" in str(msg.content) | ||
| for msg in messages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The smolagents implementation is certainly quirky. They append error messages as tool responses and have retry hooks a little differently from the other libraries. Putting this here which Claude helped me generate (with a bunch of guidance):
The Complete Flow with Message Modification in Smolagents librar
First Call (Step 1):
- _run_stream() line 576: calls _step_stream()
- _step_stream() line 1259: calls write_memory_to_messages()
- write_memory_to_messages() line 764: calls TaskStep.to_messages()
- TaskStep.to_messages() line 192: returns "New task:\nSay goodbye"
- Wrapper modifies: Changes last message to "Say hello and goodbye" ✓
- _step_stream() line 1284: calls model.generate() with modified messages
- LLM tries to call final_answer twice → raises AgentExecutionError
- _run_stream() line 597: stores error in action_step.error
- _run_stream() line 600: appends action_step (with error) to memory
Second Call (Step 2 - Retry):
- _run_stream() line 576: calls _step_stream() AGAIN
- _step_stream() line 1259: calls write_memory_to_messages() AGAIN
- write_memory_to_messages() rebuilds from memory:
- TaskStep.to_messages() → "New task:\nSay goodbye" (original task!)
- ActionStep.to_messages() → Error message: "Error:\n...Now let's retry..." - Messages are rebuilt from memory → modification is LOST ✗
- Wrapper modifies the last message (which is now the error message, not the task!)
- LLM receives original unmodified task + error message
Why This Happens
The design assumes messages are stateless and reproducible from memory structures. Smolagents never expects the messages parameter to be modified at runtime. The architecture is:
Memory Structures (TaskStep, ActionStep) → to_messages() → Fresh Message List
This happens before every LLM call, so any modifications to the message list itself are discarded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because of this, we don't want to redo the callbacks each retry, we want to only make that callback interference on the entry, not the retries (since that would be messing with the internal smolagents logic).
|
Integration tests: https://github.com/mozilla-ai/any-agent/actions/runs/18853291870 |
|
This PR is stale because it has been open 7 days with no activity. Remove stale label or comment or this will be closed in 3 days. |
|
Integration tests: https://github.com/mozilla-ai/any-agent/actions/runs/19055632789 |
|
This PR is stale because it has been open 7 days with no activity. Remove stale label or comment or this will be closed in 3 days. |
|
This PR is stale because it has been open 7 days with no activity. Remove stale label or comment or this will be closed in 3 days. |
|
This PR was closed because it has been stalled for 3 days with no activity. |
Feature: before_llm_call can edit the messages.
The pros:
The cons:
I don't see a good solution for the con here: callbacks to modify message behavior wildly varies from framework to framework and is heavily implementation specific. I lean towards merging this PR and fielding any bug reports if they come in, but I want to be clear that I'm not confident that this is going to be a robust long term solution, but I also don't see any other reasonable way to do this across all the agent frameworks.