feat(callbacks): Callbacks can edit messages #852

njbrake · 2025-10-21T16:25:10Z

Feature: before_llm_call can edit the messages.

The pros:

This feature works on all frameworks, I feel good about the integration test I added sufficiently exercising the code so that we will know if any underlying logic changes for the situation we are testing, the test will fail and we'll know one of our assumptions is broken

The cons:

Because we have to inject ourselves into specific parts of the agent loop, we're not pretty tightly tied to the agent being used inside the framework. If the user decides to use a different agent model from the default_agent_model in each framework, our callbacks will probably not work for them.

I don't see a good solution for the con here: callbacks to modify message behavior wildly varies from framework to framework and is heavily implementation specific. I lean towards merging this PR and fielding any bug reports if they come in, but I want to be clear that I'm not confident that this is going to be a robust long term solution, but I also don't see any other reasonable way to do this across all the agent frameworks.

codecov · 2025-10-21T16:25:54Z

Codecov Report

❌ Patch coverage is 51.07914% with 204 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/any_agent/frameworks/google.py	24.69%	58 Missing and 3 partials ⚠️
src/any_agent/callbacks/wrappers/google.py	49.46%	36 Missing and 11 partials ⚠️
src/any_agent/callbacks/wrappers/smolagents.py	39.62%	30 Missing and 2 partials ⚠️
src/any_agent/callbacks/wrappers/agno.py	63.15%	13 Missing and 1 partial ⚠️
src/any_agent/callbacks/span_generation/agno.py	50.00%	7 Missing and 4 partials ⚠️
src/any_agent/callbacks/context.py	61.90%	8 Missing ⚠️
src/any_agent/frameworks/openai.py	55.55%	7 Missing and 1 partial ⚠️
src/any_agent/callbacks/wrappers/langchain.py	80.00%	6 Missing and 1 partial ⚠️
src/any_agent/callbacks/wrappers/openai.py	64.70%	5 Missing and 1 partial ⚠️
src/any_agent/callbacks/wrappers/llama_index.py	70.58%	4 Missing and 1 partial ⚠️
... and 2 more

Files with missing lines	Coverage Δ
src/any_agent/callbacks/__init__.py	`100.00% <100.00%> (ø)`
src/any_agent/frameworks/any_agent.py	`90.62% <100.00%> (+3.12%)`	⬆️
src/any_agent/frameworks/langchain.py	`57.62% <100.00%> (+3.91%)`	⬆️
src/any_agent/frameworks/smolagents.py	`71.31% <ø> (+2.45%)`	⬆️
src/any_agent/frameworks/agno.py	`66.90% <0.00%> (+2.15%)`	⬆️
src/any_agent/callbacks/wrappers/tinyagent.py	`93.25% <75.00%> (-2.30%)`	⬇️
src/any_agent/callbacks/wrappers/llama_index.py	`93.42% <70.58%> (-3.19%)`	⬇️
src/any_agent/callbacks/wrappers/openai.py	`90.27% <64.70%> (-4.27%)`	⬇️
src/any_agent/callbacks/wrappers/langchain.py	`85.85% <80.00%> (-0.08%)`	⬇️
src/any_agent/callbacks/context.py	`74.19% <61.90%> (-25.81%)`	⬇️
... and 6 more

... and 46 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

daavoo · 2025-10-21T17:32:47Z

@daavoo before I get too deep into this PR, would you be able to give a quick look to make sure it aligns with what you were thinking too? I'm 90% sure it does, but want to check in before I go through the trouble of implementing it for all the frameworks

You're cooking, looks good to me

njbrake · 2025-10-27T17:29:04Z

Smolagents implementation pending improvement based on any responses from the HF team on huggingface/smolagents#1834

njbrake · 2025-10-27T19:08:49Z

src/any_agent/frameworks/smolagents.py

            "model": model_id,
            "api_key": api_key,
            "api_base": api_base,
+            "allow_running_loop": True,  # Because smolagents uses sync api


This was a bug in our code that was never found because we weren't testing callbacks in our integration tests

njbrake · 2025-10-27T19:16:13Z

src/any_agent/callbacks/wrappers/smolagents.py

+            # Only invoke callbacks on the first LLM call, not on retry attempts
+            # Retries can be detected by checking if there are error messages in the history
+            is_retry = any(
+                "Error:" in str(msg.content) and "Now let's retry" in str(msg.content)
+                for msg in messages


The smolagents implementation is certainly quirky. They append error messages as tool responses and have retry hooks a little differently from the other libraries. Putting this here which Claude helped me generate (with a bunch of guidance):

The Complete Flow with Message Modification in Smolagents librar

First Call (Step 1):

_run_stream() line 576: calls _step_stream()

_step_stream() line 1259: calls write_memory_to_messages()

write_memory_to_messages() line 764: calls TaskStep.to_messages()

TaskStep.to_messages() line 192: returns "New task:\nSay goodbye"

Wrapper modifies: Changes last message to "Say hello and goodbye" ✓

_step_stream() line 1284: calls model.generate() with modified messages

LLM tries to call final_answer twice → raises AgentExecutionError

_run_stream() line 597: stores error in action_step.error

_run_stream() line 600: appends action_step (with error) to memory

Second Call (Step 2 - Retry):

_run_stream() line 576: calls _step_stream() AGAIN

_step_stream() line 1259: calls write_memory_to_messages() AGAIN

write_memory_to_messages() rebuilds from memory:
- TaskStep.to_messages() → "New task:\nSay goodbye" (original task!)
- ActionStep.to_messages() → Error message: "Error:\n...Now let's retry..."

Messages are rebuilt from memory → modification is LOST ✗

Wrapper modifies the last message (which is now the error message, not the task!)

LLM receives original unmodified task + error message

Why This Happens

The design assumes messages are stateless and reproducible from memory structures. Smolagents never expects the messages parameter to be modified at runtime. The architecture is:

Memory Structures (TaskStep, ActionStep) → to_messages() → Fresh Message List

This happens before every LLM call, so any modifications to the message list itself are discarded.

because of this, we don't want to redo the callbacks each retry, we want to only make that callback interference on the entry, not the retries (since that would be messing with the internal smolagents logic).

njbrake · 2025-10-27T19:22:08Z

Integration tests: https://github.com/mozilla-ai/any-agent/actions/runs/18853291870

github-actions · 2025-11-04T00:01:16Z

This PR is stale because it has been open 7 days with no activity. Remove stale label or comment or this will be closed in 3 days.

njbrake · 2025-11-04T02:15:43Z

Integration tests: https://github.com/mozilla-ai/any-agent/actions/runs/19055632789

github-actions · 2025-11-12T00:01:33Z

This PR is stale because it has been open 7 days with no activity. Remove stale label or comment or this will be closed in 3 days.

github-actions · 2025-11-23T00:01:38Z

This PR is stale because it has been open 7 days with no activity. Remove stale label or comment or this will be closed in 3 days.

github-actions · 2025-11-28T00:01:39Z

This PR was closed because it has been stalled for 3 days with no activity.

initial design

b029d93

njbrake linked an issue Oct 21, 2025 that may be closed by this pull request

Allow to access and modify the actual inputs before the underlying LLM call #609

Open

njbrake marked this pull request as draft October 21, 2025 16:25

njbrake added 3 commits October 21, 2025 12:26

fix provider naming

85fbc09

docs

35b6657

add to api

0a9784a

njbrake and others added 6 commits October 25, 2025 19:31

Merge branch 'main' into 609-callbacks-edit

d9a9cee

Merge branch 'main' into 609-callbacks-edit

90dc73c

openai impl

d955fa8

lint and google adk impl

6c6a383

cleanup

f979bfb

smolagents

f2475d7

njbrake added 2 commits October 27, 2025 13:29

lint

bf05d9d

everybody works

cc7100f

njbrake commented Oct 27, 2025

View reviewed changes

lint

6c5c0e0

njbrake marked this pull request as ready for review October 27, 2025 19:21

njbrake requested review from HareeshBahuleyan and daavoo October 27, 2025 19:21

njbrake mentioned this pull request Oct 29, 2025

Discussion: "compact "feature #858

Open

github-actions bot added the Stale label Nov 4, 2025

Merge branch 'main' into 609-callbacks-edit

7944c4b

njbrake removed the Stale label Nov 4, 2025

fix callbacks wrapping order

4b1779f

njbrake added 2 commits November 3, 2025 20:50

logic swap to resolve tests

116dba5

remove getters and setters when you're done with them

b1330f8

lint

829f539

github-actions bot added Stale and removed Stale labels Nov 12, 2025

github-actions bot added the Stale label Nov 23, 2025

github-actions bot closed this Nov 28, 2025

HareeshBahuleyan reopened this Nov 28, 2025

github-actions bot removed the Stale label Nov 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(callbacks): Callbacks can edit messages #852

feat(callbacks): Callbacks can edit messages #852

njbrake commented Oct 21, 2025 •

edited

Loading

Uh oh!

codecov bot commented Oct 21, 2025 •

edited

Loading

Uh oh!

daavoo commented Oct 21, 2025

Uh oh!

njbrake commented Oct 27, 2025

Uh oh!

njbrake Oct 27, 2025

Uh oh!

njbrake Oct 27, 2025

Uh oh!

njbrake Oct 27, 2025

Uh oh!

njbrake commented Oct 27, 2025

Uh oh!

github-actions bot commented Nov 4, 2025

Uh oh!

njbrake commented Nov 4, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 12, 2025

Uh oh!

github-actions bot commented Nov 23, 2025

Uh oh!

github-actions bot commented Nov 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat(callbacks): Callbacks can edit messages #852

Are you sure you want to change the base?

feat(callbacks): Callbacks can edit messages #852

Conversation

njbrake commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

daavoo commented Oct 21, 2025

Uh oh!

njbrake commented Oct 27, 2025

Uh oh!

njbrake Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

njbrake Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

njbrake Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

njbrake commented Oct 27, 2025

Uh oh!

github-actions bot commented Nov 4, 2025

Uh oh!

njbrake commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 12, 2025

Uh oh!

github-actions bot commented Nov 23, 2025

Uh oh!

github-actions bot commented Nov 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

njbrake commented Oct 21, 2025 •

edited

Loading

codecov bot commented Oct 21, 2025 •

edited

Loading

njbrake commented Nov 4, 2025 •

edited

Loading