-
Notifications
You must be signed in to change notification settings - Fork 3.3k
feat(ai): add onStepFinish continuation support for validation and retry #10507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Fix condition to add assistant response to responseMessages when stepContinueResult.continue is true - Add test case to verify assistant message is included in continuation without tool calls - Fix TypeScript errors by checking Array.isArray before calling .some() on content
| supportedUrls: await model.supportedUrls, | ||
| download, | ||
| }); | ||
| do { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In generateObject with onStepFinish continuation and multiple retry attempts, continuation feedback messages from earlier attempts are discarded instead of accumulating. This causes the model to only receive the latest feedback message, losing context from previous failed attempts.
View Details
📝 Patch Details
diff --git a/packages/ai/src/generate-object/generate-object.ts b/packages/ai/src/generate-object/generate-object.ts
index 53de2c6b6..eb3ed5d6e 100644
--- a/packages/ai/src/generate-object/generate-object.ts
+++ b/packages/ai/src/generate-object/generate-object.ts
@@ -335,7 +335,7 @@ functionality that can be fully encapsulated in the provider.
const initialMessages = standardizedPrompt.messages;
let currentMessages: Array<ModelMessage> = [...initialMessages];
- let nextStepContinuationMessages: Array<ModelMessage> = [];
+ let accumulatedContinuationMessages: Array<ModelMessage> = [];
let result: string;
let finishReason: FinishReason;
@@ -353,12 +353,11 @@ functionality that can be fully encapsulated in the provider.
do {
attemptCount++;
- // Combine initial messages with continuation messages
+ // Combine initial messages with accumulated continuation messages
const stepInputMessages = [
...currentMessages,
- ...nextStepContinuationMessages,
+ ...accumulatedContinuationMessages,
];
- nextStepContinuationMessages = []; // Clear after use
const promptMessages = await convertToLanguageModelPrompt({
prompt: {
@@ -530,8 +529,10 @@ functionality that can be fully encapsulated in the provider.
'continue' in onStepFinishResult
) {
if (onStepFinishResult.continue === true) {
- // Store continuation messages for the next step's input
- nextStepContinuationMessages = onStepFinishResult.messages;
+ // Accumulate continuation messages for the next step's input
+ accumulatedContinuationMessages.push(
+ ...onStepFinishResult.messages,
+ );
shouldContinue = true;
}
// continue: false means stop
Analysis
Continuation feedback messages discarded on retry attempts in generateObject
What fails: In generateObject with onStepFinish continuation and multiple retry attempts (maxRetries > 0), continuation feedback messages from earlier failed attempts are discarded instead of accumulated. The model receives only the latest feedback message, losing context from previous validation failures.
How to reproduce:
When using generateObject with:
- A schema that fails validation on the first attempt
- An
onStepFinishcallback that returnscontinue: truewith feedback messages - Multiple retry attempts before success
The expected behavior is:
Attempt 1: [initial_prompt] → invalid
Attempt 2: [initial_prompt + feedback_1] → invalid
Attempt 3: [initial_prompt + feedback_1 + feedback_2] → success
Actual behavior (before fix):
Attempt 1: [initial_prompt] → invalid
Attempt 2: [initial_prompt + feedback_1] → invalid
Attempt 3: [initial_prompt + feedback_2] ← feedback_1 is LOST
Root cause: In packages/ai/src/generate-object/generate-object.ts lines 338 and 534:
- Line 338:
nextStepContinuationMessagesinitialized as empty array - Line 363 (old):
nextStepContinuationMessages = []clears after each use - Line 534 (old):
nextStepContinuationMessages = onStepFinishResult.messagesreplaces instead of accumulating
This caused messages to be replaced rather than accumulated across iterations, losing earlier feedback.
The fix: Changed the logic to accumulate continuation messages across attempts:
- Renamed
nextStepContinuationMessagestoaccumulatedContinuationMessagesfor clarity - Removed the line that cleared messages after use
- Changed line 534 from assignment to
push()to accumulate messages:accumulatedContinuationMessages.push(...onStepFinishResult.messages)
This ensures the model receives full context from all previous validation failures, matching the pattern used in generateText with responseMessages accumulation and aligning with the documented behavior that "messages injected via continuation are added to message history."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I see this a legit problem but it's probably best to give the developer control over the message history. The ideal solution would allow for accumulation, no accumulation, or direct control over the message history.
This highlights something else. Is it desirable to have a message history that preserves the validation errors? Illustration below:
Options
Current Behavior:
Message History:
- UserMessage: ...
- AssistantMessage: <--validation successDetailed History With Messages
Message History:
- UserMessage ...
- AssistantMessage <--validation failed
- UserMessage: "You failed bc of XYZ. Regenerate." <-- validation failure feedback
- AssistantMessage: <--validation successDetailed History With Tool Injection
Detailed Message History With Inserted Tool:
- UserMessage ...
- ToolMessage <-- validation failure feedback with original message
- AssistantMessage: <--validation successOf these choices, it's probably best to consider a tool injection so that we can preserve everything! However, I'm biased to my use cases where I'm attempting to be token efficient and support models without tool calling prowess.
TL;DR in code
Background
My original motivation can be simplified into the above tl;dr snippet. I wanted to generate SMS text messages that don't break many assumptions like content length, avoiding keywords or special characters that you would never see in a real text message. For example, I wrote a regex that never let the model generate a markdown characters in the output since I can reasonably say we shouldn't text that. I even had some validators to check for Chinese characters since some models occasionally generate them.
The way I implemented it was by having a tool call "reviewMessage" which returned pass or fail with reasons. But the model had to generate a tool call for this! Sometimes the tool call's output didn't match the resulting text so I had a final validator outside the agent loop which means I just had to retry the whole function without feedback. It costs more and is more complex to have a tool call for something that should be deterministic. Of course, I could inject a tool call for this but while implementing it it smelled wrong.
Detailed Thought Process / Notes
So, I want a custom piece of code to decide if it's okay to continue or not during the
onStepFinishthat also has feedback on why we failed. Why withinonStepFinish? I considered options likestopWhenwhere we could stop when we have a successfully validated message and usingprepareStepto inject feedback. In the end, it was more intuitive for me to useonStepFinishsince we're adjusting control flow. Perhaps a new function likeonStepContinuewould've made more sense here rather than extendingonStepFinish.Implementing the feature for
generateTextwasn't too challenging but I realized that we had to support streaming too. Then support the UI dependencies. ThengenerateObjectandexperimental_output.The "gotcha" for implementing continuation for
streamTextin the UI was the fact that if a text is finished streaming and fails theonStepFinishvalidation, we have to restart the request which looks like the message just disappears after completion. I think this is appropriate but I'm not sure how to give developers control over this operation so I added an option to "Clear Step on Retry" withexperimental_clearStepso that the default behavior of clearing is disabled.The best part of this implementation in my opinion is that it sets up for validators with feedback on objects as well. If there's an object that couldn't get generated due to complex schema failures - such as zod validations - we give the developer control of how that information is brought to the model via the
onStepFinsish'sStepContinueResult's messages array.Summary
This PR adds support for returning a
StepContinueResultfrom theonStepFinishcallback ingenerateText,streamText, andgenerateObject.Key Changes
onStepFinish Callback Update: The callback can now return a
StepContinueResultobject.{ continue: true, messages: [...] }: Continues the generation loop, injecting the provided messages (feedback/corrections) as the next user step.{ continue: false }: Stops the generation loop immediately, even if there are pending tool calls.void/undefined: Default behavior (continues if tool calls exist, stops otherwise).generateText&streamText:streamText, when a retry happens (continuation), the previous failed step needs to be handled in the UI.experimental_clearStep(default:true) toStepContinueResult. This allows developers to control whether the "failed" step should be cleared from the UI stream before the retry starts. This prevents the user from seeing the invalid attempt disappear confusingly unless desired.generateObject:onStepFinishand return a continuation with a specific error message to guide the model to fix the JSON.Manual Verification
Verified locally using the following test apps and test cases:
generateText&streamText: Confirmed that returningcontinue: truetriggers a new generation step with the provided feedback messages.generateObject: Verified that schema validation errors can be caught inonStepFinishand used to trigger a retry, successfully correcting the output in subsequent steps.experimental_clearStepbehavior. Whentrue(default), the invalid step is cleared from the UI before the new stream starts.stream-text-continuation.test.ts(10 tests passed) covering retry limits, message injection, and stop conditions.generate-object.test.ts(33 tests passed) to verify retry-on-validation logic.packages/ai: 88 test files passed.examples/next-openai/app/test-on-step-finish-continuation/examples/next-openai/app/test-object-continuation/Checklist
pnpm changesetin the project root)Future Work
It's worth considering this the comment I made below about how we handle message history for messages failing validation: #10507 (comment)
Screenshots
This is
test-on-step-finish-continuationwith the clear step disabled so both messages are shown (highlighted is the resulting message). If the clear step is enabled (default) only the highlighted message remains.This is
test-object-continuation/showing the a failed validation step with feedback.