-
Notifications
You must be signed in to change notification settings - Fork 25.3k
[ML] Refactoring streaming error handling #131316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Refactoring streaming error handling #131316
Conversation
toRestStatus(responseStatusCode) | ||
); | ||
} | ||
|
||
protected String errorMessage(String message, Request request, HttpResult result, ErrorResponse errorResponse, int statusCode) { | ||
public static String constructErrorMessage(String message, Request request, ErrorResponse errorResponse, int statusCode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving this to public static
so that the ChatCompletionErrorResponseHandler
can access it.
@@ -95,7 +95,7 @@ public void validateResponse( | |||
|
|||
protected abstract void checkForFailureStatusCode(Request request, HttpResult result); | |||
|
|||
private void checkForErrorObject(Request request, HttpResult result) { | |||
protected void checkForErrorObject(Request request, HttpResult result) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to be able to override this if we want to use a different error class. This PR adds UnifiedChatCompletionErrorResponse
which pulls the common fields for constructing a UnifiedChatCompletionException
into a single place.
this.unifiedChatCompletionErrorParser = Objects.requireNonNull(errorParser); | ||
} | ||
|
||
public void checkForErrorObject(Request request, HttpResult result) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied this from BaseResponseHandler
and added different parsing logic. I'm open to other ideas of how to solve this though.
var errorMessage = BaseResponseHandler.constructErrorMessage(message, request, errorResponse, statusCode); | ||
var restStatus = toRestStatus(statusCode); | ||
|
||
if (errorResponse.errorStructureFound()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By using UnifiedChatCompletionErrorResponse
we no longer need the instanceof
checks.
@@ -0,0 +1,171 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we transition all the chat completion response handlers to use this class we can remove the methods that are similar within BaseResponseHandler
.
* @param errorResponse the parsed error response from the HTTP result | ||
* @return an instance of {@link UnifiedChatCompletionException} with details from the error response | ||
*/ | ||
private UnifiedChatCompletionException buildChatCompletionErrorInternal( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functionality here should be very similar to buildError
method of GoogleVertexAiUnifiedChatCompletionResponseHandler.java
except that it doesn't need to do the instanceof
checks.
@@ -22,7 +22,7 @@ public ErrorResponse(String errorMessage) { | |||
this.errorStructureFound = true; | |||
} | |||
|
|||
private ErrorResponse(boolean errorStructureFound) { | |||
protected ErrorResponse(boolean errorStructureFound) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing to protected
because UnifiedChatCompletionErrorResponse
needs to call it.
import java.util.Objects; | ||
|
||
public class UnifiedChatCompletionErrorResponse extends ErrorResponse { | ||
public static final UnifiedChatCompletionErrorResponse UNDEFINED_ERROR = new UnifiedChatCompletionErrorResponse(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you can think of a better to do this let me know. The issue here is that in situations where we fail to parse the error response we return this generic error response. Typically we return ErrorResponse.UNDEFINED_ERROR
but we need it to be a UnifiedChatCompletionErrorResponse
for the typing to be correct.
try ( | ||
XContentParser parser = XContentFactory.xContent(XContentType.JSON) | ||
.createParser(XContentParserConfiguration.EMPTY, response.body()) | ||
) { | ||
return ERROR_PARSER.apply(parser, null).orElse(ErrorResponse.UNDEFINED_ERROR); | ||
return ERROR_PARSER.apply(parser, null).orElse(UnifiedChatCompletionErrorResponse.UNDEFINED_ERROR); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where the UnifiedChatCompletionErrorResponse.UNDEFINED_ERROR
is used. We can't use ErrorResponse.UNDEFINED_ERROR
here because it doesn't satisfy the UnifiedChatCompletionErrorResponse
return type (which is needed for the UnifiedChatCompletionErrorParser
interface).
Pinging @elastic/ml-core (Team:ML) |
This PR is an iteration of this one: #128923
Currently when we add new integrations that support chat completion streaming we have to duplicate some of the error handling code. This PR aims to reduce some of the duplication and move the logic to a new class
ChatCompletionErrorResponseHandler
. This solution uses composition. Ideally we'd push the common logic into theBaseResponseHandler
but I feel like it is confusing since that class was originally only handling non-streaming logic. This solution separates the chat completion error handling logic from the other implementations (likecompletion
and other task types).I'm open to other ideas for simplifying things.
This PR only refactors the google vertex ai chat completion implementation to give an example of what we'd need to do for the other providers.
I did some manual testing with these requests to generate some errors:
That should generate an error like: