Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extproc: custom processors per path and serve /v1/models #325

Merged
merged 2 commits into from
Feb 12, 2025

Conversation

nacx
Copy link
Contributor

@nacx nacx commented Feb 11, 2025

Commit Message

extproc: custom processors per path and serve /v1/models

Refactors the server processing to allow registering custom Processors for different request paths,
and adds a custom processor for requests to /v1/models that returns an immediate response based
on the models that are configured in the filter configuration.

Related Issues/PRs (if applicable)

Related discussion: #186

Special notes for reviewers (if applicable)

Opening as a draft to discuss on the proposed approach.

In order to return a direct response for some paths, we need to be able to customize the behaviour. Instead of messing
up with the translator, this PR proposes a change to refactor the Server and allow it to define custom processors per request path. This would make it easier to extend the functionality of the filter in the future, when more endpoints are supported.

This is used in the current PR to serve the /v1/models endpoint and return a computed list of models based on the configured routes. Right now we take advantage of exact header matching being the only option, but should be good enough for a first implementation.

The current PR doesn't add a feature flag to enable/disable this endpoint handling, but it should be easy to add if we feel the approach proposed in this PR is good.

@@ -89,10 +89,12 @@ func Main() {
log.Fatalf("failed to listen: %v", err)
}

server, err := extproc.NewServer[*extproc.Processor](l, extproc.NewProcessor)
server, err := extproc.NewServer(l)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generics were tying the server to a concrete processor type, but now it allows defining custom processors per request path, so the generics made no sense anymore. I've removed them and kept the code using the ProcessorIface

)

// NewChatCompletionProcessor creates a new processor for the chat completino requests
func NewChatCompletionProcessor(config *processorConfig, logger *slog.Logger) ProcessorIface {
Copy link
Contributor Author

@nacx nacx Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the Processor object that I've renamed to ChatCompletionProcessor as now we can have different processors per request path. The contents are just a verbatim copy (modulo the type name change).

@@ -125,6 +160,17 @@ func (s *Server[P]) process(p P, stream extprocv3.ExternalProcessor_ProcessServe
return status.Errorf(codes.Unknown, "cannot receive stream request: %v", err)
}

// If we're processing the request headers, read the :path header to instantiate the
// right processor.
if headers := req.GetRequestHeaders().GetHeaders(); headers != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main change in this PR.

In order to allow having custom processors per-request path we need to access the :path header. We can't instantiate the ProcessorIface object until we've read the first request headers message. This is why I instantiate the processor here, and reuse it for the following messages.

@nacx nacx changed the title extproc: handle requests to /v1/models based on the declared models in the filter config extproc: custom processors per path and serve /v1/models Feb 11, 2025
Copy link
Member

@mathetake mathetake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I like the general direction! One question: then we don't need translator.Factory but instead maybe need concrete translator.NewChatCompletionTranslator, etc since it's currently translator.Factory takes path as an input?

@nacx
Copy link
Contributor Author

nacx commented Feb 11, 2025

I think that's correct. We don't need the factory as it is today, but we still need to let the processor decide what translator to use, depending on the selected backend. I can refactor and clean that up in a follow-up PR if that makes sense?

@mathetake
Copy link
Member

yeah let's do a follow up refactor - that makes sense

@nacx nacx marked this pull request as ready for review February 11, 2025 23:35
@nacx nacx requested a review from a team as a code owner February 11, 2025 23:35
@nacx nacx force-pushed the models branch 2 times, most recently from 125b6bc to 9fa41ef Compare February 12, 2025 09:16
Copy link
Member

@mathetake mathetake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only nits!!!

@@ -251,3 +298,17 @@ func filterSensitiveBody(resp *extprocv3.ProcessingResponse, logger *slog.Logger
}
return filteredResp
}

func getHeader(headers *corev3.HeaderMap, name string) string {
Copy link
Member

@mathetake mathetake Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: so basically i see we (unnecessarily) iterate through all headers here and headersToMap twice. would it be possible to use headersToMap before the processor selection and pass the constructed map to processorForPath and eventually to newProcessor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this, but was unsure about modifying the processor factory methods to accept a request headers map. I can change that if you feel it is OK

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah let's do change

// configuration.
// Since it returns an immediate response after processing the headers, the rest of the methods of the
// ProcessorIface are not implemented. Those should never be called.
type ModelsProcessor struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit does this need to be exported

@nacx
Copy link
Contributor Author

nacx commented Feb 12, 2025

OK, I've addressed all the review comments and rebased to the latest main. I squashed the commits because otherwise there were too many conflicts to resolve. I hope it doesn't complicate much the final review.

// If we're processing the request headers, read the :path header to instantiate the
// right processor.
if headers := req.GetRequestHeaders().GetHeaders(); headers != nil {
p, err = s.processorForPath(headersToMap(headers))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will create a new p each time the grpc stream receives a new message? maybe this block should be conditioned on if p != nil { ... } ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah headers := req.GetRequestHeaders().GetHeaders(); headers != nil implicitly does that - maybe I would like a comment at least on this

Copy link
Contributor Author

@nacx nacx Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The req.GetRequestHeaders() will only be non-nil when it is a RequestHeaders message, so I think this will be executed only once? Or does ext_proc, for the same request, send N messages of type ProcessingRequest_RequestHeaders ?

func (x *ProcessingRequest) GetRequestHeaders() *HttpHeaders {
	if x, ok := x.GetRequest().(*ProcessingRequest_RequestHeaders); ok {
		return x.RequestHeaders
	}
	return nil
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a comment :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Signed-off-by: Ignasi Barrera <[email protected]>
Copy link
Member

@mathetake mathetake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@mathetake mathetake merged commit f07a7ff into envoyproxy:main Feb 12, 2025
17 checks passed
@nacx nacx deleted the models branch February 12, 2025 22:04
mathetake pushed a commit that referenced this pull request Feb 13, 2025
**Commit Message**

extproc: remove the path from the translator factory

Removes the path from the translator factory, now that there is a
dedicated processor for the chat completion endpoint.

**Related Issues/PRs (if applicable)**

Follow-up for:
#325 (review)

**Special notes for reviewers (if applicable)**

Note that I don't remove the `Factory` type completely so that only the
right translator is instantiated and only when needed.

---------

Signed-off-by: Ignasi Barrera <[email protected]>
mathetake pushed a commit that referenced this pull request Feb 13, 2025
**Commit Message**

e2e: add test for the models endpoint using the openai client

**Related Issues/PRs (if applicable)**

Related to #325

**Special notes for reviewers (if applicable)**

N/A

---------

Signed-off-by: Ignasi Barrera <[email protected]>
mathetake added a commit that referenced this pull request Feb 17, 2025
**Commit Message**

This removes the unnecessary suffix "Iface" from extproc.ProcessorIface
interface.

**Related Issues/PRs (if applicable)**

Follow up on  #325

Signed-off-by: Takeshi Yoneda <[email protected]>
mathetake added a commit that referenced this pull request Feb 18, 2025
**Commit Message**

extproc: decouple router package from api paths

This is a follow-up change to decouple the router package from the API
paths. Now that we have specialized processors per API path it does not
make sense for the router package to have path-related logic.
Now each processor is responsible for instantiating the right body
parser for the path they're processing.

**Related Issues/PRs (if applicable)**

Follow-up for #325 and
#334

**Special notes for reviewers (if applicable)**

N/A

---------

Signed-off-by: Ignasi Barrera <[email protected]>
Co-authored-by: Takeshi Yoneda <[email protected]>
daixiang0 pushed a commit to daixiang0/ai-gateway that referenced this pull request Feb 19, 2025
)

**Commit Message**

extproc: custom processors per path and serve /v1/models

Refactors the server processing to allow registering custom Processors
for different request paths,
and adds a custom processor for requests to `/v1/models` that returns an
immediate response based
on the models that are configured in the filter configuration.

**Related Issues/PRs (if applicable)**

Related discussion: envoyproxy#186

---------

Signed-off-by: Ignasi Barrera <[email protected]>
Signed-off-by: Loong <[email protected]>
daixiang0 pushed a commit to daixiang0/ai-gateway that referenced this pull request Feb 19, 2025
**Commit Message**

extproc: remove the path from the translator factory

Removes the path from the translator factory, now that there is a
dedicated processor for the chat completion endpoint.

**Related Issues/PRs (if applicable)**

Follow-up for:
envoyproxy#325 (review)

**Special notes for reviewers (if applicable)**

Note that I don't remove the `Factory` type completely so that only the
right translator is instantiated and only when needed.

---------

Signed-off-by: Ignasi Barrera <[email protected]>
Signed-off-by: Loong <[email protected]>
daixiang0 pushed a commit to daixiang0/ai-gateway that referenced this pull request Feb 19, 2025
**Commit Message**

e2e: add test for the models endpoint using the openai client

**Related Issues/PRs (if applicable)**

Related to envoyproxy#325

**Special notes for reviewers (if applicable)**

N/A

---------

Signed-off-by: Ignasi Barrera <[email protected]>
Signed-off-by: Loong <[email protected]>
daixiang0 pushed a commit to daixiang0/ai-gateway that referenced this pull request Feb 19, 2025
**Commit Message**

This removes the unnecessary suffix "Iface" from extproc.ProcessorIface
interface.

**Related Issues/PRs (if applicable)**

Follow up on  envoyproxy#325

Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Loong <[email protected]>
daixiang0 pushed a commit to daixiang0/ai-gateway that referenced this pull request Feb 19, 2025
**Commit Message**

extproc: decouple router package from api paths

This is a follow-up change to decouple the router package from the API
paths. Now that we have specialized processors per API path it does not
make sense for the router package to have path-related logic.
Now each processor is responsible for instantiating the right body
parser for the path they're processing.

**Related Issues/PRs (if applicable)**

Follow-up for envoyproxy#325 and
envoyproxy#334

**Special notes for reviewers (if applicable)**

N/A

---------

Signed-off-by: Ignasi Barrera <[email protected]>
Co-authored-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Loong <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants