Skip to content

handle picking multiple destinations in scheduling layer #1059

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 8, 2025

Conversation

nirrozenbaum
Copy link
Contributor

@nirrozenbaum nirrozenbaum commented Jun 24, 2025

This PR implements the multiple destination handling in the scheduling layer, as defined in the EPP protocol:
https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main/docs/proposals/004-endpoint-picker-protocol#destination-endpoint

please pay attention that this PR updates only the scheduling layer.
as a follow up, we should update the request-control layer and the handlers to use the multiple returned endpoints according to the protocol.
In order to keep this PR tightly scoped, the director keeps using a single targetPod to keep the current functionality.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 24, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nirrozenbaum

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 24, 2025
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 24, 2025
Copy link

netlify bot commented Jun 24, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 951b5af
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/686a16a377bb4800087e7b43
😎 Deploy Preview https://deploy-preview-1059--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 25, 2025
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jun 25, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 30, 2025
@nirrozenbaum nirrozenbaum changed the title [WIP] handle picking multiple destinations handle picking multiple destinations Jun 30, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 30, 2025
@nirrozenbaum nirrozenbaum changed the title handle picking multiple destinations handle picking multiple destinations in scheduling layer Jun 30, 2025
@nirrozenbaum
Copy link
Contributor Author

cc @kfswain @ahg-g

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 3, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 3, 2025
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 5, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 6, 2025
@@ -238,7 +238,8 @@ func (d *Director) prepareRequest(ctx context.Context, reqCtx *handlers.RequestC
return reqCtx, errutil.Error{Code: errutil.Internal, Msg: "results must be greater than zero"}
}
// primary profile is used to set destination
targetPod := result.ProfileResults[result.PrimaryProfileName].TargetPod.GetPod()
// TODO should use multiple destinations according to epp protocol. current code assumes a single target
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we bind an issue to this TODO?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do have #414 which covers this point current PR implements first half (scheduling part) and next PR should mark the issue as completed (handle request control + hndlers).
a new issue will be a duplicate

// RandomPickerFactory defines the factory function for RandomPicker.
func RandomPickerFactory(name string, _ json.RawMessage, _ plugins.Handle) (plugins.Plugin, error) {
return NewRandomPicker().WithName(name), nil
func RandomPickerFactory(name string, rawParameters json.RawMessage, _ plugins.Handle) (plugins.Plugin, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we want to parse the json before here and just push everything into an unstructured map.

It's out of scope of this PR, definitely, and I do understand the issue that a complex object would still need its type inferred... But it feels strange to do json parsing in every factory. Is there a reason that I missed that it was done this way?

Copy link
Contributor Author

@nirrozenbaum nirrozenbaum Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same answer as in the other thread.. it was initially implemented this way and this is yet another point that potentially can be improved in config api.

@kfswain
Copy link
Collaborator

kfswain commented Jul 8, 2025

/lgtm
/hold

Holding if we want to spin up an issue to track the TODO work, thanks Nir!

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 8, 2025
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 8, 2025
@nirrozenbaum
Copy link
Contributor Author

/unhold

will take care of next part using issue #414 (current PR was focused on scheduling and left the behavior of other layers as is)

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 8, 2025
@k8s-ci-robot k8s-ci-robot merged commit 91e3047 into kubernetes-sigs:main Jul 8, 2025
9 checks passed
@kfswain
Copy link
Collaborator

kfswain commented Jul 8, 2025

SGTM! Thanks Nir!

@nirrozenbaum nirrozenbaum deleted the multiple-dest branch July 9, 2025 05:25
EyalPazz pushed a commit to EyalPazz/gateway-api-inference-extension that referenced this pull request Jul 9, 2025
…sigs#1059)

* implement multiple destination as the output of the scheduler

Signed-off-by: Nir Rozenbaum <[email protected]>

* updated max score picker unit tests to cover multiple pods

Signed-off-by: Nir Rozenbaum <[email protected]>

* imports

Signed-off-by: Nir Rozenbaum <[email protected]>

* unit-test fix

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants