Skip to content

Conversation

@LukeAVanDrie
Copy link
Contributor

@LukeAVanDrie LukeAVanDrie commented Dec 8, 2025

What type of PR is this?
/kind cleanup

What this PR does / why we need it:
This PR refactors the directory structure for the saturation control logic within the Endpoint Picker (EPP). It renames
pkg/epp/saturationdetector to pkg/epp/saturationcontrol and moves the current implementation to
framework/plugins/staticthresholdcontroller.

This change is a preparatory step to establish Saturation Control as a formal extension point within the EPP, allowing
for multiple implementations. The existing logic is now housed as the staticthresholdcontroller plugin, making way
for future additions like ConcurrencyController.

This PR contains only file moves and renames; no functional changes are introduced. Note: it is not aligned with the
EPP plugin system yet. This is a preparatory step only.

Which issue(s) this PR fixes:
Tracks #1405 ('cc @nirrozenbaum) and #1793

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label Dec 8, 2025
@netlify
Copy link

netlify bot commented Dec 8, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 539d5c6
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/693899f68397ab00085d8613
😎 Deploy Preview https://deploy-preview-1976--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 8, 2025
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Dec 8, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @LukeAVanDrie. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Dec 8, 2025
@LukeAVanDrie LukeAVanDrie changed the title Refactor: Relocate SaturationDetector Refactor: Prepare EPP SaturationControl as an Extension Point Dec 8, 2025
@ahg-g
Copy link
Contributor

ahg-g commented Dec 8, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 8, 2025
@LukeAVanDrie LukeAVanDrie force-pushed the refactor/saturation-control-dir-structure branch from a6a78b8 to 06c821f Compare December 9, 2025 01:07
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 9, 2025
@nirrozenbaum
Copy link
Contributor

/approve

FWIW, I think we should draw on board (or paper :) ) the updated design with the updated extension points and make sure we're aligned on how we think EPP should end up.
This is not a comment about this PR but more of a general comment, mainly because there are multiple different threads that push changes in parallel into our pluggability story (cc: @kfswain @ahg-g).

will stamp with lgtm once the CI tests pass (should fix boilerplate headers).

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: LukeAVanDrie, nirrozenbaum

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 9, 2025
@kfswain
Copy link
Collaborator

kfswain commented Dec 9, 2025

/approve

FWIW, I think we should draw on board (or paper :) ) the updated design with the updated extension points and make sure we're aligned on how we think EPP should end up. This is not a comment about this PR but more of a general comment, mainly because there are multiple different threads that push changes in parallel into our pluggability story (cc: @kfswain @ahg-g).

will stamp with lgtm once the CI tests pass (should fix boilerplate headers).

I agree with this. There is another PR that is proposing to fundamentally change how plugins can operate, and I'm worried about the precedent that sets.

This commit moves the existing `SaturationDetector` implementation to
its new home under the EPP plugin framework structure. This is a
preparatory step towards making `SaturationControl` an official EPP
extension point.

- Renamed `pkg/epp/saturationdetector` to `pkg/epp/saturationcontrol`
- Moved files to `.../framework/plugins/staticthresholdcontroller`
- Renamed `saturationdetector.go` to `controller.go`
- Renamed `saturationdetector_test.go` to `controller_test.go`
- Fixed imports

No functional changes are included in this commit.
@LukeAVanDrie LukeAVanDrie force-pushed the refactor/saturation-control-dir-structure branch from 06c821f to 539d5c6 Compare December 9, 2025 21:51
@nirrozenbaum
Copy link
Contributor

I agree with this. There is another PR that is proposing to fundamentally change how plugins can operate, and I'm worried about the precedent that sets.

@kfswain not sure which other PR is the one you mentioned. from my point of view this PR can be stamped (it's just putting the existing saturation detector under a different directory).

leaving to Kellen the final stamp just to be on the safe side.

@LukeAVanDrie
Copy link
Contributor Author

LukeAVanDrie commented Dec 10, 2025

@kfswain not sure which other PR is the one you mentioned. from my point of view this PR can be stamped (it's just putting the existing saturation detector under a different directory).

He's referring to my other PR: #1977. I am working on the configuration and extensibility story for Flow Control. Flow Control has two potential extension points: interflow (fairness) and intraflow (ordering) policies. These, however, are scoped to priority bands or flows respectively and not well-supported by the singleton plugin model in their current state. I have a draft to support transient lifecycle for stateful, scoped plugin instances.

Shmuel recommended a stateless singleton approach relying on state-passing which I am now exploring instead as it is more closely aligned with our existing plugin model and less easy to abuse.


FWIW, I think we should draw on board (or paper :) ) the updated design with the updated extension points and make sure we're aligned on how we think EPP should end up.

This will be a good exercise. I am currently working toward adding (or rather promoting) the following extension points for our Flow Control story:

  • InterFlowDispatchPolicy (exposing SelectQueue): the fairness policy for picking which flow gets the next dispatch opportunity
  • IntraFlowDispatchPolicy (exposing Compare): the temporal scheduling policy that is responsible for defining the comparator used to sort the requests within a flow's queue
  • SaturationController (exposing ShouldDispatch) $\to$ formerly SaturationDetector

These then compose nicely into a flow control / scheduling regime.

  • Serve requests with Virtual Token Count (VTC) fairness (InterFlowDispatchPolicy) between flows ordered by Earliest Deadline First (IntraFlowDispatchPolicy) subject to a hard concurrency limit per pod (SaturationController) for pool saturation.
  • Serve requests with Round Robin fairness between flows ordered by FCFS subject to queue depth and kv-cache utilization thresholds for pool saturation.

And so on... These can all be mixed and matched.


I figured it was generally agreed upon to start this for saturation detection given #1405. I plan on doing exactly what's in the bug. Promoting our simple heuristic-based detector to the default implementation before adding a new in-tree concurrency-limit based controller as a followup.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 10, 2025
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@nirrozenbaum
Copy link
Contributor

nirrozenbaum commented Dec 10, 2025

as a side note I must admit that I’ve read the explanation about what is InterFlowPolicy and what is IntraFlowPolicy at least 20 times in different threads, and I still find these names very confusing.

do you think we can rename it to a descriptive name- just by what it does?
e.g., FairnessDispatchPolicy, FlowOrderingPolicy, etc

@LukeAVanDrie
Copy link
Contributor Author

LukeAVanDrie commented Dec 10, 2025

as a side note I must admit that I’ve read the explanation about what is InterFlowPolicy and what is IntraFlowPolicy at least 20 times in different threads, and I still find these names very confusing.

do you think we can rename it to a descriptive name- just by what it does? e.g., FairnessDispatchPolicy, FlowOrderingPolicy, etc

Yes, these can be changed. They are not sticky at all yet. I think FairnessPolicy or FairnessStrategy is clear and to the point and will be the first type I migrate to the plugin model. When I open that PR (moving InterFlowDispatchPolicy definition to a plugins.go file and adding embedding plugins.Plugin for TypedName), we can discuss names there.

These will eventually be reflected in our configuration guide docs and such, so we should pick the best user-facing names.


Edit: mocking the config example makes choices more clear. Thoughts?

fairnessPolicy:
  pluginRef: virtual-token-count
orderingPolicy:
  pluginRef: earliest-deadline-first

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants