-
Notifications
You must be signed in to change notification settings - Fork 148
feat: Extend the text based configuration to include feature flags and the SaturationDetector's configuration #1492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
…tector config Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: shmuelk The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @shmuelk. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
// +optional | ||
// SaturationDetector when present specifies the configuration of the | ||
// Saturation detector. If not present, default values are used. | ||
SaturationDetector *SaturationDetector `json:"saturationDetector,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is too specific and should be generalized.
we've talked in the past about having a parameters section in the config API, such that multiple structs in the code can consume. for example, I might want to consume the metricsStalenessThreshold in more places.
This approach is not future proof and we might need to add more and more fields here instead of just generalizing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parts of the code, "just knowing" what parameters they are looking for in a pool of parameters, is as error prone as environment variables. A typo will cause a parameter to be ignored and presumably a default value will be used. I believe this will be hard to "debug" scenarios like this.
The sections need to be explicit.
As for having constants as I did or references to a parameter, that is a different issue. It does however make for a very verbose configuration. With the need for more validation code. Unless you do something like:
saturationDetector:
queueDepthThreshold:
value: 5
kvCacheUtilThreshold:
parameterRef: plover
type FeatureGates struct { | ||
// +optional | ||
// EnableDataLayer If present and true enables the experimental Datalayer APIs. | ||
// The Datalayer APIswill be disabled if EnableDataLayer is not present in the | ||
// configuration or its value is false. | ||
EnableDataLayer bool `json:"enableDataLayer,omitempty"` | ||
|
||
// +optional | ||
// EnableFlowControl If present and true enables the experimental FlowControl feature. | ||
// The FlowControl feature will be disabled if EnableFlowControl is not in the | ||
// configuration or its value is false. | ||
EnableFlowControl bool `json:"enableFlowControl,omitempty"` | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be generalized to a map[string]bool
instead of two boolean fields? what if tomorrow I have another feature? would we need to update the config api code on every new feature?
what if I'd like to use the same config API to enable a feature in llm-d that is not part of the GIE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The disadvantage of a map and environment variables for that matter is that it is difficult to validate. A silly typo causes all sorts of problems.
Validating the keys, will not solve your desire of experimental issues in llm-d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having thought about this some more, what about the following:
- FeatureGates will be a map[string]bool
- There will be an API to register feature gate keys. The EPP will use this to set its own feature gate keys and llm-d can use this as well.
- The map will be validated. Any key that is in the FeatureGates map and not registered will cause the validation of the configuration to fail.
- The epp config will have a FeatureGates map which will contain a value for all of the registered keys. They will be initialized with a value of false and overlayed with the value from the text configuration's FeatureGates field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have pushed changes that implemets the above described changes
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
return featureConfig | ||
} | ||
|
||
func loadSaturationDetectorConfig(sd *configapi.SaturationDetector) saturationdetector.Config { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: As we expand the configuration surface for more components, does it make sense to include all validation and defaulting logic in loader/...
? Or, is it preferable to delegate to the local config.go
files for each component. E.g., saturationdetector/config.go
could export
// ValidateAndApplyDefaults checks the provided configuration for validity and then mutates the receiver to populate any
// empty fields with system defaults.
func (c *Config) ValidateAndApplyDefaults() error { ... }
If I were a developer working on an individual component, I think the latter would be easier to trace as it keeps the default values and validation logic scoped locally to the component. It also keeps the single entry point loader/...
as a pure delegator. Overall, I think this approach has tighter cohesion and looser coupling.
I am wiring up the Flow Control layer this week and am considering the best path forward as it has a much larger configuration surface than the saturation detector.
What type of PR is this?
/kind feature
What this PR does / why we need it:
Eliminates the use of environment variables to enable the new Data Layer and configure the SaturationDetector.
The use of environment variables for these types of things is not scalable. It also much easer to maintain multiple configurations if needed in the form of simple YAML text.
Which issue(s) this PR fixes:
Fixes #1481
Does this PR introduce a user-facing change?: