-
Notifications
You must be signed in to change notification settings - Fork 628
Prefer kube-scheduler's resource metrics to kube-state-metrics' #815
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I like this change I can see a problem with this in managed solutions (like EKS) where access to kube-scheduler is forbidden. In those cases alerts and dashboards which are based on kube-scheduler data won't be useful at all. Due to such an issue can we use OR statements instead of deprecating kube-state-metrics data? I think something like kube_pod_resource_request OR kube_pod_container_resource_request should do the trick.
|
Note to self: Revisit this issue once prometheus/prometheus#9624 is implemented. |
|
Just wondering if there's a better (or any) way to format embedded PromQL expressions in |
8102d0c to
eb73d96
Compare
ce7c3f8 to
817b784
Compare
|
This PR has been automatically marked as stale because it has not The next time this stale check runs, the stale label will be Thank you for your contributions! |
|
Rebasing. |
6997e03 to
86d83ae
Compare
| kube_pod_container_resource_requests{resource="memory",%(kubeStateMetricsSelector)s} | ||
| ) * on(namespace, pod, %(clusterLabel)s) group_left() max by (namespace, pod, %(clusterLabel)s) ( | ||
| kube_pod_status_phase{phase=~"Pending|Running"} == 1 | ||
| kube_pod_resource_request{resource="memory",%(kubeSchedulerSelector)s} or kube_pod_container_resource_requests{resource="memory",%(kubeStateMetricsSelector)s} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure that it will do the right thing since kube_pod_resource_request and kube_pod_container_resource_requests don't have the exact same labels if I understand correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could be wrong here but wouldn't the differing labelsets (kube-scheduler metrics potentially having the additional scheduler and priority labels) after or be sanitized to the set of specified labels in the max operation, so doing an ignoring (scheduler,priority) will eventually have no effect on the final result?
|
This PR has been automatically marked as stale because it has not The next time this stale check runs, the stale label will be Thank you for your contributions! |
Since they are more accurate.
|
I've went over and tried to address all outstanding reviews from before. PLMK if I missed something! |
|
This PR has been automatically marked as stale because it has not The next time this stale check runs, the stale label will be Thank you for your contributions! |
|
(bump) |
|
This PR has been automatically marked as stale because it has not The next time this stale check runs, the stale label will be Thank you for your contributions! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your patience! I'm going to create another PR to get the kube-scheduler metrics in the new local dev env, looks like I'm missing the scrape config there to validate this.
EDIT: Here's the follow-up PR: #1116, please note this requires a new scrape job which might affect your job labels?
| promql_expr_test: | ||
| - eval_time: 0m | ||
| expr: namespace_cpu:kube_pod_container_resource_requests:sum | ||
| expr: namespace_cpu:kube_pod_resource_request_or_kube_pod_container_resource_requests:sum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that you are keeping the existing rules, the existing tests probably need to be restored, so that we end up with both the existing and the new tests alongside each other.
| { | ||
| record: 'cluster:namespace:pod_cpu:active:kube_pod_resource_request_or_kube_pod_container_resource_requests', | ||
| expr: ||| | ||
| (kube_pod_resource_request{resource="memory",%(kubeSchedulerSelector)s} or kube_pod_container_resource_requests{resource="cpu",%(kubeStateMetricsSelector)s}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| (kube_pod_resource_request{resource="memory",%(kubeSchedulerSelector)s} or kube_pod_container_resource_requests{resource="cpu",%(kubeStateMetricsSelector)s}) | |
| (kube_pod_resource_request{resource="cpu",%(kubeSchedulerSelector)s} or kube_pod_container_resource_requests{resource="cpu",%(kubeStateMetricsSelector)s}) |
| ||| % $._config, | ||
| }, | ||
| { | ||
| record: 'namespace_memory:kube_pod_resource_request_or_kube_pod_container_resource_requests:sum', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice the new rules don't use kube_pod_status_phase, is that intentional/implied somehow in the scheduler version of the metrics?
Use kube-scheduler's metrics instead of kube-state-metrics, as they are more precise.
Refer the links below for more details.
Also, refactor
kube_pod_status_phase, since statuses other than "Pending" or "Running" are excluded or deprecated.