Skip to content

Conversation

@capri-xiyue
Copy link
Contributor

@capri-xiyue capri-xiyue commented Nov 26, 2025

What type of PR is this?

/kind cleanup

What this PR does / why we need it:
gke needs to be removed in monitoring session

Which issue(s) this PR fixes:

Fixes #

Does this PR introduce a user-facing change?:

there is no `inferenceExtension.monitoring.gke` in `inferencePool` helm chart

@k8s-ci-robot k8s-ci-robot added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label Nov 26, 2025
@netlify
Copy link

netlify bot commented Nov 26, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 13d0ddd
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/693b3ef2334c540008e18c34
😎 Deploy Preview https://deploy-preview-1906--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Nov 26, 2025
Copy link
Contributor

@liu-cong liu-cong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @JeffLuoo Did we decide to deprecate monitoring.gke.enabled and use provider:gke to enable GKE monitoring?

@capri-xiyue Note we set monitoring.gke in llm-d user guides today, but I assume deprecating it just means monitoring.gke becomes a noop right?

@ahg-g
Copy link
Contributor

ahg-g commented Nov 26, 2025

cc @JeffLuoo Did we decide to deprecate monitoring.gke.enabled and use provider:gke to enable GKE monitoring?

Yes, there is already a deprecation note. We need to update the guides in GKE, the previous release now supports both, so we can safely migrate the guide.

@capri-xiyue capri-xiyue changed the title refactor: refactor monitoring session WIP: refactor: refactor monitoring session Nov 26, 2025
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 26, 2025
@ahg-g
Copy link
Contributor

ahg-g commented Dec 10, 2025

@capri-xiyue can you pls address the comments so we can move forward with this PR?

@capri-xiyue
Copy link
Contributor Author

@capri-xiyue can you pls address the comments so we can move forward with this PR?

@ahg-g I have returned for a brief three-day window before taking leave again until late December. I will prioritize moving this PR forward immediately.

@capri-xiyue
Copy link
Contributor Author

capri-xiyue commented Dec 11, 2025

cc @JeffLuoo Did we decide to deprecate monitoring.gke.enabled and use provider:gke to enable GKE monitoring?

@capri-xiyue Note we set monitoring.gke in llm-d user guides today, but I assume deprecating it just means monitoring.gke becomes a noop right?

Yes, my understanding of it is that monitoring.gke has a noop and users need to change to set Prometheus Monitoring is enabled AND Prometheus Authentication is enabled to enable monitoring no matter whether it is gke or istio or kgateway

@capri-xiyue
Copy link
Contributor Author

@JeffLuoo Can you help review this PR to see it makes sense or not?

@capri-xiyue capri-xiyue changed the title WIP: refactor: refactor monitoring session refactor: refactor monitoring session Dec 11, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 11, 2025
@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 11, 2025
@capri-xiyue capri-xiyue requested a review from JeffLuoo December 11, 2025 22:00
@capri-xiyue
Copy link
Contributor Author

/retest

@capri-xiyue
Copy link
Contributor Author

@capri-xiyue can you pls address the comments so we can move forward with this PR?

@ahg-g I've addressed all the comments

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: capri-xiyue, JeffLuoo
Once this PR has been reviewed and has the lgtm label, please assign ahg-g for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

enabled: true # log all requests by default
---
{{- if or .Values.inferenceExtension.monitoring.gke.enabled (and .Values.inferenceExtension.monitoring.prometheus.enabled .Values.inferenceExtension.monitoring.prometheus.auth.enabled) }}
{{- if and .Values.inferenceExtension.monitoring.prometheus.enabled .Values.inferenceExtension.monitoring.prometheus.auth.enabled }}
Copy link
Contributor

@JeffLuoo JeffLuoo Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the prometheus enabled is by default set to false in the values.yaml: https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/v1.2.0-rc.1/config/charts/inferencepool/values.yaml#L55-L56

This implies that the monitoring stack on GKE won't have all required objects like roles binding by default if inferenceExtension.monitoring.prometheus.enabled is set to false with this change.

Shall we use inferenceExtension.monitoring.prometheus.enabled to control the enablement of monitoring stacks regardless of the provider? If so, we need to update the mentioning of inferenceExtension.monitoring.gke.enable=true, e.g. https://github.com/search?q=repo%3Allm-d%2Fllm-d%20inferenceExtension.monitoring.gke.enable%3Dtrue&type=code

wdyt? @ahg-g

Copy link
Contributor

@ahg-g ahg-g Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the prometheus enabled is by default set to false in the values.yaml: https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/v1.2.0-rc.1/config/charts/inferencepool/values.yaml#L55-L56

This implies that the monitoring stack on GKE won't have all required objects like roles binding by default if inferenceExtension.monitoring.prometheus.enabled is set to false with this change.

Can you elaborate here? the existing gke.enable flag that we are removing in this PR is not controlling that either, right?

Shall we use inferenceExtension.monitoring.prometheus.enabled to control the enablement of monitoring stacks regardless of the provider? If so, we need to update the mentioning of inferenceExtension.monitoring.gke.enable=true, e.g. https://github.com/search?q=repo%3Allm-d%2Fllm-d%20inferenceExtension.monitoring.gke.enable%3Dtrue&type=code

wdyt? @ahg-g

I agree as long as all what this flag enables is indeed prometheus-related.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. EPP uses Prometheus as the metric source behind the metrics endpoint.

The PR LGTM then, we just need to make sure we update all references later to include inferenceExtension.monitoring.prometheus.enabled in addition to setting the provider to GKE.

Copy link
Contributor

@JeffLuoo JeffLuoo Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more place to update -

--set inferenceExtension.monitoring.gke.enabled=true \

With this change, we switch to use inferenceExtension.monitoring.prometheus.enabled so mentioning of inferenceExtension.monitoring.gke.enabled needs to be updated as well. This is the only mentioning I found in EPP repo: https://github.com/search?q=repo%3Akubernetes-sigs%2Fgateway-api-inference-extension+monitoring.gke&type=code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants