diff --git a/config/manifests/prometheus/values.yaml b/config/manifests/prometheus/values.yaml new file mode 100644 index 000000000..780b88993 --- /dev/null +++ b/config/manifests/prometheus/values.yaml @@ -0,0 +1,27 @@ +serviceAccounts: + server: + create: false + name: inference-gateway-sa-metrics-reader + +extraScrapeConfigs: | + - job_name: 'inference-extension-epp' + authorization: + credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token + scrape_interval: 5s + kubernetes_sd_configs: + - role: endpoints + relabel_configs: + - source_labels: [__meta_kubernetes_service_name] + action: keep + regex: .*-epp$ + - source_labels: [__meta_kubernetes_pod_container_port_number] + action: keep + regex: "9090" + - job_name: vllm + scrape_interval: 5s + kubernetes_sd_configs: + - role: pod + relabel_configs: + - source_labels: [__meta_kubernetes_pod_label_app] + action: keep + regex: vllm-llama3-8b-instruct diff --git a/site-src/guides/metrics-and-observability.md b/site-src/guides/metrics-and-observability.md index 07b86a058..ca8dd770d 100644 --- a/site-src/guides/metrics-and-observability.md +++ b/site-src/guides/metrics-and-observability.md @@ -126,6 +126,60 @@ PROFILE_NAME=heap curl -H "Authorization: Bearer $TOKEN" localhost:9090/debug/pprof/$PROFILE_NAME -o profile.out go tool pprof -png profile.out ``` +## Setting Up Grafana + Prometheus + +### Grafana + +A simple grafana deployment can be done with the following commands: + +```bash +helm repo add grafana https://grafana.github.io/helm-charts +helm install grafana grafana/grafana --namespace monitoring --create-namespace +``` + +Get the Grafana URL to visit by running these commands in the same shell: + +```bash + export POD_NAME=$(kubectl get pods --namespace monitoring -l "app.kubernetes.io/name=grafana,app.kubernetes.io/instance=grafana" -o jsonpath="{.items[0].metadata.name}") + kubectl --namespace monitoring port-forward $POD_NAME 3000 +``` + +### Prometheus + +We currently have 2 types of prometheus deployments documented: + +1. Self Hosted using the prometheus helm chart +2. Using Google Managed Prometheus + +=== "Self-Hosted" + + Add the prometheus-community helm repository: + + ```bash + helm repo add prometheus-community https://prometheus-community.github.io/helm-charts + ``` + + Deploy the prometheus helm chart using this command: + ```bash + helm install prometheus prometheus-community/prometheus \ + --namespace monitoring \ + -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/prometheus/values.yaml + ``` + + You can add the prometheus data source to grafana following [This Guide](https://grafana.com/docs/grafana/latest/administration/data-source-management/). + The prometheus server host is by default `http://prometheus-server` + + Notice that the given values file is very simple and will work directly after following the [Getting Started Guide](https://gateway-api-inference-extension.sigs.k8s.io/guides/), you might need to modify it + +=== "Google Managed" + + If you run the inference gateway with [Google Managed Prometheus](https://cloud.google.com/stackdriver/docs/managed-prometheus), please follow the [instructions](https://cloud.google.com/stackdriver/docs/managed-prometheus/query) + to configure Google Managed Prometheus as data source for the grafana dashboard. + +## Load Inference Extension dashboard into Grafana + +Please follow [grafana instructions](https://grafana.com/docs/grafana/latest/dashboards/build-dashboards/import-dashboards/) to load the dashboard json. +The dashboard can be found here [Grafana Dashboard](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/tools/dashboards/inference_gateway.json) ## Prometheus Alerts