Skip to content

docs: add prometheus + grafana deployment guide #1019

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions config/manifests/prometheus/values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
serviceAccounts:
server:
create: false
name: inference-gateway-sa-metrics-reader

extraScrapeConfigs: |
- job_name: 'inference-extension-epp'
authorization:
credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
scrape_interval: 5s
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_name]
action: keep
regex: .*-epp$
- source_labels: [__meta_kubernetes_pod_container_port_number]
action: keep
regex: "9090"
- job_name: vllm
scrape_interval: 5s
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_label_app]
action: keep
regex: vllm-llama3-8b-instruct
54 changes: 54 additions & 0 deletions site-src/guides/metrics-and-observability.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,60 @@ PROFILE_NAME=heap
curl -H "Authorization: Bearer $TOKEN" localhost:9090/debug/pprof/$PROFILE_NAME -o profile.out
go tool pprof -png profile.out
```
## Setting Up Grafana + Prometheus

### Grafana

A simple grafana deployment can be done with the following commands:

```bash
helm repo add grafana https://grafana.github.io/helm-charts
helm install grafana grafana/grafana --namespace monitoring --create-namespace
```

Get the Grafana URL to visit by running these commands in the same shell:

```bash
export POD_NAME=$(kubectl get pods --namespace monitoring -l "app.kubernetes.io/name=grafana,app.kubernetes.io/instance=grafana" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace monitoring port-forward $POD_NAME 3000
```

### Prometheus

We currently have 2 types of prometheus deployments documented:

1. Self Hosted using the prometheus helm chart
2. Using Google Managed Prometheus

=== "Self-Hosted"

Add the prometheus-community helm repository:

```bash
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
```

Deploy the prometheus helm chart using this command:
```bash
helm install prometheus prometheus-community/prometheus \
--namespace monitoring \
-f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/prometheus/values.yaml
```

You can add the prometheus data source to grafana following [This Guide](https://grafana.com/docs/grafana/latest/administration/data-source-management/).
The prometheus server host is by default `http://prometheus-server`

Notice that the given values file is very simple and will work directly after following the [Getting Started Guide](https://gateway-api-inference-extension.sigs.k8s.io/guides/), you might need to modify it

=== "Google Managed"

If you run the inference gateway with [Google Managed Prometheus](https://cloud.google.com/stackdriver/docs/managed-prometheus), please follow the [instructions](https://cloud.google.com/stackdriver/docs/managed-prometheus/query)
to configure Google Managed Prometheus as data source for the grafana dashboard.

## Load Inference Extension dashboard into Grafana

Please follow [grafana instructions](https://grafana.com/docs/grafana/latest/dashboards/build-dashboards/import-dashboards/) to load the dashboard json.
The dashboard can be found here [Grafana Dashboard](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/tools/dashboards/inference_gateway.json)

## Prometheus Alerts

Expand Down