diff --git a/prometheus/README.md b/prometheus/README.md index 1bec439fb4b77..6b97a8aabc00d 100644 --- a/prometheus/README.md +++ b/prometheus/README.md @@ -56,7 +56,7 @@ Note: Bucket data for a given `` Prometheus histogram met ### Events -Prometheus Alertmanager alerts are automatically sent to your Datadog event stream following the webhook configuration. +Prometheus Alertmanager alerts are automatically sent to your Datadog event stream following the webhook configuration. See the [Prometheus Alertmanager](#prometheus-alertmanager) section for setup instructions. ### Service Checks @@ -66,27 +66,101 @@ The Prometheus check does not include any service checks. Send Prometheus Alertmanager alerts in the event stream. Natively, Alertmanager sends all alerts simultaneously to the configured webhook. To see alerts in Datadog, you must configure your instance of Alertmanager to send alerts one at a time. You can add a group-by parameter under `route` to have alerts grouped by the actual name of the alert rule. ### Setup -1. Edit the Alertmanager configuration file, `alertmanager.yml`, to include the following: -``` -receivers: -- name: datadog - webhook_configs: - - send_resolved: true - url: https://app.datadoghq.com/intake/webhook/prometheus?api_key= -route: - group_by: ['alertname'] - group_wait: 10s - group_interval: 5m - receiver: datadog - repeat_interval: 3h -``` - -**Note**: This endpoint accepts only one event in the payload at a time. + + + + +1. Edit the `alertmanager.yml` configuration file to include the following: + + ```yaml + receivers: + - name: datadog + webhook_configs: + - send_resolved: true + url: https://event-management-intake.datadoghq.com/api/v2/events/webhook?dd-api-key=&integration_id=prometheus + route: + group_by: ['alertname'] + group_wait: 10s + group_interval: 5m + receiver: datadog + repeat_interval: 3h + ``` + +
+
    +
  • The group_by parameter determines how alerts are grouped together when sent to Datadog. Alerts with matching values for the specified labels are combined into a single notification. For details on routing configuration, see the Prometheus Alertmanager documentation.
  • +
  • This endpoint accepts only one event in the payload at a time.
  • +
+
+ +2. (Optional) Use matchers to redirect specific alerts to different receivers. Matchers allow routing based on any alert label. For syntax details, see the [Alertmanager matcher documentation][12]. + + The V2 webhook supports additional query parameters. For example, use the `oncall_team` parameter to integrate with [Datadog On-Call][11] and redirect pages to different teams: + + ```yaml + receivers: + - name: datadog-ops + webhook_configs: + - send_resolved: true + url: https://event-management-intake.datadoghq.com/api/v2/events/webhook?dd-api-key=&integration_id=prometheus&oncall_team=ops + - name: datadog-db + webhook_configs: + - send_resolved: true + url: https://event-management-intake.datadoghq.com/api/v2/events/webhook?dd-api-key=&integration_id=prometheus&oncall_team=database + + route: + group_by: ['alertname'] + group_wait: 10s + group_interval: 5m + receiver: datadog-ops + repeat_interval: 3h + routes: + - matchers: + - team="database" + receiver: datadog-db + ``` + +
+ Setting send_resolved: true (the default value) enables Alertmanager to send notifications when alerts are resolved in Prometheus. This is particularly important when using the oncall_team parameter to ensure that pages are marked as resolved. Note that resolved notifications may be delayed until the next group_interval. +
+ +3. Restart the Prometheus and Alertmanager services. + + ```shell + sudo systemctl restart prometheus.service alertmanager.service + ``` + + + + +1. Edit the `alertmanager.yml` configuration file to include the following: + + ```yaml + receivers: + - name: datadog + webhook_configs: + - send_resolved: true + url: https://app.datadoghq.com/intake/webhook/prometheus?api_key= + route: + group_by: ['alertname'] + group_wait: 10s + group_interval: 5m + receiver: datadog + repeat_interval: 3h + ``` + +
+ This endpoint accepts only one event in the payload at a time. +
2. Restart the Prometheus and Alertmanager services. -``` -sudo systemctl restart prometheus.service alertmanager.service -``` + + ```shell + sudo systemctl restart prometheus.service alertmanager.service + ``` + + + ## Troubleshooting @@ -108,3 +182,5 @@ Need help? Contact [Datadog support][7]. [8]: https://www.datadoghq.com/blog/monitor-prometheus-metrics [9]: https://docs.datadoghq.com/agent/prometheus/ [10]: https://docs.datadoghq.com/developers/prometheus/ +[11]: https://docs.datadoghq.com/service_management/on-call/ +[12]: https://prometheus.io/docs/alerting/latest/configuration/#matcher