|
3 | 3 | Monitoring wire-server using Prometheus and Grafana
|
4 | 4 | =======================================================
|
5 | 5 |
|
6 |
| -Introduction |
7 |
| ------------- |
| 6 | +All wire-server helm charts offering prometheus metrics expose a |
| 7 | +`metrics.serviceMonitor.enabled` option. |
8 | 8 |
|
9 |
| -The following instructions detail the installation of a monitoring |
10 |
| -system consisting of a Prometheus instance and corresponding Alert |
11 |
| -Manager in addition to a Grafana instance for viewing dashboards related |
12 |
| -to cluster and wire-services health. |
| 9 | +If these are set to true, the helm charts will install `ServiceMonitor` |
| 10 | +resources, which can be used to mark services for scraping by |
| 11 | +[Prometheus Operator](https://prometheus-operator.dev/), |
| 12 | +[Grafana Agent Operator](https://grafana.com/docs/grafana-cloud/kubernetes-monitoring/agent-k8s/), |
| 13 | +or similar prometheus-compatible tools. |
13 | 14 |
|
14 |
| -Prerequisites |
15 |
| -------------- |
| 15 | +Refer to their documentation for installation. |
16 | 16 |
|
17 |
| -You need to have wire-server installed, see either of |
18 |
| - |
19 |
| -* :ref:`helm` |
20 |
| -* :ref:`helm_prod`. |
21 |
| - |
22 |
| -How to install Prometheus and Grafana on Kubernetes using Helm |
23 |
| ---------------------------------------------------------------- |
24 |
| - |
25 |
| -.. note:: |
26 |
| - |
27 |
| - The following makes use of overrides for helm charts. You may wish to read :ref:`understand-helm-overrides` first. |
28 |
| - |
29 |
| -Create an override file: |
30 |
| - |
31 |
| -.. code:: bash |
32 |
| -
|
33 |
| - mkdir -p wire-server-metrics |
34 |
| - curl -sSL https://raw.githubusercontent.com/wireapp/wire-server-deploy/master/values/wire-server-metrics/demo-values.example.yaml > wire-server-metrics/values.yaml |
35 |
| -
|
36 |
| -And edit this file by editing/uncommenting as needed with respect to the next sections. |
37 |
| - |
38 |
| -The monitoring system requires disk space if you wish to be resilient to |
39 |
| -pod failure. This disk space is given to pods by using a so-called "Storage Class". You have three options: |
40 |
| - |
41 |
| -* (1) If you deploy on a kubernetes cluster hosted on AWS you may install the ``aws-storage`` helm chart which provides configurations of Storage Classes for AWS's elastic block storage (EBS). For this, install the aws storage classes with ``helm upgrade --install aws-storage wire/aws-storage --wait``. |
42 |
| -* (2) If you're not using AWS, but you sill want to have persistent metrics, see :ref:`using-custom-storage-classes`. |
43 |
| -* (3) If you don't want persistence at all, see :ref:`using-no-storage-classes`. |
44 |
| - |
45 |
| -Once you have a storage class configured (or put the override configuration to not use persistence), next we can install the monitoring suite itself. |
46 |
| - |
47 |
| -There are a few known issues surrounding the ``prometheus-operator`` |
48 |
| -helm chart. |
49 |
| - |
50 |
| -You will likely have to install the Custom Resource Definitions manually |
51 |
| -before installing the ``wire-server-metrics`` chart: |
52 |
| - |
53 |
| -:: |
54 |
| - |
55 |
| - kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/d34d70de61fe8e23bb21f6948993c510496a0b31/example/prometheus-operator-crd/alertmanager.crd.yaml |
56 |
| - kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/d34d70de61fe8e23bb21f6948993c510496a0b31/example/prometheus-operator-crd/prometheus.crd.yaml |
57 |
| - kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/d34d70de61fe8e23bb21f6948993c510496a0b31/example/prometheus-operator-crd/prometheusrule.crd.yaml |
58 |
| - kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/d34d70de61fe8e23bb21f6948993c510496a0b31/example/prometheus-operator-crd/servicemonitor.crd.yaml |
59 |
| - |
60 |
| -Now we can install the metrics chart, run the following:: |
61 |
| - |
62 |
| - helm upgrade --install wire-server-metrics wire/wire-server-metrics --wait -f wire-server-metrics/values.yaml |
63 |
| - |
64 |
| -See the `Prometheus Operator |
65 |
| -README <https://github.com/helm/charts/tree/master/stable/prometheus-operator#work-arounds-for-known-issues>`__ |
66 |
| -for more information and troubleshooting help. |
67 |
| - |
68 |
| -Adding Dashboards |
69 |
| ------------------ |
70 |
| - |
71 |
| -Grafana dashboard configurations are included as JSON inside the |
72 |
| -``charts/wire-server-metrics/dashboards`` directory. You may import |
73 |
| -these via Grafana's web UI. See `Accessing |
74 |
| -grafana <#accessing-grafana>`__. |
75 |
| - |
76 |
| -Monitoring in a separate namespace |
77 |
| ----------------------------------- |
78 |
| - |
79 |
| -It is advisable to separate your monitoring services from your |
80 |
| -application services. To accomplish this you may deploy |
81 |
| -``wire-server-metrics`` into a separate namespace from ``wire-server``. |
82 |
| -Simply provide a different namespace to the ``helm upgrade --install`` |
83 |
| -calls with ``--namespace your-desired-namespace``. |
84 |
| - |
85 |
| -The wire-server-metrics chart will monitor all wire services across *all* namespaces. |
86 |
| - |
87 |
| -Accessing grafana |
| 17 | +Dashboards |
88 | 18 | -----------------
|
89 | 19 |
|
90 |
| -Forward a port from your localhost to the grafana service running in |
91 |
| -your cluster: |
92 |
| - |
93 |
| -:: |
94 |
| - |
95 |
| - kubectl port-forward service/<release-name>-grafana 3000:80 -n <namespace> |
96 |
| - |
97 |
| -Now you can access grafana at ``http://localhost:3000`` |
98 |
| - |
99 |
| -The username and password are stored in the ``grafana`` secret of your |
100 |
| -namespace |
101 |
| - |
102 |
| -By default this is: |
103 |
| - |
104 |
| -- username: ``admin`` |
105 |
| -- password: ``admin`` |
106 |
| - |
107 |
| -Accessing prometheus |
108 |
| --------------------- |
109 |
| - |
110 |
| -Forward a port from your localhost to the prometheus service running in |
111 |
| -your cluster: |
112 |
| - |
113 |
| -:: |
114 |
| - |
115 |
| - kubectl port-forward service/<release-name>-prometheus 9090:9090 -n <namespace> |
116 |
| - |
117 |
| -Now you can access prometheus at ``http://localhost:9090`` |
118 |
| - |
119 |
| - |
120 |
| -Customization |
121 |
| ---------------- |
122 |
| - |
123 |
| -.. _using-no-storage-classes: |
124 |
| - |
125 |
| -Monitoring without persistent disk |
126 |
| -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
127 |
| - |
128 |
| -If you wish to deploy monitoring without any persistent disk (not |
129 |
| -recommended) you may add the following overrides to your ``values.yaml`` |
130 |
| -file. |
131 |
| - |
132 |
| -.. code:: yaml |
133 |
| -
|
134 |
| - # This configuration switches to use memory instead of disk for metrics services |
135 |
| - # NOTE: If the pods are killed you WILL lose all your metrics history |
136 |
| - kube-prometheus-stack: |
137 |
| - grafana: |
138 |
| - persistence: |
139 |
| - enabled: false |
140 |
| - prometheus: |
141 |
| - prometheusSpec: |
142 |
| - storageSpec: null |
143 |
| - alertmanager: |
144 |
| - alertmanagerSpec: |
145 |
| - storage: null |
146 |
| -
|
147 |
| -.. _using-custom-storage-classes: |
148 |
| - |
149 |
| -Using Custom Storage Classes |
150 |
| -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
151 |
| - |
152 |
| -If you're using a provider other than AWS please reference the |
153 |
| -`Kubernetes documentation on storage |
154 |
| -classes <https://kubernetes.io/docs/concepts/storage/storage-classes/>`__ |
155 |
| -for configuring a storage class for your kubernetes cluster. |
156 |
| - |
157 |
| -If you wish to use a different storage class (for instance if you don't |
158 |
| -run on AWS) you may add the following overrides to your ``values.yaml`` |
159 |
| -file. |
160 |
| - |
161 |
| -.. code:: yaml |
162 |
| -
|
163 |
| - kube-prometheus-stack: |
164 |
| - grafana: |
165 |
| - persistence: |
166 |
| - storageClassName: "<my-storage-class>" |
167 |
| - prometheus: |
168 |
| - prometheusSpec: |
169 |
| - storageSpec: |
170 |
| - volumeClaimTemplate: |
171 |
| - spec: |
172 |
| - storageClassName: "<my-storage-class>" |
173 |
| - alertmanager: |
174 |
| - alertmanagerSpec: |
175 |
| - storage: |
176 |
| - volumeClaimTemplate: |
177 |
| - spec: |
178 |
| - storageClassName: "<my-storage-class>" |
179 |
| -
|
180 |
| -
|
181 |
| -Troubleshooting |
182 |
| ---------------- |
183 |
| - |
184 |
| -"validation failed" |
185 |
| -^^^^^^^^^^^^^^^^^^^^^ |
186 |
| - |
187 |
| -If you receive the following error: |
188 |
| - |
189 |
| -:: |
190 |
| - |
191 |
| - Error: validation failed: [unable to recognize "": no matches for kind "Alertmanager" in version |
192 |
| - "monitoring.coreos.com/v1", unable to recognize "": no matches for kind "Prometheus" in version |
193 |
| - "monitoring.coreos.com/v1", unable to recognize "": no matches for kind "PrometheusRule" in version |
194 |
| - |
195 |
| -Please run the script to install Custom Resource Definitions which is |
196 |
| -detailed in the installation instructions above. |
197 |
| - |
198 |
| -"object is being deleted" |
199 |
| -^^^^^^^^^^^^^^^^^^^^^^^^^^ |
200 |
| - |
201 |
| -When upgrading you may see the following error: |
202 |
| - |
203 |
| -:: |
204 |
| - |
205 |
| - Error: object is being deleted: customresourcedefinitions.apiextensions.k8s.io "prometheusrules.monitoring.coreos.com" already exists |
206 |
| - |
207 |
| -Helm sometimes has trouble cleaning up or defining Custom Resource |
208 |
| -Definitions. Try manually deleting the resource definitions and trying |
209 |
| -your helm install again: |
210 |
| - |
211 |
| -:: |
212 |
| - |
213 |
| - kubectl delete customresourcedefinitions \ |
214 |
| - alertmanagers.monitoring.coreos.com \ |
215 |
| - prometheuses.monitoring.coreos.com \ |
216 |
| - servicemonitors.monitoring.coreos.com \ |
217 |
| - prometheusrules.monitoring.coreos.com |
| 20 | +Grafana dashboard configurations are included as JSON inside the ``dashboards`` |
| 21 | +directory. You may import these via Grafana's web UI. |
0 commit comments