Skip to content

Commit df4f15b

Browse files
authored
Merge pull request #33686 from teskje/heap-metrics
2 parents 56611d3 + 55f7f63 commit df4f15b

File tree

12 files changed

+329
-114
lines changed

12 files changed

+329
-114
lines changed

doc/user/content/sql/system-catalog/mz_internal.md

Lines changed: 26 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -162,13 +162,15 @@ for all processes of all extant cluster replicas.
162162
At this time, we do not make any guarantees about the exactness or freshness of these numbers.
163163

164164
<!-- RELATION_SPEC mz_internal.mz_cluster_replica_metrics -->
165-
| Field | Type | Meaning |
166-
| ------------------- | ------------ | -------- |
167-
| `replica_id` | [`text`] | The ID of a cluster replica. |
168-
| `process_id` | [`uint8`] | The ID of a process within the replica. |
169-
| `cpu_nano_cores` | [`uint8`] | Approximate CPU usage, in billionths of a vCPU core. |
170-
| `memory_bytes` | [`uint8`] | Approximate RAM usage, in bytes. |
171-
| `disk_bytes` | [`uint8`] | Approximate disk usage in bytes. |
165+
| Field | Type | Meaning
166+
| ------------------- | ------------ | --------
167+
| `replica_id` | [`text`] | The ID of a cluster replica.
168+
| `process_id` | [`uint8`] | The ID of a process within the replica.
169+
| `cpu_nano_cores` | [`uint8`] | Approximate CPU usage, in billionths of a vCPU core.
170+
| `memory_bytes` | [`uint8`] | Approximate RAM usage, in bytes.
171+
| `disk_bytes` | [`uint8`] | Approximate disk usage, in bytes.
172+
| `heap_bytes` | [`uint8`] | Approximate heap (RAM + swap) usage, in bytes.
173+
| `heap_limit` | [`uint8`] | Available heap (RAM + swap) space, in bytes.
172174

173175
## `mz_cluster_replica_metrics_history`
174176

@@ -183,10 +185,12 @@ At this time, we do not make any guarantees about the exactness or freshness of
183185
| ---------------- | --------- | --------
184186
| `replica_id` | [`text`] | The ID of a cluster replica.
185187
| `process_id` | [`uint8`] | The ID of a process within the replica.
186-
| `cpu_nano_cores` | [`uint8`] | Approximate CPU usage in billionths of a vCPU core.
187-
| `memory_bytes` | [`uint8`] | Approximate memory usage in bytes.
188-
| `disk_bytes` | [`uint8`] | Approximate disk usage in bytes.
188+
| `cpu_nano_cores` | [`uint8`] | Approximate CPU usage, in billionths of a vCPU core.
189+
| `memory_bytes` | [`uint8`] | Approximate memory usage, in bytes.
190+
| `disk_bytes` | [`uint8`] | Approximate disk usage, in bytes.
189191
| `occurred_at` | [`timestamp with time zone`] | Wall-clock timestamp at which the event occurred.
192+
| `heap_bytes` | [`uint8`] | Approximate heap (RAM + swap) usage, in bytes.
193+
| `heap_limit` | [`uint8`] | Available heap (RAM + swap) space, in bytes.
190194

191195
## `mz_cluster_replica_statuses`
192196

@@ -225,13 +229,14 @@ for all processes of all extant cluster replicas, as a percentage of the total r
225229
At this time, we do not make any guarantees about the exactness or freshness of these numbers.
226230

227231
<!-- RELATION_SPEC mz_internal.mz_cluster_replica_utilization -->
228-
| Field | Type | Meaning |
229-
|------------------|----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
230-
| `replica_id` | [`text`] | The ID of a cluster replica. |
231-
| `process_id` | [`uint8`] | The ID of a process within the replica. |
232-
| `cpu_percent` | [`double precision`] | Approximate CPU usage in percent of the total allocation. |
233-
| `memory_percent` | [`double precision`] | Approximate RAM usage in percent of the total allocation. |
234-
| `disk_percent` | [`double precision`] | Approximate disk usage in percent of the total allocation. |
232+
| Field | Type | Meaning
233+
|------------------|----------------------|---------
234+
| `replica_id` | [`text`] | The ID of a cluster replica.
235+
| `process_id` | [`uint8`] | The ID of a process within the replica.
236+
| `cpu_percent` | [`double precision`] | Approximate CPU usage, in percent of the total allocation.
237+
| `memory_percent` | [`double precision`] | Approximate RAM usage, in percent of the total allocation.
238+
| `disk_percent` | [`double precision`] | Approximate disk usage, in percent of the total allocation.
239+
| `heap_percent` | [`double precision`] | Approximate heap (RAM + swap) usage, in percent of the total allocation.
235240

236241
## `mz_cluster_replica_utilization_history`
237242

@@ -246,9 +251,10 @@ At this time, we do not make any guarantees about the exactness or freshness of
246251
|------------------|----------------------|--------
247252
| `replica_id` | [`text`] | The ID of a cluster replica.
248253
| `process_id` | [`uint8`] | The ID of a process within the replica.
249-
| `cpu_percent` | [`double precision`] | Approximate CPU usage in percent of the total allocation.
250-
| `memory_percent` | [`double precision`] | Approximate RAM usage in percent of the total allocation.
251-
| `disk_percent` | [`double precision`] | Approximate disk usage in percent of the total allocation.
254+
| `cpu_percent` | [`double precision`] | Approximate CPU usage, in percent of the total allocation.
255+
| `memory_percent` | [`double precision`] | Approximate RAM usage, in percent of the total allocation.
256+
| `disk_percent` | [`double precision`] | Approximate disk usage, in percent of the total allocation.
257+
| `heap_percent` | [`double precision`] | Approximate heap (RAM + swap) usage, in percent of the total allocation.
252258
| `occurred_at` | [`timestamp with time zone`] | Wall-clock timestamp at which the event occurred.
253259

254260
## `mz_cluster_replica_history`

src/catalog/src/builtin.rs

Lines changed: 38 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -4893,14 +4893,19 @@ pub static MZ_CLUSTER_REPLICA_METRICS_HISTORY: LazyLock<BuiltinSource> =
48934893
("process_id", "The ID of a process within the replica."),
48944894
(
48954895
"cpu_nano_cores",
4896-
"Approximate CPU usage in billionths of a vCPU core.",
4896+
"Approximate CPU usage, in billionths of a vCPU core.",
48974897
),
4898-
("memory_bytes", "Approximate memory usage in bytes."),
4899-
("disk_bytes", "Approximate disk usage in bytes."),
4898+
("memory_bytes", "Approximate memory usage, in bytes."),
4899+
("disk_bytes", "Approximate disk usage, in bytes."),
49004900
(
49014901
"occurred_at",
49024902
"Wall-clock timestamp at which the event occurred.",
49034903
),
4904+
(
4905+
"heap_bytes",
4906+
"Approximate heap (RAM + swap) usage, in bytes.",
4907+
),
4908+
("heap_limit", "Available heap (RAM + swap) space, in bytes."),
49044909
]),
49054910
is_retained_metrics_object: false,
49064911
access: vec![PUBLIC_SELECT],
@@ -4934,6 +4939,8 @@ pub static MZ_CLUSTER_REPLICA_METRICS: LazyLock<BuiltinView> = LazyLock::new(||
49344939
.with_column("cpu_nano_cores", SqlScalarType::UInt64.nullable(true))
49354940
.with_column("memory_bytes", SqlScalarType::UInt64.nullable(true))
49364941
.with_column("disk_bytes", SqlScalarType::UInt64.nullable(true))
4942+
.with_column("heap_bytes", SqlScalarType::UInt64.nullable(true))
4943+
.with_column("heap_limit", SqlScalarType::UInt64.nullable(true))
49374944
.with_key(vec![0, 1])
49384945
.finish(),
49394946
column_comments: BTreeMap::from_iter([
@@ -4944,7 +4951,12 @@ pub static MZ_CLUSTER_REPLICA_METRICS: LazyLock<BuiltinView> = LazyLock::new(||
49444951
"Approximate CPU usage, in billionths of a vCPU core.",
49454952
),
49464953
("memory_bytes", "Approximate RAM usage, in bytes."),
4947-
("disk_bytes", "Approximate disk usage in bytes."),
4954+
("disk_bytes", "Approximate disk usage, in bytes."),
4955+
(
4956+
"heap_bytes",
4957+
"Approximate heap (RAM + swap) usage, in bytes.",
4958+
),
4959+
("heap_limit", "Available heap (RAM + swap) space, in bytes."),
49484960
]),
49494961
sql: "
49504962
SELECT
@@ -4953,7 +4965,9 @@ SELECT
49534965
process_id,
49544966
cpu_nano_cores,
49554967
memory_bytes,
4956-
disk_bytes
4968+
disk_bytes,
4969+
heap_bytes,
4970+
heap_limit
49574971
FROM mz_internal.mz_cluster_replica_metrics_history
49584972
JOIN mz_cluster_replicas r ON r.id = replica_id
49594973
ORDER BY replica_id, process_id, occurred_at DESC",
@@ -8852,21 +8866,26 @@ pub static MZ_CLUSTER_REPLICA_UTILIZATION: LazyLock<BuiltinView> = LazyLock::new
88528866
.with_column("cpu_percent", SqlScalarType::Float64.nullable(true))
88538867
.with_column("memory_percent", SqlScalarType::Float64.nullable(true))
88548868
.with_column("disk_percent", SqlScalarType::Float64.nullable(true))
8869+
.with_column("heap_percent", SqlScalarType::Float64.nullable(true))
88558870
.finish(),
88568871
column_comments: BTreeMap::from_iter([
88578872
("replica_id", "The ID of a cluster replica."),
88588873
("process_id", "The ID of a process within the replica."),
88598874
(
88608875
"cpu_percent",
8861-
"Approximate CPU usage in percent of the total allocation.",
8876+
"Approximate CPU usage, in percent of the total allocation.",
88628877
),
88638878
(
88648879
"memory_percent",
8865-
"Approximate RAM usage in percent of the total allocation.",
8880+
"Approximate RAM usage, in percent of the total allocation.",
88668881
),
88678882
(
88688883
"disk_percent",
8869-
"Approximate disk usage in percent of the total allocation.",
8884+
"Approximate disk usage, in percent of the total allocation.",
8885+
),
8886+
(
8887+
"heap_percent",
8888+
"Approximate heap (RAM + swap) usage, in percent of the total allocation.",
88708889
),
88718890
]),
88728891
sql: "
@@ -8875,7 +8894,8 @@ SELECT
88758894
m.process_id,
88768895
m.cpu_nano_cores::float8 / NULLIF(s.cpu_nano_cores, 0) * 100 AS cpu_percent,
88778896
m.memory_bytes::float8 / NULLIF(s.memory_bytes, 0) * 100 AS memory_percent,
8878-
m.disk_bytes::float8 / NULLIF(s.disk_bytes, 0) * 100 AS disk_percent
8897+
m.disk_bytes::float8 / NULLIF(s.disk_bytes, 0) * 100 AS disk_percent,
8898+
m.heap_bytes::float8 / NULLIF(m.heap_limit, 0) * 100 AS heap_percent
88798899
FROM
88808900
mz_catalog.mz_cluster_replicas AS r
88818901
JOIN mz_catalog.mz_cluster_replica_sizes AS s ON r.size = s.size
@@ -8894,6 +8914,7 @@ pub static MZ_CLUSTER_REPLICA_UTILIZATION_HISTORY: LazyLock<BuiltinView> =
88948914
.with_column("cpu_percent", SqlScalarType::Float64.nullable(true))
88958915
.with_column("memory_percent", SqlScalarType::Float64.nullable(true))
88968916
.with_column("disk_percent", SqlScalarType::Float64.nullable(true))
8917+
.with_column("heap_percent", SqlScalarType::Float64.nullable(true))
88978918
.with_column(
88988919
"occurred_at",
88998920
SqlScalarType::TimestampTz { precision: None }.nullable(false),
@@ -8904,15 +8925,19 @@ pub static MZ_CLUSTER_REPLICA_UTILIZATION_HISTORY: LazyLock<BuiltinView> =
89048925
("process_id", "The ID of a process within the replica."),
89058926
(
89068927
"cpu_percent",
8907-
"Approximate CPU usage in percent of the total allocation.",
8928+
"Approximate CPU usage, in percent of the total allocation.",
89088929
),
89098930
(
89108931
"memory_percent",
8911-
"Approximate RAM usage in percent of the total allocation.",
8932+
"Approximate RAM usage, in percent of the total allocation.",
89128933
),
89138934
(
89148935
"disk_percent",
8915-
"Approximate disk usage in percent of the total allocation.",
8936+
"Approximate disk usage, in percent of the total allocation.",
8937+
),
8938+
(
8939+
"heap_percent",
8940+
"Approximate heap (RAM + swap) usage, in percent of the total allocation.",
89168941
),
89178942
(
89188943
"occurred_at",
@@ -8926,6 +8951,7 @@ SELECT
89268951
m.cpu_nano_cores::float8 / NULLIF(s.cpu_nano_cores, 0) * 100 AS cpu_percent,
89278952
m.memory_bytes::float8 / NULLIF(s.memory_bytes, 0) * 100 AS memory_percent,
89288953
m.disk_bytes::float8 / NULLIF(s.disk_bytes, 0) * 100 AS disk_percent,
8954+
m.heap_bytes::float8 / NULLIF(m.heap_limit, 0) * 100 AS heap_percent,
89298955
m.occurred_at
89308956
FROM
89318957
mz_catalog.mz_cluster_replicas AS r

0 commit comments

Comments
 (0)