Metrics reporting #1496

DerGut · 2025-07-07T21:48:43Z

Which issue does this PR close?

Closes Metrics Reporter API #1466.

What changes are included in this PR?

As mentioned in the issue description, this PR adds an implementation for the Iceberg Metrics Reporting API.

I'll follow up with a more thorough description of it's changes.

Are these changes tested?

Yes, with the attached unit test and an example main that's more adjacent to an integration test. The example is available on this branch.

Signed-off-by: Jannik Steinmann <[email protected]> Rename MetricsReporter and MetricsReport Signed-off-by: Jannik Steinmann <[email protected]> Remove unnecessary stuff Signed-off-by: Jannik Steinmann <[email protected]>

Signed-off-by: Jannik Steinmann <[email protected]>

Signed-off-by: Jannik Steinmann <[email protected]> Be explicit about spawning

Signed-off-by: Jannik Steinmann <[email protected]>

DerGut · 2025-07-07T21:51:08Z

crates/iceberg/src/delete_file_index.rs

At this point, the DeleteFileIndex is only used by the scan module. I don't think it will be needed by other modules in the future, so maybe it's a good opportunity to move it there instead?

Makes sense. If I recall, when delete_file_index.rs was added, the scan module didn't exist as a module and was just the scan.rs file. If scan was already a module at this time then delete_file_index.rs would probably have been created in there.

crates/iceberg/src/metrics.rs

DerGut · 2025-07-07T21:56:47Z

crates/iceberg/src/metrics.rs

+                metrics,
+                metadata,
+            } => {
+                info!(


I don't think it's a good idea to use debug-formatted values here. I was struggling a lot to use the tracing API, and this is the best I could come up with so far.
I didn't really want to serialize the struct into a json, nor did I know how to implement fmt::Display for values such that they make sense across tracing subscribers.
Any suggestions welcome!

I could also use some feedback about the degree to which we want to mimic the Java implementation's log records.

DerGut · 2025-07-07T22:01:53Z

crates/iceberg/Cargo.toml

@@ -90,6 +90,7 @@ typed-builder = { workspace = true }
 url = { workspace = true }
 uuid = { workspace = true }
 zstd = { workspace = true }
+tracing = { workspace = true }


So far, tracing was only used in tests. As far as I could find, the LoggingMetricsReporter is the first use of any logging in the iceberg crate.
I'm not entirely sure whether it's a good idea include it and commit on a specific logging crate. tracing seems reasonably standard and compatible with other crates though. I'd also like to include some default reporter. The Java implementation comes with it's LoggingMetricsReporter.java based on SL4J.

I've also run into some issues using the tracing crate (as outlined in this comment) but they can probably be worked around and shouldn't be a deciding factor.

Regarding the choice of tracing vs other logging crates, you might want to read through the discussion here if you haven't already:

#482

TL;DR: tracing is the choice that many of us settled on as the chosen logging facade previously.

DerGut · 2025-07-07T22:20:57Z

crates/iceberg/src/scan/context.rs

@@ -186,16 +187,25 @@ impl PlanContext {
        tx_data: Sender<ManifestEntryContext>,
        delete_file_idx: DeleteFileIndex,
        delete_file_tx: Sender<ManifestEntryContext>,
-    ) -> Result<Box<impl Iterator<Item = Result<ManifestFileContext>> + 'static>> {
+    ) -> Result<(Vec<Result<ManifestFileContext>>, ManifestMetrics)> {


Since we were returning a vector and this function was only called at a single place, I took the liberty of changing the return value. This somewhat simplified passing the result to a spawned thread because Vec implies Send + Sync when it's Items do.

I've also extended the TODO comment below for future reference because I've added another obstacle to simply using an iterator here: the ManifestMetrics are now continuously mutated in the loop. If we used an iterator instead, we couldn't as easily (I think) pass around the mutable reference.

DerGut · 2025-07-07T22:28:05Z

crates/iceberg/src/scan/mod.rs

Because I had to thread the metrics building throughout the planning stage, I heavily refactored this file to make room for it.

DerGut · 2025-07-07T22:34:13Z

crates/iceberg/src/scan/mod.rs

+                delete_file_tx,
+                result_tx.clone(),
+            );
+        delete_manifests_handle.await;


This one is worth pointing out: During refactoring, I tried to be explicit about where threads are spawned, and which JoinHandles we await on (vs. ignore). This line corresponds to

iceberg-rust/crates/iceberg/src/scan/mod.rs

Line 404 in 96ec4d5

.await;

in the previous implementation.

I left it here for consistency, but at this point I believe it is unintended. The other spawned threads are simply pushed into the background without ever checking their completion (which maybe isn't ideal either, e.g. in case they panic). This thread however is awaited on, and so IIUC we block until all delete manifest entries are processed.

DerGut · 2025-07-07T22:43:59Z

crates/iceberg/src/scan/metrics.rs

Because the planning stage is parallelized and multiple threads are responsible for fetching manifest files and evaluating their contents, building of the metrics reports needs to fit into this framework.

The approach that I came up with is that individual threads that process manifest entries are spawned with a neighboring thread responsible for metrics aggregation. The processing thread usually iterates over a stream (e.g. of manifest entries), and sends a metrics update to the neighboring thread. The neighboring thread then accumulates the stream of metrics updates and returns the finished result wrapped by the JoinHandle.

My previous attempt included channels for metrics updates (and no separate threads), but this resulted in coupling between the plan file processing and metrics reporting. I had to somewhere iterate over the metrics receivers to aggregate their updates, and had to do this for multiple processors sequentially. This meant that some processor could be blocked when we were still iterating over another processors metrics updates, and the metrics channel buffer was exhausted.

This approach decouples plan file processing from metrics submission.

DerGut · 2025-07-07T23:03:44Z

crates/iceberg/src/metrics.rs

+
+/// Carries all metrics for a particular scan.
+#[derive(Debug)]
+pub(crate) struct ScanMetrics {


Note that the Java implementation uses special types for the metrics (e.g. TimerResult.java and CounterResult.java). They include a value and a unit but I felt like both the ScanMetrics' field names and their types should convey everything we need. The RestMetricReporter will need to emit reports that follow this format but I omitted it from the general purpose ScanMetrics.

Happy for any feedback!

Signed-off-by: Jannik Steinmann <[email protected]>

sdd · 2025-07-11T23:33:33Z

Thanks for this contribution - I can see that you've put a lot of work into this and it is appreciated!

I do have some concerns with this approach though - I think that by mimicking the pre-existing Java metrics approach so closely, we miss out on a lot of the advantages that come from making use of the kind of approaches that we commonly see in modern idiomatic Rust codebases that were not as widely available when the Java Iceberg metrics subsystem was created originally.

By using the metrics facade for metrics, and the tracing library's trace module for tracking things like function execution duration, scan file inclusion counts, we get more flexibility and easier integration with common observability platforms. Adding new metrics is trivial with metrics - there's no need to modify rigid report structs. metrics can be easily exposed to Prometheus and traces can be easily exported as OpenTelemetry traces - these two being amongst the most common approaches for each in the wild.

Also I have another PR open that also proposes quite a large refactor to the file plan code in the scan module (#1486). I think that PR and this one are likely to make each other require a significant refactor depending on which lands first, so we should try to co-ordinate to save each other from having to re-work one or other of them.

sdd · 2025-07-12T01:01:57Z

I put something together off the back of my earlier comment and opened a draft PR. Here's the relevant commit: 2a02e55

This means we can share references via TableScan::column_names() Signed-off-by: Jannik Steinmann <[email protected]>

DerGut · 2025-07-13T10:36:55Z

Thanks a lot for chiming in @sdd! I think using metrics and tracing directly instead makes a lot of sense. They are generally about as pluggable as the reporting interface is and I can't think of an implementation that wouldn't work either way.
I have one concern though: Emitting metrics straight from the implementation effectively binds us to a choice of metric names. To be able to aggregate metrics across language clients, I think we should agree on a naming that works for other implementations too.
I've been meaning to start a similar discussion in apache/iceberg-go and apache/iceberg-python too, so let me go ahead and do this now!

At the same time, the Metrics API seems to be part of the catalog spec and I wonder whether it shouldn't be implemented regardless.

sdd · 2025-07-13T11:09:13Z

Sounds good on the metrics naming convention - some cross implementation standardisation would be great, especially with pyiceberg looking to use iceberg-rust in the core.

Re the metrics / reporting api from the spec. I was thinking about this last night and I'm wondering if it would be feasible to implement by using either a custom tracing subscriber or metrics reporter or combination of both? That way the reports plug in to the same instrumentation as any other means of observability. I've not looked into this properly yet but it would be interesting to explore!

DerGut · 2025-07-13T21:18:35Z

I've created apache/iceberg-go#485 and apache/iceberg-python#474 (comment) for Go and Python respectively.

Re Re the metrics / reporting api from the spec.

I see the biggest problem in the way the spec's API bundles multiple metric values into a single report. To implement this using a metrics crate exporter, I'd assume we would either need to assign individually incoming metrics to their respective report (e.g. via a scan/ commit ID), or ignore these semantics and send multiple partial reports instead (which would arguably not be following the spec).

sdd · 2025-07-14T18:34:25Z

Yes you're right, I was thinking that the traces exported by tracing would be more suitable for this rather than metrics. Not sure why I even suggested metrics now 😅

sdd · 2025-07-16T19:43:42Z

I'm in the process of adding an integration test that showcases how to export traces via OTEL OTLP to Jaeger running in a container alongside the rest of the integration test containers.

You get a trace like this:

I'm getting these changes in better shape and looking to include them in a commit on my PR, #1486, soon - just thought I'd share progress :-)

DerGut · 2025-07-16T21:10:52Z

Thanks for sharing! This is great progress 👏
I'm working on a doc to bring our discussion to a wider audience as it touches on points that are relevant for all the client implementations. Will share here too once it's ready!

I also thought a bit more about retro-fitting the Metrics Reporting API. It's probably not too hard to write a custom tracing exporter that extracts span attributes for report building. But in addition to that, I would like to see independently published metrics too. From my quick research, it doesn't uniformly seem possible to derive values from spans and republish them as metrics. At least not in a way that we could control from within Iceberg.
This is probably what you meant earlier, but I've just realized it now.

DerGut added 23 commits July 7, 2025 23:21

Foundational work

eb8109c

Signed-off-by: Jannik Steinmann <[email protected]> Rename MetricsReporter and MetricsReport Signed-off-by: Jannik Steinmann <[email protected]> Remove unnecessary stuff Signed-off-by: Jannik Steinmann <[email protected]>

refactor: TableScan::plan_files into parallel steps

20a0e80

Signed-off-by: Jannik Steinmann <[email protected]>

Use serialization-based logger

e1dc699

Signed-off-by: Jannik Steinmann <[email protected]>

Set metrics reporter on TableScan

16af416

Signed-off-by: Jannik Steinmann <[email protected]>

Collect metrics for indexed deletes

8832027

Signed-off-by: Jannik Steinmann <[email protected]>

Collect manifest file metrics

05dc825

Signed-off-by: Jannik Steinmann <[email protected]>

Drop unnecessary Box<>

ce52bf6

Signed-off-by: Jannik Steinmann <[email protected]>

Collect metrics for data and delete files

3bec473

Signed-off-by: Jannik Steinmann <[email protected]>

Inlcude metrics mod

4fcfbed

Signed-off-by: Jannik Steinmann <[email protected]>

Include TableIdent in TableScan

7242774

Signed-off-by: Jannik Steinmann <[email protected]>

Send metrics report

1b8e8c6

Signed-off-by: Jannik Steinmann <[email protected]>

Add missing brackets around import

ddc9627

Signed-off-by: Jannik Steinmann <[email protected]>

Test metrics reporting

f1e598d

Signed-off-by: Jannik Steinmann <[email protected]>

Move stream writing outside of processing functions

08f72bd

Signed-off-by: Jannik Steinmann <[email protected]>

Replace Box<ScanMetrics> with Arc<ScanMetrics>

683ad4f

Signed-off-by: Jannik Steinmann <[email protected]>

Rever vec comment

876c708

Signed-off-by: Jannik Steinmann <[email protected]>

Move JoinHandle for delete index metrics

c74e855

Signed-off-by: Jannik Steinmann <[email protected]>

Use JoinHandle for delete file metrics

90ce07c

Signed-off-by: Jannik Steinmann <[email protected]>

Use JoinHandle for data file metrics and refactor

630cc53

Signed-off-by: Jannik Steinmann <[email protected]>

Be explicit about JoinHandles and awaits

ed4987d

Signed-off-by: Jannik Steinmann <[email protected]> Be explicit about spawning

Simplify LoggingMetricsReporter

467c565

Signed-off-by: Jannik Steinmann <[email protected]>

Feature-flag TableBuilder::metrics_reporter for tests only

9e34b42

Signed-off-by: Jannik Steinmann <[email protected]>

Join all metrics handles

166cf5d

Signed-off-by: Jannik Steinmann <[email protected]>

DerGut commented Jul 7, 2025

View reviewed changes

crates/iceberg/src/metrics.rs Outdated Show resolved Hide resolved

DerGut commented Jul 7, 2025

View reviewed changes

DerGut added 3 commits July 8, 2025 01:11

Fix clippy warnings

22c4612

Signed-off-by: Jannik Steinmann <[email protected]>

Remove unclear comment

ed6c139

Report columns when all are selected

a2007d8

Signed-off-by: Jannik Steinmann <[email protected]>

Copy column names to plan context

91cfd26

This means we can share references via TableScan::column_names() Signed-off-by: Jannik Steinmann <[email protected]>

This was referenced Jul 13, 2025

Metrics Reporting apache/iceberg-go#485

Open

Support Rest Catalog Metrics Endpoint apache/iceberg-python#474

Open

Metrics reporting #1496

Are you sure you want to change the base?

Metrics reporting #1496

Conversation

DerGut commented Jul 7, 2025

Which issue does this PR close?

What changes are included in this PR?

Are these changes tested?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sdd commented Jul 11, 2025

Uh oh!

sdd commented Jul 12, 2025

Uh oh!

DerGut commented Jul 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sdd commented Jul 13, 2025

Uh oh!

DerGut commented Jul 13, 2025

Uh oh!

sdd commented Jul 14, 2025

Uh oh!

sdd commented Jul 16, 2025

Uh oh!

DerGut commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

DerGut commented Jul 13, 2025 •

edited

Loading

DerGut commented Jul 16, 2025 •

edited

Loading