Skip to content

Conversation

@tdcmeehan
Copy link
Contributor

Description

Motivation and Context

Impact

Test Plan

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.
  • If adding new dependencies, verified they have an OpenSSF Scorecard score of 5.0 or higher (or obtained explicit TSC approval for lower scores).

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* ... 
* ... 

Hive Connector Changes
* ... 
* ... 

If release note is NOT required, use:

== NO RELEASE NOTE ==

@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Dec 2, 2025
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @tdcmeehan, your pull request is larger than the review limit of 150000 diff characters

Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on the documentation! A few nits and suggestions, nothing major.

In this example:

* ``orders.order_date`` and ``customers.reg_date`` are equivalent due to the equality join condition
* Even though ``reg_date`` is not in the MV's SELECT list, staleness can be tracked through the equivalence to ``order_date``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Even though ``reg_date`` is not in the MV's SELECT list, staleness can be tracked through the equivalence to ``order_date``
* Even though ``reg_date`` is not in the SELECT list, staleness can be tracked through the equivalence to ``order_date``

The all caps MV was jarring. Suggest removing it as not needed because of context supporting the meaning, or maybe "the materialized view's SELECT list".


**How Passthrough Mapping Works**

1. **Equivalence Extraction**: During MV creation, Presto analyzes JOIN conditions to identify
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above about MV. Suggest deleting, or spelling it out.

* Join must be an INNER JOIN (not LEFT, RIGHT, or FULL OUTER)
* Equality must be direct (``col1 = col2``), not through expressions like ``col1 = col2 + 1``
* Both columns must be partition columns in their respective tables
* At least one column in the equivalence class must be in the MV's output
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above about MV. Suggest deleting, or spelling it out.


- All refreshes recompute the entire result set
- REFRESH does not provide snapshot isolation across multiple base tables
- All refreshes recompute the entire result set (incremental refresh not yet supported)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- All refreshes recompute the entire result set (incremental refresh not yet supported)
- All refreshes recompute the entire result set (incremental refresh not supported)

"yet" is an implied promise that should be avoided in documentation.

I thought about suggesting deleting the entire parenthetical "(incremental refresh not yet supported)" as it's arguably implied in "All refreshes recompute the entire result set", but I also find value in the explicit declaration of "incremental refresh not supported" so I could go either way on it and be fine with it staying.

3. Partition constraints are built that identify exactly which data is stale

See the connector-specific documentation for details on how staleness is tracked.
For Iceberg tables, see :doc:`/connector/iceberg` (Materialized Views section).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For Iceberg tables, see :doc:`/connector/iceberg` (Materialized Views section).
For Iceberg tables, see :ref:`connector/iceberg:materialized views`.

Tested improved link in local doc build.

@tdcmeehan tdcmeehan force-pushed the mv-iceberg-disjuncts branch 4 times, most recently from 05e8c4a to ab3ceea Compare December 4, 2025 18:43
@tdcmeehan tdcmeehan force-pushed the mv-iceberg-disjuncts branch 2 times, most recently from e7c3d3a to 2cc1534 Compare December 6, 2025 03:34
@tdcmeehan tdcmeehan force-pushed the mv-iceberg-disjuncts branch from 2cc1534 to 24d1537 Compare December 6, 2025 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants