Skip to content

Commit 24d1537

Browse files
committed
Support partition stitching in MaterializedViewRewrite
1 parent ee2e169 commit 24d1537

File tree

22 files changed

+7467
-285
lines changed

22 files changed

+7467
-285
lines changed

presto-analyzer/src/main/java/com/facebook/presto/sql/analyzer/Analysis.java

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -867,6 +867,12 @@ public MaterializedViewAnalysisState getMaterializedViewAnalysisState(Table mate
867867
return materializedViewAnalysisStateMap.getOrDefault(materializedView, NOT_VISITED);
868868
}
869869

870+
public void markMaterializedViewDataTableAsVisiting(Table dataTable)
871+
{
872+
requireNonNull(dataTable, "dataTable is null");
873+
materializedViewAnalysisStateMap.put(dataTable, VISITING);
874+
}
875+
870876
public boolean hasTableInView(Table tableReference)
871877
{
872878
return tablesForView.contains(tableReference);

presto-docs/src/main/sphinx/admin/materialized-views.rst

Lines changed: 243 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,249 @@ The following permissions are required for materialized view operations when
7272
* For DEFINER mode: User needs ``SELECT`` permission on the view itself
7373
* For INVOKER mode: User needs ``SELECT`` permission on all underlying base tables
7474

75+
Data Consistency Modes
76+
----------------------
77+
78+
Materialized views support three data consistency modes that control how queries are optimized
79+
when the view's data may be stale:
80+
81+
**USE_STITCHING** (default)
82+
Reads fresh data from storage, recomputes stale data from base tables,
83+
and combines results via UNION.
84+
85+
**USE_DATA_TABLE**
86+
Reads directly from storage, ignoring staleness. Fastest but may return stale data.
87+
88+
**USE_VIEW_QUERY**
89+
Executes the view query against base tables. Always fresh but highest cost.
90+
91+
Set via session property::
92+
93+
SET SESSION materialized_view_skip_storage = 'USE_STITCHING';
94+
95+
Predicate Stitching (USE_STITCHING Mode)
96+
----------------------------------------
97+
98+
Overview
99+
^^^^^^^^
100+
101+
Predicate stitching recomputes only stale data rather than the entire view. When base
102+
tables change, Presto identifies which data is affected and generates a UNION query
103+
that combines:
104+
105+
* **Storage scan**: Reads unchanged (fresh) data from the materialized view's storage
106+
* **Recompute branch**: Recomputes changed (stale) data from base tables using the view's
107+
defining query
108+
109+
This avoids full recomputation when only a subset of data is stale, though there is
110+
overhead from the UNION operation and predicate-based filtering.
111+
112+
How It Works
113+
^^^^^^^^^^^^
114+
115+
**Staleness Detection**
116+
117+
For each base table referenced in the materialized view, a connector may track which data
118+
has changed since the last refresh and return predicates identifying the stale data. The
119+
specific mechanism depends on the connector:
120+
121+
1. At refresh time, metadata is recorded (implementation varies by connector)
122+
2. When the view is queried, the current state is compared with the recorded state
123+
3. Predicates are built that identify exactly which data is stale
124+
125+
See the connector-specific documentation for details on how staleness is tracked.
126+
For Iceberg tables, see :doc:`/connector/iceberg` (Materialized Views section).
127+
128+
**Query Rewriting**
129+
130+
When a query uses a materialized view with stale data, the optimizer rewrites the query
131+
to use UNION::
132+
133+
-- Original query
134+
SELECT * FROM my_materialized_view WHERE order_date >= '2024-01-01'
135+
136+
-- Rewritten with predicate stitching (example using partition predicates)
137+
SELECT * FROM (
138+
-- Fresh partitions from storage
139+
SELECT * FROM my_materialized_view_storage
140+
WHERE order_date >= '2024-01-01'
141+
AND order_date NOT IN ('2024-01-15', '2024-01-16') -- Exclude stale
142+
UNION ALL
143+
-- Stale partitions recomputed
144+
SELECT o.order_id, c.customer_name, o.order_date
145+
FROM orders o
146+
JOIN customers c ON o.customer_id = c.customer_id
147+
WHERE o.order_date IN ('2024-01-15', '2024-01-16') -- Stale partition filter
148+
AND c.reg_date IN ('2024-01-15', '2024-01-16') -- Propagated via equivalence
149+
AND o.order_date >= '2024-01-01' -- Original filter preserved
150+
)
151+
152+
The partition predicate is propagated to equivalent columns in joined tables (in this case,
153+
``c.reg_date``), allowing partition pruning on the ``customers`` table as well.
154+
155+
Requirements
156+
^^^^^^^^^^^^
157+
158+
For predicate stitching to work effectively, the following requirements must be met:
159+
160+
**Predicate Mapping Requirement**
161+
162+
The connector must be able to express staleness as predicates that can be mapped to the
163+
materialized view's columns. The specific requirements depend on the connector implementation.
164+
For partition-based connectors (like Iceberg), this typically means:
165+
166+
* Base table partition columns must appear in the SELECT list or be equivalent to columns that do
167+
* The materialized view should be partitioned on the same or equivalent columns
168+
* Partition columns must use compatible data types
169+
170+
See connector-specific documentation for details on staleness tracking requirements.
171+
172+
**Unsupported Query Patterns**
173+
174+
Predicate stitching does not work with:
175+
176+
* **Outer joins**: LEFT, RIGHT, and FULL OUTER joins
177+
* **Non-deterministic functions**: ``RANDOM()``, ``NOW()``, ``UUID()``, etc.
178+
179+
**Security Constraints**
180+
181+
For SECURITY INVOKER materialized views, predicate stitching requires that:
182+
183+
* No column masks are defined on base tables (or the view is treated as fully stale)
184+
* No row filters are defined on base tables (or the view is treated as fully stale)
185+
186+
This is because column masks and row filters can vary by user, making it impossible to
187+
determine staleness in a user-independent way.
188+
189+
Column Equivalences and Passthrough Columns
190+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
191+
192+
Predicate stitching supports **passthrough columns** through **column equivalences**,
193+
which allows tracking staleness even when predicate columns from base tables
194+
are not directly in the materialized view's output.
195+
196+
**Column Equivalence**
197+
198+
When tables are joined with equality predicates, those columns become equivalent for
199+
predicate propagation purposes. This applies to any type of staleness predicate
200+
(partition-based, snapshot-based, etc.). For example with partition predicates::
201+
202+
CREATE TABLE orders (order_id BIGINT, customer_id BIGINT, order_date VARCHAR)
203+
WITH (partitioning = ARRAY['order_date']);
204+
205+
CREATE TABLE customers (customer_id BIGINT, name VARCHAR, reg_date VARCHAR)
206+
WITH (partitioning = ARRAY['reg_date']);
207+
208+
-- MV with equivalence: order_date = reg_date
209+
CREATE MATERIALIZED VIEW order_summary
210+
WITH (partitioning = ARRAY['order_date'])
211+
AS
212+
SELECT o.order_id, c.name, o.order_date
213+
FROM orders o
214+
JOIN customers c ON o.customer_id = c.customer_id
215+
AND o.order_date = c.reg_date; -- Creates equivalence
216+
217+
In this example:
218+
219+
* ``orders.order_date`` and ``customers.reg_date`` are equivalent due to the equality join condition
220+
* Even though ``reg_date`` is not in the MV's SELECT list, staleness can be tracked through the equivalence to ``order_date``
221+
* When ``customers`` table changes in partition ``reg_date='2024-01-15'``, this maps to ``order_date='2024-01-15'`` for recomputation
222+
223+
**How Passthrough Mapping Works**
224+
225+
1. **Equivalence Extraction**: During MV creation, Presto analyzes JOIN conditions to identify
226+
column equivalences
227+
228+
2. **Staleness Detection**: When a base table changes:
229+
230+
* The connector detects which data changed in the base table and returns predicates
231+
* For passthrough columns, predicates are mapped through equivalences
232+
* Example: ``customers.reg_date='2024-01-15'`` → ``orders.order_date='2024-01-15'``
233+
234+
3. **Predicate Application**: The mapped predicates are used in:
235+
236+
* Storage scan: Exclude data where equivalent columns match stale values
237+
* Recompute branch: Filter the stale table using the staleness predicate
238+
* Joined tables: Propagate the predicate to equivalent columns in joined
239+
tables, enabling pruning on those tables as well
240+
241+
**Requirements for Passthrough Columns**
242+
243+
* Join must be an INNER JOIN (not LEFT, RIGHT, or FULL OUTER)
244+
* Equality must be direct (``col1 = col2``), not through expressions like ``col1 = col2 + 1``
245+
* At least one column in the equivalence class must be in the MV's output
246+
* Data types must be compatible
247+
248+
**Transitive Equivalences**
249+
250+
Multiple equivalences can be chained together. If ``A.x = B.y`` and ``B.y = C.z``, then
251+
``A.x``, ``B.y``, and ``C.z`` are all equivalent for predicate propagation.
252+
253+
Unsupported Patterns
254+
^^^^^^^^^^^^^^^^^^^^
255+
256+
Predicate stitching is **not** applied in the following cases:
257+
258+
* **No staleness predicates available**: If the connector cannot provide staleness predicates
259+
* **Predicate columns not preserved**: If predicate columns are transformed or not mappable to MV output
260+
* **Outer joins with passthrough**: LEFT, RIGHT, and FULL OUTER joins invalidate passthrough equivalences due to null handling
261+
* **Expression-based equivalences**: ``CAST(col1 AS DATE) = col2`` or ``col1 = col2 + 1``
262+
263+
When predicate stitching cannot be applied, the behavior falls back to the configured consistency mode:
264+
265+
* If ``USE_STITCHING`` is set but stitching is not possible, the query falls back to full
266+
recompute (equivalent to ``USE_VIEW_QUERY``)
267+
* A warning may be logged indicating why stitching was not possible
268+
269+
Performance Considerations
270+
^^^^^^^^^^^^^^^^^^^^^^^^^^^
271+
272+
**When Stitching is Most Effective**
273+
274+
* **Large materialized views**: More benefit from avoiding full recomputation
275+
* **Localized changes**: When only a small fraction of data is stale
276+
* **Frequently refreshed**: When most data remains fresh between queries
277+
* **Well-structured data**: When staleness predicates align with data modification patterns
278+
279+
**Cost Trade-offs**
280+
281+
Predicate stitching introduces a UNION operation, which has overhead:
282+
283+
* **Storage scan overhead**: Reading from storage + filtering fresh data
284+
* **Recompute overhead**: Querying base tables + filtering stale data
285+
* **Union overhead**: Combining results from both branches
286+
287+
However, this is typically much cheaper than:
288+
289+
* **Full recompute**: Reading all base table data
290+
* **Stale data**: Returning incorrect results
291+
292+
**Optimization Tips**
293+
294+
1. **Predicate granularity**: For partition-based connectors, choose partition columns that align
295+
with data modification patterns
296+
297+
* Too coarse (e.g., partitioning by year): Recomputes too much data
298+
* Too fine (e.g., partitioning by second): Too many partitions to manage
299+
300+
2. **Refresh frequency**: Balance freshness needs with refresh costs
301+
302+
* More frequent refreshes: Less recomputation per query, but higher refresh costs
303+
* Less frequent refreshes: More recomputation per query, but lower refresh costs
304+
305+
3. **Query filters**: Include predicate columns in query filters when possible::
306+
307+
-- Good: Limits scan to relevant data
308+
SELECT * FROM mv WHERE order_date >= '2024-01-01'
309+
310+
-- Less optimal: Scans all data
311+
SELECT * FROM mv WHERE customer_id = 12345
312+
313+
4. **Monitor metrics**: Track the ratio of storage scan vs recompute:
314+
315+
* High recompute ratio: Consider more frequent refreshes or better staleness granularity
316+
* High storage scan ratio: Stitching is working efficiently
317+
75318
See Also
76319
--------
77320

presto-docs/src/main/sphinx/connector/iceberg.rst

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2306,18 +2306,27 @@ Property Name Description
23062306

23072307
The storage table inherits standard Iceberg table properties for partitioning, sorting, and file format.
23082308

2309+
Staleness Tracking
2310+
^^^^^^^^^^^^^^^^^^
2311+
2312+
The Iceberg connector tracks materialized view staleness at the partition level, enabling
2313+
partition stitching to recompute only affected partitions rather than the entire view.
2314+
2315+
.. note::
2316+
Partition-level staleness detection only works for append-only changes (INSERT).
2317+
DELETE or UPDATE operations on base tables cause the entire view to be treated
2318+
as stale, requiring full recomputation.
2319+
23092320
Freshness and Refresh
23102321
^^^^^^^^^^^^^^^^^^^^^^
23112322

2312-
Materialized views track the snapshot IDs of their base tables to determine staleness. When base tables are modified, the materialized view becomes stale and returns results by querying the base tables directly. After running ``REFRESH MATERIALIZED VIEW``, queries read from the pre-computed storage table.
2313-
2314-
The refresh operation uses a full refresh strategy, replacing all data in the storage table with the current query results.
2323+
After running ``REFRESH MATERIALIZED VIEW``, queries read from the pre-computed storage table. The refresh operation uses a full refresh strategy, replacing all data in the storage table with the current query results and recording the new snapshot IDs for all base tables.
23152324

23162325
Limitations
23172326
^^^^^^^^^^^
23182327

2319-
- All refreshes recompute the entire result set
2320-
- REFRESH does not provide snapshot isolation across multiple base tables
2328+
- All refreshes recompute the entire result set (incremental refresh not yet supported)
2329+
- REFRESH does not provide snapshot isolation across multiple base tables (each base table's current snapshot is used independently)
23212330
- Querying materialized views at specific snapshots or timestamps is not supported
23222331

23232332
Example

0 commit comments

Comments
 (0)