Add dynamic pruning filters from TopK state #15301

adriangb · 2025-03-19T04:41:14Z

This introduces a general mechanism for arbitrary ExecutionPlans to push down filters to their children.
This mechanism can be used in the future to tackle #7955.
This PR only implements this for the TopK operator.

If filter pushdown on parquet is turned on this is showing a ~3x performance improvement for Q23, overall 10x faster than main.
And I believe there is more juice to squeeze by:

Optimizing order in which files are scanned using file level statistics if they were collected.
Implementing pushdown support for SortPreservingMergeExec which will enable "global" pushdown helping plans with many partitions.

adriangb · 2025-03-19T04:43:30Z

Made a bit of progress on this... I think the general idea of sharing the state is there. The nice thing is that this mechanism can be used to push down other dynamic filters (joins, etc.).

What I'm having trouble with a bit is the wiring... I need to think of:

How the optimizer traverses things, making sure to abort if it passes through any unsupported nodes.
How it recognizes that it's reached a DataSource.
How the DataSource gets modified (in place?) to register the dynamic filter source.

Maybe that's the wrong approach... maybe this should be a method on ExecutionPlan that can recurse but by default is a no-op?

adriangb · 2025-03-19T05:44:19Z

Inspired by discussion in #13054 I went with adding this to ExecutionPlan.

adriangb · 2025-03-19T05:49:32Z

Tomorrow I plan on doing some tracer bullet testing to see if this approach works at all.

adriangb · 2025-03-19T05:53:49Z

cc @alamb

2010YOUY01 · 2025-03-19T07:43:09Z

This is really cool! I have a high-level question:
#15037 suggests pushdown topK to skip certain partitioned file, and #15177 purposed a very similar idea to pushdown topK, however it's using it to skip decoding RowGroup/page for other projected columns.
Is this PR's solution only able to skip file or it's a unified solution?

adriangb · 2025-03-19T12:10:56Z

I think this is just part of the picture. To fully match DuckDB we'd have to do something like the rewrite proposed in #15177 (comment) aka "late materialization" of the projection.

adriangb · 2025-03-19T14:13:33Z

Ok did a tracer bullet test with datafusion-cli and got this to push filters down into parquet filter pushdown!
Ironically it won't push down into stats pruning because of the same thing in #15057 where the pruning predicate is calculated at the planning stage and not when the file is opened.
I can copy the change here, it will be a minor conflict, but I'm somewhat inclined to just tackle one of the two at a time.

adriangb · 2025-03-19T14:18:57Z

I think this is just part of the picture. To fully match DuckDB we'd have to do something like the rewrite proposed in #15177 (comment) aka "late materialization" of the projection.

To expand on this: what I implemented here is just "dump" filter pushdown. To make a query like SELECT * FROM data ORDER BY id DESC LIMIT 10 fast you need the late materialization proposed in that comment or ordering and throttling of file reads (similar to SortPreservingMerge:

You need to order files within each partition so that you read ones "more likely" to produce meaningful filters first. So if you have files with id ranges (1,5) and (3,8) you should read the (3,8) file first. I guess TableProvider's and such need to handle this.
You may want to consider reducing the number of partitions since the fan out may be wasted work: if you do (1) correctly and 1-2 files are enough to fill the TopK then a fan out to 32 partitions means you opened ~30 files for no reason and the whole query would have likely been faster if you focused all effort on those 1-2 files you actually needed.

adriangb · 2025-03-19T14:19:53Z

datafusion/datasource-parquet/src/opener.rs

+        let dynamic_predicate = dynamic_filters.into_iter().reduce(|a, b| {
+            Arc::new(BinaryExpr::new(a, datafusion_expr::Operator::And, b))
+        });
+        // TODO: need to recalculate page and row group pruning predicates to include dynamic filters


See https://github.com/apache/datafusion/pull/15057/files#diff-bbd611d7b35d7f17633eebbf32a07dc9e394f20135754ed949751e8030049e38

adriangb · 2025-03-19T20:30:37Z

@alamb marking this as ready for an initial review. There's still a lot of work to be done I guess (I'd like to see Q23 results!) but I'd like to get some feedback on the approach and missing pieces first. There's a working implementation, including a test; more are needed.

datafusion/common/src/config.rs

adriangb · 2025-03-19T20:31:29Z

datafusion/datasource-parquet/src/mod.rs

 fn should_enable_page_index(
    enable_page_index: bool,
    page_pruning_predicate: &Option<Arc<PagePruningAccessPlanFilter>>,
+    has_dynamic_filters: bool,
 ) -> bool {
    enable_page_index
-        && page_pruning_predicate.is_some()
-        && page_pruning_predicate
-            .as_ref()
-            .map(|p| p.filter_number() > 0)
-            .unwrap_or(false)
+        && (page_pruning_predicate.is_some()
+            && page_pruning_predicate
+                .as_ref()
+                .map(|p| p.filter_number() > 0)
+                .unwrap_or(false))
+        || has_dynamic_filters
 }


Copied from #15057

A comment would be helpful to explain why the presence of dynamic filters should trigger page index enablement.

adriangb · 2025-03-19T20:31:42Z

datafusion/datasource-parquet/src/opener.rs

+            let mut pruning_predicate = pruning_predicate;
+            let mut page_pruning_predicate = page_pruning_predicate;
+
+            if let Some(predicate) = predicate.as_ref() {


Similar to #15057

adriangb · 2025-03-19T20:32:18Z

datafusion/datasource-parquet/src/opener.rs

+
+/// Build a page pruning predicate from an optional predicate expression.
+/// If the predicate is None or the predicate cannot be converted to a page pruning
+/// predicate, return None.
+pub(crate) fn build_page_pruning_predicate(
+    predicate: &Arc<dyn PhysicalExpr>,
+    file_schema: &SchemaRef,
+) -> Arc<PagePruningAccessPlanFilter> {
+    Arc::new(PagePruningAccessPlanFilter::new(
+        predicate,
+        Arc::clone(file_schema),
+    ))
+}
+
+/// A vistor for a PhysicalExpr that collects all column references to determine what columns the expression needs to be evaluated.
+struct FilterSchemaBuilder<'schema> {
+    filter_schema_fields: BTreeSet<Arc<Field>>,
+    file_schema: &'schema Schema,
+    table_schema: &'schema Schema,
+}
+
+impl<'schema> FilterSchemaBuilder<'schema> {
+    fn new(file_schema: &'schema Schema, table_schema: &'schema Schema) -> Self {
+        Self {
+            filter_schema_fields: BTreeSet::new(),
+            file_schema,
+            table_schema,
+        }
+    }
+
+    fn sort_fields(
+        fields: &mut Vec<Arc<Field>>,
+        table_schema: &Schema,
+        file_schema: &Schema,
+    ) {
+        fields.sort_by_key(|f| f.name().to_string());
+        fields.dedup_by_key(|f| f.name().to_string());
+        fields.sort_by_key(|f| {
+            let table_schema_index =
+                table_schema.index_of(f.name()).unwrap_or(usize::MAX);
+            let file_schema_index = file_schema.index_of(f.name()).unwrap_or(usize::MAX);
+            (table_schema_index, file_schema_index)
+        });
+    }
+
+    fn build(self) -> SchemaRef {
+        let mut fields = self.filter_schema_fields.into_iter().collect::<Vec<_>>();
+        FilterSchemaBuilder::sort_fields(
+            &mut fields,
+            self.table_schema,
+            self.file_schema,
+        );
+        Arc::new(Schema::new(fields))
+    }
+}
+
+impl<'node> TreeNodeVisitor<'node> for FilterSchemaBuilder<'_> {
+    type Node = Arc<dyn PhysicalExpr>;
+
+    fn f_down(
+        &mut self,
+        node: &'node Arc<dyn PhysicalExpr>,
+    ) -> Result<TreeNodeRecursion> {
+        if let Some(column) = node.as_any().downcast_ref::<Column>() {
+            if let Ok(field) = self.table_schema.field_with_name(column.name()) {
+                self.filter_schema_fields.insert(Arc::new(field.clone()));
+            } else if let Ok(field) = self.file_schema.field_with_name(column.name()) {
+                self.filter_schema_fields.insert(Arc::new(field.clone()));
+            } else {
+                // valid fields are the table schema's fields + the file schema's fields, preferring the table schema's fields when there is a conflict
+                let mut valid_fields = self
+                    .table_schema
+                    .fields()
+                    .iter()
+                    .chain(self.file_schema.fields().iter())
+                    .cloned()
+                    .collect::<Vec<_>>();
+                FilterSchemaBuilder::sort_fields(
+                    &mut valid_fields,
+                    self.table_schema,
+                    self.file_schema,
+                );
+                let valid_fields = valid_fields
+                    .into_iter()
+                    .map(|f| datafusion_common::Column::new_unqualified(f.name()))
+                    .collect();
+                let field = datafusion_common::Column::new_unqualified(column.name());
+                return Err(datafusion_common::DataFusionError::SchemaError(
+                    SchemaError::FieldNotFound {
+                        field: Box::new(field),
+                        valid_fields,
+                    },
+                    Box::new(None),
+                ));
+            }
+        }
+
+        Ok(TreeNodeRecursion::Continue)
+    }
+}


Copied from #15057

The point I'm trying to make is that I think this is both a useful change and that the real diff here or there will be smaller once the other is merged

adriangb · 2025-03-19T20:32:57Z

datafusion/datasource-parquet/src/source.rs

+    fn supports_dynamic_filter_pushdown(&self) -> bool {
+        true
+    }


I guess I could nix this method and just treat a Ok(None) from push_down_dynamic_filter as not supported

I agree -- that would be a nicer interface -- or return a specific Enum perhaps 🤔

Okay will refactor to do this instead 👍🏻

datafusion/physical-plan/src/topk/mod.rs

adriangb · 2025-03-19T20:34:05Z

datafusion/physical-plan/src/topk/mod.rs

+                lit(threshold.value.clone()),
+            ));
+
+            // TODO: handle nulls first/last?


!!
Not sure exactly how to translate that into a dynamic filter...

This transformation might work:
For nulls-first =>(threshold.value is not null) and (threshold.expr is null or comparison)
For nulls-last => comparison // comparison include (threshold.expr is not null)
That said, if we go with this approach, the following part might no longer be needed:

// Skip null threshold values - can't create a meaningful filter if threshold.value.is_null() { continue; }

WDYT :)

Hurts my brain a little bit but sounds good to me as long as we add good tests 😄
Would you like to make a PR to this PR to add that?

Sure! I'd be happy to give it a try.

adriangb · 2025-03-19T20:34:25Z

datafusion/core/src/datasource/physical_plan/parquet.rs

+        let sql = format!("explain analyze {query}");
+        let batches = ctx.sql(&sql).await.unwrap().collect().await.unwrap();
+        let explain_plan = format!("{}", pretty_format_batches(&batches).unwrap());
+        assert_contains!(explain_plan, "row_groups_pruned_statistics=96");


Proof this works!

Should we also add tests for:

Negative scenarios where dynamic filter pushdown should not be applied.

Edge cases such as empty datasets or cases with only dynamic filters without a static predicate.

Verification that the combined predicate (static AND dynamic) behaves as expected in different configurations.

Yes! More tests! I just tried this in my full system and found a bug w/ hive partition columns. Making a note to add a test and fix.

adriangb · 2025-03-20T00:05:36Z

I ran this against Q23, results look promising! Elapsed 3.173 seconds with datafusion.optimizer.enable_dynamic_filter_pushdown = true vs. 4.696 with false. Both with predicate pushdown turned on.

❯  ../target/release/datafusion-cli -f ../q.sql
DataFusion CLI v46.0.1
0 row(s) fetched.
Elapsed 0.004 seconds.

0 row(s) fetched.
Elapsed 0.000 seconds.

+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type         | plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | SortPreservingMergeExec: [EventTime@4 ASC NULLS LAST], fetch=10, metrics=[output_rows=10, elapsed_compute=49.708µs]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
|                   |   SortExec: TopK(fetch=10), expr=[EventTime@4 ASC NULLS LAST], preserve_partitioning=[true], metrics=[output_rows=120, elapsed_compute=21.495293ms, row_replacements=794]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                   |     DataSourceExec: file_groups={12 groups: [[Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_0.parquet:0..122446530, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_1.parquet:0..174965044, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_10.parquet:0..101513258, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_11.parquet:0..118419888, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_12.parquet:0..149514164, ...], [Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_17.parquet:59860177..116867853, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_18.parquet:0..133119589, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_19.parquet:0..103692598, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_2.parquet:0..230595491, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_20.parquet:0..85766533, ...], [Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_26.parquet:122668027..156510916, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_27.parquet:0..166286210, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_28.parquet:0..162772407, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_29.parquet:0..79213288, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_3.parquet:0..192507052, ...], [Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_35.parquet:54087341..153632381, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_36.parquet:0..92487304, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_37.parquet:0..108247781, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_38.parquet:0..132005180, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_39.parquet:0..103522954, ...], [Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_42.parquet:118278445..288524057, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_43.parquet:0..299692947, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_44.parquet:0..242404750, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_45.parquet:0..148061387, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_46.parquet:0..92407680, ...], ...]}, projection=[WatchID, JavaEnable, Title, GoodEvent, EventTime, EventDate, CounterID, ClientIP, RegionID, UserID, CounterClass, OS, UserAgent, URL, Referer, IsRefresh, RefererCategoryID, RefererRegionID, URLCategoryID, URLRegionID, ResolutionWidth, ResolutionHeight, ResolutionDepth, FlashMajor, FlashMinor, FlashMinor2, NetMajor, NetMinor, UserAgentMajor, UserAgentMinor, CookieEnable, JavascriptEnable, IsMobile, MobilePhone, MobilePhoneModel, Params, IPNetworkID, TraficSourceID, SearchEngineID, SearchPhrase, AdvEngineID, IsArtifical, WindowClientWidth, WindowClientHeight, ClientTimeZone, ClientEventTime, SilverlightVersion1, SilverlightVersion2, SilverlightVersion3, SilverlightVersion4, PageCharset, CodeVersion, IsLink, IsDownload, IsNotBounce, FUniqID, OriginalURL, HID, IsOldCounter, IsEvent, IsParameter, DontCountHits, WithHash, HitColor, LocalEventTime, Age, Sex, Income, Interests, Robotness, RemoteIP, WindowName, OpenerName, HistoryLength, BrowserLanguage, BrowserCountry, SocialNetwork, SocialAction, HTTPError, SendTiming, DNSTiming, ConnectTiming, ResponseStartTiming, ResponseEndTiming, FetchTiming, SocialSourceNetworkID, SocialSourcePage, ParamPrice, ParamOrderID, ParamCurrency, ParamCurrencyID, OpenstatServiceName, OpenstatCampaignID, OpenstatAdID, OpenstatSourceID, UTMSource, UTMMedium, UTMCampaign, UTMContent, UTMTerm, FromTag, HasGCLID, RefererHash, URLHash, CLID], file_type=parquet, predicate=CAST(URL@13 AS Utf8View) LIKE %google%, metrics=[output_rows=6141, elapsed_compute=12ns, bytes_scanned=8336084257, file_open_errors=0, file_scan_errors=0, num_predicate_creation_errors=0, page_index_rows_matched=0, page_index_rows_pruned=0, predicate_evaluation_errors=0, pushdown_rows_matched=11821, pushdown_rows_pruned=65222022, row_groups_matched_bloom_filter=0, row_groups_matched_statistics=136, row_groups_pruned_bloom_filter=0, row_groups_pruned_statistics=117, bloom_filter_eval_time=843.303µs, metadata_load_time=242.295025ms, page_index_eval_time=12.475µs, row_pushdown_eval_time=4.249525318s, statistics_eval_time=2.316302ms, time_elapsed_opening=48.356002ms, time_elapsed_processing=20.809290216s, time_elapsed_scanning_total=25.623879841s, time_elapsed_scanning_until_data=11.100054751s] |
|                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 3.173 seconds.

❯  ../target/release/datafusion-cli -f ../q.sql
DataFusion CLI v46.0.1
0 row(s) fetched.
Elapsed 0.001 seconds.

0 row(s) fetched.
Elapsed 0.000 seconds.

+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type         | plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | SortPreservingMergeExec: [EventTime@4 ASC NULLS LAST], fetch=10, metrics=[output_rows=10, elapsed_compute=56.167µs]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
|                   |   SortExec: TopK(fetch=10), expr=[EventTime@4 ASC NULLS LAST], preserve_partitioning=[true], metrics=[output_rows=120, elapsed_compute=49.926081ms, row_replacements=794]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
|                   |     DataSourceExec: file_groups={12 groups: [[Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_0.parquet:0..122446530, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_1.parquet:0..174965044, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_10.parquet:0..101513258, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_11.parquet:0..118419888, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_12.parquet:0..149514164, ...], [Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_17.parquet:59860177..116867853, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_18.parquet:0..133119589, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_19.parquet:0..103692598, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_2.parquet:0..230595491, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_20.parquet:0..85766533, ...], [Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_26.parquet:122668027..156510916, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_27.parquet:0..166286210, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_28.parquet:0..162772407, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_29.parquet:0..79213288, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_3.parquet:0..192507052, ...], [Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_35.parquet:54087341..153632381, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_36.parquet:0..92487304, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_37.parquet:0..108247781, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_38.parquet:0..132005180, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_39.parquet:0..103522954, ...], [Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_42.parquet:118278445..288524057, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_43.parquet:0..299692947, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_44.parquet:0..242404750, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_45.parquet:0..148061387, Users/adriangb/GitHub/datafusion/benchmarks/data/hits_partitioned/hits_46.parquet:0..92407680, ...], ...]}, projection=[WatchID, JavaEnable, Title, GoodEvent, EventTime, EventDate, CounterID, ClientIP, RegionID, UserID, CounterClass, OS, UserAgent, URL, Referer, IsRefresh, RefererCategoryID, RefererRegionID, URLCategoryID, URLRegionID, ResolutionWidth, ResolutionHeight, ResolutionDepth, FlashMajor, FlashMinor, FlashMinor2, NetMajor, NetMinor, UserAgentMajor, UserAgentMinor, CookieEnable, JavascriptEnable, IsMobile, MobilePhone, MobilePhoneModel, Params, IPNetworkID, TraficSourceID, SearchEngineID, SearchPhrase, AdvEngineID, IsArtifical, WindowClientWidth, WindowClientHeight, ClientTimeZone, ClientEventTime, SilverlightVersion1, SilverlightVersion2, SilverlightVersion3, SilverlightVersion4, PageCharset, CodeVersion, IsLink, IsDownload, IsNotBounce, FUniqID, OriginalURL, HID, IsOldCounter, IsEvent, IsParameter, DontCountHits, WithHash, HitColor, LocalEventTime, Age, Sex, Income, Interests, Robotness, RemoteIP, WindowName, OpenerName, HistoryLength, BrowserLanguage, BrowserCountry, SocialNetwork, SocialAction, HTTPError, SendTiming, DNSTiming, ConnectTiming, ResponseStartTiming, ResponseEndTiming, FetchTiming, SocialSourceNetworkID, SocialSourcePage, ParamPrice, ParamOrderID, ParamCurrency, ParamCurrencyID, OpenstatServiceName, OpenstatCampaignID, OpenstatAdID, OpenstatSourceID, UTMSource, UTMMedium, UTMCampaign, UTMContent, UTMTerm, FromTag, HasGCLID, RefererHash, URLHash, CLID], file_type=parquet, predicate=CAST(URL@13 AS Utf8View) LIKE %google%, metrics=[output_rows=15911, elapsed_compute=12ns, bytes_scanned=14265851248, file_open_errors=0, file_scan_errors=0, num_predicate_creation_errors=0, page_index_rows_matched=0, page_index_rows_pruned=0, predicate_evaluation_errors=0, pushdown_rows_matched=15911, pushdown_rows_pruned=99981586, row_groups_matched_bloom_filter=0, row_groups_matched_statistics=0, row_groups_pruned_bloom_filter=0, row_groups_pruned_statistics=0, bloom_filter_eval_time=222ns, metadata_load_time=1.494189151s, page_index_eval_time=222ns, row_pushdown_eval_time=7.309380164s, statistics_eval_time=222ns, time_elapsed_opening=16.006422ms, time_elapsed_processing=41.532244256s, time_elapsed_scanning_total=46.85323142s, time_elapsed_scanning_until_data=14.505093042s] |
|                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 4.696 seconds.

alamb · 2025-03-21T00:28:33Z

I ran this against Q23, results look promising! Elapsed 3.173 seconds with datafusion.optimizer.enable_dynamic_filter_pushdown = true vs. 4.696 with false. Both with predicate pushdown turned on.

Nice!

I plan to check this PR out tomorrow carefully

kosiew

Great work!
A lot better than my earlier tinkering effort at this.

I left some comments for your consideration.

datafusion/common/src/config.rs

kosiew · 2025-03-21T01:26:49Z

datafusion/core/src/datasource/physical_plan/parquet.rs

+        let sql = format!("explain analyze {query}");
+        let batches = ctx.sql(&sql).await.unwrap().collect().await.unwrap();
+        let explain_plan = format!("{}", pretty_format_batches(&batches).unwrap());
+        assert_contains!(explain_plan, "row_groups_pruned_statistics=96");


Should we also add tests for:

Negative scenarios where dynamic filter pushdown should not be applied.

Edge cases such as empty datasets or cases with only dynamic filters without a static predicate.

Verification that the combined predicate (static AND dynamic) behaves as expected in different configurations.

kosiew · 2025-03-21T01:29:24Z

datafusion/datasource-parquet/src/mod.rs

 fn should_enable_page_index(
    enable_page_index: bool,
    page_pruning_predicate: &Option<Arc<PagePruningAccessPlanFilter>>,
+    has_dynamic_filters: bool,
 ) -> bool {
    enable_page_index
-        && page_pruning_predicate.is_some()
-        && page_pruning_predicate
-            .as_ref()
-            .map(|p| p.filter_number() > 0)
-            .unwrap_or(false)
+        && (page_pruning_predicate.is_some()
+            && page_pruning_predicate
+                .as_ref()
+                .map(|p| p.filter_number() > 0)
+                .unwrap_or(false))
+        || has_dynamic_filters
 }


A comment would be helpful to explain why the presence of dynamic filters should trigger page index enablement.

kosiew · 2025-03-21T01:41:58Z

datafusion/datasource-parquet/src/opener.rs

+        // Collect dynamic_filters into a single predicate by reducing with AND
+        let dynamic_predicate = dynamic_filters.into_iter().reduce(|a, b| {
+            Arc::new(BinaryExpr::new(a, datafusion_expr::Operator::And, b))


The dynamic filters are reduced using the AND operator when combined with the static predicate. It would help to document the reasoning behind this choice and any assumptions about how these predicates interact.

eg Using the AND operator to combine the dynamic filters with the static predicate means that a row (or a row group) must satisfy both conditions before it's read from disk.

The approach assumes that the static predicate and dynamic filters are independent and complementary. In other words, the dynamic filters are not meant to replace or override the original predicate; they refine the set of rows even further. If they were combined using OR, you might end up with more rows than necessary, which would negate the benefits of dynamic filtering.

Since the dynamic filters are calculated at runtime, they might sometimes be conservative estimates. By combining them with AND, the system errs on the side of safety—only excluding data when it’s reasonably certain that the rows won’t match the overall query conditions.

kosiew · 2025-03-21T02:11:49Z

datafusion/physical-plan/src/topk/mod.rs

+        let mut heap = self.heap.try_write().map_err(|_| {
+            DataFusionError::Internal(
+                "Failed to acquire write lock on TopK heap".to_string(),


The use of try_write() here and
let mut heap = self.heap.write().map_err later at line 235 immediately return an error if the lock cannot be acquired. This means that if another thread holds the lock—even for a brief moment—the current thread will error out and convert the failure into an internal error. In the blocking version using write(), if the lock acquisition fails due to a poisoned lock, it’s also immediately converted into an internal error.

Converting lock acquisition failures into an internal error is a straightforward approach. It flags an unexpected situation but doesn’t provide details on the nature of the contention or any attempt to recover from transient lock conflicts.

Should we consider retry for transient contention or add logging for diagnostic information?

a think a poisoned lock is likely not a scenario that happens often in practice so it is not something that needs a lot of special handing

side issue here - is there a "policy" about using std locks vs parking_lot? a basic search showed they are used roughly equally but it seems like a weird inconsistency (There's also this ~3 year-old PR by @xudong963).

kosiew · 2025-03-21T02:29:37Z

datafusion/datasource/src/file.rs

+
+    fn supports_dynamic_filter_pushdown(&self) -> bool {
+        false
+    }
+
+    fn push_down_dynamic_filter(
+        &self,
+        _dynamic_filter: Arc<dyn DynamicFilterSource>,
+    ) -> datafusion_common::Result<Option<Arc<dyn FileSource>>> {
+        Ok(None)
+    }


This PR adds support for dynamic filter pushdown in multiple modules (e.g., in FileSource, DataSource, ProjectionExec, RepartitionExec, FilterExec, and SortExec).

Common helper functions or traits could reduce code duplication.
For example, a shared trait for dynamic filter pushdown behavior might centralize the logic and reduce maintenance overhead.

datafusion/physical-plan/src/sorts/sort.rs

2010YOUY01 · 2025-03-21T04:57:40Z

I ran this against Q23, results look promising! Elapsed 3.173 seconds with datafusion.optimizer.enable_dynamic_filter_pushdown = true vs. 4.696 with false. Both with predicate pushdown turned on.

Could you share the benchmark script? I tried but the results are similar, perhaps I missed some configurations.

geoffreyclaude · 2025-04-02T10:36:12Z

@adriangb looks super promising, especially as it paves the way for general dynamic filtering!

I tested out your branch to see the overlap with #15529, and I had trouble understanding exactly when and how the filters got applied. I dumped the execution plan after running a topk query and the filters were way broader than what I expected.

Would it be possible to add an end-to-end test as a "demo" of the feature, especially validating the final filters?

geoffreyclaude · 2025-04-02T10:58:13Z

datafusion/core/tests/parquet/mod.rs

@@ -1072,3 +1080,364 @@ async fn make_test_file_page(scenario: Scenario, row_per_page: usize) -> NamedTe
    writer.close().unwrap();
    output_file
 }
+
+struct DynamicFilterTestCase {


Would it make sense to move this test block to a dedicated file? The parquet/mod.rs is already pretty long

Yep moved into parquet/filter_pushdown.rs which is both smaller and more related

2010YOUY01

Great to see the progress! I’ve left a few suggestions that might help improve the code’s clarity—especially around parts that weren’t immediately obvious to me on the first read.

datafusion/physical-optimizer/src/filter_pushdown.rs

datafusion/physical-plan/src/execution_plan.rs

2010YOUY01 · 2025-04-02T12:40:29Z

datafusion/physical-plan/src/sorts/sort.rs

                Ok(Box::pin(RecordBatchStreamAdapter::new(
                    self.schema(),
                    futures::stream::once(async move {
                        while let Some(batch) = input.next().await {
                            let batch = batch?;
                            topk.insert_batch(batch)?;
+                            if enable_dynamic_filter_pushdown {


I think the code to update the current executor's dynamic filter can be extracted to a separate function like self.maybe_update_dynamic_filter(...)

Perhaps as a follow-up PR to make this PR easier to merge.

adriangb · 2025-04-02T06:40:35Z

datafusion/core/src/datasource/physical_plan/arrow_file.rs

-#[derive(Clone, Default)]
+#[derive(Clone, Default, Debug)]


This was just an annoyance during debugging. Can revert.

Looks good to me -- we can also make a separate PR for it too (not needed)

adriangb · 2025-04-02T06:41:39Z

datafusion/core/src/datasource/listing/table.rs

-                filters.as_ref(),
+                None,


We kind of need to do this otherwise you end up with duplicate filters for the case where ListingTable said inexact -> a FilterExec gets created -> we then push down from the filter exec in to the DataSourceExec that already had the filter -> duplicate filter.

Because essentially this PR is having to introduce a generalized way to do filter pushdown instead of the very specific way that ListingTable does it. And we wouldn't want to do both at the same time. What we want is for ListingTable to tell us:

Which filters it can apply just from partitioning (Exact)

Any other filter becomes Inexact

I wonder if we can split out the part of this PR that changes how the filters are pushed down (aka pruning predicate per file rather than one overall) as a separate PR to isolate the changes into smaller PRs?

adriangb · 2025-04-02T06:44:10Z

datafusion/datasource-parquet/src/source.rs

@@ -498,6 +489,7 @@ impl FileSource for ParquetSource {
            reorder_filters: self.reorder_filters(),
            enable_page_index: self.enable_page_index(),
            enable_bloom_filter: self.bloom_filter_on_read(),
+            enable_stats_pruning: self.table_parquet_options.global.pruning,


I had to add this for the case where this is disabled (false): otherwise if we push down filters they end up getting used for row group pruning, which differs from our current behavior.

alamb · 2025-04-02T17:51:32Z

Startingt o check this one out again

adriangb · 2025-04-02T17:58:12Z

@alamb check out ecc89f9#diff-05ace4c36d20453103f49749bad98864aea48680b0a4d5691d7ba5185d8ae4c9, I added a lot of docs / comments

adriangb · 2025-04-02T18:03:20Z

datafusion/physical-optimizer/src/filter_pushdown.rs

+/// ```text
+// ┌──────────────────────┐
+// │ CoalesceBatchesExec  │
+// └──────────────────────┘
+//             │
+//             ▼
+// ┌──────────────────────┐
+// │      FilterExec      │
+// │    filters =         │
+// │     [cost>50,id=1]   │
+// └──────────────────────┘
+//             │
+//             ▼
+// ┌──────────────────────┐
+// │    ProjectionExec    │
+// │ cost = price * 1.2   │
+// └──────────────────────┘
+//             │
+//             ▼
+// ┌──────────────────────┐
+// │    DataSourceExec    │
+// │    projection = *    │
+// └──────────────────────┘
+/// ```


I think this is an interesting example.
It made me realize we need to expand PhysicalExpr::suports_filter_pushdown(&self) -> bool to PhysicalExpr::suports_filter_pushdown(&self, filters: &[&Arc<dyn PhysicalExpr>]) -> Vec<FilterPushdownSupport> or similar so that ProjectionExec can check which filters reference the columns it is creating and block those but allow others

alamb

First of all, thank you so much @adriangb @AdamGS @ctsk @YjyJeff @2010YOUY01 @kosiew @suibianwanwank @geoffreyclaude and @berkaysynnada -- this is a pretty amazing piece of optimization and technology and a great team effort.

In my opinion this is a very important feature and the structure in this PR is a great foundation for more general dynamic filtering as @geoffreyclaude says.

Suggested next steps

Since it is looking good, and we should start working on merging it. However, as you have said given the size of this PR I think it might be easier to do so if we broke it into pieces

I suggest we first make a PR for adding the physical filter pushdown (and associated ExecutionPlan methods) in datafusion/physical-optimizer/src/filter_pushdown.rs

BTW

It is quite cool to see a dynamic filter in this explain plan (predicate=DynamicFilterPhysicalExpr [ SortDynamicFilterSource[ ] ] |)

> explain format indent select * from hits ORDER BY "EventTime" DESC limit 10;
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type     | plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| logical_plan  | Sort: hits.EventTime DESC NULLS FIRST, fetch=10                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
|               |   TableScan: hits projection=[WatchID, JavaEnable, Title, GoodEvent, EventTime, EventDate, CounterID, ClientIP, RegionID, UserID, CounterClass, OS, UserAgent, URL, Referer, IsRefresh, RefererCategoryID, RefererRegionID, URLCategoryID, URLRegionID, ResolutionWidth, ResolutionHeight, ResolutionDepth, FlashMajor, FlashMinor, FlashMinor2, NetMajor, NetMinor, UserAgentMajor, UserAgentMinor, CookieEnable, JavascriptEnable, IsMobile, MobilePhone, MobilePhoneModel, Params, IPNetworkID, TraficSourceID, SearchEngineID, SearchPhrase, AdvEngineID, IsArtifical, WindowClientWidth, WindowClientHeight, ClientTimeZone, ClientEventTime, SilverlightVersion1, SilverlightVersion2, SilverlightVersion3, SilverlightVersion4, PageCharset, CodeVersion, IsLink, IsDownload, IsNotBounce, FUniqID, OriginalURL, HID, IsOldCounter, IsEvent, IsParameter, DontCountHits, WithHash, HitColor, LocalEventTime, Age, Sex, Income, Interests, Robotness, RemoteIP, WindowName, OpenerName, HistoryLength, BrowserLanguage, BrowserCountry, SocialNetwork, SocialAction, HTTPError, SendTiming, DNSTiming, ConnectTiming, ResponseStartTiming, ResponseEndTiming, FetchTiming, SocialSourceNetworkID, SocialSourcePage, ParamPrice, ParamOrderID, ParamCurrency, ParamCurrencyID, OpenstatServiceName, OpenstatCampaignID, OpenstatAdID, OpenstatSourceID, UTMSource, UTMMedium, UTMCampaign, UTMContent, UTMTerm, FromTag, HasGCLID, RefererHash, URLHash, CLID]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| physical_plan | SortPreservingMergeExec: [EventTime@4 DESC], fetch=10                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
|               |   SortExec: TopK(fetch=10), expr=[EventTime@4 DESC], preserve_partitioning=[true]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
|               |     DataSourceExec: file_groups={16 groups: [[Users/andrewlamb/Downloads/hits/hits.parquet:0..923748528], [Users/andrewlamb/Downloads/hits/hits.parquet:923748528..1847497056], [Users/andrewlamb/Downloads/hits/hits.parquet:1847497056..2771245584], [Users/andrewlamb/Downloads/hits/hits.parquet:2771245584..3694994112], [Users/andrewlamb/Downloads/hits/hits.parquet:3694994112..4618742640], ...]}, projection=[WatchID, JavaEnable, Title, GoodEvent, EventTime, EventDate, CounterID, ClientIP, RegionID, UserID, CounterClass, OS, UserAgent, URL, Referer, IsRefresh, RefererCategoryID, RefererRegionID, URLCategoryID, URLRegionID, ResolutionWidth, ResolutionHeight, ResolutionDepth, FlashMajor, FlashMinor, FlashMinor2, NetMajor, NetMinor, UserAgentMajor, UserAgentMinor, CookieEnable, JavascriptEnable, IsMobile, MobilePhone, MobilePhoneModel, Params, IPNetworkID, TraficSourceID, SearchEngineID, SearchPhrase, AdvEngineID, IsArtifical, WindowClientWidth, WindowClientHeight, ClientTimeZone, ClientEventTime, SilverlightVersion1, SilverlightVersion2, SilverlightVersion3, SilverlightVersion4, PageCharset, CodeVersion, IsLink, IsDownload, IsNotBounce, FUniqID, OriginalURL, HID, IsOldCounter, IsEvent, IsParameter, DontCountHits, WithHash, HitColor, LocalEventTime, Age, Sex, Income, Interests, Robotness, RemoteIP, WindowName, OpenerName, HistoryLength, BrowserLanguage, BrowserCountry, SocialNetwork, SocialAction, HTTPError, SendTiming, DNSTiming, ConnectTiming, ResponseStartTiming, ResponseEndTiming, FetchTiming, SocialSourceNetworkID, SocialSourcePage, ParamPrice, ParamOrderID, ParamCurrency, ParamCurrencyID, OpenstatServiceName, OpenstatCampaignID, OpenstatAdID, OpenstatSourceID, UTMSource, UTMMedium, UTMCampaign, UTMContent, UTMTerm, FromTag, HasGCLID, RefererHash, URLHash, CLID], file_type=parquet, predicate=DynamicFilterPhysicalExpr [ SortDynamicFilterSource[  ] ] |
|               |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2 row(s) fetched.
Elapsed 0.059 seconds.

alamb · 2025-04-02T17:55:27Z

datafusion/core/src/datasource/physical_plan/arrow_file.rs

-#[derive(Clone, Default)]
+#[derive(Clone, Default, Debug)]


Looks good to me -- we can also make a separate PR for it too (not needed)

alamb · 2025-04-02T18:02:41Z

datafusion/datasource-parquet/src/source.rs

@@ -349,11 +337,13 @@ impl ParquetSource {
    }

    /// Optional reference to this parquet scan's pruning predicate
+    #[deprecated(note = "ParquetDataSource no longer constructs a PruningPredicate.")]
    pub fn pruning_predicate(&self) -> Option<&Arc<PruningPredicate>> {
        self.pruning_predicate.as_ref()


For anyone following along

Remove ParquetSource::pruning_predicate #15534

alamb · 2025-04-02T18:26:48Z

datafusion/core/src/datasource/listing/table.rs

-                filters.as_ref(),
+                None,


I wonder if we can split out the part of this PR that changes how the filters are pushed down (aka pruning predicate per file rather than one overall) as a separate PR to isolate the changes into smaller PRs?

alamb · 2025-04-02T18:27:46Z

I am trying to write some tests for filter_pushdown now, btw, as a way to help make that a separate PR

alamb · 2025-04-02T20:10:36Z

I am trying to write some tests for filter_pushdown now, btw, as a way to help make that a separate PR

@adriangb :

here is what I have in mind: Add filter pushdown tests pydantic/datafusion#16

If you like this pattern I recommend:

Merge that PR
Start a new PR for adding the pushdown code and I'll work on adding more tests there

jayzhan211 · 2025-04-03T03:02:02Z

datafusion/physical-optimizer/src/filter_pushdown.rs

+/// There are two pushdowns we can do here:
+/// 1. Push down the `[d.size > 100]` filter through the `HashJoinExec` node to the `DataSourceExec` node for the `departments` table.
+/// 2. Push down the hash table state from the `HashJoinExec` node to the `DataSourceExec` node to avoid reading
+///    rows from teh `users` table that will be eliminated by the join.


Suggested change

/// rows from teh `users` table that will be eliminated by the join.

/// rows from the `users` table that will be eliminated by the join.

jayzhan211 · 2025-04-03T03:41:26Z

datafusion/physical-plan/src/filter.rs

+                    input: Arc::clone(&self.input),
+                    metrics: self.metrics.clone(),
+                    default_selectivity: self.default_selectivity,
+                    cache: self.cache.clone(),


if predicate is updated, should we update the cache too?

jayzhan211 · 2025-04-03T03:47:26Z

datafusion/physical-plan/src/dynamic_filters.rs

+    }
+}
+
+impl PhysicalExpr for DynamicFilterPhysicalExpr {


PhysicalExpr is usually in physical-expr crate

jayzhan211 · 2025-04-03T03:56:44Z

datafusion/physical-plan/src/sorts/sort_filters.rs

+    /// Sort expressions
+    expr: LexOrdering,
+    /// Current threshold values
+    thresholds: Arc<RwLock<Vec<Option<ScalarValue>>>>,


do we need Arc<RwLock<T>>? I think we will only have single instance so it is safe to update the values

jayzhan211 · 2025-04-03T03:57:05Z

datafusion/physical-plan/src/sorts/sort_filters.rs

+    pub fn update_values(&self, new_values: &[ScalarValue]) -> Result<()> {
+        let replace = {
+            let thresholds = self.thresholds.read().map_err(|_| {
+                datafusion_common::DataFusionError::Execution(


exec_err!()

jayzhan211 · 2025-04-03T07:08:04Z

datafusion/physical-expr/src/utils/mod.rs

+            None => Some(predicate),
+            Some(acc) => Some(Arc::new(BinaryExpr::new(acc, Operator::And, predicate))),
+        })
+        .unwrap_or_else(|| crate::expressions::lit(true))


I think we can check the len of predicates before calling this function so we know need to create lit(true) for this case

jayzhan211 · 2025-04-03T07:20:29Z

datafusion/physical-plan/src/dynamic_filters.rs

+///
+/// See `TopKDynamicFilterSource` in datafusion/physical-plan/src/topk/mod.rs for examples.
+pub trait DynamicFilterSource:
+    Send + Sync + std::fmt::Debug + DynEq + DynHash + Display + 'static


Do we need 'static

jayzhan211 · 2025-04-03T07:21:59Z

datafusion/datasource/src/file_scan_config.rs

+
+    fn push_down_filters(
+        &self,
+        filters: &[Arc<dyn PhysicalExpr>],


Suggested change

filters: &[Arc<dyn PhysicalExpr>],

filters: &[PhysicalExprRef],

jayzhan211 · 2025-04-03T07:33:17Z

datafusion/datasource/src/file_scan_config.rs

+        filters: &[Arc<dyn PhysicalExpr>],
+    ) -> Result<Option<DataSourceFilterPushdownResult>> {
+        if let Some(file_source_result) = self.file_source.push_down_filters(filters)? {
+            let mut new_self = self.clone();


We could try to avoid clone if possible

fn push_down_filters( self: Arc<Self>, filters: &[Arc<dyn PhysicalExpr>], ) -> Result<Option<DataSourceFilterPushdownResult>> { if let Some(file_source_result) = self.file_source.push_down_filters(filters)? { let mut inner = Arc::into_inner(self).unwrap(); inner.file_source = file_source_result.inner; Ok(Some(DataSourceFilterPushdownResult { inner: Arc::new(inner) as Arc<dyn DataSource>, support: file_source_result.support, })) } else { Ok(None) } }

adriangb · 2025-04-03T18:57:09Z

I'm closing this now massive PR in favor of splitting it up into units of work #15512 (comment)

Thank you all for the amazing reviews! Lets continue work in the smaller PRs so that it's more tractable and easier to review / diff.

Dandandan · 2025-04-14T04:53:11Z

datafusion/physical-plan/src/sorts/sort_filters.rs

+        let mut filters: Vec<Arc<dyn PhysicalExpr>> =
+            Vec::with_capacity(thresholds.len());
+
+        let mut prev_sort_expr: Option<Arc<dyn PhysicalExpr>> = None;


I think this can be written with fewer expressions, like the following:

col0 < threshold0 || (col0 == threshold0 && col1 < threshold1 || (...)

github-actions bot added optimizer Optimizer rules core Core DataFusion crate common Related to common crate datasource Changes to the datasource crate labels Mar 19, 2025

github-actions bot added sqllogictest SQL Logic Tests (.slt) and removed optimizer Optimizer rules core Core DataFusion crate labels Mar 19, 2025

github-actions bot added the documentation Improvements or additions to documentation label Mar 19, 2025

adriangb commented Mar 19, 2025

View reviewed changes

github-actions bot added the core Core DataFusion crate label Mar 19, 2025

adriangb marked this pull request as ready for review March 19, 2025 20:28

adriangb commented Mar 19, 2025

View reviewed changes

This was referenced Mar 20, 2025

Make ClickBench Q23 Go Faster #15177

Open

Dynamic pruning filters from TopK state (optimize ORDER BY LIMIT queries) #15037

Open

Push Dynamic Join Predicates into Scan ("Sideways Information Passing", etc) #7955

Open

adriangb force-pushed the topk-filter branch from 9e146aa to ea0e012 Compare March 20, 2025 22:01

kosiew reviewed Mar 21, 2025

View reviewed changes

2010YOUY01 reviewed Mar 21, 2025

View reviewed changes

datafusion/physical-plan/src/sorts/sort.rs Outdated Show resolved Hide resolved

geoffreyclaude reviewed Apr 2, 2025

View reviewed changes

2010YOUY01 reviewed Apr 2, 2025

View reviewed changes

adriangb commented Apr 2, 2025

View reviewed changes

adriangb mentioned this pull request Apr 2, 2025

Extend TopK early termination to partially sorted inputs #15529

Closed

a lot of comments

ecc89f9

alamb mentioned this pull request Apr 2, 2025

Remove ParquetSource::pruning_predicate #15545

Closed

adriangb commented Apr 2, 2025

View reviewed changes

alamb mentioned this pull request Apr 2, 2025

Add topk information into tree explain plans #15546

Closed

alamb reviewed Apr 2, 2025

View reviewed changes

alamb mentioned this pull request Apr 2, 2025

Add filter pushdown tests pydantic/datafusion#16

Merged

adriangb and others added 6 commits April 2, 2025 15:17

refactoring

e59aac5

fixes

a7ce3bc

fix additional test

5ebba12

Chore: Add basic filter pushdown tests (#16)

d423866

fix new tests

b683507

fixes

fbf93a2

github-actions bot removed the sqllogictest SQL Logic Tests (.slt) label Apr 3, 2025

jayzhan211 reviewed Apr 3, 2025

View reviewed changes

adriangb mentioned this pull request Apr 3, 2025

parquet reader: move pruning predicate creation from ParquetSource to ParquetOpener #15561

Merged

adriangb closed this Apr 3, 2025

adriangb mentioned this pull request Apr 11, 2025

dynamic filter refactor #15685

Draft

Dandandan reviewed Apr 14, 2025

View reviewed changes

This was referenced Apr 14, 2025

Optimize TopK with threshold filter ~1.4x speedup #15697

Open

[Epic] A collection of dynamic filtering related items #15512

Open

	/// rows from teh `users` table that will be eliminated by the join.
	/// rows from the `users` table that will be eliminated by the join.

	filters: &[Arc<dyn PhysicalExpr>],
	filters: &[PhysicalExprRef],

Add dynamic pruning filters from TopK state #15301

Add dynamic pruning filters from TopK state #15301

Conversation

adriangb commented Mar 19, 2025 • edited Loading

adriangb commented Mar 19, 2025

adriangb commented Mar 19, 2025

adriangb commented Mar 19, 2025

adriangb commented Mar 19, 2025

2010YOUY01 commented Mar 19, 2025

adriangb commented Mar 19, 2025

adriangb commented Mar 19, 2025

adriangb commented Mar 19, 2025

Choose a reason for hiding this comment

adriangb commented Mar 19, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

suibianwanwank Mar 20, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adriangb commented Mar 20, 2025 • edited Loading

alamb commented Mar 21, 2025

kosiew left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AdamGS Mar 27, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

2010YOUY01 commented Mar 21, 2025

geoffreyclaude commented Apr 2, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

2010YOUY01 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adriangb Apr 2, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alamb commented Apr 2, 2025

adriangb commented Apr 2, 2025

Choose a reason for hiding this comment

alamb left a comment

Choose a reason for hiding this comment

Suggested next steps

BTW

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alamb commented Apr 2, 2025

alamb commented Apr 2, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adriangb commented Apr 3, 2025

adriangb commented Mar 19, 2025 •

edited

Loading

suibianwanwank Mar 20, 2025 •

edited

Loading

adriangb commented Mar 20, 2025 •

edited

Loading

AdamGS Mar 27, 2025 •

edited

Loading

geoffreyclaude commented Apr 2, 2025 •

edited

Loading

adriangb Apr 2, 2025 •

edited

Loading