Skip to content

POC: Test DataFusion with experimental Parquet Filter Pushdown (try 4) #16711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Jul 7, 2025

Which issue does this PR close?

related to

This is a PR to test the next generation parquet pushdown:

Builds on

It forces filter_pushdown on by default and pins to

@github-actions github-actions bot added documentation Improvements or additions to documentation sql SQL Planner logical-expr Logical plan and expressions core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) substrait Changes to the substrait crate common Related to common crate proto Related to proto crate datasource Changes to the datasource crate labels Jul 7, 2025
@alamb
Copy link
Contributor Author

alamb commented Jul 7, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubuntu SMP Wed May 28 02:40:52 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/test_pushdown (f46ed19) to ebb8e95 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@XiangpengHao
Copy link
Contributor

XiangpengHao commented Jul 7, 2025

I'm taking a look at the test failures..

@zhuqi-lucas
Copy link
Contributor

Thank you @alamb @XiangpengHao ,

I believe we need to compare the performance apache/arrow-rs#7850 with #16690 instead of the main branch.

Because #16690 will gain improvement from apache/arrow-rs#7802.

So we may:

  1. DRAFT: Update arrow/parquet to 56.0.0 #16690 compare to main
  2. And Parquet filter pushdown v4 arrow-rs#7850 compare to main
  3. Compare the above two result

Or we can directly use script to compare two branch?

@XiangpengHao
Copy link
Contributor

XiangpengHao commented Jul 8, 2025

I believe the bugs are fixed in 2cf1a8f82f722e1c7e4857d7b07ba726f67d9f2f

Can you @alamb point to that commit and try the benchmark again?

I believe some tests will still fail but because we make filter pushdown to be true by default, which breaks some testing assumptions.

@alamb
Copy link
Contributor Author

alamb commented Jul 9, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubuntu SMP Wed May 28 02:40:52 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/test_pushdown (2ddefd3) to ebb8e95 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@alamb
Copy link
Contributor Author

alamb commented Jul 9, 2025

🤖: Benchmark completed

Details

Comparing HEAD and alamb_test_pushdown
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ alamb_test_pushdown ┃         Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 0     │  1892.39 ms │          1970.36 ms │      no change │
│ QQuery 1     │   686.90 ms │           657.67 ms │      no change │
│ QQuery 2     │  1328.20 ms │          1354.20 ms │      no change │
│ QQuery 3     │   673.35 ms │           674.01 ms │      no change │
│ QQuery 4     │  1370.07 ms │          1529.65 ms │   1.12x slower │
│ QQuery 5     │ 14964.35 ms │         15233.16 ms │      no change │
│ QQuery 6     │  2037.13 ms │           128.92 ms │ +15.80x faster │
│ QQuery 7     │  1941.66 ms │          2036.65 ms │      no change │
└──────────────┴─────────────┴─────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 24894.05ms │
│ Total Time (alamb_test_pushdown)   │ 23584.61ms │
│ Average Time (HEAD)                │  3111.76ms │
│ Average Time (alamb_test_pushdown) │  2948.08ms │
│ Queries Faster                     │          1 │
│ Queries Slower                     │          1 │
│ Queries with No Change             │          6 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ alamb_test_pushdown ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.22 ms │             2.29 ms │     no change │
│ QQuery 1     │    34.01 ms │            34.72 ms │     no change │
│ QQuery 2     │    83.25 ms │            78.03 ms │ +1.07x faster │
│ QQuery 3     │   100.41 ms │            96.41 ms │     no change │
│ QQuery 4     │   600.36 ms │           587.98 ms │     no change │
│ QQuery 5     │   843.23 ms │           840.84 ms │     no change │
│ QQuery 6     │     2.21 ms │             2.22 ms │     no change │
│ QQuery 7     │    37.12 ms │            50.19 ms │  1.35x slower │
│ QQuery 8     │   845.96 ms │           844.78 ms │     no change │
│ QQuery 9     │  1168.23 ms │          1158.68 ms │     no change │
│ QQuery 10    │   258.85 ms │           278.88 ms │  1.08x slower │
│ QQuery 11    │   292.42 ms │           322.27 ms │  1.10x slower │
│ QQuery 12    │   857.00 ms │           924.52 ms │  1.08x slower │
│ QQuery 13    │  1216.58 ms │          1265.34 ms │     no change │
│ QQuery 14    │   784.71 ms │           994.96 ms │  1.27x slower │
│ QQuery 15    │   764.68 ms │           787.23 ms │     no change │
│ QQuery 16    │  1567.26 ms │          1582.51 ms │     no change │
│ QQuery 17    │  1569.23 ms │          1564.78 ms │     no change │
│ QQuery 18    │  2799.36 ms │          2830.81 ms │     no change │
│ QQuery 19    │    87.21 ms │            93.40 ms │  1.07x slower │
│ QQuery 20    │  1126.43 ms │          1138.49 ms │     no change │
│ QQuery 21    │  1269.44 ms │          1301.27 ms │     no change │
│ QQuery 22    │  2121.94 ms │          2426.07 ms │  1.14x slower │
│ QQuery 23    │  7309.29 ms │           864.63 ms │ +8.45x faster │
│ QQuery 24    │   442.75 ms │           196.45 ms │ +2.25x faster │
│ QQuery 25    │   292.53 ms │           353.15 ms │  1.21x slower │
│ QQuery 26    │   432.18 ms │           265.10 ms │ +1.63x faster │
│ QQuery 27    │  1548.48 ms │                FAIL │  incomparable │
│ QQuery 28    │ 12755.57 ms │         12624.05 ms │     no change │
│ QQuery 29    │   525.71 ms │           529.92 ms │     no change │
│ QQuery 30    │   758.48 ms │          1197.40 ms │  1.58x slower │
│ QQuery 31    │   780.68 ms │          1172.13 ms │  1.50x slower │
│ QQuery 32    │  2388.96 ms │          2383.89 ms │     no change │
│ QQuery 33    │  3116.28 ms │          3167.65 ms │     no change │
│ QQuery 34    │  3155.33 ms │          3169.24 ms │     no change │
│ QQuery 35    │  1248.46 ms │          1215.70 ms │     no change │
│ QQuery 36    │   118.35 ms │            26.88 ms │ +4.40x faster │
│ QQuery 37    │    52.03 ms │            26.41 ms │ +1.97x faster │
│ QQuery 38    │   119.13 ms │            26.66 ms │ +4.47x faster │
│ QQuery 39    │   196.81 ms │            26.30 ms │ +7.48x faster │
│ QQuery 40    │    41.05 ms │            27.18 ms │ +1.51x faster │
│ QQuery 41    │    38.40 ms │            26.41 ms │ +1.45x faster │
│ QQuery 42    │    31.92 ms │            26.22 ms │ +1.22x faster │
└──────────────┴─────────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 52235.99ms │
│ Total Time (alamb_test_pushdown)   │ 46532.02ms │
│ Average Time (HEAD)                │  1243.71ms │
│ Average Time (alamb_test_pushdown) │  1107.91ms │
│ Queries Faster                     │         11 │
│ Queries Slower                     │         10 │
│ Queries with No Change             │         21 │
│ Queries with Failure               │          1 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ alamb_test_pushdown ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  99.20 ms │            98.19 ms │     no change │
│ QQuery 2     │  20.52 ms │            21.15 ms │     no change │
│ QQuery 3     │  32.37 ms │            32.17 ms │     no change │
│ QQuery 4     │  18.56 ms │            18.41 ms │     no change │
│ QQuery 5     │  50.10 ms │            49.93 ms │     no change │
│ QQuery 6     │  11.50 ms │            11.48 ms │     no change │
│ QQuery 7     │  90.14 ms │            91.78 ms │     no change │
│ QQuery 8     │  24.03 ms │            24.69 ms │     no change │
│ QQuery 9     │  53.28 ms │            53.50 ms │     no change │
│ QQuery 10    │  42.31 ms │            40.59 ms │     no change │
│ QQuery 11    │  11.36 ms │            11.33 ms │     no change │
│ QQuery 12    │  34.45 ms │            32.44 ms │ +1.06x faster │
│ QQuery 13    │  25.92 ms │            26.46 ms │     no change │
│ QQuery 14    │   9.67 ms │             9.72 ms │     no change │
│ QQuery 15    │  19.92 ms │            18.85 ms │ +1.06x faster │
│ QQuery 16    │  17.84 ms │            18.13 ms │     no change │
│ QQuery 17    │  94.99 ms │            95.22 ms │     no change │
│ QQuery 18    │ 190.36 ms │           189.74 ms │     no change │
│ QQuery 19    │  24.70 ms │            24.10 ms │     no change │
│ QQuery 20    │  31.10 ms │            31.77 ms │     no change │
│ QQuery 21    │ 145.17 ms │           142.60 ms │     no change │
│ QQuery 22    │  14.90 ms │            13.91 ms │ +1.07x faster │
└──────────────┴───────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 1062.38ms │
│ Total Time (alamb_test_pushdown)   │ 1056.17ms │
│ Average Time (HEAD)                │   48.29ms │
│ Average Time (alamb_test_pushdown) │   48.01ms │
│ Queries Faster                     │         3 │
│ Queries Slower                     │         0 │
│ Queries with No Change             │        19 │
│ Queries with Failure               │         0 │
└────────────────────────────────────┴───────────┘

@alamb
Copy link
Contributor Author

alamb commented Jul 9, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubuntu SMP Wed May 28 02:40:52 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/test_pushdown (2ddefd3) to ebb8e95 diff using: clickbench_1
Results will be posted here when complete

@alamb
Copy link
Contributor Author

alamb commented Jul 9, 2025

🤖: Benchmark completed

Details

Comparing HEAD and alamb_test_pushdown
--------------------
Benchmark clickbench_1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ alamb_test_pushdown ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     0.60 ms │             0.58 ms │     no change │
│ QQuery 1     │    78.01 ms │            85.00 ms │  1.09x slower │
│ QQuery 2     │   110.32 ms │           112.05 ms │     no change │
│ QQuery 3     │   133.18 ms │           127.78 ms │     no change │
│ QQuery 4     │   659.60 ms │           681.28 ms │     no change │
│ QQuery 5     │   855.39 ms │           870.18 ms │     no change │
│ QQuery 6     │     0.64 ms │             0.62 ms │     no change │
│ QQuery 7     │    85.89 ms │            94.56 ms │  1.10x slower │
│ QQuery 8     │   891.44 ms │           907.17 ms │     no change │
│ QQuery 9     │  1164.24 ms │          1239.03 ms │  1.06x slower │
│ QQuery 10    │   294.03 ms │           309.87 ms │  1.05x slower │
│ QQuery 11    │   328.92 ms │           347.11 ms │  1.06x slower │
│ QQuery 12    │   887.84 ms │           952.22 ms │  1.07x slower │
│ QQuery 13    │  1246.78 ms │          1427.08 ms │  1.14x slower │
│ QQuery 14    │   811.40 ms │           982.42 ms │  1.21x slower │
│ QQuery 15    │   838.09 ms │           858.86 ms │     no change │
│ QQuery 16    │  1656.53 ms │          1626.01 ms │     no change │
│ QQuery 17    │  1626.10 ms │          1620.52 ms │     no change │
│ QQuery 18    │  2893.48 ms │          2920.65 ms │     no change │
│ QQuery 19    │   129.58 ms │           138.04 ms │  1.07x slower │
│ QQuery 20    │  1173.20 ms │          1185.28 ms │     no change │
│ QQuery 21    │  1381.19 ms │          1435.50 ms │     no change │
│ QQuery 22    │  2356.96 ms │          2591.92 ms │  1.10x slower │
│ QQuery 23    │  7814.54 ms │          1152.97 ms │ +6.78x faster │
│ QQuery 24    │   459.26 ms │           509.46 ms │  1.11x slower │
│ QQuery 25    │   340.85 ms │           410.36 ms │  1.20x slower │
│ QQuery 26    │   459.09 ms │           563.39 ms │  1.23x slower │
│ QQuery 27    │  1669.30 ms │          1843.92 ms │  1.10x slower │
│ QQuery 28    │ 13396.95 ms │         13268.95 ms │     no change │
│ QQuery 29    │   561.08 ms │           561.39 ms │     no change │
│ QQuery 30    │   797.22 ms │          1099.89 ms │  1.38x slower │
│ QQuery 31    │   835.04 ms │          1070.99 ms │  1.28x slower │
│ QQuery 32    │  2475.36 ms │          2563.69 ms │     no change │
│ QQuery 33    │  3300.74 ms │          3300.70 ms │     no change │
│ QQuery 34    │  3485.57 ms │          3343.18 ms │     no change │
│ QQuery 35    │  1337.93 ms │          1311.07 ms │     no change │
│ QQuery 36    │   159.43 ms │            72.59 ms │ +2.20x faster │
│ QQuery 37    │   111.04 ms │            72.40 ms │ +1.53x faster │
│ QQuery 38    │   177.32 ms │            73.73 ms │ +2.40x faster │
│ QQuery 39    │   260.75 ms │            72.40 ms │ +3.60x faster │
│ QQuery 40    │    91.45 ms │            72.91 ms │ +1.25x faster │
│ QQuery 41    │    89.45 ms │            72.89 ms │ +1.23x faster │
│ QQuery 42    │    77.68 ms │            71.68 ms │ +1.08x faster │
└──────────────┴─────────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 57503.46ms │
│ Total Time (alamb_test_pushdown)   │ 52022.28ms │
│ Average Time (HEAD)                │  1337.29ms │
│ Average Time (alamb_test_pushdown) │  1209.82ms │
│ Queries Faster                     │          8 │
│ Queries Slower                     │         16 │
│ Queries with No Change             │         19 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘

@alamb
Copy link
Contributor Author

alamb commented Jul 9, 2025

My analysis of these results are very consistent with my last attempt at caching filter results

The biggest slow downs are in Q30, Q31

│ QQuery 30    │   758.48 ms │          1197.40 ms │  1.58x slower │
│ QQuery 31    │   780.68 ms │          1172.13 ms │  1.50x slower │

I am fairly sure this is due to the overhad of RowSelection (these queries select many small selections). I started analyzing them here: #16562 (comment)

So TLDR is I think the caching approach is good. but to avoid some queries getting slower we will need to improve the RowSelection representation too. I will try and think about this / whip up some POC hopefully over the next few days

@zhuqi-lucas
Copy link
Contributor

The adaptive selection will help Q30 and Q31 from previous PR result:

apache/arrow-rs#7454 (comment)

@alamb
Copy link
Contributor Author

alamb commented Jul 9, 2025

The adaptive selection will help Q30 and Q31 from previous PR result:

apache/arrow-rs#7454 (comment)

@zhuqi-lucas my largest concern about all the previous adaptive row selection work was the software engineering / keeping the complexity under control.

I have been dreaming about this -- I am thinking it might be time to create an internal RowSelection representation like

enum InternalRowSelector {
  /// Skip the next n rows
  Skip(usize),
  /// Decode and output the next n rows
  Decode(usize),
   /// Decode the next filter.len() rows and apply the filter before outputting the rows
  DecodeAndFilter(Arc<BooleanArray>),
}

Then the actual RecordReader would take a Vec` or something

To stage / keep things of reasonable complexity we could make one PR that refactors to

enum InternalRowSelector {
  /// Skip the next n rows
  Skip(usize),
  /// Decode and output the next n rows
  Decode(usize),
   // NO DecodeAndFilter variant
}

And then add the DecodeAndFilter as a follow on PR

Maybe I'll try and make a PR

@alamb
Copy link
Contributor Author

alamb commented Jul 9, 2025

One thing that I think has caused us problems is judging any improvements to pushdown based on not regressing performance when pushdown is enabled vs not.

However, this makes making incremental progress really hard. What I think we should start doing is compare any proposed improvements to pushdown when pushdown is already on.

In other words, let's make a benchmark that already has filter pushdown on.

I'll make a PR for this new benchmark later today

@zhuqi-lucas
Copy link
Contributor

The adaptive selection will help Q30 and Q31 from previous PR result:
apache/arrow-rs#7454 (comment)

@zhuqi-lucas my largest concern about all the previous adaptive row selection work was the software engineering / keeping the complexity under control.

I have been dreaming about this -- I am thinking it might be time to create an internal RowSelection representation like

enum InternalRowSelector {
  /// Skip the next n rows
  Skip(usize),
  /// Decode and output the next n rows
  Decode(usize),
   /// Decode the next filter.len() rows and apply the filter before outputting the rows
  DecodeAndFilter(Arc<BooleanArray>),
}

Then the actual RecordReader would take a Vec` or something

To stage / keep things of reasonable complexity we could make one PR that refactors to

enum InternalRowSelector {
  /// Skip the next n rows
  Skip(usize),
  /// Decode and output the next n rows
  Decode(usize),
   // NO DecodeAndFilter variant
}

And then add the DecodeAndFilter as a follow on PR

Maybe I'll try and make a PR

Thank you @alamb , i agree, even the adaptive selection ratio is experimented by random selection in my PR, i can't find a good way to do it clearly.

@alamb
Copy link
Contributor Author

alamb commented Jul 9, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubuntu SMP Wed May 28 02:40:52 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/test_pushdown (2ddefd3) to ebb8e95 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@zhuqi-lucas
Copy link
Contributor

One thing that I think has caused us problems is judging any improvements to pushdown based on not regressing performance when pushdown is enabled vs not.

However, this makes making incremental progress really hard. What I think we should start doing is compare any proposed improvements to pushdown when pushdown is already on.

In other words, let's make a benchmark that already has filter pushdown on.

I'll make a PR for this new benchmark later today

Great point @alamb , i totally agree this, we can first improve the pushdown itself instead of comparing to not pushdown because we currently not default pushdown until now. I guess this PR and #16562 will improve it.

@alamb
Copy link
Contributor Author

alamb commented Jul 9, 2025

One thing that I think has caused us problems is judging any improvements to pushdown based on not regressing performance when pushdown is enabled vs not.

However, this makes making incremental progress really hard. What I think we should start doing is compare any proposed improvements to pushdown when pushdown is already on.

In other words, let's make a benchmark that already has filter pushdown on.

I'll make a PR for this new benchmark later today

I actually found a seemingly good one here: https://github.com/apache/datafusion/blob/3ca09a642dac266dfdbf7f57d2a5af82a9c77436/benchmarks/bench.sh#L117-L116

bench.sh run parquet

I started it running and will see what happens.

I need to do some other non parquet stuff for a few hours. Will be back

@alamb
Copy link
Contributor Author

alamb commented Jul 9, 2025

🤖: Benchmark completed

Details

Comparing HEAD and alamb_test_pushdown
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ alamb_test_pushdown ┃         Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 0     │  1850.20 ms │          2077.01 ms │   1.12x slower │
│ QQuery 1     │   695.48 ms │           707.76 ms │      no change │
│ QQuery 2     │  1348.92 ms │          1355.71 ms │      no change │
│ QQuery 3     │   679.46 ms │           666.18 ms │      no change │
│ QQuery 4     │  1354.18 ms │          1552.67 ms │   1.15x slower │
│ QQuery 5     │ 15130.40 ms │         15323.01 ms │      no change │
│ QQuery 6     │  2050.65 ms │           128.43 ms │ +15.97x faster │
│ QQuery 7     │  1928.41 ms │          1936.78 ms │      no change │
└──────────────┴─────────────┴─────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 25037.71ms │
│ Total Time (alamb_test_pushdown)   │ 23747.55ms │
│ Average Time (HEAD)                │  3129.71ms │
│ Average Time (alamb_test_pushdown) │  2968.44ms │
│ Queries Faster                     │          1 │
│ Queries Slower                     │          2 │
│ Queries with No Change             │          5 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ alamb_test_pushdown ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.20 ms │             2.67 ms │  1.21x slower │
│ QQuery 1     │    33.62 ms │            35.33 ms │  1.05x slower │
│ QQuery 2     │    81.58 ms │            81.34 ms │     no change │
│ QQuery 3     │    98.54 ms │            98.30 ms │     no change │
│ QQuery 4     │   587.27 ms │           586.80 ms │     no change │
│ QQuery 5     │   847.12 ms │           865.34 ms │     no change │
│ QQuery 6     │     2.34 ms │             2.22 ms │ +1.05x faster │
│ QQuery 7     │    38.71 ms │            50.91 ms │  1.32x slower │
│ QQuery 8     │   850.12 ms │           844.78 ms │     no change │
│ QQuery 9     │  1173.07 ms │          1163.15 ms │     no change │
│ QQuery 10    │   255.12 ms │           282.47 ms │  1.11x slower │
│ QQuery 11    │   293.57 ms │           316.60 ms │  1.08x slower │
│ QQuery 12    │   849.81 ms │           925.56 ms │  1.09x slower │
│ QQuery 13    │  1241.36 ms │          1403.74 ms │  1.13x slower │
│ QQuery 14    │   798.25 ms │           976.66 ms │  1.22x slower │
│ QQuery 15    │   768.25 ms │           776.35 ms │     no change │
│ QQuery 16    │  1579.72 ms │          1585.16 ms │     no change │
│ QQuery 17    │  1598.05 ms │          1588.66 ms │     no change │
│ QQuery 18    │  2827.12 ms │          2815.50 ms │     no change │
│ QQuery 19    │    87.49 ms │            94.53 ms │  1.08x slower │
│ QQuery 20    │  1152.71 ms │          1128.84 ms │     no change │
│ QQuery 21    │  1287.02 ms │          1297.07 ms │     no change │
│ QQuery 22    │  2115.29 ms │          2450.86 ms │  1.16x slower │
│ QQuery 23    │  7359.74 ms │           889.54 ms │ +8.27x faster │
│ QQuery 24    │   427.33 ms │           210.27 ms │ +2.03x faster │
│ QQuery 25    │   295.79 ms │           381.70 ms │  1.29x slower │
│ QQuery 26    │   442.34 ms │           283.31 ms │ +1.56x faster │
│ QQuery 27    │  1544.98 ms │                FAIL │  incomparable │
│ QQuery 28    │ 12858.47 ms │         12774.45 ms │     no change │
│ QQuery 29    │   536.16 ms │           532.45 ms │     no change │
│ QQuery 30    │   777.87 ms │          1193.31 ms │  1.53x slower │
│ QQuery 31    │   782.24 ms │          1184.97 ms │  1.51x slower │
│ QQuery 32    │  2400.57 ms │          2413.43 ms │     no change │
│ QQuery 33    │  3172.31 ms │          3233.48 ms │     no change │
│ QQuery 34    │  3188.58 ms │          3202.95 ms │     no change │
│ QQuery 35    │  1232.52 ms │          1261.15 ms │     no change │
│ QQuery 36    │   122.11 ms │            28.40 ms │ +4.30x faster │
│ QQuery 37    │    49.90 ms │            26.67 ms │ +1.87x faster │
│ QQuery 38    │   116.53 ms │            26.92 ms │ +4.33x faster │
│ QQuery 39    │   191.50 ms │            27.03 ms │ +7.08x faster │
│ QQuery 40    │    41.20 ms │            28.19 ms │ +1.46x faster │
│ QQuery 41    │    39.53 ms │            26.66 ms │ +1.48x faster │
│ QQuery 42    │    33.15 ms │            26.93 ms │ +1.23x faster │
└──────────────┴─────────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 52636.14ms │
│ Total Time (alamb_test_pushdown)   │ 47124.66ms │
│ Average Time (HEAD)                │  1253.24ms │
│ Average Time (alamb_test_pushdown) │  1122.02ms │
│ Queries Faster                     │         11 │
│ Queries Slower                     │         13 │
│ Queries with No Change             │         18 │
│ Queries with Failure               │          1 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ alamb_test_pushdown ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  98.33 ms │            97.17 ms │     no change │
│ QQuery 2     │  21.20 ms │            21.00 ms │     no change │
│ QQuery 3     │  32.47 ms │            32.04 ms │     no change │
│ QQuery 4     │  18.43 ms │            18.58 ms │     no change │
│ QQuery 5     │  49.82 ms │            49.21 ms │     no change │
│ QQuery 6     │  11.62 ms │            11.49 ms │     no change │
│ QQuery 7     │  87.15 ms │            88.51 ms │     no change │
│ QQuery 8     │  24.76 ms │            25.22 ms │     no change │
│ QQuery 9     │  53.80 ms │            53.67 ms │     no change │
│ QQuery 10    │  42.44 ms │            40.89 ms │     no change │
│ QQuery 11    │  11.47 ms │            11.33 ms │     no change │
│ QQuery 12    │  34.56 ms │            31.01 ms │ +1.11x faster │
│ QQuery 13    │  25.78 ms │            25.71 ms │     no change │
│ QQuery 14    │   9.68 ms │             9.50 ms │     no change │
│ QQuery 15    │  18.84 ms │            18.30 ms │     no change │
│ QQuery 16    │  18.09 ms │            17.80 ms │     no change │
│ QQuery 17    │  95.48 ms │            95.75 ms │     no change │
│ QQuery 18    │ 191.23 ms │           186.61 ms │     no change │
│ QQuery 19    │  24.34 ms │            24.15 ms │     no change │
│ QQuery 20    │  31.46 ms │            31.02 ms │     no change │
│ QQuery 21    │ 145.02 ms │           142.75 ms │     no change │
│ QQuery 22    │  15.05 ms │            13.78 ms │ +1.09x faster │
└──────────────┴───────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 1061.00ms │
│ Total Time (alamb_test_pushdown)   │ 1045.49ms │
│ Average Time (HEAD)                │   48.23ms │
│ Average Time (alamb_test_pushdown) │   47.52ms │
│ Queries Faster                     │         2 │
│ Queries Slower                     │         0 │
│ Queries with No Change             │        20 │
│ Queries with Failure               │         0 │
└────────────────────────────────────┴───────────┘

@alamb
Copy link
Contributor Author

alamb commented Jul 9, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubuntu SMP Wed May 28 02:40:52 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/test_pushdown (2ddefd3) to ebb8e95 diff using: parquet
Results will be posted here when complete

@alamb
Copy link
Contributor Author

alamb commented Jul 9, 2025

🤖: Benchmark completed

Details

Comparing HEAD and alamb_test_pushdown
--------------------
Benchmark parquet.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ alamb_test_pushdown ┃       Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QSelective-… │ 1914.68 ms │          1873.28 ms │    no change │
│ filter:      │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QSelective-… │ 3248.40 ms │          3345.80 ms │    no change │
│ filter:      │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QSelective-… │ 3471.17 ms │          3443.40 ms │    no change │
│ filter:      │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QNon-select… │ 2458.98 ms │          2419.96 ms │    no change │
│ filter:      │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QNon-select… │ 4410.66 ms │          4279.49 ms │    no change │
│ filter:      │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QNon-select… │ 4786.02 ms │          4680.66 ms │    no change │
│ filter:      │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QBasic       │ 2002.64 ms │          1946.81 ms │    no change │
│ conjunction: │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QBasic       │ 1614.25 ms │          1688.48 ms │    no change │
│ conjunction: │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QBasic       │ 1843.95 ms │          1943.33 ms │ 1.05x slower │
│ conjunction: │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QNested      │ 2013.97 ms │          1984.36 ms │    no change │
│ filters:     │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QNested      │ 2321.05 ms │          2357.72 ms │    no change │
│ filters:     │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QNested      │ 2480.90 ms │          2539.76 ms │    no change │
│ filters:     │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QMany        │ 2404.29 ms │          2370.31 ms │    no change │
│ filters:     │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QMany        │ 4147.79 ms │          3977.40 ms │    no change │
│ filters:     │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QMany        │ 4137.83 ms │          3972.74 ms │    no change │
│ filters:     │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QFilter      │ 1726.27 ms │          1696.03 ms │    no change │
│ everything:  │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QFilter      │   34.08 ms │            36.12 ms │ 1.06x slower │
│ everything:  │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QFilter      │   30.34 ms │            32.07 ms │ 1.06x slower │
│ everything:  │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QFilter      │ 1908.07 ms │          1868.66 ms │    no change │
│ nothing:     │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QFilter      │ 1425.87 ms │          1373.87 ms │    no change │
│ nothing:     │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
│ QFilter      │ 1935.56 ms │          1905.10 ms │    no change │
│ nothing:     │            │                     │              │
│ pushdown_fi… │            │                     │              │
│ reorder_fil… │            │                     │              │
│ page_index=… │            │                     │              │
└──────────────┴────────────┴─────────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 50316.77ms │
│ Total Time (alamb_test_pushdown)   │ 49735.34ms │
│ Average Time (HEAD)                │  2396.04ms │
│ Average Time (alamb_test_pushdown) │  2368.35ms │
│ Queries Faster                     │          0 │
│ Queries Slower                     │          3 │
│ Queries with No Change             │         18 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘

@alamb
Copy link
Contributor Author

alamb commented Jul 9, 2025

Ok, I made a PR to add a filter_pushdown benchmark. Once I get that merged I can test this branch using that.

@alamb
Copy link
Contributor Author

alamb commented Jul 10, 2025

I ran the new clickbench_pushdown benchmark and TLDR is the new pushdown decoder look like they make a measurable difference 🎉

Thus I think we should proceed trying to get apache/arrow-rs#7850 merged.

My next step will be to analyze some of the queries that seem to have gotten slwer (like Q19) and see if I can reproduce it / find any thing to improve

--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃  merge-base ┃ test_pushdown ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.37 ms │       2.42 ms │     no change │
│ QQuery 1     │    36.74 ms │      35.98 ms │     no change │
│ QQuery 2     │    80.88 ms │      81.65 ms │     no change │
│ QQuery 3     │    98.74 ms │      96.46 ms │     no change │
│ QQuery 4     │   599.44 ms │     588.92 ms │     no change │
│ QQuery 5     │   875.20 ms │     860.63 ms │     no change │
│ QQuery 6     │     2.36 ms │       2.40 ms │     no change │
│ QQuery 7     │    55.75 ms │      50.98 ms │ +1.09x faster │
│ QQuery 8     │   839.47 ms │     829.48 ms │     no change │
│ QQuery 9     │  1171.43 ms │    1149.42 ms │     no change │
│ QQuery 10    │   281.96 ms │     276.07 ms │     no change │
│ QQuery 11    │   317.52 ms │     318.86 ms │     no change │
│ QQuery 12    │  1084.77 ms │     923.75 ms │ +1.17x faster │
│ QQuery 13    │  1572.79 ms │    1383.83 ms │ +1.14x faster │
│ QQuery 14    │  1126.55 ms │     975.02 ms │ +1.16x faster │
│ QQuery 15    │   766.37 ms │     761.22 ms │     no change │
│ QQuery 16    │  1584.13 ms │    1564.22 ms │     no change │
│ QQuery 17    │  1560.67 ms │    1581.42 ms │     no change │
│ QQuery 18    │  2806.90 ms │    2905.37 ms │     no change │
│ QQuery 19    │    86.80 ms │      99.67 ms │  1.15x slower │
│ QQuery 20    │  1180.12 ms │    1157.74 ms │     no change │
│ QQuery 21    │  1391.28 ms │    1314.41 ms │ +1.06x faster │
│ QQuery 22    │  2580.68 ms │    2496.35 ms │     no change │
│ QQuery 23    │   974.72 ms │     924.11 ms │ +1.05x faster │
│ QQuery 24    │   245.25 ms │     192.79 ms │ +1.27x faster │
│ QQuery 25    │   595.13 ms │     349.02 ms │ +1.71x faster │
│ QQuery 26    │   418.77 ms │     287.61 ms │ +1.46x faster │
│ QQuery 27    │  2328.28 ms │          FAIL │  incomparable │
│ QQuery 28    │ 12913.12 ms │   12728.61 ms │     no change │
│ QQuery 29    │   526.61 ms │     516.02 ms │     no change │
│ QQuery 30    │  1226.74 ms │    1198.28 ms │     no change │
│ QQuery 31    │  1214.66 ms │    1166.91 ms │     no change │
│ QQuery 32    │  2353.52 ms │    2344.64 ms │     no change │
│ QQuery 33    │  3181.67 ms │    3159.55 ms │     no change │
│ QQuery 34    │  3237.27 ms │    3183.64 ms │     no change │
│ QQuery 35    │  1209.08 ms │    1230.69 ms │     no change │
│ QQuery 36    │    26.88 ms │      28.11 ms │     no change │
│ QQuery 37    │    26.97 ms │      28.15 ms │     no change │
│ QQuery 38    │    26.77 ms │      28.11 ms │  1.05x slower │
│ QQuery 39    │    27.37 ms │      27.82 ms │     no change │
│ QQuery 40    │    26.76 ms │      28.66 ms │  1.07x slower │
│ QQuery 41    │    26.32 ms │      27.33 ms │     no change │
│ QQuery 42    │    26.42 ms │      27.77 ms │  1.05x slower │
└──────────────┴─────────────┴───────────────┴───────────────┘

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary            ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (merge-base)      │ 48386.94ms │
│ Total Time (test_pushdown)   │ 46934.08ms │
│ Average Time (merge-base)    │  1152.07ms │
│ Average Time (test_pushdown) │  1117.48ms │
│ Queries Faster               │          9 │
│ Queries Slower               │          4 │
│ Queries with No Change       │         29 │
│ Queries with Failure         │          1 │
└──────────────────────────────┴────────────┘

@zhuqi-lucas
Copy link
Contributor

I ran the new clickbench_pushdown benchmark and TLDR is the new pushdown decoder look like they make a measurable difference 🎉

Thus I think we should proceed trying to get apache/arrow-rs#7850 merged.

My next step will be to analyze some of the queries that seem to have gotten slwer (like Q19) and see if I can reproduce it / find any thing to improve

--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃  merge-base ┃ test_pushdown ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.37 ms │       2.42 ms │     no change │
│ QQuery 1     │    36.74 ms │      35.98 ms │     no change │
│ QQuery 2     │    80.88 ms │      81.65 ms │     no change │
│ QQuery 3     │    98.74 ms │      96.46 ms │     no change │
│ QQuery 4     │   599.44 ms │     588.92 ms │     no change │
│ QQuery 5     │   875.20 ms │     860.63 ms │     no change │
│ QQuery 6     │     2.36 ms │       2.40 ms │     no change │
│ QQuery 7     │    55.75 ms │      50.98 ms │ +1.09x faster │
│ QQuery 8     │   839.47 ms │     829.48 ms │     no change │
│ QQuery 9     │  1171.43 ms │    1149.42 ms │     no change │
│ QQuery 10    │   281.96 ms │     276.07 ms │     no change │
│ QQuery 11    │   317.52 ms │     318.86 ms │     no change │
│ QQuery 12    │  1084.77 ms │     923.75 ms │ +1.17x faster │
│ QQuery 13    │  1572.79 ms │    1383.83 ms │ +1.14x faster │
│ QQuery 14    │  1126.55 ms │     975.02 ms │ +1.16x faster │
│ QQuery 15    │   766.37 ms │     761.22 ms │     no change │
│ QQuery 16    │  1584.13 ms │    1564.22 ms │     no change │
│ QQuery 17    │  1560.67 ms │    1581.42 ms │     no change │
│ QQuery 18    │  2806.90 ms │    2905.37 ms │     no change │
│ QQuery 19    │    86.80 ms │      99.67 ms │  1.15x slower │
│ QQuery 20    │  1180.12 ms │    1157.74 ms │     no change │
│ QQuery 21    │  1391.28 ms │    1314.41 ms │ +1.06x faster │
│ QQuery 22    │  2580.68 ms │    2496.35 ms │     no change │
│ QQuery 23    │   974.72 ms │     924.11 ms │ +1.05x faster │
│ QQuery 24    │   245.25 ms │     192.79 ms │ +1.27x faster │
│ QQuery 25    │   595.13 ms │     349.02 ms │ +1.71x faster │
│ QQuery 26    │   418.77 ms │     287.61 ms │ +1.46x faster │
│ QQuery 27    │  2328.28 ms │          FAIL │  incomparable │
│ QQuery 28    │ 12913.12 ms │   12728.61 ms │     no change │
│ QQuery 29    │   526.61 ms │     516.02 ms │     no change │
│ QQuery 30    │  1226.74 ms │    1198.28 ms │     no change │
│ QQuery 31    │  1214.66 ms │    1166.91 ms │     no change │
│ QQuery 32    │  2353.52 ms │    2344.64 ms │     no change │
│ QQuery 33    │  3181.67 ms │    3159.55 ms │     no change │
│ QQuery 34    │  3237.27 ms │    3183.64 ms │     no change │
│ QQuery 35    │  1209.08 ms │    1230.69 ms │     no change │
│ QQuery 36    │    26.88 ms │      28.11 ms │     no change │
│ QQuery 37    │    26.97 ms │      28.15 ms │     no change │
│ QQuery 38    │    26.77 ms │      28.11 ms │  1.05x slower │
│ QQuery 39    │    27.37 ms │      27.82 ms │     no change │
│ QQuery 40    │    26.76 ms │      28.66 ms │  1.07x slower │
│ QQuery 41    │    26.32 ms │      27.33 ms │     no change │
│ QQuery 42    │    26.42 ms │      27.77 ms │  1.05x slower │
└──────────────┴─────────────┴───────────────┴───────────────┘

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary            ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (merge-base)      │ 48386.94ms │
│ Total Time (test_pushdown)   │ 46934.08ms │
│ Average Time (merge-base)    │  1152.07ms │
│ Average Time (test_pushdown) │  1117.48ms │
│ Queries Faster               │          9 │
│ Queries Slower               │          4 │
│ Queries with No Change       │         29 │
│ Queries with Failure         │          1 │
└──────────────────────────────┴────────────┘

Good news!

@XiangpengHao
Copy link
Contributor

Thus I think we should proceed trying to get apache/arrow-rs#7850 merged.

Great! I plan to take another look in a few days (being occupied by other stuff recently), I have a few ideas to improve it more

@alamb
Copy link
Contributor Author

alamb commented Jul 12, 2025

Thus I think we should proceed trying to get apache/arrow-rs#7850 merged.

Great! I plan to take another look in a few days (being occupied by other stuff recently), I have a few ideas to improve it more

Great -- I will also try and look at it more closely later today / tomorrow when I have some time on airplanes

@alamb alamb force-pushed the alamb/test_pushdown branch from 18763fc to a13fd45 Compare July 15, 2025 19:17
@alamb
Copy link
Contributor Author

alamb commented Jul 15, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubuntu SMP Wed May 28 02:40:52 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/test_pushdown (a13fd45) to 18a30ce diff using: clickbench_pushdown
Results will be posted here when complete

@alamb
Copy link
Contributor Author

alamb commented Jul 15, 2025

🤖: Benchmark completed

Details

Comparing HEAD and alamb_test_pushdown
--------------------
Benchmark clickbench_pushdown.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ alamb_test_pushdown ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.72 ms │             2.25 ms │ +1.21x faster │
│ QQuery 1     │    37.39 ms │            35.77 ms │     no change │
│ QQuery 2     │    81.97 ms │            80.16 ms │     no change │
│ QQuery 3     │    99.73 ms │            99.30 ms │     no change │
│ QQuery 4     │   579.96 ms │           593.31 ms │     no change │
│ QQuery 5     │   843.45 ms │           846.07 ms │     no change │
│ QQuery 6     │     2.34 ms │             2.24 ms │     no change │
│ QQuery 7     │    56.62 ms │            52.08 ms │ +1.09x faster │
│ QQuery 8     │   855.11 ms │           854.48 ms │     no change │
│ QQuery 9     │  1165.85 ms │          1175.94 ms │     no change │
│ QQuery 10    │   278.43 ms │           272.55 ms │     no change │
│ QQuery 11    │   321.09 ms │           319.40 ms │     no change │
│ QQuery 12    │  1096.92 ms │           907.08 ms │ +1.21x faster │
│ QQuery 13    │  1521.39 ms │          1278.64 ms │ +1.19x faster │
│ QQuery 14    │  1165.67 ms │           988.43 ms │ +1.18x faster │
│ QQuery 15    │   784.07 ms │           790.20 ms │     no change │
│ QQuery 16    │  1619.79 ms │          1620.71 ms │     no change │
│ QQuery 17    │  1586.62 ms │          1606.06 ms │     no change │
│ QQuery 18    │  2853.16 ms │          3049.52 ms │  1.07x slower │
│ QQuery 19    │    86.23 ms │            97.66 ms │  1.13x slower │
│ QQuery 20    │  1177.67 ms │          1218.80 ms │     no change │
│ QQuery 21    │  1393.37 ms │          1353.34 ms │     no change │
│ QQuery 22    │  2620.54 ms │          2511.87 ms │     no change │
│ QQuery 23    │   890.79 ms │           821.40 ms │ +1.08x faster │
│ QQuery 24    │   243.08 ms │           197.53 ms │ +1.23x faster │
│ QQuery 25    │   607.67 ms │           355.19 ms │ +1.71x faster │
│ QQuery 26    │   389.59 ms │           267.10 ms │ +1.46x faster │
│ QQuery 27    │  2313.22 ms │                FAIL │  incomparable │
│ QQuery 28    │ 12823.63 ms │         13460.08 ms │     no change │
│ QQuery 29    │   519.24 ms │           514.82 ms │     no change │
│ QQuery 30    │  1246.76 ms │          1214.04 ms │     no change │
│ QQuery 31    │  1221.30 ms │          1221.22 ms │     no change │
│ QQuery 32    │  2440.09 ms │          2465.49 ms │     no change │
│ QQuery 33    │  3228.91 ms │          3190.13 ms │     no change │
│ QQuery 34    │  3224.61 ms │          3220.22 ms │     no change │
│ QQuery 35    │  1291.49 ms │          1281.26 ms │     no change │
│ QQuery 36    │    27.83 ms │            28.04 ms │     no change │
│ QQuery 37    │    26.92 ms │            27.99 ms │     no change │
│ QQuery 38    │    27.26 ms │            27.36 ms │     no change │
│ QQuery 39    │    27.13 ms │            27.65 ms │     no change │
│ QQuery 40    │    27.26 ms │            27.89 ms │     no change │
│ QQuery 41    │    26.88 ms │            27.54 ms │     no change │
│ QQuery 42    │    26.97 ms │            27.35 ms │     no change │
└──────────────┴─────────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 48547.50ms │
│ Total Time (alamb_test_pushdown)   │ 48158.18ms │
│ Average Time (HEAD)                │  1155.89ms │
│ Average Time (alamb_test_pushdown) │  1146.62ms │
│ Queries Faster                     │          9 │
│ Queries Slower                     │          2 │
│ Queries with No Change             │         31 │
│ Queries with Failure               │          1 │
└────────────────────────────────────┴────────────┘

@alamb
Copy link
Contributor Author

alamb commented Jul 16, 2025

I looked into this failure running clickbench:

│ QQuery 27    │  2328.28 ms │          FAIL │  incomparable │

I ran the q27.sql on this branch and the cached array reader panics:

]s$ ~/Downloads/datafusion-cli-alamb_test_pushdown -f  ~/Software/datafusion/benchmarks/queries/clickbench/queries/q27.sql
DataFusion CLI v48.0.0

thread 'tokio-runtime-worker' panicked at /Users/andrewlamb/.cargo/git/checkouts/arrow-rs-583cca34693b79b8/4d24172/parquet/src/arrow/array_reader/cached_array_reader.rs:118:13:
assertion `left == right` failed
  left: 319484
 right: 319488
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Join Error
caused by
External error: task 441 panicked with message "assertion `left == right` failed\n  left: 319484\n right: 319488"

Update: I ran with a debug build and got this stack trace:

RUST_BACKTRACE=1 cargo run -p datafusion-cli -- -f ~/Downloads/q27.sql
...

thread 'tokio-runtime-worker' panicked at /Users/andrewlamb/Software/arrow-rs/parquet/src/arrow/array_reader/cached_array_reader.rs:215:24:
attempt to subtract with overflow
stack backtrace:
   0: __rustc::rust_begin_unwind
             at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/panicking.rs:697:5
   1: core::panicking::panic_fmt
             at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/core/src/panicking.rs:75:14
   2: core::panicking::panic_const::panic_const_sub_overflow
             at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/core/src/panicking.rs:175:17
   3: <parquet::arrow::array_reader::cached_array_reader::CachedArrayReader as parquet::arrow::array_reader::ArrayReader>::read_records
             at /Users/andrewlamb/Software/arrow-rs/parquet/src/arrow/array_reader/cached_array_reader.rs:215:24
   4: <parquet::arrow::array_reader::struct_array::StructArrayReader as parquet::arrow::array_reader::ArrayReader>::read_records
             at /Users/andrewlamb/Software/arrow-rs/parquet/src/arrow/array_reader/struct_array.rs:68:30
   5: parquet::arrow::arrow_reader::ParquetRecordBatchReader::next_inner
             at /Users/andrewlamb/Software/arrow-rs/parquet/src/arrow/arrow_reader/mod.rs:893:27
   6: <parquet::arrow::arrow_reader::ParquetRecordBatchReader as core::iter::traits::iterator::Iterator>::next
             at /Users/andrewlamb/Software/arrow-rs/parquet/src/arrow/arrow_reader/mod.rs:844:9
   7: <parquet::arrow::async_reader::ParquetRecordBatchStream<T> as futures_core::stream::Stream>::poll_next
             at /Users/andrewlamb/Software/arrow-rs/parquet/src/arrow/async_reader/mod.rs:872:62
   8: <S as futures_core::stream::TryStream>::try_poll_next
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:206:9
   9: <futures_util::stream::try_stream::into_stream::IntoStream<St> as futures_core::stream::Stream>::poll_next
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/try_stream/into_stream.rs:38:9
  10: <futures_util::stream::stream::map::Map<St,F> as futures_core::stream::Stream>::poll_next
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/map.rs:58:26
  11: <futures_util::stream::try_stream::MapErr<St,F> as futures_core::stream::Stream>::poll_next
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/lib.rs:97:13
  12: <futures_util::stream::stream::map::Map<St,F> as futures_core::stream::Stream>::poll_next
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/map.rs:58:26
  13: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130:9
  14: futures_util::stream::stream::StreamExt::poll_next_unpin
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638:9
  15: datafusion_datasource::file_stream::FileStream::poll_inner
             at ./datafusion/datasource/src/file_stream.rs:221:34
  16: <datafusion_datasource::file_stream::FileStream as futures_core::stream::Stream>::poll_next
             at ./datafusion/datasource/src/file_stream.rs:334:22
  17: futures_util::stream::stream::StreamExt::poll_next_unpin
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638:9
  18: <datafusion_physical_plan::coop::CooperativeStream<T> as futures_core::stream::Stream>::poll_next
             at ./datafusion/physical-plan/src/coop.rs:160:25
  19: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130:9
  20: futures_util::stream::stream::StreamExt::poll_next_unpin
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638:9
  21: <datafusion_physical_plan::aggregates::row_hash::GroupedHashAggregateStream as futures_core::stream::Stream>::poll_next
             at ./datafusion/physical-plan/src/aggregates/row_hash.rs:655:34
  22: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130:9
  23: futures_util::stream::stream::StreamExt::poll_next_unpin
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638:9
  24: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/next.rs:32:9
  25: datafusion_physical_plan::repartition::RepartitionExec::pull_from_input::{{closure}}
             at ./datafusion/physical-plan/src/repartition/mod.rs:939:40
  26: datafusion_common_runtime::trace_utils::trace_future::{{closure}}
             at ./datafusion/common-runtime/src/trace_utils.rs:137:29
  27: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /Users/andrewlamb/.rustup/toolchains/1.88.0-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/future/future.rs:124:9
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
common Related to common crate core Core DataFusion crate datasource Changes to the datasource crate documentation Improvements or additions to documentation logical-expr Logical plan and expressions proto Related to proto crate sql SQL Planner sqllogictest SQL Logic Tests (.slt) substrait Changes to the substrait crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants