Bad query plan produced when all_horizontal
is used with join_where
#21009
Labels
enhancement
New feature or an improvement of an existing feature
needs triage
Awaiting prioritization by a maintainer
python
Related to Python Polars
Checks
Reproducible example
Log output
Issue description
When using
all_horizontal
within thejoin_where
the query plan shows that:When I run the second expression/join example on larger dataframes, it is much slower than the first expression/join example, in some cases by 100x.
My specific use case is when I have multiple boolean comparisons I want to chain together with
.all_horizontal((<first-comparison>).or_(<second-comparison>))
. I have not included the use of.or_
in this minimal example of the change in query plan.Expected behavior
Both expressions should have the same evaluation plan because there is a single boolean comparison. Both should then take the same time and consume the same amount of memory.
Installed versions
The text was updated successfully, but these errors were encountered: