SNOW-1984396: Snowpark Local Testing minus
filters out rows that match values across multiple rows in the subtracted set
#3163
Labels
bug
Something isn't working
status-triage_done
Initial triage done, will be further handled by the driver team
Python 3.11.9 (main, Jun 24 2024, 14:49:51) [Clang 15.0.0 (clang-1500.3.9.4)]
macOS-15.3.1-arm64-arm-64bit
What are the component versions in the environment (
pip freeze
)?There are a lot, but relevant for the test example:
I was trying to use
subtract
/minus
/except_
and write some unit tests for my code using Snowpark but ran into some odd behavior. I've created a toy example that illustrates the problem below.Expected:
Got:
As you can see, the row
[1, 2]
is getting filtered out despite not existing in the dataframe being subtracted. This is because both1
and2
show up as values among the rows. The bug is on this line of code, as it is checking if all of the values in each row indf1
show up in rows indf2
, but not necessarily the same row. This is due to smushing all thedf2
values together viacur_df.values.ravel()
, so we lose the row distinctions.In Snowflake itself, an equivalent query does what you'd expect:
N/A
The text was updated successfully, but these errors were encountered: