Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider removing ParquetFilters #36

Open
sunchao opened this issue Feb 16, 2024 · 2 comments
Open

Consider removing ParquetFilters #36

sunchao opened this issue Feb 16, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@sunchao
Copy link
Member

sunchao commented Feb 16, 2024

What is the problem the feature request solves?

The ParquetFilters were added originally so we could shade Parquet in Comet. However the shading was removed later as it caused a lot of trouble on the Iceberg side. In addition, we ported several Parquet classes from parquet-mr into Comet, so now the boundary between Comet and parquet-mr is fairly thin. Therefore, we could consider removing ParquetFilters and directly use the one from Spark.

The advantage of this is we are able to absorb changes & improvements in the newer version of Spark. For instance, apache/spark#36696 added Parquet In/NotIn pushdown from Spark side, which is only available since Spark 3.4. At the moment, as Comet keeps a copy of Spark's ParquetFilters, the feature is not added in order to be backward compatible with Spark 3.2 and 3.3.

Describe the potential solution

Evaluate whether we can remove ParquetFilters from Comet.

Additional context

No response

@sunchao sunchao added the enhancement New feature or request label Feb 16, 2024
@huaxingao
Copy link
Contributor

I will work on this

@huaxingao
Copy link
Contributor

I have tried this out. Everything works fine except this fix https://issues.apache.org/jira/browse/SPARK-46092. The fix is in Spark 3.4.3. I will wait for Spark 3.4.3 is released, and then remove the ParquetFilters

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants