You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The ParquetFilters were added originally so we could shade Parquet in Comet. However the shading was removed later as it caused a lot of trouble on the Iceberg side. In addition, we ported several Parquet classes from parquet-mr into Comet, so now the boundary between Comet and parquet-mr is fairly thin. Therefore, we could consider removing ParquetFilters and directly use the one from Spark.
The advantage of this is we are able to absorb changes & improvements in the newer version of Spark. For instance, apache/spark#36696 added Parquet In/NotIn pushdown from Spark side, which is only available since Spark 3.4. At the moment, as Comet keeps a copy of Spark's ParquetFilters, the feature is not added in order to be backward compatible with Spark 3.2 and 3.3.
Describe the potential solution
Evaluate whether we can remove ParquetFilters from Comet.
Additional context
No response
The text was updated successfully, but these errors were encountered:
I have tried this out. Everything works fine except this fix https://issues.apache.org/jira/browse/SPARK-46092. The fix is in Spark 3.4.3. I will wait for Spark 3.4.3 is released, and then remove the ParquetFilters
What is the problem the feature request solves?
The
ParquetFilters
were added originally so we could shade Parquet in Comet. However the shading was removed later as it caused a lot of trouble on the Iceberg side. In addition, we ported several Parquet classes fromparquet-mr
into Comet, so now the boundary between Comet andparquet-mr
is fairly thin. Therefore, we could consider removingParquetFilters
and directly use the one from Spark.The advantage of this is we are able to absorb changes & improvements in the newer version of Spark. For instance, apache/spark#36696 added Parquet
In
/NotIn
pushdown from Spark side, which is only available since Spark 3.4. At the moment, as Comet keeps a copy of Spark'sParquetFilters
, the feature is not added in order to be backward compatible with Spark 3.2 and 3.3.Describe the potential solution
Evaluate whether we can remove
ParquetFilters
from Comet.Additional context
No response
The text was updated successfully, but these errors were encountered: