-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: support array_compact function #1321
base: main
Are you sure you want to change the base?
Conversation
# Conflicts: # native/core/src/execution/planner.rs # native/proto/src/proto/expr.proto # spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala # spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala
# Conflicts: # native/core/src/execution/planner.rs # native/proto/src/proto/expr.proto # spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala # spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala
# Conflicts: # spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala
spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala
Outdated
Show resolved
Hide resolved
spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala
Outdated
Show resolved
Hide resolved
@@ -2428,6 +2428,22 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde with CometExprShim | |||
withInfo(expr, "unsupported arguments for ArrayJoin", exprs: _*) | |||
None | |||
} | |||
case expr @ ArrayFilter(child, _) if ArrayCompact(child).replacement.sql == expr.sql => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you add a flag to enable and disable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the PR only contains basic tests, could you add a check to enable this expression only if CometConf.COMET_CAST_ALLOW_INCOMPATIBLE
is enabled? We can remove this check in a future PR that adds comprehensive tests and demonstrates that we have Spark-compatible behavior for all supported data types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
supported the latest Interface
# Conflicts: # native/core/src/execution/planner.rs # native/proto/src/proto/expr.proto # spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala # spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala
Which issue does this PR close?
Related to Epic: #1042
array_compact: SELECT array_compact(array(1, 2, 3, null)) => array(1, 2, 3)
DataFusion' s array_compact has same behavior with Spark 's array_compact function
Spark: https://docs.databricks.com/en/sql/language-manual/functions/array_compact.html
DataFusion: https://datafusion.apache.org/user-guide/sql/scalar_functions.html#array-remove-all
Rationale for this change
Defined under Epic: #1042
What changes are included in this PR?
planner.rs: Maps Spark 's arrays_compact function to DataFusion array_remove_all_udf physical expression from Spark physical expression
expr.proto: arrays_compact message has been added,
QueryPlanSerde.scala: arrays_compact pattern matching case has been added,
CometExpressionSuite.scala: A new UT has been added for arrays_compact function.
How are these changes tested?
A new UT has been added.