-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: Move all array_* serde to new framework, use correct INCOMPAT config #1349
Conversation
spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala
Outdated
Show resolved
Hide resolved
case _: ArrayRemove => toProto(CometArrayRemove) | ||
case _: ArrayContains => toProto(CometArrayContains) | ||
case _: ArrayAppend => toProto(CometArrayAppend) | ||
case _: ArrayIntersect => toProto(CometArrayIntersect) | ||
case _: ArrayJoin => toProto(CometArrayJoin) | ||
case _: ArraysOverlap => toProto(CometArraysOverlap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually I would like to move this to a Map[SparkExpr, CometExpr]
@@ -3490,3 +3437,6 @@ trait CometExpressionSerde { | |||
inputs: Seq[Attribute], | |||
binding: Boolean): Option[ExprOuterClass.Expr] | |||
} | |||
|
|||
/** Marker trait for an expression that is not guaranteed to be 100% compatible with Spark */ | |||
trait IncompatExpr {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Later, we could add a method to this trait to provide the reason why an expression is incompatible and this could be used to generate documentation.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1349 +/- ##
=============================================
- Coverage 56.12% 39.16% -16.96%
- Complexity 976 2065 +1089
=============================================
Files 119 262 +143
Lines 11743 60323 +48580
Branches 2251 12836 +10585
=============================================
+ Hits 6591 23627 +17036
- Misses 4012 32223 +28211
- Partials 1140 4473 +3333 ☔ View full report in Codecov by Sentry. |
conf("spark.comet.expression.allowIncompatible") | ||
.doc( | ||
"Comet is not currently fully compatible with Spark for all expressions. " + | ||
"Set this config to true to allow them anyway. See compatibility guide " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Set this config to true to allow them anyway. See compatibility guide " + | |
"Set this config to true to allow them anyway. See Compatibility Guide " + |
.doc( | ||
"Comet is not currently fully compatible with Spark for all expressions. " + | ||
"Set this config to true to allow them anyway. See compatibility guide " + | ||
"for more information.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"for more information.") | |
"for more information. (docs/user-guide/compatibility.md)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a commit to update this by referencing a new CometConf.COMPAT_GUIDE
variable with the link to the published documentation.
I regenerated the compatibility guide as well.
docs/source/user-guide/configs.md
Outdated
@@ -64,6 +64,7 @@ Comet provides the following configuration settings. | |||
| spark.comet.explain.native.enabled | When this setting is enabled, Comet will provide a tree representation of the native query plan before execution and again after execution, with metrics. | false | | |||
| spark.comet.explain.verbose.enabled | When this setting is enabled, Comet will provide a verbose tree representation of the extended information. | false | | |||
| spark.comet.explainFallback.enabled | When this setting is enabled, Comet will provide logging explaining the reason(s) why a query stage cannot be executed natively. Set this to false to reduce the amount of logging. | false | | |||
| spark.comet.expression.allowIncompatible | Comet is not currently fully compatible with Spark for all expressions. Set this config to true to allow them anyway. See compatibility guide for more information. | false | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| spark.comet.expression.allowIncompatible | Comet is not currently fully compatible with Spark for all expressions. Set this config to true to allow them anyway. See compatibility guide for more information. | false | | |
| spark.comet.expression.allowIncompatible | Comet is not currently fully compatible with Spark for all expressions. Set this config to true to allow them anyway. See Compatibility Guide for more information. (docs/user-guide/compatibility.md) | false | |
case _: IncompatExpr if !CometConf.COMET_EXPR_ALLOW_INCOMPATIBLE.get() => | ||
withInfo( | ||
expr, | ||
s"$expr is not fully compatible with Spark. To enable it anyway, set " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s"$expr is not fully compatible with Spark. To enable it anyway, set " + | |
s"$expr is not fully compatible with Spark, see details in Compatibility Guide. To enable it anyway, set " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a commit to update this by referencing a new CometConf.COMPAT_GUIDE
variable with the link to the published documentation.
Co-authored-by: Oleks V <[email protected]>
…t into array-incompat
@@ -64,6 +64,7 @@ Comet provides the following configuration settings. | |||
| spark.comet.explain.native.enabled | When this setting is enabled, Comet will provide a tree representation of the native query plan before execution and again after execution, with metrics. | false | | |||
| spark.comet.explain.verbose.enabled | When this setting is enabled, Comet will provide a verbose tree representation of the extended information. | false | | |||
| spark.comet.explainFallback.enabled | When this setting is enabled, Comet will provide logging explaining the reason(s) why a query stage cannot be executed natively. Set this to false to reduce the amount of logging. | false | | |||
| spark.comet.expression.allowIncompatible | Comet is not currently fully compatible with Spark for all expressions. Set this config to true to allow them anyway. For more information, refer to the Comet Compatibility Guide (https://datafusion.apache.org/comet/user-guide/compatibility.html). | false | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker but looks like the compatibility guide does not mention about the array related issues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I had updated this locally but forgot that these changes have to go in compatibility-template.md
instead of compatibility,md
. 🤦♂️
I have added them now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pending ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Which issue does this PR close?
Part of #1345
Rationale for this change
This PR updates the recently added
array_*
expression handling to mark them all as incompatible for now because we do not have comprehensive testing or validation for which data types are supportedThe check is now based on a new
COMET_EXPR_ALLOW_INCOMPATIBLE
config instead ofCOMET_CAST_ALLOW_INCOMPATIBLE
, which I had previously (mistakenly) suggested.Rather than duplicate the
if INCOMPAT
check and the associated error message, there is now a new marker traitIncompatExpr
.What changes are included in this PR?
IncompExpr
traitarray_*
serde into separate classesCometExpressionSuite
toCometArrayExpressionSuite
How are these changes tested?
Existing tests