-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-48353][SQL] Introduction of Exception Handling mechanism in SQL Scripting #49427
[SPARK-48353][SQL] Introduction of Exception Handling mechanism in SQL Scripting #49427
Conversation
…4-condition-handlers
# Conflicts: # common/utils/src/main/resources/error/error-conditions.json # sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecution.scala # sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/ThriftServerWithSparkContextSuite.scala
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Outdated
Show resolved
Hide resolved
Option(ctx.multipartIdentifier()) | ||
.foreach { conditionContext => | ||
val conditionNameParts = visitMultipartIdentifier(conditionContext) | ||
val conditionNameString = conditionNameParts.mkString(".").toUpperCase(Locale.ROOT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we only allow one name part, so this can be moved after the length check and do conditionNameParts.head.toUpperCase...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we also allow multipart names because of builtin conditions. I agree that we can move toUpperCase
after the length check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I see. Shall we check the user-defined error condition name to not contain special chars (only allow A-Z, digits and underscore)? In case it conflicts with the qualified builtin error condition name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add assertConditionName
check in visitDeclareConditionStatementImpl
to only allow A-B, digits and underscore.
Anyway, we check if the condition name in handler declaration is either user defined or builtin condition name, so for user defined stuff it is ok to just check it during condition declaration.
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Outdated
Show resolved
Hide resolved
...st/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/SqlScriptingLogicalPlans.scala
Outdated
Show resolved
Hide resolved
*/ | ||
case class ErrorCondition( | ||
conditionName: String, | ||
sqlState: String) extends CompoundPlanStatement { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be a CompoundPlanStatement
? It seems like an intermediate data structure we use in the parser, and will end up as a Map[String, String]
in CompoundBody
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there is no need for this to be CompoundPlanStatement. I will update this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I think about this, ErrorCondition
is in a way a CompoundPlanStatement
because we use it to declare condition and store the details there. In visitCompoundStatementImpl
we are returning CompoundPlanStatement
and ErrorCondition
is one of them (also in the parser rule).
Looking at this, ErrorCondition (or DeclareConditionStatement) can either extend single statement, or we can leave it as it is now. What's your opinion @cloud-fan?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine to keep it as it is
...st/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/SqlScriptingLogicalPlans.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except for some nits
@cloud-fan Thank you for review(s)! |
@cloud-fan Please don't merge until I take one last look, I was doing something locally last night and I just need to check first if the comments I had are applicable to this PR. I'll let you know once I'm done. |
...st/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/SqlScriptingLogicalPlans.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNode.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNode.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNode.scala
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Show resolved
Hide resolved
Minor comments, otherwise LGTM. |
I agree more centralized approach should be used, but with current tests, I think we covered all possible checks in code to be case-insensitive. As a follow up, I will try to think of some more clever way of ensuring case-insensitivity because this can be a problem if somebody change or add new code without case-sensitivity in mind. Thanks for the comment! |
thanks, merging to master/4.0! |
…L Scripting ### What changes were proposed in this pull request? This pull request introduces the logic of error handling inside SQL Scripting language. Now, it is possible to: - declare conditions for specific SQL States (currently only valid in scope where they are defined) - declare handlers to catch and process errors that are risen during statement execution Rules for selecting the most appropriate handler: - Named condition handlers are most specific. - SQLSTATE handlers are next in specificity. - Generic NOT FOUND and SQLEXCEPTION handlers are least specific. Note: Handlers defined in the innermost compound statement where the exception was raised are considered. ### Why are the changes needed? The intent is to add the possibility for user to handle SQL errors in a custom defined way. ### Limitations - Currently, only `EXIT` handler is supported. Support for `CONTINUE` handlers will come in the future. - It is only possible to declare condition in its full form by specifying SQLSTATE. Short form with default SQLSTATE is not yet supported. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? There are already existing test suites for SQL scripting that have been improved to test new functionalities: - `SqlScriptingParserSuite` - `SqlScriptingExecutionSuite` - `SqlScriptingE2eSuite` ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49427 from miland-db/milan-dankovic_data/refactor-execution-4-condition-handlers. Authored-by: Milan Dankovic <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 0f163a5) Signed-off-by: Wenchen Fan <[email protected]>
parameters = Map("sqlState" -> "12345")) | ||
} | ||
|
||
test("Specific condition takes precedence over sqlState") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 13 new tests in SqlScriptingExecutionSuiteand
SqlScriptingE2eSuite` will fail to execute in non-ANSI mode and the NON-ANSI daily test failed:
we can reproduce it using the following command: SPARK_ANSI_SQL_MODE=false build/sbt "sql/testOnly org.apache.spark.sql.scripting.SqlScriptingE2eSuite org.apache.spark.sql.scripting.SqlScriptingExecutionSuite"
Do you have time to fix this issue? @miland-db
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will fix this. The issue is that zero division is not throwing an error when SPARK_ANSI_SQL_MODE=false
. The solution can be to override this config for this suite, or to find some other error that has the same behavior regardless of the ANSI. Any thoughts @LuciferYang?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's just enable ansi for the entire suite, by overriding def sparkConf
. example:
class AvroV1Suite extends AvroSuite {
override protected def sparkConf: SparkConf =
super
.sparkConf
.set(SQLConf.USE_V1_SOURCE_LIST, "avro")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's just enable ansi for the entire suite, by overriding
def sparkConf
. example:class AvroV1Suite extends AvroSuite { override protected def sparkConf: SparkConf = super .sparkConf .set(SQLConf.USE_V1_SOURCE_LIST, "avro")
If the 13 failed cases are all due to the same reason, I think it's ok
…suites ### What changes were proposed in this pull request? Keep ANSI enabled for SQL Scripting execution suites to enable proper testing of exception handling. ### Why are the changes needed? 13 new tests in `SqlScriptingExecutionSuite` and `SqlScriptingE2eSuite` introduced in #49427 will fail to execute in non-ANSI mode and the NON-ANSI daily test failed:  If ANSI is off, code that should trigger exception fails silently (zero division does not throw). We want to keep ANSI behavior so that we can test exception handling properly. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49852 from miland-db/milan-dankovic_data/exception-handlers-followup-fix-test. Authored-by: Milan Dankovic <[email protected]> Signed-off-by: Max Gekk <[email protected]>
…suites ### What changes were proposed in this pull request? Keep ANSI enabled for SQL Scripting execution suites to enable proper testing of exception handling. ### Why are the changes needed? 13 new tests in `SqlScriptingExecutionSuite` and `SqlScriptingE2eSuite` introduced in #49427 will fail to execute in non-ANSI mode and the NON-ANSI daily test failed:  If ANSI is off, code that should trigger exception fails silently (zero division does not throw). We want to keep ANSI behavior so that we can test exception handling properly. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49852 from miland-db/milan-dankovic_data/exception-handlers-followup-fix-test. Authored-by: Milan Dankovic <[email protected]> Signed-off-by: Max Gekk <[email protected]> (cherry picked from commit ad13a88) Signed-off-by: Max Gekk <[email protected]>
What changes were proposed in this pull request?
This pull request introduces the logic of error handling inside SQL Scripting language. Now, it is possible to:
Rules for selecting the most appropriate handler:
Note: Handlers defined in the innermost compound statement where the exception was raised are considered.
Why are the changes needed?
The intent is to add the possibility for user to handle SQL errors in a custom defined way.
Limitations
EXIT
handler is supported. Support forCONTINUE
handlers will come in the future.Does this PR introduce any user-facing change?
No.
How was this patch tested?
There are already existing test suites for SQL scripting that have been improved to test new functionalities:
SqlScriptingParserSuite
SqlScriptingExecutionSuite
SqlScriptingE2eSuite
Was this patch authored or co-authored using generative AI tooling?
No.