[SPARK-53527][SQL] Improve fallback of analyzeExistenceDefaultValue #52274

szehon-ho · 2025-09-08T18:11:46Z

What changes were proposed in this pull request?

#49962 added a fallback in case there were already broken (ie, non-resolved) persisted default values in catalogs. A broken one is something like 'current_database, current_user, current_timestamp' , these are non-deterministic and will bring wrong results in EXISTS_DEFAULT, where user expects the value resolved when they set the default.

Add yet another fallback for broken default default value, in this case one where there are nested function calls.

Why are the changes needed?

Take the case where the EXISTS_DEFAULT is :
CONCAT(YEAR(CURRENT_DATE), LPAD(WEEKOFYEAR(CURRENT_DATE), 2, '0'))

the current code Literal.fromSQL(defaultSQL) will throw the exception before getting to the fallback:

Caused by: java.lang.AssertionError: assertion failed: function arguments must be resolved.
	at scala.Predef$.assert(Predef.scala:279)
	at org.apache.spark.sql.catalyst.analysis.FunctionRegistry$.$anonfun$expressionBuilder$1(FunctionRegistry.scala:1278)
	at org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistryBase.lookupFunction(FunctionRegistry.scala:251)
	at org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistryBase.lookupFunction$(FunctionRegistry.scala:245)
	at org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry.lookupFunction(FunctionRegistry.scala:317)
	at org.apache.spark.sql.catalyst.expressions.Literal$$anonfun$fromSQL$1.applyOrElse(literals.scala:325)
	at org.apache.spark.sql.catalyst.expressions.Literal$$anonfun$fromSQL$1.applyOrElse(literals.scala:317)
	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$4(TreeNode.scala:586)
	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:121)
	at org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:586)
	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$1(TreeNode.scala:579)
	at scala.collection.immutable.List.map(List.scala:251)
	at scala.collection.immutable.List.map(List.scala:79)
	at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:768)
	at org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:579)
	at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:556)
	at org.apache.spark.sql.catalyst.expressions.Literal$.fromSQL(literals.scala:317)
	at org.apache.spark.sql.catalyst.util.ResolveDefaultColumns$.analyzeExistenceDefaultValue(ResolveDefaultColumnsUtil.scala:393)
	at org.apache.spark.sql.catalyst.util.ResolveDefaultColumns$.getExistenceDefaultValue(ResolveDefaultColumnsUtil.scala:529)
	at org.apache.spark.sql.catalyst.util.ResolveDefaultColumns$.$anonfun$getExistenceDefaultValues$1(ResolveDefaultColumnsUtil.scala:524)
	at scala.collection.ArrayOps$.map$extension(ArrayOps.scala:936)
	at org.apache.spark.sql.catalyst.util.ResolveDefaultColumns$.getExistenceDefaultValues(ResolveDefaultColumnsUtil.scala:524)
	at org.apache.spark.sql.catalyst.util.ResolveDefaultColumns$.$anonfun$existenceDefaultValues$2(ResolveDefaultColumnsUtil.scala:594)
	at scala.Option.getOrElse(Option.scala:201)
	at org.apache.spark.sql.catalyst.util.ResolveDefaultColumns$.existenceDefaultValues(ResolveDefaultColumnsUtil.scala:592)

Does this PR introduce any user-facing change?

No

How was this patch tested?

Add unit test in StructTypeSuite

Was this patch authored or co-authored using generative AI tooling?

No

…or parsing corrupt exists_default value

dongjoon-hyun

Hi, @szehon-ho and @dtenedor .

SPARK-51119 is a part of Apache Spark 4.0.0. We cannot make a follow-up for the released JIRA issue because the Fixed Version of this PR should be 4.1.0 and 4.0.2.

Please file a new JIRA issue to proceed this kind of bug fix.

dongjoon-hyun · 2025-09-08T22:21:26Z

BTW, it would be great if you can prefer a new JIRA issue instead of too many follow-ups like SPARK-51119.

$ git log --oneline | grep SPARK-51119
4731c85c556 [SPARK-51119][SQL][FOLLOW-UP] Fix missing fallback case for parsing corrupt exists_default value
2a075956dc4 [SPARK-51119][SQL][FOLLOW-UP] Add fallback to ResolveDefaultColumnsUtil existenceDefaultValues
3455b822188 [SPARK-51119][SQL][FOLLOW-UP] ColumnDefinition.toV1Column should preserve EXISTS_DEFAULT resolution
63cb5204079 [SPARK-51119][SQL][FOLLOW-UP] Readers on executors resolving EXISTS_DEFAULT should not call catalogs
f959716556d [SPARK-51119][SQL] Readers on executors resolving EXISTS_DEFAULT should not call catalogs

szehon-ho · 2025-09-08T23:07:08Z

yes sure! I made a new JIRA: https://issues.apache.org/jira/browse/SPARK-53527 , thanks for the suggestion

cloud-fan · 2025-09-09T03:25:49Z

thanks, merging to master/4.0!

#49962 added a fallback in case there were already broken (ie, non-resolved) persisted default values in catalogs. A broken one is something like 'current_database, current_user, current_timestamp' , these are non-deterministic and will bring wrong results in EXISTS_DEFAULT, where user expects the value resolved when they set the default. Add yet another fallback for broken default default value, in this case one where there are nested function calls. Take the case where the EXISTS_DEFAULT is : ```CONCAT(YEAR(CURRENT_DATE), LPAD(WEEKOFYEAR(CURRENT_DATE), 2, '0'))``` the current code `Literal.fromSQL(defaultSQL)` will throw the exception before getting to the fallback: ``` Caused by: java.lang.AssertionError: assertion failed: function arguments must be resolved. at scala.Predef$.assert(Predef.scala:279) at org.apache.spark.sql.catalyst.analysis.FunctionRegistry$.$anonfun$expressionBuilder$1(FunctionRegistry.scala:1278) at org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistryBase.lookupFunction(FunctionRegistry.scala:251) at org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistryBase.lookupFunction$(FunctionRegistry.scala:245) at org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry.lookupFunction(FunctionRegistry.scala:317) at org.apache.spark.sql.catalyst.expressions.Literal$$anonfun$fromSQL$1.applyOrElse(literals.scala:325) at org.apache.spark.sql.catalyst.expressions.Literal$$anonfun$fromSQL$1.applyOrElse(literals.scala:317) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$4(TreeNode.scala:586) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:121) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:586) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$1(TreeNode.scala:579) at scala.collection.immutable.List.map(List.scala:251) at scala.collection.immutable.List.map(List.scala:79) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:768) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:579) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:556) at org.apache.spark.sql.catalyst.expressions.Literal$.fromSQL(literals.scala:317) at org.apache.spark.sql.catalyst.util.ResolveDefaultColumns$.analyzeExistenceDefaultValue(ResolveDefaultColumnsUtil.scala:393) at org.apache.spark.sql.catalyst.util.ResolveDefaultColumns$.getExistenceDefaultValue(ResolveDefaultColumnsUtil.scala:529) at org.apache.spark.sql.catalyst.util.ResolveDefaultColumns$.$anonfun$getExistenceDefaultValues$1(ResolveDefaultColumnsUtil.scala:524) at scala.collection.ArrayOps$.map$extension(ArrayOps.scala:936) at org.apache.spark.sql.catalyst.util.ResolveDefaultColumns$.getExistenceDefaultValues(ResolveDefaultColumnsUtil.scala:524) at org.apache.spark.sql.catalyst.util.ResolveDefaultColumns$.$anonfun$existenceDefaultValues$2(ResolveDefaultColumnsUtil.scala:594) at scala.Option.getOrElse(Option.scala:201) at org.apache.spark.sql.catalyst.util.ResolveDefaultColumns$.existenceDefaultValues(ResolveDefaultColumnsUtil.scala:592) ``` No Add unit test in StructTypeSuite No Closes #52274 from szehon-ho/more_default_value_fallback. Authored-by: Szehon Ho <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 2f305b6) Signed-off-by: Wenchen Fan <[email protected]>

[SPARK-51119][SQL][FOLLOW-UP] Fix yet another missing fallback case f…

589b86e

…or parsing corrupt exists_default value

github-actions bot added the SQL label Sep 8, 2025

dtenedor approved these changes Sep 8, 2025

View reviewed changes

dongjoon-hyun requested changes Sep 8, 2025

View reviewed changes

szehon-ho changed the title ~~[SPARK-51119][SQL][FOLLOW-UP] Fix yet another missing fallback case for parsing corrupt exists_default value~~ [SPARK-53527][SQL] Improve fallback of analyzeExistenceDefaultValue Sep 8, 2025

cloud-fan approved these changes Sep 9, 2025

View reviewed changes

cloud-fan closed this in 2f305b6 Sep 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-53527][SQL] Improve fallback of analyzeExistenceDefaultValue #52274

[SPARK-53527][SQL] Improve fallback of analyzeExistenceDefaultValue #52274

Uh oh!

szehon-ho commented Sep 8, 2025 •

edited

Loading

Uh oh!

dongjoon-hyun left a comment

Uh oh!

dongjoon-hyun commented Sep 8, 2025

Uh oh!

szehon-ho commented Sep 8, 2025 •

edited

Loading

Uh oh!

cloud-fan commented Sep 9, 2025

Uh oh!

Uh oh!

[SPARK-53527][SQL] Improve fallback of analyzeExistenceDefaultValue #52274

[SPARK-53527][SQL] Improve fallback of analyzeExistenceDefaultValue #52274

Uh oh!

Conversation

szehon-ho commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Sep 8, 2025

Uh oh!

szehon-ho commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cloud-fan commented Sep 9, 2025

Uh oh!

Uh oh!

szehon-ho commented Sep 8, 2025 •

edited

Loading

szehon-ho commented Sep 8, 2025 •

edited

Loading