-
Notifications
You must be signed in to change notification settings - Fork 162
Support casting date literal to timestamp #3831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
date str -> timestamp timestamp str -> date timestamp str -> time Signed-off-by: Yuanchun Shen <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
rows("New York"), | ||
rows("Ontario"), | ||
rows("Quebec"), | ||
rows((Object) null), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job to support passing null
in verifyDataRows
"source=%s | eval a = cast('09:07:42' as DATE) | fields a", | ||
TEST_INDEX_DATE_FORMATS))); | ||
|
||
verifyErrorMessageContains(t, "date:09:07:42 in unsupported format, please use 'yyyy-MM-dd'"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems we are not aligning with Spark, so this is the behviour of v2 or mysql?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the behavior of v2.
In MySQL, cast a time string to date will return NULL; in some other databases like Postgres, it raises an error.
Signed-off-by: Yuanchun Shen <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling UDFs for casting may render it impossible to e.g. pushdown
Is this still a concern if expression pushdown is supported?
Meanwhile in V2, even though expression pushdown is available, I recall there is performance degradation when CAST is involved—queries may fall back from DSL search to expression script evaluation. Probably we can double check and see if there’s a possible workaround in V3.
There happens to be a new issue #3842 related. |
Signed-off-by: Yuanchun Shen <[email protected]>
It should not be a issue when script pushdown is supported. Besides, if it is a literal that has to be cast, it can be pushed-down as a range query with #3798. I added an example with physical plan in the PR description to demonstrate this.
#3798 circumvents script push-down by converting the string literal to another DSL-recognizable literal then pushing down. |
.toInstant(); | ||
} catch (DateTimeParseException e) { | ||
DateTimeParser.parseDateOrTimestamp(timestamp).atZone(ZoneOffset.UTC).toInstant(); | ||
} catch (SemanticCheckException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think SemanticCheckException
is inappropriate here. This issue isn't related to semantic at all. Should be something like IllegalArgumentException.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to ExpressionEvaluationException
when failed parsing datetime literals/strings.
.atZone(ZoneOffset.UTC) | ||
.toInstant(); | ||
} catch (DateTimeParseException e) { | ||
DateTimeParser.parseDateOrTimestamp(timestamp).atZone(ZoneOffset.UTC).toInstant(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we support cast time str
-> timestamp
in the future? Maybe replace DATE_TIME_FORMATTER_VARIABLE_NANOS
with DATE_TIME_FORMATTER_VARIABLE_NANOS_OPTIONAL
directly if that's the case.
And I wonder if using DateTimeFormatter
with multiple patterns should be more efficient than DateTimeParser.parseDateOrTimestamp
which try to parseDate and fallback to parseTimeStamp when throwing exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed by replacing parse*Or*
calls with new DateTimeFormatter
s
Not related to this issue, I found there are some inefficient implementation for the date related functions, like branch selection at runtime for |
Signed-off-by: Yuanchun Shen <[email protected]>
…ate/time strings Signed-off-by: Yuanchun Shen <[email protected]>
One problem I discovered during testing: Errors thrown during executing PPL with Calcite will often result in 500 error with Calcite. For example query {
"error": {
"reason": "There was internal problem at backend",
"details": "java.sql.SQLException: exception while executing query: timestamp:12:00 in unsupported format, please use 'yyyy-MM-dd HH:mm:ss[.SSSSSSSSS]'",
"type": "RuntimeException"
},
"status": 500
} But in v2, it's: {
"error": {
"reason": "Invalid Query",
"details": "timestamp:12:00 in unsupported format, please use 'yyyy-MM-dd HH:mm:ss[.SSSSSSSSS]'",
"type": "ExpressionEvaluationException"
},
"status": 400
} This is because exceptions thrown during try (PreparedStatement statement = OpenSearchRelRunners.run(context, rel)) {
ResultSet result = statement.executeQuery();
buildResultSet(result, rel.getRowType(), context.querySizeLimit, listener);
} catch (SQLException e) {
throw new RuntimeException(e);
} A simple fix: retrieving the wrapped exception from the example stacktrace
|
I spotted a few for functions under Besides, almost all datetime functions convert inputs to |
Description
Before this PR, only the following casts are supported
With this PR:
The castings that remain impossible are in place to avoid swallowing errors silently.
An example case it solves
... | where date_time > '1950-10-11' | fields date_time
Callout: The exception thrown when failed to parse a date/time string is changed to
ExpressionEvaluationException
(fromSemanticCheckException
). Users relying on the exception type will be affected.Implementation Notes
EXPR_TIMESTAMP
. The values inside are not validated: https://github.com/apache/calcite/blob/a6973bde8107a7c2a605a82c779a64fe28bdb5d1/core/src/main/java/org/apache/calcite/rex/RexBuilder.java#L835TIMESTAMP
,TIME
, orDATE
functions to cast string values.Concerns
Related Issues
Resolves #3728
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.