-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Make SUM
and AVG
Aggregate Type Coercion Explicit
#7369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make SUM
and AVG
Aggregate Type Coercion Explicit
#7369
Conversation
@@ -1010,40 +1010,7 @@ fn aggregate_expressions( | |||
| AggregateMode::SinglePartitioned => Ok(aggr_expr | |||
.iter() | |||
.map(|agg| { | |||
let pre_cast_type = if let Some(Sum { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We no longer need this custom logic, it is handled automatically by the optimizer
SUM
and AVG
Aggregate Type Coercion Explicit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @tustvold
I think this logic was left over from an earlier time when DataFusion did the type coercion at physical plan time (rather than during logical planning)
I went through it carefully and it makes much more sense to me than what is on master
DataType::Float64 | DataType::Float32 => Ok(DataType::Float64), | ||
DataType::Int64 => Ok(DataType::Int64), | ||
DataType::UInt64 => Ok(DataType::UInt64), | ||
DataType::Float64 => Ok(DataType::Float64), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looked like we lost support for Float32 -- but then I see the test below for it, so 👍 (presumably what happens is the arguments are coerced first and arg_type
refers to the post coerced type)
@@ -70,7 +70,7 @@ fn subquery_filter_with_cast() -> Result<()> { | |||
\n Inner Join: Filter: CAST(test.col_int32 AS Float64) > __scalar_sq_1.AVG(test.col_int32)\ | |||
\n TableScan: test projection=[col_int32]\ | |||
\n SubqueryAlias: __scalar_sq_1\ | |||
\n Aggregate: groupBy=[[]], aggr=[[AVG(test.col_int32)]]\ | |||
\n Aggregate: groupBy=[[]], aggr=[[AVG(CAST(test.col_int32 AS Float64))]]\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is actually nice to see the explicit cast I think
Which issue does this PR close?
Closes #.
Rationale for this change
Currently various accumulators contain internal logic to coerce inputs to supported types, this has a few issues:
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?