Skip to content

Remove AggregateExpr::create_accumulator #7112

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tustvold opened this issue Jul 27, 2023 · 2 comments
Closed

Remove AggregateExpr::create_accumulator #7112

tustvold opened this issue Jul 27, 2023 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@tustvold
Copy link
Contributor

Is your feature request related to a problem or challenge?

AggregateExpr currently provides both the ScalarValue based Accumulator and the new GroupsAccumulator interface. With an additional groups_accumulator_supported to indicate if GroupsAccumulator is supported.

This is unfortunate as it forces us to keep two accumulator implementations for each expression, which is not only wasteful but introduces the possibility of inconsistency between the two implementations

Describe the solution you'd like

Change AggregateExpr::create_accumulator to return GroupsAccumulator, and remove groups_accumulator_supported and create_groups_accumulator. Implementations that currently only have an Accumulator based implementation can return a GroupsAccumulatorAdapter, which is ultimately what GroupedHashAggregateStream will do anyway

Describe alternatives you've considered

No response

Additional context

This blocks #6842

@tustvold tustvold added the enhancement New feature or request label Jul 27, 2023
@tustvold tustvold self-assigned this Jul 27, 2023
@alamb
Copy link
Contributor

alamb commented Jul 27, 2023

I think updating AggregateExpr which is largely an internal implementation detail of DataFusion, to better communicate what is supported and what is not makes a lot of sense to me.

I think there are two accumulators now because Accumulator has an API to do retractable updates (incremental updates)
Accumulator is part of the public API so I don't think it can / should be removed

@alamb
Copy link
Contributor

alamb commented Jul 27, 2023

The only thing that might be tricky with this proposal is making sure the right implementation is provided for accumulators that can do retractable updates (like sum and count)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants