-
Notifications
You must be signed in to change notification settings - Fork 23
Refactor the SQL implementation to include the SQLTable
trait and add support for parameterized views.
#117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…eterized views This PR primarily aims to refactor the SQL implementation and additionally extends the traits and their default implementation to support storing function arguments for table. - Adds a SQLTable trait that is to be used within SQLTableSource. SQLTable trait abstracts information about the remote table and allows of the trait to hook into the final stages where they can change logical plan and the final AST for sql query that is being federated via VirtualExecutionPlan. - Adds RemoteTable, a default implementation for SQLTable trait, capable of handing table and parameterized views. - Adds RemoteTableRef, a extention to default TableReference capable of storing function args. - Provides a default AST Analyzer for rewriting Statement for tables which contain RemoteTableRef with some functional Args - Extends SqlExecutor trait with logical_optimizer method, this can allow executor to hook into federation planning, allowing for rewriting LogicalPlan and even placement of FederationPlanNode. This is useful for avoiding federating nodes that are only part of datafusion eg. UDF, UDAF.. etc. - Refactors and testing related to usage of this feature
60440c7
to
b7b88cd
Compare
SQLTable
trait and add support for parameterized views.
Overall, this makes sense to me. I do want to mention the following: We seem to be building out the "mini optimizer tooling" inside federation with this PR. It is still predominantly used for rewriting table names (I don't mean the AST optimiser, I think that makes sense). It does make me wonder if it wouldn't be more appropriate to use DataFusions optimizer tooling for this purpose. This was briefly discussed in #61 but was abandoned at the time. However, now it seems to be leading to additional wiring (E.g.: If there is no agreement/commitment to the above, I'd be in favor of landing this first and re-visiting the table rewrite topic at later time. |
Adding alias is need for some of the older database versions
Any optimization process that is done via For now I am okay with discarding logical optimizer hooks from |
I moved, |
I suggest we ticket out the optimiser discussion again. Let's not hold up this PR on it. RE partial federation: that indeed makes sense as a use-case. The way I originally (intended to) support this is with the FederationProvider.Optimizer. The idea is to let remotes that can't execute everything to Self-Select which part of the plan to federate in there. |
Required for this issue
This PR primarily aims to refactor the SQL implementation and additionally define/extend types and traits to support parameterized view.
The idea behind the change is essentially separating SQLExecutor behavior from Table. Allowing tables to define their own state and make changes to logical/ast accordingly. Things like handling 4 token multi-part identifiers, attaching WITH context ..etc can be done via creating a custom
SQLTable
implementation in datafusion-table-provider.Changes
Adds
SQLTable
trait which is used bySQLTableSource
.SQLTable
trait abstracts information about the remote table and allows implementors of the trait to hook into the final stages of federation, where they can change logical plan and the finalast
for the sql query that will be used withinVirtualExecutionPlan
.Adds
RemoteTable
, a default implementation forSQLTable
trait, capable of handing table and parameterized views.Adds
RemoteTableRef
, a extention to defaultTableReference
capable of storing function args.Provides a default
AST
Analyzer for rewriting Statement for tables which containRemoteTableRef
with some functional argsExtends
SqlExecutor
trait with logical_optimizer method, this can allow executor to hook into federation planning, allowing for rewritingLogicalPlan
and even placement ofFederationPlanNode
. This is useful for avoiding federating nodes that are only part of datafusion eg. UDF, UDAF.. etc.Refactors and testing related to usage of this feature