-
Notifications
You must be signed in to change notification settings - Fork 41
Proposal: Support for Parameterized Views #301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Since support for parameterized views is merged into Awaiting this to be merged first #324
|
We've already started going down this route (see MySQL: https://github.com/datafusion-contrib/datafusion-table-providers/blob/main/core/src/mysql/federation.rs). In general I think the |
This crate currently supports registering remote tables from different providers. However, it currently lacks support for registering remote views and procedures, making it difficult to query from views and parametrized view. This feature would be particularly useful for many people who want datafusion to only have partial access to a table ( for ex. in multi-tenancy ) or hiding a very complicated vendor specific logic abstracted behind a view.
Problems
SqlTable
or any of vendor-specific implementation.datafusion-federation
crate to support this functionality.Proposed Solution:
Based on some experimenting that I have done till now, I want to propose following changes to bring this feature.
Either extending
SqlTable
( and any similar vendor specific Table Provider ) to store optional arguments assqlpaser::ast::TableFunctionArgs
. Or create a new type similar toSqlTable
calledSqlView
Extending
SyncDbConnection
andAsyncDbConnection
with new methodget_view_schema
for retrieving schema for a view.When creating a SQL string for
SqlExec
, First create a basic logical plan with an alias set, unparse the logical plan to aStatement
. Then use the following visitor to append the arguments if any.Similar approach can done in datafusion-federation
sql
module.Other Benefits
This will also allow for sort of
curried
table functions. Essentially allowing for defining a function likeread_from(path)
, which can the internally create a new table provider that calls a remote viewread_from(path, cluster)
Open Questions:
SqlTable
or a new type will be better for it?We’d love feedback from the community on this proposal. Are there use cases we might be overlooking? Would this fit well with scope of this project.
Note
I will update this thread and provide a PR or reference to my forked branch that people can look at and see if this seems okay.
The text was updated successfully, but these errors were encountered: