Skip to content

Add try_new for LogicalPlan::Join Join and others #14363

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
phisn opened this issue Jan 29, 2025 · 9 comments · Fixed by #15757
Closed

Add try_new for LogicalPlan::Join Join and others #14363

phisn opened this issue Jan 29, 2025 · 9 comments · Fixed by #15757
Assignees
Labels
enhancement New feature or request

Comments

@phisn
Copy link
Contributor

phisn commented Jan 29, 2025

Is your feature request related to a problem or challenge?

Currently one has to manually add the schema when creating a join or give an empty one and call recompute_schema. It would be nice to have a try_new which handles schema creation as well as does maybe some sanity checking.

Describe the solution you'd like

Add try_new to Join (and to all other plans currently missing it).

Describe alternatives you've considered

N/A

Additional context

Quality of life

@phisn phisn added the enhancement New feature or request label Jan 29, 2025
@phisn phisn changed the title Add try_new for LogicalPlan::Join Join Add try_new for LogicalPlan::Join Join and others Jan 29, 2025
@findepi
Copy link
Member

findepi commented Jan 30, 2025

Join output schema is very difficult topic, especially when using USING.
It might be there is no one size fits all answer, but some (well defined) answer may be packaged as try_new (or differently named method), as long as it's not enforced as the only blessed way to construct Join.

I see how recompute_schema may be useful in the case where it's a newly constructed Join and doesn't have good schema yet. Fixing schema at construction time would be better -- ie creating truth is better than creating falsehood and fixing it later. Especially that recompute_schema has its own problems (#14357).

@alamb
Copy link
Contributor

alamb commented Jan 30, 2025

Add try_new to Join (and to all other plans currently missing it).

FWIW I think this would be a very nice API to add. Thank you for proposing it

@alamb
Copy link
Contributor

alamb commented Jan 30, 2025

Additional context
Quality of life

LOL

@Spaarsh
Copy link
Contributor

Spaarsh commented Feb 7, 2025

I would like to work on this. I will be taking a look at a few try_new implementations in benchmarks/src/clickbench.rs datafusion/catalog/src/streaming.rs and datafusion/execution/src/disk_manager.rs for understanding this better.

@Spaarsh
Copy link
Contributor

Spaarsh commented Feb 7, 2025

take

@Spaarsh
Copy link
Contributor

Spaarsh commented Feb 18, 2025

I won't be able to work on this issue. Anyone willing to work on this is free to get it assigned to themselves. I don't see any unassign button here.

@alamb
Copy link
Contributor

alamb commented Feb 19, 2025

Thanks @Spaarsh -- I unassigned you

@kumarlokesh
Copy link
Contributor

take

@kumarlokesh
Copy link
Contributor

@phisn @alamb made an attempt to address this requirement here: #15757. Please have a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants