-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add files to add existing Parquet files to a table #932
Comments
I would like to try working on this. |
Thanks @jonathanc-n! Feel free to send the PR for this. |
@ZENOTME When appending existing data files, should the system load file metadata by reading the current snapshot’s manifest lists from an existing Iceberg table, or would you prefer to specify a file path from which the system scans and infers metadata? I'm looking to just perform a |
Hi @jonathanc-n, I think we can refer the implementation of pyiceberg: https://github.com/apache/iceberg-python/blob/main/pyiceberg/table/__init__.py#L669C9-L669C18.
I think the user will add file using transaction API so we can know which table it will be append and related metadata. |
In #345, we support writing new data files and appending them to the table. But we haven't support appending existing data files which need to support reading existing data files and generating corresponding metadata DataFile.
The text was updated successfully, but these errors were encountered: