-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add implementation status of javascript hyparquet #102
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your update! Really appreciate it.
I left a couple cells blank because I was unclear what the row was indicating exactly. For example for bloom filters, does this mean that the bloom filters are used for optimized querying? or just that the metadata is available to clients to use? Same with Size statistics.
I think it means that the implementation has read and/or write support to the metadata.
Also happy to update if there's a better way to indicate a read-only implementation.
What about using 🇼 and 🇷 respectively? I searched them from https://emojipedia.org.
@@ -22,91 +22,91 @@ Implementations: | |||
* `Go`: [parquet-go](https://github.com/apache/arrow-go/tree/main/parquet) | |||
* `Rust`: [parquet-rs](https://github.com/apache/arrow-rs/blob/main/parquet/README.md) | |||
* `cuDF`: [cudf](https://github.com/rapidsai/cudf) | |||
|
|||
* `JavaScript`: [hyparquet](https://github.com/hyparam/hyparquet) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* `JavaScript`: [hyparquet](https://github.com/hyparam/hyparquet) | |
* `hyparquet`: [hyparquet](https://github.com/hyparam/hyparquet) |
There was a similar discussion about the name: #99 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the list items besides cuDF
put the language before the colon, not the library name. I'm happy to make changes but I think it would be nice if it mentioned the language somewhere. So that users who are working in a particular language might know what implementations to look at? Open to suggestions.
I would personally advocate for having the language listed in the top "Implementations" list, and then in the tables put the specific implementation name (since there could be multiple libraries in the same language).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option would be to remove the pre-colon and put language in parens after?
Also one could argue that the existing link names are wrong, since the implementation names are actually arrow
, arrow-go
, and arrow-rs
. Should those names also be in table headers below instead of the languages?
This format would better support the case of multiple implementations per language:
* `JavaScript`: [hyparquet](https://github.com/hyparam/hyparquet) | |
* [arrow](https://github.com/apache/arrow/tree/main/cpp/src/parquet) (C++) | |
* [parquet-java](https://github.com/apache/parquet-java) (Java) | |
* [arrow-go](https://github.com/apache/arrow-go/tree/main/parquet) (Go) | |
* [arrow-rs](https://github.com/apache/arrow-rs/blob/main/parquet/README.md) (Rust) | |
* [cudf](https://github.com/rapidsai/cudf) (CUDA) | |
* [hyparquet](https://github.com/hyparam/hyparquet) (JavaScript) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the 2nd option
For hyparquet there is a function |
I was following the |
IMHO, if hyparquet is able to return a deserialized bloom filter from the file then it can be marked as (R).
Agreed |
I've merged this. Thanks @platypii and look forward to future update! |
Adds the JavaScript parquet implementation hyparquet to the support matrix. This library is read-only but has support for almost every parquet encoding and compression format.
I left a couple cells blank because I was unclear what the row was indicating exactly. For example for bloom filters, does this mean that the bloom filters are used for optimized querying? or just that the metadata is available to clients to use? Same with
Size statistics
.Also happy to update if there's a better way to indicate a read-only implementation.
Disclosure: I am the primary author of hyparquet