v0.2.0
We are pleased to announce the release of GraphAr Version 0.2.0! This release includes a number of new features and performance improvements to the GraphAr libraries, as well as addressing some project-related issues.
The main improvements included in this release are:
- Optimizing the GraphAr Spark library:
- Implement a custom data source for GraphAr to improve the performance of the GraphAr Spark Reader/Writer.
- Fix the Spark Writer bug when the column name contains a dot (.).
- Including more methods in the libraries:
- Provide more reading methods in the Spark library for convenience.
- Add auxiliary functions in the C++ library to get vertex chunk number or edge chunk number with infos.
- Addressing some issues when using GraphAr in external projects:
- Handle yaml-cpp correctly when requiring GraphAr in external projects.
- Use gar-related names for arrow project and ccache to avoid duplicated project names.
- Use linker flags to suppress the clang warnings.
- Add prefix to arrow definitions to avoid conflicts.
- Cast StringArray to LargeStringArray in Arrow for generality.
- Improving the project configurations and other bug fixes:
- Add pre-commit configuration and instructions.
- Handle comments correctly for preview PR docs.
- Write CSV payload files with header.
- Update the source code url of GraphScope fragment builder and writer.
Commits
- d223858: [Improve][Spark] Improve the performance of GraphAr Spark Reader (#84) (lixueclaire) #84
- 24b2446: [Improvement][FileFormat] Write CSV payload files with header (#85) (Weibin Zeng) #85
- b14c2f2: [Improvement] [Spark] Add methods for Spark Reader and improve the performance (#87) (lixueclaire) #87
- cdf0440: Address issues in handling yaml-cpp correctly when requires GraphAr in external projects (#91) (Tao He) #91
- 5ae1e7c: Add pre-commit configuration and instructions (#93) (Tao He) #93
- c31a88b: Handle comments correctly for preview PR docs (#94) (Tao He) #94
- 789fe48: [Improve] Add auxiliary functions to get vertex chunk num or edge chunk num with infos (#95) (Weibin Zeng) #95
- c2c5ec6: [BugFix] Fix the Spark Writer bug when the column name contains a dot(.) (#101) (lixueclaire) #101
- 84a3fde: [Improve] Use gar-related names for arrow project and ccache to avoid duplicated project name (#102) (Weibin Zeng) #102
- d2c0818: It should be linker flags, suppressing the clang warnings (#104) (Tao He) #104
- 023bd3b: Cast StringArray to LargeStringArray otherwise we will fill when we need to contenate chunks (#105) (Tao He) #105
- 5c58589: Update the source code url of GraphScope fragment builder and writer (#103) (Weibin Zeng) #103
- d26d3b8: Add prefix to arrow definitions to avoid conflicts (#106) (Tao He) #106
- ad30121: [Improvement] Improve GraphAr spark writer performance and implement custom writer builder to bypass spark's write behavior (#92) (Weibin Zeng) #92