-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Closed
Labels
bugnext-major-releasethe PR has API changes and it waiting on the next major versionthe PR has API changes and it waiting on the next major versionparquetChanges to the parquet crateChanges to the parquet crateperformance
Description
Describe the bug
While testing the tpchgen-rs upgrade to arrow 57 in
@clflushopt, @kevinjqliu and I found that arrow-rs 57 seems to write data around 10% slower than arrow 56: clflushopt/tpchgen-rs#200 (review)
Specifically running this command is around 10% slower
tpchgen-cli --scale-factor=100 --tables=lineitem --parts=10 --format=parquet56.0.0 takes 0m27.122s
57.0.0 takes 0m28.776s
To Reproduce
rm -rf lineitem && cargo build --release && time ./target/release/tpchgen-cli --scale-factor=100 --tables=lineitem --parts=10 --format=parquet
Expected behavior
57 should be the same or better performance as 56
Additional context
I am doing some git bisecting to see if I can find some more data
Metadata
Metadata
Assignees
Labels
bugnext-major-releasethe PR has API changes and it waiting on the next major versionthe PR has API changes and it waiting on the next major versionparquetChanges to the parquet crateChanges to the parquet crateperformance