-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
script to export benchmark information as Line Protocol format #14662
base: main
Are you sure you want to change the base?
Conversation
@alamb Check this out. do we have some documentation on json format? that would be helpful if we want to show more arguments in |
I believe the JSON is a serialized version of this structure: datafusion/benchmarks/src/util/run.rs Lines 47 to 59 in 68306ac
This looks amazing @logan-keede -- I'll check it out later today |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @logan-keede -- this is a great. I tried it on some of my own data and it works well ❤️
I am thinking through to the next steps (of trying to gather historical performance measurements, but I think I just need to try and gather that information and see what additional data is needed
benchmarks/lineformat.py
Outdated
compare_parser.add_argument( | ||
"baseline_path", | ||
type=Path, | ||
help="Path to the baseline summary file.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this is just a conversion, I don't think there is any idea of baseline? As in maybe this comment needs to be adjusted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remnants of compare.py 😅
benchmarks/lineformat.py
Outdated
# specific language governing permissions and limitations | ||
# under the License. | ||
|
||
from __future__ import annotations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to add some comments here explaining what this script does along with the example you have in your PR description?
It would also be great to add some documentation to https://github.com/apache/datafusion/tree/main/benchmarks#comparing-performance-of-main-and-a-branch explaining how to use this script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added some, check it out.
benchmarks/lineformat.py
Outdated
@@ -0,0 +1,130 @@ | |||
#!/usr/bin/env python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can rename this file from benchmarks/lineformat.py
to benchmarks/lineprotocol.py
as a way to have its name be a bit more self describing
Which issue does this PR close?
Rationale for this change
a step towards #5504
What changes are included in this PR?
addition of python script that converts some
**benchmark**.json
to lineformatAre these changes tested?
on some files in
result.zip
Are there any user-facing changes?
No