Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add histogram parsing in runner v2 #34017

Merged
merged 2 commits into from
Feb 20, 2025

Conversation

Naireen
Copy link
Contributor

@Naireen Naireen commented Feb 18, 2025

This is a forward fix for #33761, where this was added in java:core with a dependency on the dataflow-runner. This adds it back in without that dependence.

Addresses #33093

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

@Naireen
Copy link
Contributor Author

Naireen commented Feb 19, 2025

Run Java_IOs_Direct PreCommit

@Naireen
Copy link
Contributor Author

Naireen commented Feb 19, 2025

Run Java PreCommit

@Naireen
Copy link
Contributor Author

Naireen commented Feb 19, 2025

Run Java_Pulsar_IO_Direct PreCommit

@Naireen
Copy link
Contributor Author

Naireen commented Feb 19, 2025

Run Python_Runners PreCommit 3.12

@Naireen
Copy link
Contributor Author

Naireen commented Feb 19, 2025

Run Java PreCommit

@Naireen
Copy link
Contributor Author

Naireen commented Feb 19, 2025

Run Java_Pulsar_IO_Direct PreCommit

@Naireen Naireen marked this pull request as ready for review February 19, 2025 06:11
Copy link
Contributor

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @Abacn for label java.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@Naireen
Copy link
Contributor Author

Naireen commented Feb 19, 2025

R: @kennknowles

Copy link
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment assign set of reviewers

Copy link
Member

@kennknowles kennknowles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems OK and mostly was already approved. I did have some question about whether the proto for reporting histogram is too heavyweight or could have incompatibilities or whatnot

repeated int64 bucket_counts = 3;

// Statistics for the underflow and overflow bucket.
message OutlierStats {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: harder to read the fields that exist when nested message definitions are interspersed. Do all the nested messages first or all nested message last.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the nested messages to the end.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the outlier stats, since we don't currently use them either for encoding or decoding, so it simplifies this a bit.

}

// Describes the bucket boundaries used in the histogram.
optional BucketOptions bucket_options = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any issue where the options might be incompatible across two metrics reports?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is solely for encoding or decoding, and each message for a metric report (keyed by the name) is separate, for consuming metrics downstream, where if there are two different histograms types, it can be problematic, and would be on that client to determine how to handle incompatibilities or how to aggregate the two.

@Naireen Naireen force-pushed the new_hist_parse_logic branch from 13e8ce1 to e532162 Compare February 19, 2025 22:50
@Naireen
Copy link
Contributor Author

Naireen commented Feb 19, 2025

Run Go PreCommit

@Naireen
Copy link
Contributor Author

Naireen commented Feb 19, 2025

Run Prism_Python PreCommit 3.9

@Naireen
Copy link
Contributor Author

Naireen commented Feb 19, 2025

Run Java_IOs_Direct PreCommit

@Naireen
Copy link
Contributor Author

Naireen commented Feb 20, 2025

Run Go PreCommit

@Naireen Naireen force-pushed the new_hist_parse_logic branch from e532162 to 46ae1e3 Compare February 20, 2025 17:16
@Naireen
Copy link
Contributor Author

Naireen commented Feb 20, 2025

Run Kotlin_Examples PreCommit

@Naireen
Copy link
Contributor Author

Naireen commented Feb 20, 2025

Run Java PreCommit

1 similar comment
@Naireen
Copy link
Contributor Author

Naireen commented Feb 20, 2025

Run Java PreCommit

@johnjcasey johnjcasey merged commit f5ed586 into apache:master Feb 20, 2025
111 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants