-
Notifications
You must be signed in to change notification settings - Fork 1k
Improve Otlp Delta Aggregation with support for max and Histogram. #3749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Otlp Delta Aggregation with support for max and Histogram. #3749
Conversation
...ns/micrometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpMeterRegistry.java
Outdated
Show resolved
Hide resolved
...mentations/micrometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpTimer.java
Outdated
Show resolved
Hide resolved
This comment was marked as outdated.
This comment was marked as outdated.
ba901e5
to
3ede90f
Compare
3ede90f
to
ecc66dd
Compare
@shakuzen / @jonatan-ivanov Can you guys have a look at this? Really appreciate any feedback on this. It would be good to have these get in time for 1.11.0 |
It feels a bit wider scope for this pull request than necessary. See some previous discussion in #3144. I think we should not change that behavior here. It's generally part of the contract of a timer/summary that max in some form be supported. That we don't export it currently is somewhat of a tangential issue. I would probably opt for publishing our TimeWindowMax as a separate gauge in the case of cumulative temporality. A cumulative max as specified by OTLP is not generally useful, as far as I can tell. But let's tackle any such change separately so we don't block other things and get distracted. |
I'm not immediately sure what we should do about the tests, but the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pull request, as always. I'm leaving some initial thoughts. I'll do a more thorough review tomorrow.
micrometer-core/src/main/java/io/micrometer/core/instrument/step/StepMax.java
Outdated
Show resolved
Hide resolved
micrometer-core/src/main/java/io/micrometer/core/instrument/step/StepMax.java
Outdated
Show resolved
Hide resolved
...rometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpDistributionSummary.java
Outdated
Show resolved
Hide resolved
51c9da5
to
27fa754
Compare
StepMaxTest, OtlpCumulativeMeterRegistryCompatibilityTest, and OtlpDeltaMeterRegistryCompatibilityTest have test failures now. I guess we still need to update the TCK code for the last one, but the first two should be passing, right? |
5aa1daa
to
eeb91af
Compare
I fixed StepMaxTest. OtlpCumulativeMeterRegistryCompatibilityTest fails for unavailability of max on the meter when it is cumulative which I am going to fix. But before that I wanted to see what approach we take for having CumulativeTIme and DeltaTimer vs a single Abstract timer behaving based on aggregation temporality |
I updated the TCK code with a bit of a hack so it works with both time window and step histograms. |
|
||
@Override | ||
protected CountAtBucket[] noValue() { | ||
if (buckets == null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
buckets
might be empty but it is never null. I wonder if it would be worth it to store an instance field with the zero'd CountAtBucket array versus making it each time noValue()
is called. It depends how often noValue()
will be called in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In an ideal world, I don't expect "noValue" to be called during the app's lifecycle except during the start-up time. That's the reason I decided against adding an additional long-lived (actually an idle) object in there. This might quickly get concerning when there are 1000's timers with ~50 buckets.
Another thing I considered for noValue is to return an empty histogram which is already a static variable but that might not be good since the bucket information gets dropped in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
buckets might be empty but it is never null
That's true but except for the fact that the StepValue calls noValue()
during object creation by which point buckets is not yet initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing that out. This isn't ideal. I left // TODO
comments in the tests since we can't check the buckets that we expect to be there until a step has passed. We can try to figure this out post merge if it is worth fixing and we can come up with a solution.
micrometer-core/src/main/java/io/micrometer/core/instrument/distribution/StepHistogram.java
Outdated
Show resolved
Hide resolved
...rometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpDistributionSummary.java
Outdated
Show resolved
Hide resolved
micrometer-core/src/main/java/io/micrometer/core/instrument/AbstractTimer.java
Outdated
Show resolved
Hide resolved
...crometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/AggregationTemporality.java
Outdated
Show resolved
Hide resolved
...rometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpDistributionSummary.java
Outdated
Show resolved
Hide resolved
...eter-registry-otlp/src/test/java/io/micrometer/registry/otlp/OtlpDeltaMeterRegistryTest.java
Show resolved
Hide resolved
...ns/micrometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpMeterRegistry.java
Show resolved
Hide resolved
...ations/micrometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpStepTimer.java
Show resolved
Hide resolved
...ations/micrometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpStepTimer.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some tests for the StepHistogram. There are some things I will probably polish post-merge, but I think this is functionally in a good state. Thank you for all of the work on this.
Regarding this comment and the unresolved part of the comment thread, I've left it as something to consider outside of this pull request so we can get this merged and at least make progress to a better state overall. |
@shakuzen Also, it is important to make the OtlpStepTimer to rotate count, total, max and histogram on reading any of these, which will be fairly less costly as we will do this only on rotation and repeated call in the same step has almost nil effect. |
This PR is a follow-up PR for #3625 where the capability to have Delta Aggregation Temporality was introduced. This aims to change some of the behaviors of the OTLP Delta Registry and try to stick the meters to the standards mentioned here(https://opentelemetry.io/docs/reference/specification/metrics/data-model/#metric-points)
Changes introduced in this PR
Core:
OTLP:
Known Issues
MeterRegistryCompatibilityKit tests have some known failures that I want to solve in this PR discussion,There was a deprecated test in MeterRegistryCompatibilityKit that validates histogram counts. Since OTLP uses Step Histogram it will return 0 for the uncompleted step which fails.TODO
Notes:
Closes gh-3772
Closes gh-3771