Improved Meter.Id#getTags() performance #6182
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hey. This is a very small and simple PR aimed at a tiny performance improvement: currently, the .getTags() call always allocates a zero-sized array list and then uses a .forEach call (which results in allocating another object for lambda, and, in the worst case, non-inlined call) to populate it. PR checks if list is needed at all, and if so, asks for a list with a space for 32 elements, populating it in a classic loop. The number of 32 is only a guess with no real grounds.
This is definitely not the biggest problem around, so the change is perceptible, yet not a game changer.
Benchmarks from #6174 and Intel N100 fixed at 2GHz were used to estimate the impact. The benchmark uses identifiers with 0-64 tags, where the number of tags in the
mode
column occurs 90% of time, and other values are distributed evenly in the remaining space.43.379 ± 0.056
44.352 ± 0.782
71.544 ± 0.188
65.575 ± 0.480
81.493 ± 0.704
72.908 ± 1.048
92.932 ± 0.451
82.054 ± 1.028
120.026 ± 1.511
104.113 ± 1.794
209.666 ± 2.159
143.756 ± 3.422
345.139 ± 3.130
244.574 ± 7.125
618.301 ± 6.336
496.224 ± 14.074
Please note that some improvements suggested in #6113 (using
Collections.unmodifiableList(Arrays.asList(...).subList(...)
or even creating a custom List implementation that would combine all the three) will likely bring these numbers down to tens of ns per call on arrays of any length and make this PR completely redundant.