Add benchmarks for basic prebuilt MeterFilter implementations #6174
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What is this?
This PR contains only benchmarks for some basic API, MeterFilter implementations and a couple of Tags methods - no new symbols visible for users, no changes in production code. It goes a bit extra by exploiting different inputs so that different usage patterns would be evaluated and "better at this count, yet worse at that count" situations had less chance of getting unnoticed (assuming that whoever works on improving performance of specific code part runs these benchmarks prior to and after the changes).
Is it really needed?
I'm prepping other PRs for some non-dramatic updates in filters and tags code (also no new symbols or new functionality, but changing the existing one preserving the present functionality), so these benchmarks just had to emerge. It's not up to me to decide if they are needed in the project, but they came up naturally and they shouldn't take any resources from anyone except for the time from people explicitly running them, so i don't see an obstacle here.
But at the same time i'm just having fun, so i don't have a problem with any requested changes or removals.
There are already benchmarks for tags!
Yes, but: 1) i've noticed them just when i was preparing this PR, 2) they use static input (which also comes with a fixed size). I don't have any energy to check the assembly, but JIT theoretically might infer wrong assumptions from that; i also think that dynamic input size is a must here, as the whole sort-the-array story is non-linear from the very beginning. Not sure what do with these, but you can guide me.
Aren't these benchmarks too sophisticated?
They are sophisticated, but not "too" for me; they aren't entirely necessary, but since they are coming for free and doing the more thorough matrix evaluation, i see this as a benefit. But i'm farming my XP here, so if there's a concern against checking these in, i'm OK with it.
Any interesting results?
These are just the baseline measurements, i.e. they exist to compare two different revisions. There are some hints on the situations that can be improved. Selecting the biggest offender (using Intel N100 @ 2GHz) for a dramatic effect:
Thing's happening here is just taking ~64 elements (there are many samples, most of which have 64 tags) and filter out the ones that match a set of 64 keys. While here it comes at 8.4usec, this could be just an O(n) task executing in a matter of a couple of hundreds of nanoseconds. My forthcoming PRs would somewhat address things like that (being honest: this is the biggest offender, other places are not so impressive), but getting to the best position would require #6113.