Skip to content

Conversation

alsoba13
Copy link
Contributor

@alsoba13 alsoba13 commented Sep 23, 2025

The idea is to easily find rules that contribute the most to an eventual alert on tenants hitting cardinality limits.

We can make this more efficient in a previous layer of the code, but I avoided changing interfaces too much, while this code runs in a separate thread I think we can deal with it

@alsoba13 alsoba13 requested review from bryanhuhta and a team as code owners September 23, 2025 12:15
@alsoba13 alsoba13 self-assigned this Sep 25, 2025
seriesByRuleID[ruleID]++
}
for ruleID, count := range seriesByRuleID {
e.metrics.seriesSent.WithLabelValues(tenantId, ruleID).Add(float64(count))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you estimate the extra cardinality that this will add?

Copy link
Contributor Author

@alsoba13 alsoba13 Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cardinality could be described as "total number of tenants using recording rules times the number of instances of compaction workers in that cell". After the change, we would describe as "total number of rules defined by all the tenants times the number of instances of compaction workers in that cell".

So it's safe to talk about a x10-x50 increase in cardinality. In the case of ops we would go from 32 to 320 series

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it is okay with current usage, it easy to see how this could go through the roof in the future, with this multiplier.

My gut feel is this is better kept in logs. (I am also not super firm with my position here, decide yourself).

seriesByRuleID[ruleID]++
}
for ruleID, count := range seriesByRuleID {
e.metrics.seriesSent.WithLabelValues(tenantId, ruleID).Add(float64(count))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it is okay with current usage, it easy to see how this could go through the roof in the future, with this multiplier.

My gut feel is this is better kept in logs. (I am also not super firm with my position here, decide yourself).

@alsoba13
Copy link
Contributor Author

alsoba13 commented Oct 8, 2025

I've decided to merge as is. As I expect most of the tenants to have <5 rules. Happy to get rid of this if recording rules gets a lot of adoption

@alsoba13 alsoba13 merged commit 6a4ebf3 into main Oct 8, 2025
20 checks passed
@alsoba13 alsoba13 deleted the alsoba13/metrics-recording-rules-per-rule-id branch October 8, 2025 08:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants