-
Notifications
You must be signed in to change notification settings - Fork 13
feat: improve ledger metrics gathering + updated ledger size management #410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
thlorenz
wants to merge
26
commits into
master
Choose a base branch
from
thlorenz/faster-metrics
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
5 files reviewed, no comments
Edit PR Review Bot Settings | Greptile
…te-using-compaction-filter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
20 files reviewed, 7 comments
Edit PR Review Bot Settings | Greptile
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Ledger metrics no longer rely on counting entries in ledger columns.
Ledger size management was redone to not rely on column counting either.
Details
Metrics
Previously metrics relied on counting entries in ledger columns which is very slow for larger
ledgers.
We also had tried to improve perf by caching counts, but that proved not sufficient.
Instead now we update prometheus counters directly when an entry is added. We no longer use any
methods that count entries in ledger columns in order to update metrics.
The only piece that still relies on these methods is the
ledger_stats
command. However thisonly runs during inspection/diagnostics and thus performance is not as much of a concern. For
that reason I also removed all counter caching in order to not incurr any extra overhead while
the validator is running.
Renaming Metrics
Since ledger metrics where previously tracked as gauges and now are counters, the metric names
are changed as follows:
Ledger Size Management
We now use watermarks to keep track of the ledger size at particular slots and thus predict
more accurately how many slots to truncate to bring the ledger size below max size.
The default strategy truncates to 75% max size whenever we go above max size.
The code ended up more complex than the original truncator also due to handling lots of
edgecases related to restarts with lower max size and the finality slot.
Unit tests for each of those ensure that they are all handled correctly.
The original truncator and tests were removed.
NOTE: we track the accounts mod id at the watermark boundaries which we can use to truncate
these columns correctly as well, however we do not do that yet.
Truncation via Compaction
The updated truncation implementation was cherry picked from the
fix/ledger/delete-using-compaction-filter
branch.
The main difference is that we don't create tombstones anymore since we use a filter in order
to delete rows during manual (triggered by the ledger size manager) or automatic compaction.
After cherry picking I added a guard to avoid deleting account mod ids since they don't have a
key starting with a slot. Otherwise we'd potentially delete account mod ids that are actually
still needed.
Greptile Summary
Major overhaul of ledger metrics and size management system, replacing slow column entry counting with direct Prometheus counter updates and introducing a watermark-based ledger size management system.
ledger_blocktimes_gauge
→ledger_blocktimes_count
) for more efficient trackingLedgerColumn
to eliminate memory overhead