-
Notifications
You must be signed in to change notification settings - Fork 374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CELEBORN-1831] Add ratis commitIndex metrics #3063
Conversation
@zaynt4606, is it better to introduce ratis metrics to cover ha metrics? |
Are there ratis metrics that already exist?I want to add each master's commitIndex metrics to observe the metadata synchronization of the raft cluster in master panels like this. |
@@ -60,6 +60,10 @@ object MasterSource { | |||
|
|||
val OFFER_SLOTS_TIME = "OfferSlotsTime" | |||
|
|||
val MASTER_COMMIT_INDEX = "MasterCommitIndex" | |||
|
|||
val MASTER_COMMIT_INDEX_DIFF = "MasterCommitIndexDiff" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RatisCommitIndex
and RatisCommitIndexDiff
are more straight forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done~
@@ -285,6 +286,10 @@ private[celeborn] class Master( | |||
statusSystem.decommissionWorkers.size() | |||
} | |||
|
|||
masterSource.addGauge(MasterSource.MASTER_COMMIT_INDEX) { () => getMasterRaftCommitIndex._1 } | |||
|
|||
masterSource.addGauge(MasterSource.MASTER_COMMIT_INDEX_DIFF) { () => getMasterRaftCommitIndex._2 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only addGauge
if haEnabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, has been updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@zaynt4606, ratis supports metrics which refers to https://github.com/apache/ratis/blob/master/ratis-docs/src/site/markdown/metrics.md. |
Replace commitIndex with applyCompletedIndex in Ratis~ |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3063 +/- ##
==========================================
+ Coverage 32.52% 32.54% +0.02%
==========================================
Files 336 336
Lines 20053 20055 +2
Branches 1796 1796
==========================================
+ Hits 6520 6524 +4
+ Misses 13168 13167 -1
+ Partials 365 364 -1 ☔ View full report in Codecov by Sentry. |
private def getRatisApplyCompletedIndex: Long = { | ||
if (conf.haEnabled) { | ||
val ratisServer = statusSystem.asInstanceOf[HAMasterMetaManager].getRatisServer | ||
if (ratisServer != null) { | ||
val stateMachine = ratisServer.getMasterStateMachine | ||
val lastAppliedIndex = stateMachine.getLastAppliedTermIndex.getIndex | ||
lastAppliedIndex | ||
} else { | ||
0 | ||
} | ||
} else { | ||
0 | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private def getRatisApplyCompletedIndex: Long = { | |
if (conf.haEnabled) { | |
val ratisServer = statusSystem.asInstanceOf[HAMasterMetaManager].getRatisServer | |
if (ratisServer != null) { | |
val stateMachine = ratisServer.getMasterStateMachine | |
val lastAppliedIndex = stateMachine.getLastAppliedTermIndex.getIndex | |
lastAppliedIndex | |
} else { | |
0 | |
} | |
} else { | |
0 | |
} | |
} | |
private def getRatisApplyCompletedIndex: Long = { | |
if (conf.haEnabled) { | |
val ratisServer = statusSystem.asInstanceOf[HAMasterMetaManager].getRatisServer | |
if (ratisServer != null) { | |
ratisServer.getMasterStateMachine.getLastAppliedTermIndex.getIndex | |
} | |
} | |
0 | |
} | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks, merge to main(v0.6.0) |
What changes were proposed in this pull request?
Add two metrics (raft commitIndex of each master and maxCommitIndex - minCommitIndex value).
Why are the changes needed?
To observe the metadata synchronization of the raft cluster.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Cluster test.