-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(l1): metrics for syncing #2080
Conversation
|
let throughput = | ||
self.current_cycle.throughput / self.current_cycle.executed_blocks_count as f64; | ||
|
||
tracing::info!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, but the metric we care the most is gigagas per second
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This metrics is indeed gigagas per second, I've added the unit in 322ab25
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
.get_block_header_by_hash(current_head)? | ||
.unwrap_or_default() | ||
.number; | ||
self.metrics = SyncMetrics::new(current_head_block_num, current_head); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to persist this struct as part of the SyncManager if we will be overwriting it on each cycle? Why not have it as a separate temporary entity?
// Full sync stores and executes blocks as it asks for the headers | ||
// Full sync stores and executes blocks as it ask for the headers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please revert
This reverts commit fc7a98d.
"[SYNCING PERF] Last {} blocks performance:\n\ | ||
\tTotal time (takes into account network requests): {} seconds\n\ | ||
\tTime spent adding blocks: {} seconds ~= {:.2}% of total time\n\ | ||
\tAverage block in total time: {:.3} seconds\n\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rephrase this? It is not clear what the number represents
\tBlocks per second over total time: {:.3}\n\ | ||
\tBlocks per second add block: {:.3}\n\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, It is not clear enough
\tThroughput: {:.3} Gigagas/s\n\ | ||
\tStarted at block: {} (hash: {:?})\n\ | ||
\tFinished at block: {} (hash: {:?})\n\ | ||
\tExecution count: {}\n\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe Total blocks executed
?
\tStarted at block: {} (hash: {:?})\n\ | ||
\tFinished at block: {} (hash: {:?})\n\ | ||
\tExecution count: {}\n\ | ||
\t======= Overall, this cycle took {} seconds with respect to the previous one =======", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not clear, is this the time this cycle took? is it the time difference between this cycle and the previous one? the wording is confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I looked correctly it's the time the cycle took, as it was the same as Total time (takes into account network requests)
, agree the wording seems confusing mentioning the previous cycle.
Ok(AddBlockResultStats { | ||
throughput, | ||
time_spent_as_ms: interval, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a question given I'm coming from outside rust, is this an idiomatic way to handle metrics? Usually I try to avoid modifying the type of functions to accommodate to metrics and instead have a way to send the metrics inside it without affecting the outside world, something like:
struct SyncManager {
metrics: Arc<dyn SomeKindOfMetricsCollector>,
// other fields
}
impl SyncManager {
fn new(metrics: Arc<dyn SomeKindOfMetricsCollector>) -> Self {
Self {
metrics,
// initialize other fields
}
}
[...]
fn add_block(&self, block: &Block) -> Result<(), ChainError> {
let start = Instant::now();
// block processing logic
let duration = start.elapsed().as_millis();
let throughput = calculate_throughput(block, duration);
self.metrics.record("throughput", throughput);
self.metrics.record("duration", duration);
Ok(())
}
}
This is just an example, but wanted to show what I mean. Anyway, maybe it's more idiomatic to just return this kind of info in rust, just wanted to be sure.
Closing this pr as it is getting replaced by #2174 |
Motivation
Full syncing on Holesky.
Description
Introduce
SyncMetrics
for full-syncing that logs performance metrics, including:add_block
and its percentage over the total sync time.Logging intervals are configured at 100, 1,000, 10,000, 100,000, and 1,000,000 blocks.
Example
Here is an example of what the logs look like:
Closes None