-
Notifications
You must be signed in to change notification settings - Fork 19
feat: add metrics for intent execution time #607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Manual Deploy AvailableYou can trigger a manual deploy of this PR branch to testnet: Alternative: Comment
Comment updated automatically when the PR is synchronized. |
WalkthroughAdds metrics instrumentation across the committor service: per-intent execution timing (histogram), labeled failed-intent counters, CU usage gauge, and LabelValue trait implementations; moves metrics handling into dedicated flows and adjusts some error variants/conversions to support labeled metrics. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Engine as IntentExecutionEngine
participant Executor as IntentExecutor
participant Metrics as metrics module
participant RPC as MagicBlock RPC
Engine->>Engine: record Instant t0
Engine->>Executor: execute(intent).await
Executor-->>Engine: Result<ExecutionOutput, IntentExecutorError>
Engine->>Metrics: observe_committor_intent_execution_time_histogram(duration, intent_kind, outcome)
alt result is Err
Engine->>Metrics: inc_committor_failed_intents_count(intent_kind, error_kind)
end
Engine->>Engine: spawn intent_metrics(...) -- async CU fetch
Engine->>RPC: fetch compute units (inside intent_metrics)
RPC-->>Metrics: CU value (or error logged)
Metrics-->>Engine: set_committor_intent_cu_usage(value)
Engine-->>Caller: mapped public Result
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
test-integration/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (6)
magicblock-committor-service/src/intent_execution_manager/intent_execution_engine.rs(2 hunks)magicblock-committor-service/src/intent_executor/error.rs(2 hunks)magicblock-committor-service/src/intent_executor/mod.rs(2 hunks)magicblock-committor-service/src/types.rs(2 hunks)magicblock-metrics/src/metrics/mod.rs(3 hunks)magicblock-metrics/src/metrics/types.rs(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (6)
magicblock-committor-service/src/intent_executor/mod.rs (3)
magicblock-committor-service/src/intent_executor/error.rs (1)
value(101-108)magicblock-committor-service/src/types.rs (1)
value(23-29)magicblock-metrics/src/metrics/types.rs (2)
value(73-73)value(81-86)
magicblock-committor-service/src/types.rs (3)
magicblock-committor-service/src/intent_executor/error.rs (1)
value(101-108)magicblock-committor-service/src/intent_executor/mod.rs (1)
value(70-78)magicblock-metrics/src/metrics/types.rs (2)
value(73-73)value(81-86)
magicblock-metrics/src/metrics/mod.rs (4)
magicblock-committor-service/src/intent_executor/mod.rs (1)
value(70-78)magicblock-committor-service/src/intent_executor/error.rs (1)
value(101-108)magicblock-committor-service/src/types.rs (1)
value(23-29)magicblock-metrics/src/metrics/types.rs (2)
value(73-73)value(81-86)
magicblock-committor-service/src/intent_execution_manager/intent_execution_engine.rs (1)
magicblock-metrics/src/metrics/mod.rs (2)
observe_committor_intent_execution_time_histogram(347-355)inc_committor_failed_intents_count(337-341)
magicblock-committor-service/src/intent_executor/error.rs (3)
magicblock-committor-service/src/intent_executor/mod.rs (1)
value(70-78)magicblock-committor-service/src/types.rs (1)
value(23-29)magicblock-metrics/src/metrics/types.rs (2)
value(73-73)value(81-86)
magicblock-metrics/src/metrics/types.rs (3)
magicblock-committor-service/src/intent_executor/error.rs (1)
value(101-108)magicblock-committor-service/src/intent_executor/mod.rs (1)
value(70-78)magicblock-committor-service/src/types.rs (1)
value(23-29)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: run_make_ci_lint
🔇 Additional comments (9)
magicblock-committor-service/src/intent_executor/mod.rs (1)
69-77: Execution outcome labels wired correctly. The explicit strings forSingleStageandTwoStagegive the histogram stable identifiers without touching execution flow.magicblock-committor-service/src/types.rs (1)
22-29: Intent kind labeling aligned. Mapping the wrapper straight to its underlying intent type keeps metric labels in lockstep with scheduler behavior.magicblock-committor-service/src/intent_execution_manager/intent_execution_engine.rs (1)
248-285: Instrumentation looks solid. Timing the executor, emitting the histogram with intent/result labels, and counting failures all slot neatly into the existing execute workflow.magicblock-metrics/src/metrics/types.rs (1)
72-87: LabelValue abstraction fits. The trait plusResultblanket impl give downstream code a uniform way to surface labels without boilerplate.magicblock-committor-service/src/intent_executor/error.rs (1)
100-108: Error labeling is comprehensive. Variant-specific identifiers (actions/cpi_limit/commit_nonce) plus a fallback keep failure metrics informative.magicblock-metrics/src/metrics/mod.rs (4)
8-9: LGTM!Adding
LabelValueto the public exports is necessary for consumers to implement the trait and use the labeled metrics functions.
185-188: LGTM!Converting the failed intents counter to an
IntCounterVecwitherror_kindlabel enables proper categorization of failures, which aligns with the PR objectives.
337-341: LGTM!The updated signature correctly accepts labeled error kinds, enabling fine-grained tracking of failure categories. The implementation properly uses the
LabelValuetrait.
347-355: LGTM with dependency on registration fix.The implementation correctly observes execution times with both
intent_kindandoutcome_kindlabels. However, this function will only work correctly afterCOMMITTOR_INTENT_EXECUTION_TIME_HISTOGRAMis registered (see comment on lines 194-207).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
aac6a45 to
93d9814
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (6)
magicblock-committor-service/src/intent_execution_manager/intent_execution_engine.rs(2 hunks)magicblock-committor-service/src/intent_executor/error.rs(2 hunks)magicblock-committor-service/src/intent_executor/mod.rs(2 hunks)magicblock-committor-service/src/types.rs(2 hunks)magicblock-metrics/src/metrics/mod.rs(4 hunks)magicblock-metrics/src/metrics/types.rs(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (6)
magicblock-metrics/src/metrics/types.rs (3)
magicblock-committor-service/src/intent_executor/error.rs (1)
value(105-112)magicblock-committor-service/src/intent_executor/mod.rs (1)
value(73-81)magicblock-committor-service/src/types.rs (1)
value(23-29)
magicblock-committor-service/src/types.rs (3)
magicblock-committor-service/src/intent_executor/error.rs (1)
value(105-112)magicblock-committor-service/src/intent_executor/mod.rs (1)
value(73-81)magicblock-metrics/src/metrics/types.rs (2)
value(73-73)value(81-86)
magicblock-committor-service/src/intent_executor/error.rs (3)
magicblock-committor-service/src/intent_executor/mod.rs (1)
value(73-81)magicblock-committor-service/src/types.rs (1)
value(23-29)magicblock-metrics/src/metrics/types.rs (2)
value(73-73)value(81-86)
magicblock-metrics/src/metrics/mod.rs (4)
magicblock-committor-service/src/intent_executor/mod.rs (1)
value(73-81)magicblock-committor-service/src/intent_executor/error.rs (1)
value(105-112)magicblock-committor-service/src/types.rs (1)
value(23-29)magicblock-metrics/src/metrics/types.rs (2)
value(73-73)value(81-86)
magicblock-committor-service/src/intent_executor/mod.rs (3)
magicblock-committor-service/src/intent_executor/error.rs (1)
value(105-112)magicblock-committor-service/src/types.rs (1)
value(23-29)magicblock-metrics/src/metrics/types.rs (2)
value(73-73)value(81-86)
magicblock-committor-service/src/intent_execution_manager/intent_execution_engine.rs (1)
magicblock-metrics/src/metrics/mod.rs (2)
observe_committor_intent_execution_time_histogram(348-356)inc_committor_failed_intents_count(338-342)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: run_make_ci_test
- GitHub Check: Build Project
- GitHub Check: run_make_ci_lint
🔇 Additional comments (5)
magicblock-metrics/src/metrics/mod.rs (5)
8-8: LGTM: LabelValue trait export.Exposing the
LabelValuetrait enables consumers to implement it for custom types used in metric labels, which is essential for the new labeled metrics.
242-242: LGTM: Histogram registration is in place.The histogram is properly registered, addressing the critical issue noted in the previous review.
338-342: LGTM: Labeled counter increment follows best practices.The use of
&impl LabelValueprovides good extensibility, and the implementation correctly retrieves the labeled counter before incrementing it.
348-356: LGTM: Histogram observation function is well-structured.The function signature clearly indicates the unit (seconds), and the implementation correctly applies both labels before recording the observation.
185-188: Internal metric change is properly implemented; external monitoring infrastructure assessment is beyond repo scope.The wrapper function at line 338 already accepts
error_kindand correctly uses.with_label_values()to work withIntCounterVec. The calling code at line 283 passes an error object that should implementLabelValue. This is not an internal breaking change—the abstraction layer handles the migration seamlessly.However, external monitoring systems (Prometheus, Grafana) that reference this metric will see the structural change from
committor_failed_intents_counttocommittor_failed_intents_count{error_kind="..."}. If monitoring infrastructure exists outside this repository, verify those queries and dashboards are compatible with the labeled format.
| static ref COMMITTOR_INTENT_EXECUTION_TIME_HISTOGRAM: HistogramVec = HistogramVec::new( | ||
| HistogramOpts::new( | ||
| "committor_intent_execution_time_histogram", | ||
| "Time in seconds spent on intent execution" | ||
| ) | ||
| .buckets( | ||
| MILLIS_1_9.iter() | ||
| .chain(MILLIS_10_90.iter()) | ||
| .chain(MILLIS_100_900.iter()) | ||
| .chain(SECONDS_1_9.iter()).cloned().collect(), | ||
| ), | ||
| &["intent_kind", "outcome_kind"], | ||
| ).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Consider adding microsecond buckets for sub-millisecond executions.
The histogram starts at 1ms buckets, which may not capture fast-executing intents that complete in microseconds. If intent execution can be faster than 1ms, consider adding MICROS_10_90 and MICROS_100_900 buckets to the chain.
If you're unsure whether intents can execute in sub-millisecond time, you could:
- Check existing logs or traces for execution times
- Monitor the new histogram initially to see if values cluster below 1ms
- Add the microsecond buckets preemptively if the overhead is acceptable
🤖 Prompt for AI Agents
In magicblock-metrics/src/metrics/mod.rs around lines 194 to 206, the histogram
buckets start at 1ms and may miss sub-millisecond executions; update the bucket
chain to include MICROS_10_90 and MICROS_100_900 before the millisecond buckets
(and add the necessary imports/constants if not present) so the chain becomes
micros then millis, ensure .cloned().collect() still applies, and run
tests/build to verify no import or type errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small nit, but otherwise LGTM
| &intent, | ||
| &result, | ||
| ); | ||
| if intent_execution_secs >= 2.0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does the 2s threshold come from? It could also maybe be part of a config or at least on const somewher
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If intent executed more than 2 seconds that is a peculiar case we need to dig deeper.
Now the threshold came from empirical observations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
magicblock-committor-service/src/intent_execution_manager/intent_execution_engine.rs(3 hunks)magicblock-committor-service/src/intent_executor/mod.rs(5 hunks)magicblock-metrics/src/metrics/mod.rs(4 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
magicblock-committor-service/src/intent_execution_manager/intent_execution_engine.rs (1)
magicblock-metrics/src/metrics/mod.rs (2)
observe_committor_intent_execution_time_histogram(355-363)inc_committor_failed_intents_count(342-349)
magicblock-committor-service/src/intent_executor/mod.rs (2)
magicblock-metrics/src/metrics/types.rs (2)
value(73-73)value(81-86)magicblock-metrics/src/metrics/mod.rs (1)
set_commmittor_intent_cu_usage(365-367)
magicblock-metrics/src/metrics/mod.rs (4)
magicblock-committor-service/src/intent_executor/mod.rs (1)
value(73-81)magicblock-committor-service/src/types.rs (1)
value(23-29)magicblock-metrics/src/metrics/types.rs (2)
value(73-73)value(81-86)magicblock-committor-service/src/intent_executor/error.rs (1)
value(105-112)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Build Project
- GitHub Check: run_make_ci_lint
| register!(COMMITTOR_FAILED_INTENTS_COUNT); | ||
| register!(COMMITTOR_EXECUTORS_BUSY_COUNT); | ||
| register!(COMMITTOR_INTENT_EXECUTION_TIME_HISTOGRAM); | ||
| register!(ENSURE_ACCOUNTS_TIME); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Register the CU usage gauge.
COMMITTOR_INTENT_CU_USAGE is never registered, so Prometheus will never expose the values you’re setting. Please register it alongside the other committor collectors.
register!(COMMITTOR_FAILED_INTENTS_COUNT);
register!(COMMITTOR_EXECUTORS_BUSY_COUNT);
+ register!(COMMITTOR_INTENT_CU_USAGE);
register!(COMMITTOR_INTENT_EXECUTION_TIME_HISTOGRAM);🤖 Prompt for AI Agents
In magicblock-metrics/src/metrics/mod.rs around lines 244 to 247, the
COMMITTOR_INTENT_CU_USAGE gauge is not being registered with Prometheus; add a
register!(COMMITTOR_INTENT_CU_USAGE); call alongside the other committor
collectors (e.g., after COMMITTOR_INTENT_EXECUTION_TIME_HISTOGRAM) so the gauge
is exposed and its values are scraped.
| pub fn set_commmittor_intent_cu_usage(value: i64) { | ||
| COMMITTOR_INTENT_CU_USAGE.set(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Fix the typo in the public setter.
The new API is exposed as set_commmittor_intent_cu_usage (three m’s); everything else—metric name, other helpers, call sites—uses committor. Please rename it now (and update the call site in intent_executor/mod.rs) before we publish the misspelt API.
-pub fn set_commmittor_intent_cu_usage(value: i64) {
+pub fn set_committor_intent_cu_usage(value: i64) {
COMMITTOR_INTENT_CU_USAGE.set(value)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub fn set_commmittor_intent_cu_usage(value: i64) { | |
| COMMITTOR_INTENT_CU_USAGE.set(value) | |
| pub fn set_committor_intent_cu_usage(value: i64) { | |
| COMMITTOR_INTENT_CU_USAGE.set(value) | |
| } |
🤖 Prompt for AI Agents
In magicblock-metrics/src/metrics/mod.rs around lines 365-366 the public setter
is misspelt as set_committor_intent_cu_usage (it contains an extra 'm'); rename
the function to the correct set_committor_intent_cu_usage, update the function
declaration and all references (notably the call site in intent_executor/mod.rs)
to the corrected name, then run cargo check/tests to ensure no remaining
references to the old name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
magicblock-committor-service/src/intent_execution_manager/intent_execution_engine.rs(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
magicblock-committor-service/src/intent_execution_manager/intent_execution_engine.rs (1)
magicblock-metrics/src/metrics/mod.rs (2)
observe_committor_intent_execution_time_histogram(355-363)inc_committor_failed_intents_count(342-349)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Build Project
- GitHub Check: run_make_ci_test
- GitHub Check: run_make_ci_lint
🔇 Additional comments (2)
magicblock-committor-service/src/intent_execution_manager/intent_execution_engine.rs (2)
1-4: LGTM: Timing imports added correctly.The addition of
DurationandInstantfor execution timing instrumentation is appropriate and necessary for the new metrics functionality.
291-330: The execution_metrics function is correctly implemented.Verification confirms that all types passed to metrics functions have proper
LabelValueimplementations:
ScheduledBaseIntentWrapperimplementsLabelValueIntentExecutorErrorimplementsLabelValueExecutionOutputimplementsLabelValueResult<ExecutionOutput, IntentExecutorError>automatically implementsLabelValuevia the generic impl that requires both type parameters to implement the traitThe metrics calls at lines 300–304 and 306 will compile and execute correctly.
| let instant = Instant::now(); | ||
| let result = executor | ||
| .execute(intent.inner.clone(), persister) | ||
| .await | ||
| .inspect_err(|err| { | ||
| error!( | ||
| "Failed to execute BaseIntent. id: {}. {:?}", | ||
| "Failed to execute BaseIntent. id: {}. {}", | ||
| intent.id, err | ||
| ) | ||
| }) | ||
| }); | ||
|
|
||
| // Metrics | ||
| Self::execution_metrics(instant.elapsed(), &intent, &result); | ||
|
|
||
| let result = result | ||
| .map(|output| ExecutionOutputWrapper { | ||
| id: intent.id, | ||
| trigger_type: intent.trigger_type, | ||
| output, | ||
| }) | ||
| .map_err(|err| { | ||
| // Increase failed intents metric as well | ||
| metrics::inc_committor_failed_intents_count(); | ||
| (intent.inner.id, intent.trigger_type, Arc::new(err)) | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
LGTM: Timing instrumentation properly integrated.
The execution timing is correctly captured with Instant::now() before the async operation and metrics collection occurs after execution completes. The metrics are collected before result mapping, which correctly preserves the original error types for labeling.
Minor observation: Line 254 changed error logging from {:?} (Debug) to {} (Display). This simplifies the log output but may reduce detail if the error's Display implementation is less verbose than Debug.
🤖 Prompt for AI Agents
In
magicblock-committor-service/src/intent_execution_manager/intent_execution_engine.rs
around lines 248 to 270, the error logging was changed from using the Debug
formatter ({:?}) to the Display formatter ({}), which can lose valuable
diagnostic detail; update the error logging to use Debug formatting (e.g.,
"{:?}") or explicitly include both Display and Debug to ensure full error
details are logged for troubleshooting.
<!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added per-intent execution timing histogram and async CU usage reporting. * Failure counters now include intent and error labels for richer metrics. * **Chores** * Introduced labeled-value support for intent/outcome results so metrics carry context. * Metrics collection refactored to a dedicated flow invoked after execution. * **Bug Fixes** * Improved execution failure log formatting and adjusted logging level based on duration. <!-- end of auto-generated comment: release notes by coderabbit.ai --> # Conflicts: # magicblock-metrics/src/metrics/mod.rs
Summary by CodeRabbit
New Features
Chores
Bug Fixes