Skip to content

fix: update lean_head_slot metric on head slot update#163

Merged
pablodeymo merged 2 commits intomainfrom
fix-head-slot-metric
Feb 27, 2026
Merged

fix: update lean_head_slot metric on head slot update#163
pablodeymo merged 2 commits intomainfrom
fix-head-slot-metric

Conversation

@MegaRedHand
Copy link
Copy Markdown
Collaborator

Summary

  • Report actual head slot: lean_head_slot was set to the incoming block's slot instead of the fork choice head slot, reporting wrong values during reorgs or late block arrivals.
  • Update on tick: The metric was only updated during block processing, missing head changes caused by attestation promotion at tick intervals 0 and 3. Now updated every tick alongside the existing safe_target_slot metric.
  • Add Store::head_slot() helper: Convenience accessor following the existing safe_target_slot() pattern.

Test plan

  • make lint passes (no unused variable warnings)
  • make test passes (all workspace tests)
  • Forkchoice spec tests pass (26/26)
  • In a devnet: lean_head_slot should never jump backwards when old blocks arrive, and should advance even during slots with no new blocks when attestation promotion changes the head

@github-actions
Copy link
Copy Markdown

🤖 Kimi Code Review

Review Summary

The changes look correct and improve metric accuracy. The new head_slot() accessor is well-implemented. No critical issues found.

Positive Points

  1. Metric Accuracy: The change from using the received block's slot to store.head_slot() in on_block (line 281) correctly reflects that the head may not change to the newly received block due to fork choice rules.

  2. Proper Abstraction: The new head_slot() method (lines 861-867) provides a clean accessor that encapsulates the "head exists" invariant.

  3. Tick Metric Fix: Adding update_head_slot in on_tick (line 144) correctly handles cases where attestations promoted at intervals 0/3 change the head without new blocks.

Minor Suggestions

  • Error Handling: The expect in head_slot() (line 863) is acceptable given the invariant, but consider adding a comment explaining why the head block is guaranteed to exist (e.g., "Store always has at least genesis block").

  • Consistency: Both head_slot() and safe_target_slot() use the same pattern of get_block_header().expect(...).slot, which is good for consistency.

The PR correctly addresses the issue where head slot metrics could become stale when the fork choice changes due to attestation processing rather than block reception.


Automated review by Kimi (Moonshot AI) · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Codex Code Review

  1. Medium: possible node panic in non-critical metrics path
    Store::head_slot() uses .expect("head block exists"), and it is now called from periodic ticking and block processing. If store invariants are ever broken (e.g., corruption/partial write/recovery edge case), metrics update can crash the process. For observability code, that is a risky failure mode in a consensus client.
    Refs: store.rs:862, store.rs:864, lib.rs:144, lib.rs:281
    Suggested change: make head_slot() fallible (Option<u64>/Result<u64, _>) and skip/log metric update on failure.

  2. Low: extra hot-path read for metrics
    metrics::update_head_slot(self.store.head_slot()) adds a header lookup on every tick and block process. This is likely small, but in a tight consensus loop it’s avoidable overhead.
    Refs: lib.rs:144, lib.rs:281, store.rs:863
    Suggested change: cache head slot alongside head root in store state, or return it from existing in-memory fork-choice state when possible.

The metric semantics change itself looks correct: updating head slot from actual store head (instead of incoming block slot) is an improvement and avoids misleading values on non-head blocks. No direct fork-choice/attestation/finalization/XMSS/SSZ correctness regressions are visible in this diff.


Automated review by OpenAI Codex · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Claude Code Review

The diff is small and focused. The changes are clear and the context is sufficient for a thorough review.


PR 163 Review — Fix lean_head_slot metric to report actual fork choice head

Summary

This is a correctness fix for the lean_head_slot Prometheus metric. The change is small, well-scoped, and follows existing patterns in the codebase. The logic is sound.


What's Fixed

Before: update_head_slot in process_block used signed_block.message.block.slot — the incoming block's slot — regardless of whether that block actually became the fork choice head (e.g., during a reorg or late arrival on a non-canonical chain).

After: Both call sites now use self.store.head_slot(), which reads the actual fork choice head from the store post-update. Additionally, the metric is now updated on every tick, so head changes driven by attestation promotion (intervals 0 and 3) are reflected promptly.


Code Analysis

store.rshead_slot() accessor (lines 862–866)

pub fn head_slot(&self) -> u64 {
    self.get_block_header(&self.head())
        .expect("head block exists")
        .slot
}

The expect invariant is sound: the head is always a block root that exists in storage (genesis is inserted at initialization, and set_head is only called after a block is successfully stored). This mirrors the identical pattern in safe_target_slot() and head_state(), so it's consistent. The doc comment is adequate.

lib.rson_tick (line 144)

The metric update is placed after store::on_tick and after propose_block. This ordering is correct: attestation promotion happens inside store::on_tick, so the head may change there. Reading head_slot() after that call captures the most current value. The update_head_slot call at tick interval doesn't add a second storage lookup on the hot path — head() reads from metadata, get_block_header does one storage read, both are cheap.

lib.rsprocess_block (line 281)

Removing the stale let slot = signed_block.message.block.slot; binding and replacing it with self.store.head_slot() is correct. Fork choice runs inside store::on_block, so by the time control returns, the head is already updated. The metric now reflects the actual post-fork-choice head, which may differ from the received block's slot in case the received block doesn't win fork choice.


Potential Concerns

Double storage read in head_slot(): head_slot() calls self.head() (one metadata read) followed by get_block_header(...) (one block header read). This happens every tick (~4 reads/slot). This is negligible, but noting it for completeness since safe_target_slot() has the same cost.

No regression on update_head_slot being called redundantly: After process_block, on_block returns and the tick continues to fire. The tick will then also call update_head_slot. This redundancy is harmless (idempotent gauge set), and the tick-based update was already doing the same for safe_target_slot.

on_tick update for all intervals, including interval 1 (attestation production) and interval 2 (safe target update): The head doesn't change at intervals 1 or 2 in normal operation (no attestation promotion happens there per the architecture docs). Calling update_head_slot unconditionally at every tick is slightly over-eager but not incorrect — it's the same approach used for safe_target_slot, and a gauge set to the same value is a no-op from the Prometheus client's perspective.


Minor Observations

  • The comment on line 141 says "updated by store.on_tick at interval 2" for safe_target_slot. A symmetric comment for head_slot explaining when the head can change (intervals 0/3 via attestation promotion, or when process_block is called) would make the code self-documenting, but this is already partially covered by the new comment on line 143.
  • The PR description's devnet test is still unchecked — expected given it requires a live environment, and the automated tests are sufficient for correctness here.

Verdict

The fix is correct, minimal, and follows established patterns. No bugs or security issues. Approve.


Automated review by Claude (Anthropic) · custom prompt

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Feb 27, 2026

Greptile Summary

Fixed lean_head_slot metric to report the actual fork choice head slot instead of the incoming block's slot.

Key Changes:

  • Added Store::head_slot() helper to access the slot of the current fork choice head, following the existing safe_target_slot() pattern
  • Updated process_block() to call self.store.head_slot() instead of using the incoming block's slot
  • Added metric update in on_tick() to capture head changes from attestation promotion at intervals 0 and 3

Impact:

  • Metric now correctly reflects fork choice head during reorgs and late block arrivals
  • Captures head changes caused by attestation promotion, even when no new blocks arrive
  • Prevents the metric from reporting incorrect values or jumping backwards when old blocks are processed

Confidence Score: 5/5

  • This PR is safe to merge with no issues found
  • Clean implementation that correctly fixes a metric reporting bug. The changes follow existing patterns (safe_target_slot()), use safe accessor methods with appropriate expects, and properly address both the incorrect value source (incoming block vs fork choice head) and missing update timing (only on block vs every tick). All tests pass including forkchoice spec tests.
  • No files require special attention

Important Files Changed

Filename Overview
crates/blockchain/src/lib.rs Updates lean_head_slot metric to use actual fork choice head slot instead of incoming block slot, and adds metric update on every tick to capture attestation-driven head changes
crates/storage/src/store.rs Adds head_slot() convenience accessor following existing safe_target_slot() pattern to retrieve the slot of the current fork choice head

Last reviewed commit: 3911c58

@MegaRedHand MegaRedHand changed the title Fix lean_head_slot metric to report actual fork choice head fix: update lean_head_slot metric on head slot update Feb 27, 2026
@pablodeymo pablodeymo merged commit a93c980 into main Feb 27, 2026
2 checks passed
@pablodeymo pablodeymo deleted the fix-head-slot-metric branch February 27, 2026 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants