Skip to content

Guard against out-of-bounds indexing on attacker-controlled attestation data#228

Merged
pablodeymo merged 3 commits intomainfrom
fix/aggregation-bits-oob-index
Mar 16, 2026
Merged

Guard against out-of-bounds indexing on attacker-controlled attestation data#228
pablodeymo merged 3 commits intomainfrom
fix/aggregation-bits-oob-index

Conversation

@pablodeymo
Copy link
Collaborator

@pablodeymo pablodeymo commented Mar 13, 2026

Motivation

Security audit findings #6, #7, #8: three sites in process_attestations / try_finalize use direct indexing on data structures whose keys or indices come from network-controlled attestations. A malicious peer can craft attestations that trigger panics, crashing the node.

Description

Fix 1 — Reject attestations with OOB aggregation_bits (finding #6)

File: crates/blockchain/state_transition/src/lib.rs

Problem: aggregation_bits is a BitList<ValidatorRegistryLimit> (max 4096 bits) deserialized from the network. votes is a Vec<bool> of length validator_count (actual validators in state). If a malicious attestation carries aggregation_bits with bits set beyond validator_count, direct indexing panics.

Spec and cross-client behavior:

Implementation Behavior
Spec (state.py:503-505) Direct list access justifications[target.root][validator_id] — crashes on OOB (IndexError)
Zeam (Zig) Rejects attestation with error
Lantern (C) Rejects attestation with return -1
gean (Go) Bitlist.Get() returns false for OOB (silent skip)

Fix: Pre-loop length check that rejects the entire attestation when aggregation_bits.len() > validator_count, matching the spec (crash = implicit reject) and the majority of other clients (Zeam, Lantern). Direct indexing votes[validator_id] = true is safe after the bounds check.


Fix 2 — Prune justification roots missing from root_to_slot (finding #7)

File: crates/blockchain/state_transition/src/lib.rs (in try_finalize)

Problem: root_to_slot is built from historical_block_hashes for slots after finalization. But justifications can contain roots carried over from a previous finalization window that no longer appear in historical_block_hashes. Direct HashMap index panics on missing key.

Spec and cross-client behavior:

Implementation Behavior
Spec (state.py:560-562) assert all(root in root_to_slot for root in justifications) — crashes (invariant violation)
Lantern (C) Conservative retention

The spec treats missing roots as an invariant violation. A root absent from root_to_slot means its slot is at or below finalized_slot (already finalized). Conservatively retaining such roots (previous approach with is_none_or) would cause them to accumulate forever — they never reappear in root_to_slot, and is_valid_vote rejects new votes for them.

Fix: Prune missing roots (return false in retain), matching the spec's filter intent: root_to_slot[root] > finalized_slot would be false for such roots. A warning is logged when this happens.


Finding #8justifications[root] (no change needed)

File: crates/blockchain/state_transition/src/lib.rs (in serialize_justifications)

let justification_roots: Vec<H256> = justifications.keys().cloned().collect();
// ...
.flat_map(|root| justifications[root].iter())

Reviewed and confirmed safe: justification_roots is built from justifications.keys(), so every root used as index is guaranteed to be present in the map. No change needed.

How to Test

make fmt
make lint
cargo test --workspace --release

All 120 tests pass unchanged — valid attestations always have aggregation_bits.len() <= validator_count and all justification roots are in root_to_slot, so these guards only trigger on malformed/adversarial data not present in test fixtures.

…ttestation processing

Three defensive fixes in process_attestations / try_finalize:

1. aggregation_bits OOB (finding #6): A malicious attestation could carry
   aggregation_bits longer than the validator set, causing votes[validator_id]
   to panic on out-of-bounds access. Now filter bits >= validator_count before
   indexing.

2. root_to_slot missing key (finding #7): try_finalize used direct HashMap
   index root_to_slot[root] which panics if a justification root is absent
   from historical_block_hashes (e.g. carried over from a prior finalization
   window). Replaced with .get() — missing roots are conservatively retained.

3. justifications[root] (finding #8): Reviewed and confirmed safe — the roots
   come from justifications.keys() so the key is always present. No change
   needed.
@github-actions
Copy link

🤖 Kimi Code Review

Security & Correctness Review

1. Out-of-bounds panic prevention (lines 287-295)

The fix for aggregation_bits length mismatch is correct and critical. The original code could panic on malicious attestations with longer bitfields than the validator set. The new bounds check *id < validator_count prevents this.

2. Missing root handling (lines 421-428)

The change from direct indexing to .get() with .is_none_or() is correct. This prevents panics when justification roots don't exist in root_to_slot, which can happen during state transitions or with malicious data.

3. Minor optimization opportunity

In the attestation processing loop (lines 290-298), consider using zip with the validator count to avoid the bounds check entirely:

for (validator_id, voted) in attestation
    .aggregation_bits
    .iter()
    .zip(0..validator_count)
    .filter(|(voted, _)| **voted)
{
    votes[validator_id] = true;
}

However, the current fix is safer as it handles potentially malformed bitfields more defensively.

4. Consensus correctness

Both changes are consensus-preserving:

  • The attestation fix only affects invalid/malicious attestations
  • The justification pruning change conservatively retains unknown roots, which is the safe choice

Summary

The PR correctly fixes two potential panic conditions that could be triggered by malicious network participants. The changes are minimal, focused, and maintain consensus correctness.


Automated review by Kimi (Moonshot AI) · custom prompt

@github-actions
Copy link

🤖 Codex Code Review

  1. Low: try_finalize now retains roots missing from root_to_slot, which can preserve stale justifications indefinitely and increase state size over time (lib.rs:425).
    Recommendation: consider pruning missing roots (.is_some_and(...)) or explicitly tagging/cleaning “orphaned” roots, otherwise justifications_roots can accumulate dead entries.

  2. Low: process_attestations silently truncates out-of-range aggregation_bits instead of rejecting malformed attestations (lib.rs:289).
    This prevents panic (good), but it can hide invalid input in paths that skip signature verification and may mask consensus/data-integrity issues in tests. Consider emitting a metric/log or hard-failing the attestation/block when any set bit index is >= validator_count.

No critical security or memory-safety regressions found in this diff; both changes are defensive and remove panic paths.

I could not run tests in this environment due rustup temp-dir permission errors (/home/runner/.rustup/tmp/...).


Automated review by OpenAI Codex · custom prompt

@github-actions
Copy link

🤖 Claude Code Review

Here is my review of PR #228:


PR Review: Guard against OOB panics on attacker-controlled attestation data

The PR is a focused security patch for two crash vulnerabilities. The analysis and fixes are correct. A few points follow.


Fix 1 — aggregation_bits OOB (process_attestations, line ~293)

Correctness: Sound. votes is sized to validator_count (via repeat_n), so any validator_id >= validator_count would panic. The added filter predicate *id < validator_count is the correct guard.

Consensus safety: All honest nodes process the same state, so silently skipping bits beyond the validator set is deterministic and consensus-preserving. That said, an attestation with bits set beyond validator_count is technically malformed per the spec. An argument could be made for rejecting the whole attestation rather than silently ignoring the excess bits, which would be more faithful to the protocol. The current approach is safe and arguably more robust in practice, but worth documenting as an explicit spec deviation if it is one.

Minor style nit: There is a naming inconsistency between the filter closure (id) and the for loop binder (validator_id) that refer to the same element. Both refer to the bit index:

// filter uses `id`
.filter(|(id, voted)| *voted && *id < validator_count)
// for uses `validator_id`
for (validator_id, _) in ...

Not a bug, but using a consistent name (e.g. validator_id throughout, or id throughout) would remove the momentary confusion.


Fix 2 — root_to_slot[root] missing key (try_finalize, line ~425)

Correctness: The fix is correct and the is_none_or API is available since Rust 1.82 (project uses 1.92).

Conservative-retention semantics: The PR retains roots absent from root_to_slot rather than pruning them. This avoids incorrectly discarding votes for roots that are genuinely still live. However, retained "orphan" roots accumulate in justifications and are written back into state.justifications_roots / state.justifications_validators. New votes for these roots cannot be added (is_valid_vote / checkpoint_exists will reject them), but the stale entries grow state. This matters if orphaned roots build up over time — they can only be pruned once they appear in root_to_slot at a finalized-or-earlier slot. Since state.justifications_roots is an SSZ list with a hard limit, unbounded accumulation would eventually cause a panic in serialize_justifications (try_into().expect("justifications_roots limit exceeded")).

This is likely an unlikely edge case in practice and is a pre-existing architectural concern rather than something introduced by this PR. Still worth a follow-up issue to track whether orphaned roots can accumulate indefinitely.


Finding #8justifications[root] in serialize_justifications (no change)

The analysis is correct. justification_roots is constructed from justifications.keys(), so every key used in the index at line 452 is guaranteed present. No change needed.


Overall

Both fixes are minimal, correct, and well-commented. The PR achieves its goal of eliminating the two crash vectors. The main follow-up item is whether conservative retention in fix 2 can lead to stale-root accumulation saturating the justifications_roots SSZ list limit in adversarial conditions.


Automated review by Claude (Anthropic) · custom prompt

…cate

The bounds guard is now at the point of access rather than in the iterator
filter. get_mut returns None for out-of-bounds indices, making the safety
invariant self-contained in a single expression.
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 13, 2026

Greptile Summary

This PR addresses three security-audit findings (6, 7, 8) in process_attestations / try_finalize by hardening two direct-index operations against attacker-controlled attestation data.

  • Fix 1 — aggregation_bits OOB (finding docs: update roadmap with extended steps #6): The enumerate().filter predicate now adds *id < validator_count alongside the *voted check. This correctly prevents out-of-bounds writes into the votes vector when a malicious peer sends an AggregationBits bitfield with more bits set than there are validators in the current state. The fix is sound and does not affect valid attestations.

  • Fix 2 — root_to_slot missing key (finding feat: initial libp2p integration with req-resp status decoding #7): Direct HashMap indexing root_to_slot[root] is replaced with .get(root).is_none_or(|&slot| slot > state.latest_finalized.slot). This prevents a panic when a justification root that was valid in a previous finalization window is no longer present in the root_to_slot map (which is built only for slots strictly after the current finalized boundary). However, the None branch retains such entries indefinitely: their slots are already at or below latest_finalized.slot, so they will never appear in any future root_to_slot map, and is_slot_justified implicitly marks them as justified (preventing new votes), meaning they can never be removed through the normal justification path either. The inline comment ("they'll be pruned naturally once their slot finalizes") is therefore inaccurate — these entries accumulate permanently in state.justifications_roots and state.justifications_validators.

  • Finding feat: respond to Status messages #8: Correctly assessed as safe — justification_roots is built directly from justifications.keys(), so all subsequent indexing is guaranteed to succeed.

Key concern: the conservative None-retain strategy in Fix 2 risks unbounded state growth in state.justifications_roots under adversarial conditions. The only backstop is the expect("justifications_roots limit exceeded") panic in serialize_justifications.

Confidence Score: 3/5

  • Safe to merge for panic prevention, but the is_none_or / None-retain path introduces a permanent stale-entry accumulation in state.justifications_roots that should be addressed before this pattern is relied upon long-term.
  • Fix 1 (aggregation_bits bounds check) is clean and correct with no side-effects. Fix 2 (HashMap get instead of direct index) correctly prevents the panic but the None branch silently retains entries whose slots are already past the finalization boundary — contrary to the comment — with no cleanup path. While is_valid_vote prevents those entries from accumulating further votes, they persist indefinitely in state.justifications_roots and state.justifications_validators, introducing a state-growth vector under adversarial conditions. The fix does not regress valid-attestation behaviour and all 120 existing tests pass, but the retention logic warrants a follow-up correction.
  • crates/blockchain/state_transition/src/lib.rs — specifically the try_finalize justifications.retain closure (lines 425–429) and its interaction with the serialize_justifications capacity checks.

Important Files Changed

Filename Overview
crates/blockchain/state_transition/src/lib.rs Two OOB panic guards added: (1) aggregation_bits filter correctly bounds-checks validator index against validator_count; (2) root_to_slot lookup switched to .get().is_none_or(...) to avoid missing-key panic, but the None branch conservatively retains stale entries that are already past the finalization boundary and will never be pruned, risking indefinite accumulation in state.justifications_roots.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[process_attestations] --> B{ZeroHashInJustificationRoots?}
    B -- yes --> C[Return Err]
    B -- no --> D[Build root_to_slot from\nhistorical_block_hashes\n after latest_finalized.slot]
    D --> E[For each attestation]
    E --> F{is_valid_vote?}
    F -- no --> E
    F -- yes --> G[entry target.root in justifications]
    G --> H[Filter aggregation_bits\n *voted AND id < validator_count\n Fix 1: OOB guard]
    H --> I[votes validator_id = true]
    I --> J{3 × vote_count\n ≥ 2 × validator_count?}
    J -- no --> E
    J -- yes --> K[latest_justified = target\nremove target.root from justifications]
    K --> L[try_finalize]
    L --> M{Gap between source\nand target justifiable?}
    M -- yes --> N[metrics::inc finalizations error\nreturn]
    M -- no --> O[Advance latest_finalized = source\nshift justified_slots window]
    O --> P[justifications.retain\nroot_to_slot.get root .is_none_or\n Fix 2: missing-key guard]
    P --> Q{root in root_to_slot?}
    Q -- None --> R[Conservatively retain\n stale entry never pruned]
    Q -- Some slot > finalized --> S[Retain]
    Q -- Some slot ≤ finalized --> T[Prune]
    L --> E
    E --> U[serialize_justifications\nstate.justifications_roots updated]
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/blockchain/state_transition/src/lib.rs
Line: 421-429

Comment:
**Stale justification entries retained forever, not pruned naturally**

The comment states retained entries *"will be pruned naturally once their slot finalizes"*, but these entries are in the `None` branch precisely because their slot is **already at or below** `state.latest_finalized.slot` — their slot has already been finalized. They will never appear in any future `root_to_slot` map (which always starts at `latest_finalized.slot + 1`), so the `retain` predicate will continue to return `true` for them indefinitely.

The only other removal path is `justifications.remove(&target.root)` at line 319, which fires only when a root reaches supermajority justification. For these stale roots that's prevented: `is_slot_justified` returns `true` for finalized-or-earlier slots (`unwrap_or(true)` in `justified_slots_ops`), so `is_valid_vote` rejects new attestations targeting them. They effectively can never be removed.

The practical consequence is that every such stale entry accumulates permanently in `state.justifications_roots` and `state.justifications_validators` across all subsequent blocks. While bounded by `HistoricalRootsLimit` / `JustificationValidators` capacity, a sustained adversarial workload that repeatedly triggers this path could grow state without a cleanup mechanism. The `expect("justifications_roots limit exceeded")` in `serialize_justifications` is then the only backstop, and it terminates the node rather than pruning gracefully.

Consider replacing the conservation approach with an explicit prune: if the root's slot is known-and-finalized, remove it; only retain truly unknown roots. That would make the comment accurate and eliminate the accumulation concern:

```rust
justifications.retain(|root, _| {
    match root_to_slot.get(root) {
        // Root's slot is known and still above the finalization boundary → keep
        Some(&slot) => slot > state.latest_finalized.slot,
        // Root not in root_to_slot: already finalized or unknown → prune
        None => false,
    }
});
```

Alternatively, update the comment to accurately describe the behaviour: these entries are *permanently* retained and will never be pruned by this path.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 364eaa8

Comment on lines +421 to +429
justified_slots_ops::shift_window(&mut state.justified_slots, delta);

// Prune justifications whose roots only appear at now-finalized slots
// Prune justifications whose roots only appear at now-finalized slots.
// Use .get() instead of direct index — a root may be absent from root_to_slot
// if it was never in historical_block_hashes (e.g. carried over from a previous
// finalization window). Missing roots are conservatively retained.
justifications.retain(|root, _| {
let slot = root_to_slot[root];
slot > state.latest_finalized.slot
root_to_slot
.get(root)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale justification entries retained forever, not pruned naturally

The comment states retained entries "will be pruned naturally once their slot finalizes", but these entries are in the None branch precisely because their slot is already at or below state.latest_finalized.slot — their slot has already been finalized. They will never appear in any future root_to_slot map (which always starts at latest_finalized.slot + 1), so the retain predicate will continue to return true for them indefinitely.

The only other removal path is justifications.remove(&target.root) at line 319, which fires only when a root reaches supermajority justification. For these stale roots that's prevented: is_slot_justified returns true for finalized-or-earlier slots (unwrap_or(true) in justified_slots_ops), so is_valid_vote rejects new attestations targeting them. They effectively can never be removed.

The practical consequence is that every such stale entry accumulates permanently in state.justifications_roots and state.justifications_validators across all subsequent blocks. While bounded by HistoricalRootsLimit / JustificationValidators capacity, a sustained adversarial workload that repeatedly triggers this path could grow state without a cleanup mechanism. The expect("justifications_roots limit exceeded") in serialize_justifications is then the only backstop, and it terminates the node rather than pruning gracefully.

Consider replacing the conservation approach with an explicit prune: if the root's slot is known-and-finalized, remove it; only retain truly unknown roots. That would make the comment accurate and eliminate the accumulation concern:

justifications.retain(|root, _| {
    match root_to_slot.get(root) {
        // Root's slot is known and still above the finalization boundary → keep
        Some(&slot) => slot > state.latest_finalized.slot,
        // Root not in root_to_slot: already finalized or unknown → prune
        None => false,
    }
});

Alternatively, update the comment to accurately describe the behaviour: these entries are permanently retained and will never be pruned by this path.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/blockchain/state_transition/src/lib.rs
Line: 421-429

Comment:
**Stale justification entries retained forever, not pruned naturally**

The comment states retained entries *"will be pruned naturally once their slot finalizes"*, but these entries are in the `None` branch precisely because their slot is **already at or below** `state.latest_finalized.slot` — their slot has already been finalized. They will never appear in any future `root_to_slot` map (which always starts at `latest_finalized.slot + 1`), so the `retain` predicate will continue to return `true` for them indefinitely.

The only other removal path is `justifications.remove(&target.root)` at line 319, which fires only when a root reaches supermajority justification. For these stale roots that's prevented: `is_slot_justified` returns `true` for finalized-or-earlier slots (`unwrap_or(true)` in `justified_slots_ops`), so `is_valid_vote` rejects new attestations targeting them. They effectively can never be removed.

The practical consequence is that every such stale entry accumulates permanently in `state.justifications_roots` and `state.justifications_validators` across all subsequent blocks. While bounded by `HistoricalRootsLimit` / `JustificationValidators` capacity, a sustained adversarial workload that repeatedly triggers this path could grow state without a cleanup mechanism. The `expect("justifications_roots limit exceeded")` in `serialize_justifications` is then the only backstop, and it terminates the node rather than pruning gracefully.

Consider replacing the conservation approach with an explicit prune: if the root's slot is known-and-finalized, remove it; only retain truly unknown roots. That would make the comment accurate and eliminate the accumulation concern:

```rust
justifications.retain(|root, _| {
    match root_to_slot.get(root) {
        // Root's slot is known and still above the finalization boundary → keep
        Some(&slot) => slot > state.latest_finalized.slot,
        // Root not in root_to_slot: already finalized or unknown → prune
        None => false,
    }
});
```

Alternatively, update the comment to accurately describe the behaviour: these entries are *permanently* retained and will never be pruned by this path.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +295 to +297
if let Some(vote) = votes.get_mut(validator_id) {
*vote = true;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to check what the spec does in this case

Comment on lines +428 to +430
root_to_slot
.get(root)
.is_none_or(|&slot| slot > state.latest_finalized.slot)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

…behavior

Reject entire attestations when aggregation_bits exceeds validator count
(matching spec crash, Zeam reject, Lantern reject) instead of silently
skipping individual OOB bits. Prune justification roots missing from
root_to_slot (their slot is at or below finalized_slot) instead of
conservatively retaining them forever.
@pablodeymo
Copy link
Collaborator Author

@greptile-apps wdyt?

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 16, 2026

The approach looks solid. Both fixes are clean and well-reasoned:

Fix 1 (aggregation_bits): The pre-loop length check is the right move. Simple, catches the issue early, and aligns with Zeam/Lantern's reject behavior.

Fix 2 (justification roots): Using .get() instead of direct indexing and pruning missing roots makes sense given the analysis — they'd never reappear in root_to_slot anyway, so conservative retention would just leak memory.

One consideration: The warnings could get noisy if someone floods malicious attestations. Not a blocker, but worth thinking about whether you want any rate limiting on these log messages or if you're comfortable with the spam during an active attack. Could also consider downgrading to debug! level if you find it too chatty in practice.

The cross-client comparison table is really helpful context. Nice work on the thorough analysis.

.entry(target.root)
.or_insert_with(|| std::iter::repeat_n(false, validator_count).collect());
// Reject attestations with aggregation_bits longer than the validator set.
// The spec would crash (IndexError) on OOB access; Zeam and Lantern reject.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An exception in the spec means a rejection. That's why it uses assert

@pablodeymo pablodeymo merged commit e4973aa into main Mar 16, 2026
2 checks passed
@pablodeymo pablodeymo deleted the fix/aggregation-bits-oob-index branch March 16, 2026 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants