Skip to content

Add safety-net pruning to prevent OOM when finalization is stalled#175

Closed
pablodeymo wants to merge 1 commit intomainfrom
safety-net-pruning
Closed

Add safety-net pruning to prevent OOM when finalization is stalled#175
pablodeymo wants to merge 1 commit intomainfrom
safety-net-pruning

Conversation

@pablodeymo
Copy link
Copy Markdown
Collaborator

@pablodeymo pablodeymo commented Mar 2, 2026

Motivation

When the chain runs for an extended period without finalization (e.g., due to insufficient aggregators or network issues), all pruning is effectively disabled — every prune function gates on finalized_slot advancing. The States table has no pruning at all, and each state is 100+ MB (contains historical_block_hashes up to 8 MiB, justifications_roots up to 8 MiB, validators ~245 KB).

After ~12 hours without finalization: 10,800 states × 100+ MB = potential terabytes of data, causing OOM.

Closes #166
Relates to #103

Description

Adds a safety-net pruning mechanism that activates only when finalization is stalled, using a 1024-slot window (~68 minutes of chain history).

Cutoff calculation

cutoff_slot = max(finalized_slot, head_slot.saturating_sub(1024))
  • Finalization healthycutoff_slot == finalized_slot → no-op (existing finalization-triggered pruning already handles it)
  • Finalization stalledcutoff_slot == head_slot - 1024 → prunes old unfinalized data

Protected roots (never pruned)

  • head root
  • latest_finalized root
  • latest_justified root
  • safe_target root

What gets pruned

All prunable tables, using cutoff_slot instead of finalized_slot:

Table Status before this PR
States No pruning at all — primary OOM vector
BlockHeaders / BlockBodies / BlockSignatures No pruning at all — accumulate indefinitely
LiveChain Already pruned on finalization, extended to cutoff
GossipSignatures Already pruned on finalization, extended to cutoff
AttestationDataByRoot Already pruned on finalization, extended to cutoff
LatestNewAggregatedPayloads Already pruned on finalization, extended to cutoff
LatestKnownAggregatedPayloads Already pruned on finalization, extended to cutoff

When it runs

Once per slot at interval 0 in BlockChainServer::on_tick, after tick processing (attestation acceptance) but before block proposal.

New methods in Store

  • safety_net_prune() — Public entry point. Computes cutoff, builds protected set, calls individual prune methods, logs summary at info level.
  • prune_states(cutoff_slot, protected_roots) — Iterates BlockHeaders (small, ~100 bytes each) to find slots, deletes matching States entries.
  • prune_old_blocks(cutoff_slot, protected_roots) — Deletes from BlockHeaders, BlockBodies, and BlockSignatures for non-protected old blocks.

Files changed

File Change
crates/storage/src/store.rs +MAX_UNFINALIZED_SLOTS constant, +safety_net_prune(), +prune_states(), +prune_old_blocks()
crates/blockchain/src/lib.rs Call self.store.safety_net_prune() at interval 0 in on_tick

How to test

Safety-net pruning only activates when head_slot - finalized_slot > 1024, which doesn't occur in any test fixture. All existing tests pass unchanged:

make fmt    # clean
make lint   # clean (clippy -D warnings)
make test   # all 62 tests pass (11 unit + 26 forkchoice + 8 signature + 14 STF + 3 storage)

To test the actual pruning behavior in a live environment:

  1. Start a devnet without the --is-aggregator flag (finalization will stall)
  2. Let it run for >1024 slots (~68 minutes)
  3. Observe Safety-net pruning: finalization stalled log messages with pruned counts
  4. Verify the node does not OOM (previously it would accumulate unbounded state data)

  When the chain runs without finalization (e.g., insufficient aggregators),
  all pruning is disabled since every prune function gates on finalized_slot
  advancing. The States table has no pruning at all, and each state is 100+ MB.
  After ~12 hours without finalization this can reach terabytes of data.

  Add a safety-net that computes cutoff_slot = max(finalized_slot,
  head_slot - 1024) and prunes states, blocks, live chain, signatures,
  attestation data, and aggregated payloads older than the cutoff. Protected
  roots (head, finalized, justified, safe_target) are never pruned. When
  finalization is healthy, cutoff equals finalized_slot and this is a no-op.

  Runs once per slot at interval 0 in on_tick, after tick processing but
  before block proposal.
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 2, 2026

🤖 Kimi Code Review

Review Summary

The safety-net pruning implementation is a solid defensive mechanism against OOM when finalization stalls. However, several issues need attention:

Critical Issues

  1. Race condition in prune_states and prune_old_blocks (lines 1027-1081 and 1108-1162):

    • Both functions iterate over BlockHeaders while potentially deleting from other tables, but they don't ensure atomicity between the read and write phases
    • A block could be added between the read and write, causing the pruning to delete data for a block that should be protected
    • Fix: Use a single transaction for both read and write operations
  2. Missing pruning functions (lines 995-1003):

    • prune_live_chain, prune_gossip_signatures, prune_attestation_data_by_root, and prune_aggregated_payload_table are called but not implemented
    • This will cause compilation failures
  3. Potential panic in H256::from_ssz_bytes (lines 1042, 1085, 1125):

    • Using expect("valid root") assumes all keys are valid 32-byte roots, but the database might contain corrupted data
    • Fix: Use proper error handling instead of expect

Security & Consensus Issues

  1. Protected roots calculation (lines 982-990):

    • The safe_target() might not be the best choice for protection - consider if this aligns with fork choice safety requirements
    • Missing protection for blocks in the current justified chain that aren't yet finalized
  2. Slot boundary handling (line 976):

    • Using <= cutoff_slot is correct, but ensure this doesn't accidentally prune the genesis block (slot 0) if it becomes relevant

Performance & Implementation Issues

  1. Inefficient iteration pattern (lines 1030-1044, 1113-1127):

    • Both prune_states and prune_old_blocks iterate the entire BlockHeaders table
    • Consider adding a slot-based index or using a more targeted approach
  2. Batch operation inefficiency (lines 1071-1079, 1154-1162):

    • Creating separate batches for each table type could be more efficient as a single batch
    • The clone operations on keys_to_delete are unnecessary overhead

Minor Issues

  1. Warning suppression (line 1040):

    • The warning for failed header decoding might hide important corruption issues
    • Consider logging the actual error or the problematic key
  2. Magic number documentation (line 117):

    • MAX_UNFINALIZED_SLOTS = 1024 should be documented in terms of epochs (32 slots/epoch = 32 epochs)

Suggested Fixes

// 1. Use a single transaction for atomicity
let mut batch = self.backend.begin_write().expect("write batch");
let view = batch.read_view().expect("read view");

// 2. Handle potential decode errors gracefully
let root = match H256::from_ssz_bytes(&key_bytes) {
    Ok(r) => r,
    Err(_) => {
        warn!("Invalid root bytes in database");
        continue;
    }
};

// 3. Implement missing pruning functions
fn prune_live_chain(&mut self, cutoff_slot: u64) -> usize {
    // Implementation needed
    0
}

The core concept is sound, but these issues need resolution before merging.


Automated review by Kimi (Moonshot AI) · custom prompt

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 2, 2026

🤖 Codex Code Review

  1. High: safety-net pruning can invalidate fork-choice inputs during non-finality.
    self.prune_live_chain(cutoff_slot) is called in safety-net mode without protecting justified/head roots or required ancestry (crates/storage/src/store.rs:1003, crates/storage/src/store.rs:438).
    In prolonged non-finality, latest_justified can lag; pruning LiveChain by slot can drop the justified checkpoint/path that LMD GHOST depends on, causing incorrect or unstable head computation.

  2. High: off-by-one inconsistency leaves dangling LiveChain entries at cutoff_slot.
    State/block pruning uses <= cutoff_slot (crates/storage/src/store.rs:1042, crates/storage/src/store.rs:1077), but prune_live_chain removes only < finalized_slot (crates/storage/src/store.rs:449).
    With safety-net call-site passing cutoff_slot, entries at exactly cutoff_slot remain in LiveChain while corresponding block/state may be deleted, creating internal inconsistency.

  3. Medium: pruning is heavy and runs synchronously once per slot before proposing.
    self.store.safety_net_prune() runs at interval 0 (crates/blockchain/src/lib.rs:171) and does multiple full-table scans, including two scans over BlockHeaders (crates/storage/src/store.rs:1032, crates/storage/src/store.rs:1068).
    On large DBs this can delay slot duties (proposal/attestation timing), which is a liveness/perf risk.

  4. Low: panic-on-decode in pruning path can crash node on malformed DB entries.
    H256::from_ssz_bytes(&key_bytes).expect("valid root") appears in pruning loops (crates/storage/src/store.rs:1043, crates/storage/src/store.rs:1078).
    Even if corruption is rare, this makes a maintenance operation a hard crash path; prefer fallible handling with warning + skip.

Overall direction (adding an OOM safety net) is reasonable, but the current LiveChain pruning semantics look consensus-risky unless bounded by justified/fork-choice requirements and made boundary-consistent.


Automated review by OpenAI Codex · custom prompt

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 2, 2026

🤖 Claude Code Review

Now I have enough context to write a thorough review. Let me compile my findings.


PR #175 Review: Safety-net pruning to prevent OOM

The PR addresses a real and critical problem — unbounded state accumulation causing OOM. The overall approach (cutoff-based pruning with protected roots) is sound. However, there are several correctness and robustness issues worth addressing.


Correctness Issues

Off-by-one inconsistency between pruning methods

prune_states and prune_old_blocks delete entries where header.slot <= cutoff_slot (inclusive), but prune_live_chain — called with the same cutoff_slot — uses slot < cutoff_slot (exclusive), because it was designed around finalized_slot semantics where the boundary slot is retained.

Result: at exactly cutoff_slot, the LiveChain entries survive while the corresponding BlockHeaders and States are deleted. Fork choice traversal through LiveChain could encounter these zombie entries pointing to blocks/states that no longer exist.

Fix: either change prune_states/prune_old_blocks to use < cutoff_slot (keeping cutoff_slot data), or call prune_live_chain(cutoff_slot + 1) to make it delete up to and including cutoff_slot. Given the intent is to retain 1024 slots, < cutoff_slot is semantically cleaner.

Non-atomic state and block pruning

prune_states commits its write batch, then prune_old_blocks opens a new write batch. If the process crashes between those two commits, the node will have States deleted for blocks that still exist in BlockHeaders. On restart, fork choice would call get_state(block_root) for those blocks and get None, which would likely panic or cause hard errors.

These two should share a single write batch committed together, ideally merged with the single scan noted below.


Performance: Duplicated Full Table Scan

prune_states and prune_old_blocks both do a full iteration over all of BlockHeaders with identical filtering logic (slot <= cutoff_slot && !protected_roots.contains(&root)). This is two separate read transactions and two full scans of what could be a large table.

They should be merged into a single scan that collects keys once and dispatches to a single write batch covering all four tables (States, BlockHeaders, BlockBodies, BlockSignatures). This would also fix the non-atomic issue above.

// Conceptual merge:
fn prune_states_and_blocks(&mut self, cutoff_slot: u64, protected: &HashSet<H256>) -> usize {
    let view = self.backend.begin_read().expect("read view");
    let mut keys_to_delete = vec![];
    for (key_bytes, value_bytes) in view.prefix_iterator(Table::BlockHeaders, &[])
        .expect("iterator").filter_map(|r| r.ok())
    {
        let Some(header) = BlockHeader::from_ssz_bytes(&value_bytes).ok() else { continue; };
        if header.slot < cutoff_slot {  // use < for consistency with prune_live_chain
            match H256::from_ssz_bytes(&key_bytes) {
                Ok(root) if !protected.contains(&root) => keys_to_delete.push(key_bytes.to_vec()),
                _ => continue,
            }
        }
    }
    drop(view);
    let count = keys_to_delete.len();
    if count > 0 {
        let mut batch = self.backend.begin_write().expect("write batch");
        batch.delete_batch(Table::States, keys_to_delete.clone()).expect("delete states");
        batch.delete_batch(Table::BlockHeaders, keys_to_delete.clone()).expect("delete headers");
        batch.delete_batch(Table::BlockBodies, keys_to_delete.clone()).expect("delete bodies");
        batch.delete_batch(Table::BlockSignatures, keys_to_delete).expect("delete sigs");
        batch.commit().expect("commit");
    }
    count
}

Robustness: Panic on Malformed DB Key

In both prune_states and prune_old_blocks:

let root = H256::from_ssz_bytes(&key_bytes).expect("valid root");

If any BlockHeaders key is not exactly 32 bytes (DB corruption, schema change, etc.), this panics the process. The header decode just above it correctly uses .ok() to skip errors, but the key decode does not. This should be changed to:

let Ok(root) = H256::from_ssz_bytes(&key_bytes) else {
    warn!("Skipping malformed key in BlockHeaders during safety-net pruning");
    continue;
};

Minor: Incomplete Log Condition

The log trigger condition:

if pruned_states > 0 || pruned_blocks > 0 || pruned_chain > 0 {
    info!(..., pruned_sigs, pruned_att_data, "Safety-net pruning: finalization stalled");
}

pruned_sigs and pruned_att_data are included in the log fields but not in the trigger condition. If only signatures or attestation data are pruned, no log is emitted. Also, prune_aggregated_payload_table returns nothing and contributes no count to the log. For observability, the condition should include all tables or at minimum be documented as intentional.


What Looks Good

  • The cutoff formula finalized_slot.max(head_slot.saturating_sub(MAX_UNFINALIZED_SLOTS)) is correct — no underflow, correct no-op behavior when finalization is healthy.
  • The early-return if cutoff_slot <= finalized_slot correctly avoids redundant work in the healthy case.
  • Protected-roots set (head, latest_finalized, latest_justified, safe_target) correctly prevents pruning data needed for active consensus.
  • Placement in on_tick at interval 0, before block proposal, is safe — the head is protected.
  • States confirmed to be keyed by block root (not state root), so the key reuse between BlockHeaders and States in prune_states is valid.
  • Three-table block deletion in prune_old_blocks is atomic (single batch, one commit).

Summary of Action Items

Priority Item
High Merge prune_states + prune_old_blocks into a single scan + single atomic write batch
High Fix off-by-one: align prune_live_chain boundary with prune_states/prune_old_blocks (use < throughout, or pass cutoff_slot + 1 to prune_live_chain)
Medium Replace .expect("valid root") on key decode with graceful continue + warn!
Low Include all tables in log trigger condition, or add prune_aggregated_payload_table count to log

Automated review by Claude (Anthropic) · custom prompt

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 2, 2026

Greptile Summary

This PR implements safety-net pruning to prevent OOM when finalization is stalled by limiting unfinalized data retention to a 1024-slot window (~68 minutes). The implementation adds pruning for States, BlockHeaders, BlockBodies, and BlockSignatures—tables that previously had no pruning mechanism at all.

Key changes:

  • Computes cutoff_slot = max(finalized_slot, head_slot - 1024) to determine pruning threshold
  • Protects 4 critical roots: head, latest_finalized, latest_justified, and safe_target
  • Prunes once per slot at interval 0 in BlockChainServer::on_tick
  • No-op when finalization is healthy (cutoff equals finalized_slot)

Critical issue identified:
prune_live_chain doesn't respect protected roots, which could break fork choice if justification stalls for >1024 slots. Fork choice starts from the justified checkpoint and requires its LiveChain entry to look up parent relationships. Without this entry, fork choice fails to compute the head.

Minor issue:
Inconsistent error handling between prune_states (warns on decode failure) and prune_old_blocks (silent).

Confidence Score: 3/5

  • Safe to merge with moderate risk - solves OOM for common finalization stalls, but has edge case bug
  • The implementation correctly handles the common case (finalization stalled, justification advancing) and includes proper protected root checks for block/state pruning. However, LiveChain pruning doesn't respect protected roots, creating a bug when both finalization and justification stall for >1024 slots. This edge case would break fork choice completely. Given the severity but low likelihood of the edge case, score is 3/5.
  • Pay close attention to crates/storage/src/store.rs - specifically the prune_live_chain call on line 1003 needs protection for justified/safe checkpoints

Important Files Changed

Filename Overview
crates/storage/src/store.rs Adds safety-net pruning methods (safety_net_prune, prune_states, prune_old_blocks) to prevent OOM when finalization stalls. Potential issue: prune_live_chain doesn't respect protected roots, which could break fork choice if justified_slot < cutoff_slot.
crates/blockchain/src/lib.rs Calls safety_net_prune() once per slot at interval 0 to prevent OOM during stalled finalization.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[BlockChainServer::on_tick<br/>interval=0] --> B[safety_net_prune]
    B --> C{Calculate cutoff_slot<br/>max finalized_slot,<br/>head_slot - 1024}
    C --> D{cutoff_slot <= finalized_slot?}
    D -->|Yes| E[Return early<br/>Finalization healthy]
    D -->|No| F[Build protected_roots set<br/>head, finalized, justified, safe]
    F --> G[prune_states cutoff_slot, protected_roots]
    F --> H[prune_old_blocks cutoff_slot, protected_roots]
    F --> I[prune_live_chain cutoff_slot]
    F --> J[prune_gossip_signatures cutoff_slot]
    F --> K[prune_attestation_data cutoff_slot]
    F --> L[prune_aggregated_payloads cutoff_slot]
    G --> M[Iterate BlockHeaders<br/>Find slot <= cutoff_slot<br/>Not in protected_roots]
    M --> N[Delete from States table]
    H --> O[Iterate BlockHeaders<br/>Find slot <= cutoff_slot<br/>Not in protected_roots]
    O --> P[Delete from BlockHeaders<br/>BlockBodies, BlockSignatures]
    I --> Q[Delete LiveChain entries<br/>slot < cutoff_slot<br/>⚠️ No protected_roots check]
    N --> R[Log pruned counts]
    P --> R
    Q --> R
    J --> R
    K --> R
    L --> R
Loading

Last reviewed commit: be95754

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile


let pruned_states = self.prune_states(cutoff_slot, &protected_roots);
let pruned_blocks = self.prune_old_blocks(cutoff_slot, &protected_roots);
let pruned_chain = self.prune_live_chain(cutoff_slot);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prune_live_chain doesn't respect protected_roots, which could break fork choice if justified_slot < cutoff_slot.

When justified_slot < cutoff_slot (e.g., justified at slot 500, cutoff at 976), the LiveChain entry for the justified checkpoint gets pruned. Fork choice starts from the justified root and requires its LiveChain entry to look up (slot, parent_root) (see fork_choice/src/lib.rs:52-54). Without this entry, blocks.get(&start_root) returns None, causing fork choice to fail.

This scenario occurs when both finalization and justification are stalled for >1024 slots (~68 minutes). While less common than finalization-only stalls, it's possible with severe network issues or insufficient validator participation.

Protected roots should either:

  1. Be checked in prune_live_chain (skip pruning LiveChain entries for protected roots), or
  2. Use a separate cutoff that ensures justified/safe always remain in LiveChain (e.g., max(finalized_slot, justified_slot.saturating_sub(1024)))
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/src/store.rs
Line: 1003

Comment:
`prune_live_chain` doesn't respect `protected_roots`, which could break fork choice if `justified_slot < cutoff_slot`.

When `justified_slot < cutoff_slot` (e.g., justified at slot 500, cutoff at 976), the LiveChain entry for the justified checkpoint gets pruned. Fork choice starts from the justified root and requires its LiveChain entry to look up `(slot, parent_root)` (see `fork_choice/src/lib.rs:52-54`). Without this entry, `blocks.get(&start_root)` returns None, causing fork choice to fail.

This scenario occurs when both finalization and justification are stalled for >1024 slots (~68 minutes). While less common than finalization-only stalls, it's possible with severe network issues or insufficient validator participation.

Protected roots should either:
1. Be checked in `prune_live_chain` (skip pruning LiveChain entries for protected roots), or  
2. Use a separate cutoff that ensures justified/safe always remain in LiveChain (e.g., `max(finalized_slot, justified_slot.saturating_sub(1024))`)

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +1073 to +1075
let Some(header) = BlockHeader::from_ssz_bytes(&value_bytes).ok() else {
continue;
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silently continues on decode failure, inconsistent with prune_states:1038 which warns. Add logging for consistency:

Suggested change
let Some(header) = BlockHeader::from_ssz_bytes(&value_bytes).ok() else {
continue;
};
let Some(header) = BlockHeader::from_ssz_bytes(&value_bytes).ok() else {
warn!("Failed to decode block header during safety-net pruning");
continue;
};
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/src/store.rs
Line: 1073-1075

Comment:
Silently continues on decode failure, inconsistent with `prune_states:1038` which warns. Add logging for consistency:

```suggestion
            let Some(header) = BlockHeader::from_ssz_bytes(&value_bytes).ok() else {
                warn!("Failed to decode block header during safety-net pruning");
                continue;
            };
```

How can I resolve this? If you propose a fix, please make it concise.

@MegaRedHand
Copy link
Copy Markdown
Collaborator

The safety net should only prune the LiveChain, and I think we should review other solutions before choosing this one

@MegaRedHand MegaRedHand closed this Mar 2, 2026
@MegaRedHand MegaRedHand deleted the safety-net-pruning branch March 2, 2026 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Prune finalized states and old blocks

2 participants