feat: verify and repair evodb diffs automatically at node startup #6999

UdjinM6 · 2025-11-21T09:58:53Z

Issue being fixed or feature implemented

Automatically verify and repair deterministic masternode list diffs in evodb during node startup. Helps detect and fix database corruption without manual intervention.

What was done?

Add DB_LIST_REPAIRED marker to track when repair is complete
Skip repair during reindex (fresh rebuild)
Add -forceevodbrepair flag to force re-verification
Shutdown gracefully on critical errors with user instructions
Run in background thread with minimal cs_main locking

How Has This Been Tested?

Run a node, check logs

Breaking Changes

n/a

Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have added or updated relevant unit/integration/functional/e2e tests
I have made corresponding changes to the documentation
I have assigned this pull request to a milestone (for repository code-owners and collaborators only)

github-actions · 2025-11-21T09:59:34Z

✅ No Merge Conflicts Detected

This PR currently has no conflicts with other open PRs.

Automatically verify and repair deterministic masternode list diffs in evodb during node startup. Helps detect and fix database corruption without manual intervention. - Add DB_LIST_REPAIRED marker to track when repair is complete - Skip repair during reindex (fresh rebuild) - Add -forceevodbrepair flag to force re-verification - Shutdown gracefully on critical errors with user instructions - Run in background thread with minimal cs_main locking Co-Authored-By: Claude Code (Anthropic) <[email protected]>

coderabbitai · 2025-11-21T18:11:57Z

Walkthrough

Three files are modified to introduce a repair-tracking mechanism for the Evo masternode list database. A new persistent repair flag (DB_LIST_REPAIRED) is added alongside two public methods—IsRepaired() and CompleteRepair()—that query and persist repair status. The startup initialization flow is restructured to conditionally perform repair based on reindexing state, prior repair completion, and a new -forceevodbrepair command-line argument. When repair is needed, a callback-based workflow rebuilds the list from special transactions, recalculates diffs, handles verification and repair errors, and marks completion upon success.

Sequence Diagram

sequenceDiagram
    participant Startup as Startup Flow
    participant MNMgr as CDeterministicMNManager
    participant EvoDB as EvoDB (Database)
    participant ChainHelper as ChainHelper
    participant SpecialTx as SpecialTxMan

    Startup->>MNMgr: IsRepaired()?
    MNMgr->>EvoDB: Check DB_LIST_REPAIRED flag
    EvoDB-->>MNMgr: Flag status
    MNMgr-->>Startup: Repaired status

    alt Already Repaired & !forceevodbrepair
        Startup->>Startup: Skip repair, log message
    else Reindexing in Progress
        Startup->>MNMgr: CompleteRepair()
        MNMgr->>EvoDB: Write DB_LIST_REPAIRED
        MNMgr->>EvoDB: Commit & flush
    else Repair Needed
        Startup->>ChainHelper: Build list via callback
        ChainHelper->>SpecialTx: RebuildListFromBlock
        SpecialTx-->>ChainHelper: Rebuilt list
        ChainHelper-->>Startup: List built

        Startup->>MNMgr: RecalculateAndRepairDiffs(callback)
        MNMgr->>MNMgr: Verify & repair diffs

        alt Verification Errors
            MNMgr-->>Startup: verification_errors
            Startup->>Startup: Log errors
        end

        alt Repair Errors
            MNMgr-->>Startup: repair_errors
            Startup->>Startup: Log critical, suggest reindex
            Startup->>Startup: Initiate shutdown
        else Success
            Startup->>MNMgr: CompleteRepair()
            MNMgr->>EvoDB: Write DB_LIST_REPAIRED
            MNMgr->>EvoDB: Commit & flush
            Startup->>Startup: Log summary
        end
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

src/init.cpp: The restructured repair flow contains multiple conditional branches, error-handling paths, and new integrations with chain helper and special transaction managers. The logic density and control flow changes warrant careful tracing.
src/evo/deterministicmns.cpp: Implementation of transaction-based database persistence with error handling and assertions requires verification of DB state management and flush semantics.
Error handling paths: Multiple failure scenarios (verification errors, repair errors) lead to different outcomes, including shutdown; these require thorough review to ensure correctness and user-facing messaging.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main feature: automatic verification and repair of evodb diffs at node startup.
Description check	✅ Passed	The description is clearly related to the changeset, explaining the issue, implementation details, and testing approach.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

src/evo/deterministicmns.h (1)

743-749: New repair-status API on CDeterministicMNManager looks reasonable

The addition of IsRepaired() and CompleteRepair() is consistent with the cpp implementation and the startup repair workflow; public exposure here is appropriate.

You may want to document (brief comment) that these are intended to be called only from startup/maintenance paths and that IsRepaired() is purely a DB flag check, not a live consistency check, but that's optional.

src/evo/deterministicmns.cpp (1)

1694-1706: IsRepaired/CompleteRepair semantics and error handling

IsRepaired() as a thin m_evoDb.Exists(DB_LIST_REPAIRED) wrapper is fine for a durable “have we ever completed repair” marker and matches how it’s used at startup.

In CompleteRepair():

The transactional write + CommitRootTransaction() pattern is consistent with other Evo DB upgrade flows.

Using assert(false) on commit failure is stricter than the migration path (which returns false). Given this is only called from startup/reindex paths, this is probably acceptable, but you might consider returning a bool and letting the caller log and initiate shutdown instead of hard-asserting, for consistency with MigrateLegacyDiffs() and to keep behavior uniform across DB-maintenance routines.
src/init.cpp (1)
2414-2460: Startup EvoDB repair flow is well-placed; consider a couple of small robustness tweaks

The overall flow in the loadblk thread looks solid:

Reindex/-reindex-chainstate: you explicitly skip repair and mark the DB as repaired via CompleteRepair(), which avoids doing heavy work while the entire history is being rebuilt.

Normal startup:

If IsRepaired() and -forceevodbrepair=0, you skip repair and just log once.

Otherwise you:

Snapshot start_index at DIP0003Height and stop_index at tip under cs_main.

Run RecalculateAndRepairDiffs with a build_list_func that delegates to chain_helper->special_tx->RebuildListFromBlock, reusing the already-validated special-tx logic.

Log verification errors as warnings, treat any repair_errors as fatal (log + StartShutdown()), and on success call CompleteRepair() and log a concise summary with timing via Ticks().

This keeps cs_main locking minimal and runs before network connections are started, so there is no concurrent chain mutation during the repair.

Two optional improvements you might consider:

Handle the “no work to do” case explicitly

If start_index is null or start_index->nHeight >= stop_index->nHeight (e.g. chain height is below DIP0003 activation on a fresh dev/regtest setup), the code silently does nothing and never sets the repaired flag. That’s not incorrect, but it means the check will rerun on every startup.

You could treat this as a trivially-successful repair, e.g.:
-            if (start_index && stop_index && start_index->nHeight < stop_index->nHeight) {
+            if (start_index && stop_index && start_index->nHeight < stop_index->nHeight) {
                 // existing verification/repair logic...
-            }
+            } else {
+                LogPrintf("No masternode list diffs to verify in current height range; marking EvoDB as repaired\n");
+                node.dmnman->CompleteRepair();
+            }
Defensive check for chain_helper/special_tx

The lambda assumes node.chain_helper and node.chain_helper->special_tx are always initialized. That’s true in the normal node path after LoadChainstate, but if any future refactoring ever makes this optional, an assert or explicit check with a clear log message here would make failures easier to diagnose.

Overall, the repair logic, error handling, and placement in the startup sequence look correct and low-risk.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c9faf42 and 2e5fdd8.

📒 Files selected for processing (3)

src/evo/deterministicmns.cpp (2 hunks)
src/evo/deterministicmns.h (1 hunks)
src/init.cpp (3 hunks)

🧰 Additional context used

🧠 Learnings (3)

📚 Learning: 2025-01-07T18:50:44.838Z

Learnt from: knst
Repo: dashpay/dash PR: 6511
File: src/evo/deterministicmns.cpp:1369-1373
Timestamp: 2025-01-07T18:50:44.838Z
Learning: The functions `MigrateDBIfNeeded` and `MigrateDBIfNeeded2` in `src/evo/deterministicmns.cpp` are temporary and will be removed in a future version. Refactoring suggestions for these functions should be avoided.

Applied to files:

src/evo/deterministicmns.h
src/evo/deterministicmns.cpp

📚 Learning: 2025-11-13T20:02:55.480Z

Learnt from: UdjinM6
Repo: dashpay/dash PR: 6969
File: src/evo/deterministicmns.h:441-479
Timestamp: 2025-11-13T20:02:55.480Z
Learning: In `src/evo/deterministicmns.h`, the `internalId` field in `CDeterministicMN` and the `mnInternalIdMap` in `CDeterministicMNList` are non-deterministic and used only for internal bookkeeping and efficient lookups. Different nodes can assign different internalIds to the same masternode depending on their sync history. Methods like `IsEqual()` intentionally ignore internalId mappings and only compare consensus-critical deterministic fields (proTxHash, collateral, state, etc.).

Applied to files:

src/evo/deterministicmns.h
src/evo/deterministicmns.cpp

📚 Learning: 2025-10-28T18:36:40.263Z

Learnt from: kwvg
Repo: dashpay/dash PR: 6923
File: src/test/util/setup_common.cpp:235-251
Timestamp: 2025-10-28T18:36:40.263Z
Learning: In `src/test/util/setup_common.cpp`, the `CEvoDB` instance in `BasicTestingSetup` is constructed with `.memory = true` flag (memory-only mode), so it does not create file handles on disk. This makes the destructor teardown order safe even if `fs::remove_all(m_path_root)` is called before `m_node.evodb.reset()`.

Applied to files:

src/evo/deterministicmns.cpp

🧬 Code graph analysis (2)

src/evo/deterministicmns.h (1)

src/evo/deterministicmns.cpp (4)

IsRepaired (1694-1694)

IsRepaired (1694-1694)

CompleteRepair (1696-1706)

CompleteRepair (1696-1696)

src/init.cpp (2)

src/node/interfaces.cpp (1)

fReindex (623-623)

src/util/time.h (1)

Ticks (81-84)

🔇 Additional comments (3)

src/evo/deterministicmns.cpp (1)

32-36: DB_LIST_REPAIRED key is consistent with existing Evo DB keying

The new "dmn_R1" key for DB_LIST_REPAIRED follows the existing naming/versioning pattern (dmn_S*, dmn_D*) and is scoped to this module; no issues here.

src/init.cpp (2)

79-89: New Evo includes are scoped and appropriate

Adding evo/chainhelper.h and evo/specialtxman.h here matches the new startup repair usage of node.chain_helper->special_tx and doesn’t broaden dependencies beyond init-time wiring.

756-756: -forceevodbrepair argument wiring is correct

The new -forceevodbrepair flag is registered consistently with other debug/test options and its help text clearly indicates behavior (“Force evodb masternode list diff verification and repair on startup, even if already repaired”).

No issues with the definition; it lines up with the logic later in AppInitMain.

PastaPastaPasta

utACK 2e5fdd8

DashCoreAutoGuix · 2025-11-25T14:22:05Z

Guix Automation has began to build this PR tagged as v23.0.0-devpr6999.2e5fdd83. A new comment will be made when the image is pushed.

DashCoreAutoGuix · 2025-11-25T14:22:54Z

Guix Automation has completed; a release should be present here: https://github.com/dashpay/dash-dev-branches/releases/tag/v23.0.0-devpr6999.2e5fdd83. The image should be on dockerhub soon.

UdjinM6 added this to the 23.0.1 milestone Nov 21, 2025

UdjinM6 added the backport-candidate-23.0.x label Nov 21, 2025

UdjinM6 force-pushed the fu_6969_2 branch from 44d6c84 to 2e5fdd8 Compare November 21, 2025 17:59

UdjinM6 marked this pull request as ready for review November 21, 2025 18:04

coderabbitai bot reviewed Nov 21, 2025

View reviewed changes

UdjinM6 requested review from PastaPastaPasta and knst November 22, 2025 21:06

PastaPastaPasta approved these changes Nov 24, 2025

View reviewed changes

PastaPastaPasta added the guix-build label Nov 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: verify and repair evodb diffs automatically at node startup #6999

feat: verify and repair evodb diffs automatically at node startup #6999

Uh oh!

UdjinM6 commented Nov 21, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 21, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Nov 21, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

PastaPastaPasta left a comment

Uh oh!

DashCoreAutoGuix commented Nov 25, 2025

Uh oh!

DashCoreAutoGuix commented Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: verify and repair evodb diffs automatically at node startup #6999

Are you sure you want to change the base?

feat: verify and repair evodb diffs automatically at node startup #6999

Uh oh!

Conversation

UdjinM6 commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue being fixed or feature implemented

What was done?

How Has This Been Tested?

Breaking Changes

Checklist:

Uh oh!

github-actions bot commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ No Merge Conflicts Detected

Uh oh!

coderabbitai bot commented Nov 21, 2025

Walkthrough

Sequence Diagram

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

PastaPastaPasta left a comment

Choose a reason for hiding this comment

Uh oh!

DashCoreAutoGuix commented Nov 25, 2025

Uh oh!

DashCoreAutoGuix commented Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

UdjinM6 commented Nov 21, 2025 •

edited

Loading

github-actions bot commented Nov 21, 2025 •

edited

Loading