Skip to content

Conversation

@UdjinM6
Copy link

@UdjinM6 UdjinM6 commented Nov 21, 2025

Issue being fixed or feature implemented

Automatically verify and repair deterministic masternode list diffs in evodb during node startup. Helps detect and fix database corruption without manual intervention.

What was done?

  • Add DB_LIST_REPAIRED marker to track when repair is complete
  • Skip repair during reindex (fresh rebuild)
  • Add -forceevodbrepair flag to force re-verification
  • Shutdown gracefully on critical errors with user instructions
  • Run in background thread with minimal cs_main locking

How Has This Been Tested?

Run a node, check logs

Breaking Changes

n/a

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation
  • I have assigned this pull request to a milestone (for repository code-owners and collaborators only)

@UdjinM6 UdjinM6 added this to the 23.0.1 milestone Nov 21, 2025
@github-actions
Copy link

github-actions bot commented Nov 21, 2025

✅ No Merge Conflicts Detected

This PR currently has no conflicts with other open PRs.

Automatically verify and repair deterministic masternode list diffs
in evodb during node startup. Helps detect and fix database corruption
without manual intervention.

- Add DB_LIST_REPAIRED marker to track when repair is complete
- Skip repair during reindex (fresh rebuild)
- Add -forceevodbrepair flag to force re-verification
- Shutdown gracefully on critical errors with user instructions
- Run in background thread with minimal cs_main locking

Co-Authored-By: Claude Code (Anthropic) <[email protected]>
@UdjinM6 UdjinM6 marked this pull request as ready for review November 21, 2025 18:04
@coderabbitai
Copy link

coderabbitai bot commented Nov 21, 2025

Walkthrough

Three files are modified to introduce a repair-tracking mechanism for the Evo masternode list database. A new persistent repair flag (DB_LIST_REPAIRED) is added alongside two public methods—IsRepaired() and CompleteRepair()—that query and persist repair status. The startup initialization flow is restructured to conditionally perform repair based on reindexing state, prior repair completion, and a new -forceevodbrepair command-line argument. When repair is needed, a callback-based workflow rebuilds the list from special transactions, recalculates diffs, handles verification and repair errors, and marks completion upon success.

Sequence Diagram

sequenceDiagram
    participant Startup as Startup Flow
    participant MNMgr as CDeterministicMNManager
    participant EvoDB as EvoDB (Database)
    participant ChainHelper as ChainHelper
    participant SpecialTx as SpecialTxMan

    Startup->>MNMgr: IsRepaired()?
    MNMgr->>EvoDB: Check DB_LIST_REPAIRED flag
    EvoDB-->>MNMgr: Flag status
    MNMgr-->>Startup: Repaired status

    alt Already Repaired & !forceevodbrepair
        Startup->>Startup: Skip repair, log message
    else Reindexing in Progress
        Startup->>MNMgr: CompleteRepair()
        MNMgr->>EvoDB: Write DB_LIST_REPAIRED
        MNMgr->>EvoDB: Commit & flush
    else Repair Needed
        Startup->>ChainHelper: Build list via callback
        ChainHelper->>SpecialTx: RebuildListFromBlock
        SpecialTx-->>ChainHelper: Rebuilt list
        ChainHelper-->>Startup: List built

        Startup->>MNMgr: RecalculateAndRepairDiffs(callback)
        MNMgr->>MNMgr: Verify & repair diffs

        alt Verification Errors
            MNMgr-->>Startup: verification_errors
            Startup->>Startup: Log errors
        end

        alt Repair Errors
            MNMgr-->>Startup: repair_errors
            Startup->>Startup: Log critical, suggest reindex
            Startup->>Startup: Initiate shutdown
        else Success
            Startup->>MNMgr: CompleteRepair()
            MNMgr->>EvoDB: Write DB_LIST_REPAIRED
            MNMgr->>EvoDB: Commit & flush
            Startup->>Startup: Log summary
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • src/init.cpp: The restructured repair flow contains multiple conditional branches, error-handling paths, and new integrations with chain helper and special transaction managers. The logic density and control flow changes warrant careful tracing.
  • src/evo/deterministicmns.cpp: Implementation of transaction-based database persistence with error handling and assertions requires verification of DB state management and flush semantics.
  • Error handling paths: Multiple failure scenarios (verification errors, repair errors) lead to different outcomes, including shutdown; these require thorough review to ensure correctness and user-facing messaging.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main feature: automatic verification and repair of evodb diffs at node startup.
Description check ✅ Passed The description is clearly related to the changeset, explaining the issue, implementation details, and testing approach.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/evo/deterministicmns.h (1)

743-749: New repair-status API on CDeterministicMNManager looks reasonable

The addition of IsRepaired() and CompleteRepair() is consistent with the cpp implementation and the startup repair workflow; public exposure here is appropriate.

You may want to document (brief comment) that these are intended to be called only from startup/maintenance paths and that IsRepaired() is purely a DB flag check, not a live consistency check, but that's optional.

src/evo/deterministicmns.cpp (1)

1694-1706: IsRepaired/CompleteRepair semantics and error handling

IsRepaired() as a thin m_evoDb.Exists(DB_LIST_REPAIRED) wrapper is fine for a durable “have we ever completed repair” marker and matches how it’s used at startup.

In CompleteRepair():

  • The transactional write + CommitRootTransaction() pattern is consistent with other Evo DB upgrade flows.
  • Using assert(false) on commit failure is stricter than the migration path (which returns false). Given this is only called from startup/reindex paths, this is probably acceptable, but you might consider returning a bool and letting the caller log and initiate shutdown instead of hard-asserting, for consistency with MigrateLegacyDiffs() and to keep behavior uniform across DB-maintenance routines.
src/init.cpp (1)

2414-2460: Startup EvoDB repair flow is well-placed; consider a couple of small robustness tweaks

The overall flow in the loadblk thread looks solid:

  • Reindex/-reindex-chainstate: you explicitly skip repair and mark the DB as repaired via CompleteRepair(), which avoids doing heavy work while the entire history is being rebuilt.
  • Normal startup:
    • If IsRepaired() and -forceevodbrepair=0, you skip repair and just log once.
    • Otherwise you:
      • Snapshot start_index at DIP0003Height and stop_index at tip under cs_main.
      • Run RecalculateAndRepairDiffs with a build_list_func that delegates to chain_helper->special_tx->RebuildListFromBlock, reusing the already-validated special-tx logic.
      • Log verification errors as warnings, treat any repair_errors as fatal (log + StartShutdown()), and on success call CompleteRepair() and log a concise summary with timing via Ticks().

This keeps cs_main locking minimal and runs before network connections are started, so there is no concurrent chain mutation during the repair.

Two optional improvements you might consider:

  1. Handle the “no work to do” case explicitly

If start_index is null or start_index->nHeight >= stop_index->nHeight (e.g. chain height is below DIP0003 activation on a fresh dev/regtest setup), the code silently does nothing and never sets the repaired flag. That’s not incorrect, but it means the check will rerun on every startup.

You could treat this as a trivially-successful repair, e.g.:

-            if (start_index && stop_index && start_index->nHeight < stop_index->nHeight) {
+            if (start_index && stop_index && start_index->nHeight < stop_index->nHeight) {
                 // existing verification/repair logic...
-            }
+            } else {
+                LogPrintf("No masternode list diffs to verify in current height range; marking EvoDB as repaired\n");
+                node.dmnman->CompleteRepair();
+            }
  1. Defensive check for chain_helper/special_tx

The lambda assumes node.chain_helper and node.chain_helper->special_tx are always initialized. That’s true in the normal node path after LoadChainstate, but if any future refactoring ever makes this optional, an assert or explicit check with a clear log message here would make failures easier to diagnose.

Overall, the repair logic, error handling, and placement in the startup sequence look correct and low-risk.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c9faf42 and 2e5fdd8.

📒 Files selected for processing (3)
  • src/evo/deterministicmns.cpp (2 hunks)
  • src/evo/deterministicmns.h (1 hunks)
  • src/init.cpp (3 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-01-07T18:50:44.838Z
Learnt from: knst
Repo: dashpay/dash PR: 6511
File: src/evo/deterministicmns.cpp:1369-1373
Timestamp: 2025-01-07T18:50:44.838Z
Learning: The functions `MigrateDBIfNeeded` and `MigrateDBIfNeeded2` in `src/evo/deterministicmns.cpp` are temporary and will be removed in a future version. Refactoring suggestions for these functions should be avoided.

Applied to files:

  • src/evo/deterministicmns.h
  • src/evo/deterministicmns.cpp
📚 Learning: 2025-11-13T20:02:55.480Z
Learnt from: UdjinM6
Repo: dashpay/dash PR: 6969
File: src/evo/deterministicmns.h:441-479
Timestamp: 2025-11-13T20:02:55.480Z
Learning: In `src/evo/deterministicmns.h`, the `internalId` field in `CDeterministicMN` and the `mnInternalIdMap` in `CDeterministicMNList` are non-deterministic and used only for internal bookkeeping and efficient lookups. Different nodes can assign different internalIds to the same masternode depending on their sync history. Methods like `IsEqual()` intentionally ignore internalId mappings and only compare consensus-critical deterministic fields (proTxHash, collateral, state, etc.).

Applied to files:

  • src/evo/deterministicmns.h
  • src/evo/deterministicmns.cpp
📚 Learning: 2025-10-28T18:36:40.263Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6923
File: src/test/util/setup_common.cpp:235-251
Timestamp: 2025-10-28T18:36:40.263Z
Learning: In `src/test/util/setup_common.cpp`, the `CEvoDB` instance in `BasicTestingSetup` is constructed with `.memory = true` flag (memory-only mode), so it does not create file handles on disk. This makes the destructor teardown order safe even if `fs::remove_all(m_path_root)` is called before `m_node.evodb.reset()`.

Applied to files:

  • src/evo/deterministicmns.cpp
🧬 Code graph analysis (2)
src/evo/deterministicmns.h (1)
src/evo/deterministicmns.cpp (4)
  • IsRepaired (1694-1694)
  • IsRepaired (1694-1694)
  • CompleteRepair (1696-1706)
  • CompleteRepair (1696-1696)
src/init.cpp (2)
src/node/interfaces.cpp (1)
  • fReindex (623-623)
src/util/time.h (1)
  • Ticks (81-84)
🔇 Additional comments (3)
src/evo/deterministicmns.cpp (1)

32-36: DB_LIST_REPAIRED key is consistent with existing Evo DB keying

The new "dmn_R1" key for DB_LIST_REPAIRED follows the existing naming/versioning pattern (dmn_S*, dmn_D*) and is scoped to this module; no issues here.

src/init.cpp (2)

79-89: New Evo includes are scoped and appropriate

Adding evo/chainhelper.h and evo/specialtxman.h here matches the new startup repair usage of node.chain_helper->special_tx and doesn’t broaden dependencies beyond init-time wiring.


756-756: -forceevodbrepair argument wiring is correct

The new -forceevodbrepair flag is registered consistently with other debug/test options and its help text clearly indicates behavior (“Force evodb masternode list diff verification and repair on startup, even if already repaired”).

No issues with the definition; it lines up with the logic later in AppInitMain.

Copy link
Member

@PastaPastaPasta PastaPastaPasta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 2e5fdd8

@DashCoreAutoGuix
Copy link

Guix Automation has began to build this PR tagged as v23.0.0-devpr6999.2e5fdd83. A new comment will be made when the image is pushed.

@DashCoreAutoGuix
Copy link

Guix Automation has completed; a release should be present here: https://github.com/dashpay/dash-dev-branches/releases/tag/v23.0.0-devpr6999.2e5fdd83. The image should be on dockerhub soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants