-
Notifications
You must be signed in to change notification settings - Fork 1.2k
refactor: separate network and consensus logic 2a/N [signing] #6992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
|
WalkthroughThis PR extracts recovered-signature network handling into a new NetSigning NetHandler and moves worker-thread processing out of CSigningManager. CSigningManager API is simplified: it no longer depends on chainstate, drops internal worker-thread lifecycle methods, adds VerifyAndProcessRecoveredSig, FetchPendingReconstructed, GetListeners, and changes CollectPendingRecoveredSigsToVerify to return quorum public keys. CSigSharesManager::TryRecoverSig now returns std::shared_ptr. Build files, peer-manager interfaces, tests, and lint dependency graphs were updated to expose and compile the new net_signing implementation. Sequence Diagram(s)sequenceDiagram
participant PM as PeerManager
participant NS as NetSigning
participant SM as CSigningManager
participant QM as CQuorumManager
Note over PM,NS: Incoming network recovered sig
PM->>NS: ProcessMessage(QSIGREC)
NS->>NS: Deserialize CRecoveredSig\nErase pending requests
NS->>SM: VerifyAndProcessRecoveredSig(from, recoveredSig)
SM->>QM: Lookup quorum / pubkey
alt quorum active & valid
SM->>SM: ProcessRecoveredSig(recoveredSig)
SM->>NS: Notify listeners / emit signals
else invalid/missing quorum
SM->>SM: Early return
end
Note over NS: Background processing loop
NS->>NS: WorkThreadMain()
loop periodic
NS->>SM: FetchPendingReconstructed()
NS->>SM: CollectPendingRecoveredSigsToVerify(...)
SM->>SM: Provide pubkeys + sig shares
NS->>NS: Remote verify batch, handle misbehaviour
NS->>SM: ProcessRecoveredSig(validRecoveredSig)
end
sequenceDiagram
participant SS as SigSharesManager
participant SM as CSigningManager
participant NS as NetSigning
Note over SS,SM: Recovery result flow
SS->>SS: TryRecoverSig(...)
alt recovery succeeded
SS-->>SM: returns shared_ptr<CRecoveredSig>
SM->>SM: ProcessRecoveredSig(shared_ptr)
SM->>NS: Notify listeners / emit signals
else recovery failed
SS-->>SS: returns nullptr
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (3)
💤 Files with no reviewable changes (1)
🚧 Files skipped from review as they are similar to previous changes (2)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (5)
src/net_processing.cpp (2)
660-661: Document semantics ofPeerPostProcessMessagefor external callersThe new API is straightforward, but its contract isn’t obvious: calling it currently uses a sentinel node id (
-1in the implementation), which means:
- any
m_errorwill not actually punish a specific peer, and- any
m_to_erasewill not clear per‑peer object download state.It would help future users to add a brief comment here (and/or in the class header) stating that this entry point is intended only for generic post‑processing (relays, DSQs, etc.) where a concrete
NodeIdis not available, and that peer‑scoped effects are effectively ignored.
6556-6559: Avoid magic-1node id literal inPeerPostProcessMessageUsing
-1directly as the “no peer”NodeIdworks (callers likeMisbehaving/EraseObjectRequestsafely no‑op), but it’s a magic value and already appears elsewhere in this file for the same purpose.Consider introducing a named sentinel (e.g.
constexpr NodeId NO_PEER_ID{-1};) or reusing an existing one if present, and calling:- PostProcessMessage(std::move(ret), -1); + PostProcessMessage(std::move(ret), NO_PEER_ID);This keeps the intent clear and avoids repeating the raw literal.
src/llmq/net_signing.h (1)
15-45: MakeNetSigningheader self-contained (forward declareCRecoveredSig, include<memory>)The
NetSigninginterface looks aligned with the new design (a dedicated net handler aroundCSigningManager, with its own worker thread and interrupt), but the header currently depends on transitive declarations:
ProcessRecoveredSigusesstd::shared_ptr<const llmq::CRecoveredSig>without a visible declaration ofllmq::CRecoveredSig.<memory>is not included even thoughstd::shared_ptrappears in the public API.To avoid brittle include-order dependencies and keep the header self-contained, consider:
#include <net_processing.h> #include <util/threadinterrupt.h> #include <util/time.h> +#include <memory> #include <thread> -namespace llmq { -class CSigningManager; -} // namespace llmq +namespace llmq { +class CSigningManager; +class CRecoveredSig; +} // namespace llmqThis should not change behavior but will make future usage of
llmq/net_signing.hsafer regardless of inclusion order.src/llmq/signing_shares.cpp (2)
783-795: Code duplication acknowledged in TODO.The recovered signature processing logic (lines 785-793) duplicates code from
NetSigning::ProcessRecoveredSig(net_signing.cpp lines 62-67). The TODO comment on line 786 already acknowledges this.Consider extracting this common logic to a shared helper method in a follow-up to reduce duplication.
798-879: Verify single-member recovery logic flow.The single-member quorum recovery path has a two-stage check:
- Lines 817-832: Prepares
singleMemberRecoveredSigifis_single_member()- Lines 846-848: Returns
singleMemberRecoveredSigifis_single_member()While logically correct (both checks should match), this pattern is fragile because
singleMemberRecoveredSigis only initialized inside the firstifblock. If the checks ever diverge due to future changes, line 847 could return an uninitialized shared_ptr.Consider restructuring for clarity, such as an early return after line 831 for single-member quorums.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
src/Makefile.am(2 hunks)src/evo/mnhftx.cpp(0 hunks)src/init.cpp(2 hunks)src/llmq/context.cpp(1 hunks)src/llmq/net_signing.cpp(1 hunks)src/llmq/net_signing.h(1 hunks)src/llmq/signing.cpp(9 hunks)src/llmq/signing.h(7 hunks)src/llmq/signing_shares.cpp(4 hunks)src/llmq/signing_shares.h(2 hunks)src/net_processing.cpp(2 hunks)src/net_processing.h(1 hunks)src/test/evo_islock_tests.cpp(0 hunks)src/test/fuzz/process_message.cpp(0 hunks)test/lint/lint-circular-dependencies.py(2 hunks)
💤 Files with no reviewable changes (3)
- src/test/fuzz/process_message.cpp
- src/evo/mnhftx.cpp
- src/test/evo_islock_tests.cpp
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: kwvg
Repo: dashpay/dash PR: 6761
File: src/chainlock/signing.cpp:247-250
Timestamp: 2025-07-29T14:32:48.369Z
Learning: In PR #6761, kwvg acknowledged a null pointer check issue in ChainLockSigner::Cleanup() method but deferred it to follow-up, consistent with the pattern of avoiding scope creep in refactoring PRs.
Learnt from: kwvg
Repo: dashpay/dash PR: 6543
File: src/wallet/receive.cpp:240-251
Timestamp: 2025-02-06T14:34:30.466Z
Learning: Pull request #6543 is focused on move-only changes and refactoring, specifically backporting from Bitcoin. Behavior changes should be proposed in separate PRs.
Learnt from: UdjinM6
Repo: dashpay/dash PR: 6933
File: src/llmq/utils.cpp:284-298
Timestamp: 2025-11-04T18:24:27.241Z
Learning: In consensus-critical code (such as quorum formation, block validation, or deployment activation logic), do not suggest changes to the logic itself even if the implementation appears theoretically incorrect or off-by-one. Consensus rules, once deployed on the Dash network, must be preserved exactly to avoid network forks. Refactoring PRs should maintain perfect behavioral equivalence. Only suggest logic changes if explicitly accompanied by a DIP (Dash Improvement Proposal) or if the maintainer indicates the consensus rule needs to be changed with appropriate activation logic.
Learnt from: kwvg
Repo: dashpay/dash PR: 6718
File: test/functional/test_framework/test_framework.py:2102-2102
Timestamp: 2025-06-09T16:43:20.996Z
Learning: In the test framework consolidation PR (#6718), user kwvg prefers to limit functional changes to those directly related to MasternodeInfo, avoiding scope creep even for minor improvements like error handling consistency.
📚 Learning: 2024-12-29T17:43:41.755Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6504
File: src/llmq/quorums.cpp:224-224
Timestamp: 2024-12-29T17:43:41.755Z
Learning: The `CQuorumManager` is fully initialized by `LLMQContext`, addressing any concerns about the manager’s initialization sequence.
Applied to files:
src/llmq/context.cpp
📚 Learning: 2025-01-02T21:50:00.967Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6504
File: src/llmq/context.cpp:42-43
Timestamp: 2025-01-02T21:50:00.967Z
Learning: LLMQContext manages concurrency for the `CInstantSendManager`. Previously, this was handled globally; now it's handled as a class member in `LLMQContext`, but the concurrency control remains consistent.
Applied to files:
src/llmq/context.cpp
📚 Learning: 2025-10-02T18:29:54.756Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6840
File: src/net_processing.cpp:2882-2886
Timestamp: 2025-10-02T18:29:54.756Z
Learning: Across net_processing.cpp, once LLMQContext (m_llmq_ctx) is asserted non-null, its subcomponents (e.g., isman, qdkgsman, quorum_block_processor) are treated as initialized and used without extra null checks.
Applied to files:
src/llmq/context.cppsrc/init.cppsrc/llmq/net_signing.hsrc/llmq/net_signing.cppsrc/net_processing.cppsrc/Makefile.amsrc/llmq/signing.cppsrc/llmq/signing.h
📚 Learning: 2025-07-15T14:53:04.819Z
Learnt from: knst
Repo: dashpay/dash PR: 6691
File: src/test/llmq_params_tests.cpp:148-151
Timestamp: 2025-07-15T14:53:04.819Z
Learning: In the Dash Core LLMQ implementation, signingActiveQuorumCount is never 0 in the actual parameters defined in params.h, making division by zero scenarios unrealistic in the max_cycles() function.
Applied to files:
src/llmq/context.cppsrc/llmq/signing.h
📚 Learning: 2025-07-29T14:32:48.369Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6761
File: src/chainlock/signing.cpp:247-250
Timestamp: 2025-07-29T14:32:48.369Z
Learning: In PR #6761, kwvg acknowledged a null pointer check issue in ChainLockSigner::Cleanup() method but deferred it to follow-up, consistent with the pattern of avoiding scope creep in refactoring PRs.
Applied to files:
src/llmq/signing.cpp
🧬 Code graph analysis (7)
src/net_processing.h (1)
src/net_processing.cpp (4)
PeerPostProcessMessage(6556-6559)PeerPostProcessMessage(6556-6556)ret(660-660)ret(683-683)
src/llmq/signing_shares.h (2)
test/functional/test_framework/messages.py (1)
CRecoveredSig(1520-1544)src/llmq/signing_shares.cpp (2)
TryRecoverSig(798-879)TryRecoverSig(798-799)
src/llmq/net_signing.h (4)
src/llmq/signing.cpp (4)
CSigningManager(327-331)CSigningManager(333-333)ProcessRecoveredSig(475-514)ProcessRecoveredSig(475-475)src/llmq/signing.h (2)
CSigningManager(158-233)nodiscard(69-91)src/net_processing.h (3)
NetHandler(70-85)NetHandler(74-77)PeerManagerInternal(58-68)src/llmq/net_signing.cpp (12)
ProcessMessage(19-34)ProcessMessage(19-19)ProcessPendingRecoveredSigs(70-131)ProcessPendingRecoveredSigs(70-70)ProcessRecoveredSig(58-68)ProcessRecoveredSig(58-58)Start(36-44)Start(36-36)Stop(46-56)Stop(46-46)WorkThreadMain(133-148)WorkThreadMain(133-133)
src/llmq/net_signing.cpp (3)
src/net_processing.cpp (15)
ProcessMessage(3667-5457)ProcessMessage(3667-3672)pfrom(618-618)pfrom(639-640)pfrom(751-753)pfrom(762-762)pfrom(770-770)pfrom(773-773)pfrom(775-775)pfrom(777-777)pfrom(873-873)pfrom(1055-1055)WITH_LOCK(331-334)WITH_LOCK(3232-3232)WITH_LOCK(3254-3254)src/llmq/net_signing.h (1)
Start(33-35)src/llmq/signing.cpp (2)
ProcessRecoveredSig(475-514)ProcessRecoveredSig(475-475)
src/llmq/signing_shares.cpp (2)
src/llmq/signhash.cpp (1)
SignHash(14-22)src/llmq/signhash.h (1)
SignHash(24-46)
src/llmq/signing.cpp (2)
src/llmq/signing.h (1)
CSigningManager(158-233)src/llmq/net_signing.cpp (2)
ProcessRecoveredSig(58-68)ProcessRecoveredSig(58-58)
src/llmq/signing.h (3)
src/llmq/signhash.cpp (1)
SignHash(14-22)src/llmq/signhash.h (2)
SignHash(24-46)hash(64-73)src/llmq/signing.cpp (18)
CSigningManager(327-331)CSigningManager(333-333)AlreadyHave(335-348)AlreadyHave(335-335)GetRecoveredSigForGetData(350-360)GetRecoveredSigForGetData(350-350)VerifyAndProcessRecoveredSig(362-394)VerifyAndProcessRecoveredSig(362-362)FetchPendingReconstructed(467-472)FetchPendingReconstructed(467-467)CollectPendingRecoveredSigsToVerify(396-465)CollectPendingRecoveredSigsToVerify(396-398)GetListeners(516-520)GetListeners(516-516)ProcessRecoveredSig(475-514)ProcessRecoveredSig(475-475)Cleanup(533-539)Cleanup(533-533)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: mac-build / Build source
- GitHub Check: linux64_ubsan-build / Build source
- GitHub Check: linux64_sqlite-build / Build source
- GitHub Check: linux64_fuzz-build / Build source
- GitHub Check: linux64-build / Build source
- GitHub Check: linux64_tsan-build / Build source
- GitHub Check: win64-build / Build source
- GitHub Check: arm-linux-build / Build source
- GitHub Check: linux64_nowallet-build / Build source
- GitHub Check: Lint / Run linters
🔇 Additional comments (21)
src/net_processing.h (1)
67-67: LGTM! Clean interface extension for post-processing.The addition of
PeerPostProcessMessageto thePeerManagerInternalinterface is well-structured. The use of an rvalue reference forMessageProcessingResultenables efficient move semantics for post-processing results. This aligns with the broader refactoring to decouple network and consensus logic.src/init.cpp (2)
94-94: LGTM! Header change reflects architectural separation.The switch from
llmq/signing.htollmq/net_signing.haligns with the PR objective to separate network-handling code from core signing logic. This supports the broader refactoring to decouple consensus/chain code from network/node code.
2203-2203: LGTM! NetSigning handler follows established patterns.The addition of the
NetSigningextra handler is consistent with the existingNetInstantSendhandler pattern (line 2202). Both dereferencenode.llmq_ctxsubcomponents (sigmanandismanrespectively) without explicit null checks, which is the established pattern for LLMQContext usage in this codebase. Based on learnings, this is acceptable as LLMQContext subcomponents are treated as initialized once the context is asserted non-null.src/Makefile.am (1)
274-275: NetSigning build integration looks consistentAdding
llmq/net_signing.htoBITCOIN_CORE_Handllmq/net_signing.cpptolibbitcoin_node_a_SOURCESmatches how other LLMQ components are wired intolibbitcoin_nodeand should build cleanly across targets.Also applies to: 539-539
src/llmq/context.cpp (1)
18-40: CSigningManager wiring matches new API; consider markingpeermanunusedConstructing
sigmanfrom*qmananddb_paramsaligns with the refactoredCSigningManagerinterface and cleanly removes chainstate/peer dependencies fromLLMQContext. With signing’s worker-thread handling moved out, keepingInterrupt()/Start()/Stop()focused onqmanandclhandlerlooks appropriate.The
PeerManager& peermanargument inStartis currently unused, which can trigger-Wunused-parameterunder stricter builds; consider(void)peerman;or[[maybe_unused]]to keep this future-proof.src/llmq/signing_shares.h (1)
10-11: SignHash include andTryRecoverSigreturn type are consistent with new flowPulling in
llmq/signhash.his appropriate given the directSignHashmembers inCSigSharesNodeState, and havingTryRecoverSigreturnstd::shared_ptr<CRecoveredSig>(possiblynullptr) cleanly exposes whether recovery succeeded to callers without altering locking annotations.Also applies to: 480-482
test/lint/lint-circular-dependencies.py (1)
24-64: Circular-dependency expectations updated in line with new LLMQ wiringThe updated
EXPECTED_CIRCULAR_DEPENDENCIESentries that now traversellmq/signing_shares -> net_processing -> …instead ofllmq/signingmatch the refactored dependency graph and keep the lint test aligned with the new architecture.src/llmq/net_signing.cpp (5)
36-44: LGTM!The thread lifecycle checks are appropriate. The assert on line 40 ensures Start() is not called multiple times, which is a programming error rather than a runtime condition.
46-56: LGTM!The shutdown sequence check is appropriate, ensuring
Interrupt()is called beforeStop().
58-68: LGTM!The recovered signature processing flow is correct. Early return when
ProcessRecoveredSigreturns false (indicating the signature was already processed) prevents redundant listener notifications.
70-131: LGTM!The batched verification flow is well-structured:
- Fetches pending reconstructed signatures and processes them immediately
- Collects pending signatures to verify (up to 32 unique sessions)
- Performs batched verification using quorum public keys
- Bans peers with invalid signatures
- Processes valid signatures
The use of
.at()on line 100 is safe becauseCollectPendingRecoveredSigsToVerifyensures only signatures with valid quorums are included inrecSigsByNode.
133-148: LGTM!The worker thread main loop follows the expected pattern: process pending work, perform periodic cleanup, and sleep when idle. The interrupt handling on line 144 ensures the thread can be cleanly shut down.
src/llmq/signing.cpp (6)
327-330: LGTM!The constructor signature change successfully removes the CChainState dependency, aligning with the PR objective to separate consensus/chain code from network/node code.
362-394: LGTM!The new entry point properly validates recovered signatures before queueing them for verification. The early returns for missing/inactive quorums and duplicate signatures are appropriate, and the cs_pending lock protects the pending map correctly.
396-465: LGTM!The refactored method now populates a map of BLS public keys (
ret_pubkeys) instead of full quorum objects. This is a cleaner separation of concerns, as the verification logic only needs the public keys, not the entire quorum state.The structured binding on line 436 and the key-based lookup pattern are appropriate for the new data flow.
467-472: LGTM!The method efficiently fetches and clears pending reconstructed signatures using atomic swap under lock. Clean implementation.
475-514: LGTM!The boolean return type appropriately indicates whether the recovered signature was newly processed (true) or already existed (false). This enables callers to skip redundant listener notifications, as seen in
NetSigning::ProcessRecoveredSig.
516-520: LGTM!The method returns a copy of the listeners vector under lock, allowing callers to iterate without holding cs_listeners. This prevents potential deadlocks.
src/llmq/signing.h (3)
37-37: LGTM!The forward declaration of
SignHashis appropriate for use in thebuildSignHash()return type, avoiding unnecessary header inclusion.
55-55: LGTM!Making the return type explicit improves API clarity and prevents potential issues with auto deduction.
179-209: LGTM!The API changes are well-documented and consistent with the implementations:
- Constructor removes CChainState dependency
- New
VerifyAndProcessRecoveredSigentry point for network-based processing- New methods (
FetchPendingReconstructed,CollectPendingRecoveredSigsToVerify,GetListeners) clearly documented as used by NetSigningThe lock annotations are appropriate for all methods.
… inside TryRecoverSig
…der llmq/signing.h
…rocessRecoveredSig ex-ProcessRecoveredSig
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
1fd5c47 to
069797f
Compare
Issue being fixed or feature implemented
Separation of consensus & chain code and network & node in Dash Core is blocked by tight connection of network and consensus code; in components such as
llmq::CSigningManager,llmq::CSigSharesManager,coinjoin::client,coinjoin::server,governance/,llmq::CInstantSendManager, etc.It blocks backport's of bitcoin related to 'kernel' project.
What was done?
This PR addresses a dependency of
llmq::CSigningManageronPeerManager(llmq::CSigSharesManageris out of scope; will be done in the PR 2b/N).It is a split from proof-of-concept PR #6934
Prior work: #6959
How Has This Been Tested?
Run unit & functional tests.
Run a linter
test/lint/lint-circular-dependencies.pyOverall conception is tested in #6934 as potentially successful to finish separation of network & consensus code.
Breaking Changes
N/A
Checklist: