Skip to content

Fix multisig registry VCP Tevery escrow bug by explicitly processing leader/initiator signature, mirroring kli multisig join initiator evt replay#426

Closed
kentbull wants to merge 2 commits intoWebOfTrust:mainfrom
kentbull:multisig-registry-incept-bug-needs-leader-anc-ixn-replay
Closed

Fix multisig registry VCP Tevery escrow bug by explicitly processing leader/initiator signature, mirroring kli multisig join initiator evt replay#426
kentbull wants to merge 2 commits intoWebOfTrust:mainfrom
kentbull:multisig-registry-incept-bug-needs-leader-anc-ixn-replay

Conversation

@kentbull
Copy link
Copy Markdown
Collaborator

For consistent kli multisig join like behavior and to fix Tevery ANC unescrow #403 errors KERIA needs an explicit replay of multisig events from leaders or other group members. This explicit replay ensures that multisig operation joiners will pull in the multisig leader's anchor of events like multisig registry inception (/multisig/vcp) and multisig issuance (/multisig/iss) events, among others.

Currently KERIA does not explicitly replay this and only accidentally properly stores a multisig anchor in multisig group members of 3 or more members.

This PR fix intermittent errors regarding processing interaction events and their anchors by closing a replay gap between the KLI and KERIA's multisig processing. KLI uses the JoinDoer in join.py and the Multiplexor.

KERIA’s multisig follower approval path (like kli multisig join) is less explicit about event replay from multisig leaders/other members than KLI JoinDoer for

  • /multisig/vcp
  • /multisig/iss
  • /multisig/rev and
  • /multisig/rpy

In KLI, the multisig follower

  • first clones the stored multisig EXN (from the group leader or other member when group size is 3+),
  • parses the initiator’s embedded anchoring event plus its pathed attachments (for example anc.raw + pathed["anc"]), and
  • only then parses its own approval of that same event (keripy join.py).

In KERIA, follower approval :

  • goes through IdentifierResourceEnd.interact(...), which processes the locally submitted ixn and local sigs,
  • and if "group" is present, queues that local event into agent.groups (keria aiding.py).
  • GroupRequester then starts counselor tracking for that local group event (keria agenting.py),
    • but this generic group path does not itself
      • fetch the stored EXN,
      • read paths["anc"],
      • or replay the initiator’s attached signature stream, meaning the initiators signature is not processed on the anchoring ixn, leaving the Tevery event in perpetual escrow until timeout.
        • This is where the bug occurs, what triggers the seemingly intermittent failure.

That makes the current behavior indirect and timing-sensitive:

  • generic "group" processing and later Multiplexor.add(...) duplicate parsing can sometimes still get to the correct multisig state,
  • 2-of-2 follower flows are especially fragile and can surface as repeated ANC Missing escrowed anchor / stalled TEL multisig state convergence (keripy grouping.py).

Essentials of this PR

Current state:

  • For multisig join the KLI does:
    • anc = bytearray(aserder.raw) + pathed["anc"]
    • self.psr.parseOne(ims=bytes(anc))
    • then local approval is signed and parsed
  • KERIA follower path currently processes:
    • local ixn
    • local sigs
    • "group" metadata
    • but not the stored EXN attachment stream such as paths["anc"]
  • False positive: Generic "group" handling is real, but narrower than KLI replay:
    • IdentifierResourceEnd.interact(...) queues the local group event
    • GroupRequester starts counselor tracking for that local event
    • neither step clones the stored EXN from the initiator or replays pathed attachments, meaning the initiator signature is ignored
  • Multiplexor.add(...) can sometimes rescue multisig state, but only in the case when a later equivalent peer EXN arrives after local approval, which only happens in multisig groups of 3 or more.

Solution:

  • Mirror the KLI multisig join follower path by explicitly replaying stored embedded event streams from the initiator before local follower approval.
    • This ensures the initiator's signature is ALWAYS replayed prior to the local multisig member (follower) event and local signature is processed.

The gap is not “multisig follower approval is absent,” but KERIA does not make the KLI-style stored-attachment initiator signature replay step explicit in follower approval routes.

  • This PR's fix: add explicit replay of stored, non-local (initiator) embedded events/attachments (like signatures) in the follower approval endpoints for
    • /multisig/vcp
    • /multisig/iss
    • /multisig/rev and
    • /multisig/rpy

mirroring KLI semantics.

@kentbull kentbull changed the title Fix multisig reg icp Tevery escrow bug by explicitly processing leader/initiator signature, mirroring kli multisig join initiator evt replay Fix multisig registry VCP Tevery escrow bug by explicitly processing leader/initiator signature, mirroring kli multisig join initiator evt replay Mar 23, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 23, 2026

Codecov Report

❌ Patch coverage is 92.30769% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.69%. Comparing base (9e24615) to head (5b6ddc6).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/keria/app/multisig.py 92.30% 2 Missing ⚠️
src/keria/app/credentialing.py 90.90% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #426      +/-   ##
==========================================
- Coverage   87.74%   87.69%   -0.05%     
==========================================
  Files          26       27       +1     
  Lines        5808     5844      +36     
==========================================
+ Hits         5096     5125      +29     
- Misses        712      719       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

source={},
status=registry["regk"],
)
agent_deeds = doist.enter(doers=[agent])
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved code into the try/except so that the finally block would close out the Doist and release ports once done.

@kentbull
Copy link
Copy Markdown
Collaborator Author

I was incorrect and misunderstood a comment in KERIpy's Multiplexor.add. Closing the PR.

@kentbull kentbull closed this Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant