Skip to content

feat: devnet4 #686

Open
GrapeBaBa wants to merge 15 commits intomainfrom
gr/devnet4-on-656
Open

feat: devnet4 #686
GrapeBaBa wants to merge 15 commits intomainfrom
gr/devnet4-on-656

Conversation

@GrapeBaBa
Copy link
Copy Markdown
Member

No description provided.

Copilot AI review requested due to automatic review settings March 20, 2026 14:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the devnet4 data model and signature flow by splitting validator keys into separate attestation vs proposal keys, removing proposer-attestation-in-block, and reworking how gossip signatures / aggregated payloads are stored and consumed across forkchoice, block production, networking, and tests.

Changes:

  • Split validator keys into attestation_pubkey + proposal_pubkey, propagate through genesis config/state, key-manager, and JSON/YAML/spec fixture parsing.
  • Simplify SignedBlock to wrap BeamBlock directly and verify proposer signatures over hash_tree_root(block) using the proposal key.
  • Refactor attestation signature aggregation storage to be keyed by AttestationData, add recursive-aggregation plumbing, and adjust forkchoice/chain selection logic accordingly.

Reviewed changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
pkgs/types/src/zk.zig Update STF prover input test to use SignedBlock and new message layout.
pkgs/types/src/validator.zig Split validator pubkey into attestation/proposal keys; update JSON helpers.
pkgs/types/src/utils.zig Update GenesisSpec to store dual pubkey arrays and adjust lifecycle helpers.
pkgs/types/src/state.zig Build validators from dual pubkey arrays; update genesis-related tests/hashes.
pkgs/types/src/lib.zig Re-export new SignedBlock and expose aggregateInnerMap.
pkgs/types/src/block_signatures_testing.zig Remove prior aggregation algorithm test suite.
pkgs/types/src/block.zig Replace (validator_id,data_root) maps with AttestationData-keyed structures; add inner-map aggregation helper; refactor computeAggregatedSignatures.
pkgs/types/src/aggregation.zig Add recursive-aggregation API shape and INVERSE_PROOF_SIZE constant.
pkgs/state-transition/src/transition.zig Verify proposer signature over block root using proposal pubkey; update attestation pubkey usage.
pkgs/state-transition/src/mock.zig Generate mock chain using block-root proposer signing and new signature map layout.
pkgs/state-transition/src/lib.zig Update tests to apply transitions with SignedBlock.message (no nested .block).
pkgs/spectest/src/runner/state_transition_runner.zig Allow fixtures to provide dual pubkeys while retaining fallback for legacy pubkey.
pkgs/spectest/src/runner/fork_choice_runner.zig Adjust aggregated payload storage API and validator parsing for dual pubkeys.
pkgs/node/src/validator_client.zig Produce SignedBlock and sign block root with proposal key; stop skipping proposer attestations.
pkgs/node/src/testing.zig Update node test context for dual pubkeys and proposer block-root signing.
pkgs/node/src/node.zig Update gossip block handling for new SignedBlock layout.
pkgs/node/src/network.zig Update fetched-block cache types and parent-root accessors for SignedBlock.
pkgs/node/src/forkchoice.zig Replace old gossip-signature/root-indexing with AttestationData-keyed signature maps and new aggregation entrypoints.
pkgs/node/src/chain.zig Change block production to select pre-aggregated proofs (greedy) from known payloads; update gossip/onBlock flows for new types.
pkgs/network/src/mock.zig Update mock network cloning/tests for SignedBlock.
pkgs/network/src/interface.zig Update gossip/req-resp message types to carry SignedBlock.
pkgs/network/src/ethlibp2p.zig Update deserialization and logging for new block message layout.
pkgs/key-manager/src/lib.zig Store per-validator dual keypairs; add block-root signing and dual pubkey extraction APIs.
pkgs/database/src/test_helpers.zig Update helper block/state constructors for new validator and block types.
pkgs/database/src/rocksdb.zig Store/load SignedBlock instead of SignedBlockWithAttestation; update tests accordingly.
pkgs/configs/src/lib.zig Parse genesis YAML validators as structured entries containing both pubkeys.
pkgs/cli/test/fixtures/config.yaml Update fixture genesis validators to structured dual-pubkey format.
pkgs/cli/src/node.zig Wire dual pubkeys into chain options; load/store dual keypairs; update checkpoint verification + tests.
pkgs/cli/src/main.zig Update CLI paths that build genesis specs from key-manager to use dual pubkeys.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +119 to +126
// FFI call — only processes raw XMSS signatures (children not passed to Rust yet)
try xmss.aggregateSignatures(
raw_xmss_pks,
raw_xmss_sigs,
message_hash,
@intCast(epoch),
&result.proof_data,
);
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AggregatedSignatureProof.aggregate always calls xmss.aggregateSignatures(raw_xmss_pks, raw_xmss_sigs, ...), even when the aggregation is intended to be “children-only” (no raw XMSS signatures). The underlying XMSS FFI returns AggregationFailed when called with empty key/signature arrays, so any recursive aggregation step with xmss_participants == null will fail. Additionally, the function currently merges child participants into result.participants while proof_data only attests to the raw signatures, making the resulting proof unverifiable if child participants are included. Until the Rust bindings support recursive proof composition, consider explicitly rejecting any call with children.len > 0 (or keep participants limited to the raw set) rather than producing an inconsistent proof.

Copilot uses AI. Check for mistakes.
Comment on lines +463 to +466
/// Compute aggregated signatures using recursive aggregation:
/// Step 1: Collect individual gossip signatures from signatures_map
/// Step 2: Greedy child proof selection — new_payloads first, then known_payloads as helpers
/// Step 3: Recursive aggregate — combine selected children + remaining gossip sigs
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

computeAggregatedSignatures was substantially refactored (new signature map shape + optional recursive child proof selection), but the dedicated aggregation test suite (block_signatures_testing.zig) was removed in this PR. The remaining tests in block.zig cover SSZ encode/decode but not the aggregation selection/coverage logic. Please add updated tests for the new algorithm (including payload selection and the gossip/payload interaction) to prevent regressions.

Copilot uses AI. Check for mistakes.
Comment on lines +154 to +156
var participants = try attestation.AggregationBits.init(allocator);
var participants_cleanup = true;
errdefer if (participants_cleanup) participants.deinit();
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AggregateInnerMap allocates participants but never deinitializes it on the success path. participants_cleanup is set to false after calling AggregatedSignatureProof.aggregate, but aggregate no longer takes ownership of the passed-in bitfield, so this leaks the participants allocation. Consider replacing the participants_cleanup pattern with an unconditional defer participants.deinit(); (or otherwise explicitly deinit after aggregate succeeds).

Copilot uses AI. Check for mistakes.
Comment on lines +418 to +425
var payload_it = self.forkChoice.latest_known_aggregated_payloads.iterator();
while (payload_it.next()) |entry| {
// if (!std.mem.eql(u8, &latest_justified.root, &entry.key_ptr.source.root)) continue;
try entries.append(self.allocator, .{
.att_data = entry.key_ptr.*,
.payloads = entry.value_ptr,
});
}
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

produceBlock iterates over forkChoice.latest_known_aggregated_payloads without holding forkChoice.signatures_mutex. Those payload maps are mutated under signatures_mutex elsewhere (e.g., promotion new→known, pruning, deinit), so this read-side iteration can race and potentially use invalid pointers. Please take signatures_mutex (or snapshot/copy the entries under the lock) while building entries.

Copilot uses AI. Check for mistakes.
Comment on lines +467 to +468
var att_bits = try types.AggregationBits.init(self.allocator);
errdefer att_bits.deinit();
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the greedy selection loop, att_bits has errdefer att_bits.deinit(); and is then appended into agg_attestations before the (fallible) attestation_signatures.append(cloned_proof). If that later append fails, both the errdefer and the outer agg_att_cleanup will deinit the same att_bits, risking a double-free. A common pattern used elsewhere in this PR is to wrap in ?AggregationBits and set it to null immediately after transferring ownership to the list.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 21, 2026 01:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 31 out of 31 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +95 to 102
const validators_node = root.get("GENESIS_VALIDATORS") orelse root.get("genesis_validators") orelse return GenesisConfigError.MissingValidatorConfig;
const result = try parseValidatorEntriesFromYaml(allocator, validators_node);

return types.GenesisSpec{
.genesis_time = genesis_time,
.validator_pubkeys = pubkeys,
.validator_attestation_pubkeys = result.attestation_pubkeys,
.validator_proposal_pubkeys = result.proposal_pubkeys,
};
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The YAML parser now requires GENESIS_VALIDATORS entries to be maps containing attestation_pubkey/proposal_pubkey. That is a breaking change for existing configs that used a simple list of scalar pubkeys (and the function still accepts genesis_validators key but not its legacy scalar shape). If backwards compatibility is intended, extend parseValidatorEntriesFromYaml to accept list items that are either (a) scalar pubkey (treated as both attestation+proposal) or (b) map with both fields; otherwise, update the surrounding docs/comments and error names to explicitly indicate the legacy scalar form is no longer supported.

Copilot uses AI. Check for mistakes.
Comment on lines +461 to +471
/// Compute aggregated signatures using recursive aggregation:
/// Step 1: Collect individual gossip signatures from signatures_map
/// Step 2: Greedy child proof selection — new_payloads first, then known_payloads as helpers
/// Step 3: Recursive aggregate — combine selected children + remaining gossip sigs
pub fn computeAggregatedSignatures(
self: *Self,
attestations_list: []const Attestation,
validators: *const Validators,
signatures_map: *const SignaturesMap,
aggregated_payloads: ?*const AggregatedPayloadsMap,
new_payloads: ?*const AggregatedPayloadsMap,
known_payloads: ?*const AggregatedPayloadsMap,
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR significantly changes the aggregation algorithm and inputs (new/known payloads + recursive merge). The previous dedicated aggregation test suite was removed (block_signatures_testing.zig), but an equivalent set of tests for the new behavior isn’t shown here. Please add/update unit tests covering: (1) gossip-only aggregation, (2) child-only aggregation (including multi-child), (3) overlap handling (children vs gossip), and (4) greedy selection priority (new_payloads preferred over known_payloads).

Copilot uses AI. Check for mistakes.
Comment on lines +547 to +552
// Collect gossip signatures for all validators in the group
const inner_map = signatures_map.get(group.data);
var validator_it = group.validator_bits.iterator(.{});
while (validator_it.next()) |validator_id| {
const vid: ValidatorIndex = @intCast(validator_id);
if (signatures_map.get(.{ .validator_id = vid, .data_root = data_root })) |sig_entry| {
// Check if it's not a zero signature
const sig_entry_opt = if (inner_map) |im| im.get(vid) else null;
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SignaturesMap.get() returns an InnerMap by value, which makes it easy to accidentally copy the hash map struct and obscures the fact you’re reading shared storage. Prefer using getPtr() here (e.g., const inner_map = signatures_map.getPtr(group.data);) and then query via the pointer; it makes ownership clearer and avoids unnecessary copying of the AutoHashMap wrapper.

Copilot uses AI. Check for mistakes.
Port 14 files that don't conflict with #656 store refactor:
- Dual keys: validator.zig, utils.zig, state.zig, configs/lib.zig
- AggregatedSignatureProof.aggregate(): aggregation.zig
- SignedBlock refactor: transition.zig, lib.zig, zk.zig
- Proposer signing: validator_client.zig
- Testing/infra: testing.zig, network.zig, database, state_transition_runner
- ValidatorKeys struct with attestation_keypair + proposal_keypair
- getAllPubkeys returns AllPubkeys{attestation_pubkeys, proposal_pubkeys}
- signBlockRoot for proposer block signing
- getAttestationPubkeyBytes / getProposalPubkeyBytes
- Retains #656 owned_keys tracking
- Replace SignatureKey/flat SignaturesMap with #656 wrapper
  (AttestationData -> InnerMap(ValidatorIndex -> StoredSignature))
- AggregatedPayloadsMap keyed by AttestationData (not Root)
- Retain devnet4 AggregatedAttestationsResult + extendProofsGreedily
- Add AggregateInnerMap helper from #656 (adapted for devnet4 aggregate)
- SignedBlock refactoring (flattened, no BlockWithAttestation)
- block_signatures_testing.zig needs further adaptation (TODO)
…ckages

Propagate the #656 store refactor naming changes to all consumer files:
- SignedBlockWithAttestation → SignedBlock, BlockWithAttestation removed
- .message.block.field → .message.field (flattened block structure)
- Validator.pubkey → attestation_pubkey + proposal_pubkey (dual keys)
- GenesisSpec.validator_pubkeys → validator_attestation_pubkeys
- key_manager.getAllPubkeys() returns AllPubkeys{attestation, proposal}
- signAttestation → signBlockRoot for proposer signatures
- Removed proposer_attestation from block processing
- aggregateCommitteeSignatures → aggregate(state, false)
- Update expected state root hash in genesis root comparison test
- Update config.yaml fixture to use attestation_pubkey/proposal_pubkey format
- Fix attestation_signatures field access in chain.zig test
- Fix gossip_signatures → attestation_signatures in forkchoice.zig
- Fix lean_pq_sig_attestation_signatures → lean_pq_sig_aggregated_signatures metric name in chain.zig
…aggregate()

aggregate() reads xmss_participants but does not consume them (it creates
its own merged bitfield internally). The errdefer + participants_cleanup=false
pattern prevented cleanup on the success path, leaking the AggregationBits.

Fix: use unconditional defer instead of guarded errdefer.
Update leanSpec to a5df895 (includes devnet4 dual-key fixtures) and
lean-quickstart to match the devnet4 branch requirements.
- Rename aggregateCommitteeSignatures → aggregate, maybeAggregateCommitteeSignaturesOnInterval → maybeAggregateOnInterval
- Rename validator_pubkeys → validator_attestation_pubkeys in comments
- Remove stale "signed proposer attestation" references in logs
- Fix signed_block.message.block → signed_block.message (SignedBlock flattening)
Copilot AI review requested due to automatic review settings March 26, 2026 09:19
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 29 out of 29 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

pkgs/node/src/forkchoice.zig:1547

  • aggregateUnlocked holds signatures_mutex for the entire function (via defer unlock), but the comment says the metrics update happens outside the lock scope. Either unlock before updating gauges (if that’s the intent) or adjust the comment to avoid misleading future changes—especially since aggregation work here can be relatively expensive.
        // Capture counts before lock is released
        new_payloads_count = self.latest_new_aggregated_payloads.count();
        gossip_sigs_count = self.attestation_signatures.count();

        // Update fork-choice store gauges after aggregation (outside lock scope)
        zeam_metrics.metrics.lean_latest_new_aggregated_payloads.set(@intCast(new_payloads_count));
        zeam_metrics.metrics.lean_gossip_signatures.set(@intCast(gossip_sigs_count));

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +58 to +72
const all_pubkeys = try key_manager.getAllPubkeys(allocator, num_validators);
errdefer allocator.free(all_pubkeys.attestation_pubkeys);
errdefer allocator.free(all_pubkeys.proposal_pubkeys);

genesis_config = types.GenesisSpec{
.genesis_time = 1234,
.validator_pubkeys = pubkeys,
.validator_attestation_pubkeys = all_pubkeys.attestation_pubkeys,
.validator_proposal_pubkeys = all_pubkeys.proposal_pubkeys,
};
should_free_genesis = true;
}
defer if (should_free_genesis) allocator.free(genesis_config.validator_pubkeys);
defer if (should_free_genesis) {
allocator.free(genesis_config.validator_attestation_pubkeys);
allocator.free(genesis_config.validator_proposal_pubkeys);
};
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In genMockChain, all_pubkeys.* is freed via errdefer immediately after allocation, and the same slices are also freed again by the later defer if (should_free_genesis) once they’re moved into genesis_config. If an error occurs after should_free_genesis is set, this can double-free those slices. Fix by removing the errdefer after ownership transfer (or guard it with a flag), and let a single cleanup path own the frees.

Copilot uses AI. Check for mistakes.
// Populate validators from genesis pubkeys
for (genesis.validator_pubkeys, 0..) |pubkey, i| {
const val = validator.Validator{ .pubkey = pubkey, .index = i };
std.debug.assert(genesis.validator_attestation_pubkeys.len == genesis.validator_proposal_pubkeys.len);
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

genGenesisState uses std.debug.assert to require validator_attestation_pubkeys.len == validator_proposal_pubkeys.len. In release builds assertions may be disabled, and the subsequent multi-slice for will silently iterate to the shorter length, producing a state with mismatched/ignored validators. Prefer returning a real error when the lengths differ (e.g., error.InvalidGenesisSpec) so this is enforced in all build modes.

Suggested change
std.debug.assert(genesis.validator_attestation_pubkeys.len == genesis.validator_proposal_pubkeys.len);
if (genesis.validator_attestation_pubkeys.len != genesis.validator_proposal_pubkeys.len) {
return error.InvalidGenesisSpec;
}

Copilot uses AI. Check for mistakes.
- Rename SignedBlock.message → SignedBlock.block (#465)
- Add genesis special case for current_justified_root in block production (#464)
- Update leanSpec submodule to d0c5030
@GrapeBaBa GrapeBaBa marked this pull request as ready for review March 26, 2026 11:05
Copilot AI review requested due to automatic review settings March 26, 2026 11:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 29 out of 29 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +570 to +574
};

vid_to_sigmap_idx[validator_id] = sigmap_sigs.items.len;
try sigmap_sigs.append(allocator, sig);
try sigmap_pks.append(allocator, pk);
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In computeAggregatedSignatures, pk is created and then appended into sigmap_pks, but there is no errdefer pk.deinit() before the append. If sigmap_pks.append(...) fails, the public key handle will leak. Add an errdefer (and disable it after a successful append) to keep error paths leak-free.

Suggested change
};
vid_to_sigmap_idx[validator_id] = sigmap_sigs.items.len;
try sigmap_sigs.append(allocator, sig);
try sigmap_pks.append(allocator, pk);
};
var pk_added = false;
errdefer if (!pk_added) pk.deinit();
vid_to_sigmap_idx[validator_id] = sigmap_sigs.items.len;
try sigmap_sigs.append(allocator, sig);
try sigmap_pks.append(allocator, pk);
pk_added = true;

Copilot uses AI. Check for mistakes.
Comment on lines 74 to 78
try cache.put(validator_id, CachedKeyPair{
.keypair = keypair,
.attestation_keypair = attestation_keypair,
.proposal_keypair = proposal_keypair,
.num_active_epochs = num_active_epochs,
});
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getOrCreateCachedKeyPair has errdefer cleanup for attestation_keypair but not for proposal_keypair. If cache.put(...) fails (OOM) after generating both keypairs, the proposal keypair will leak. Add symmetric error cleanup for proposal_keypair (or generate it as var with errdefer) before the cache.put call.

Copilot uses AI. Check for mistakes.
Comment on lines +94 to +96
if (!has_raw and !has_children) return error.AggregationInvalidInput;
if (!has_raw and children.len < 2) return error.AggregationInvalidInput;

Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AggregatedSignatureProof.aggregate allows the case xmss_participants == null with children.len >= 2, but it still unconditionally calls xmss.aggregateSignatures(...). The XMSS FFI aggregator fails on zero raw signatures, so this path will always error even though it passes the input validation. Either reject the no-raw case for now, or implement a non-FFI fallback (e.g., dummy proof_data / purely merging participants) until recursive aggregation is supported end-to-end.

Copilot uses AI. Check for mistakes.
Comment on lines +120 to +123
try xmss.aggregateSignatures(
raw_xmss_pks,
raw_xmss_sigs,
message_hash,
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xmss.aggregateSignatures appends into result.proof_data and does not clear existing content (see xmss aggregation implementation). AggregatedSignatureProof.aggregate currently calls it without resetting result.proof_data, so re-aggregating into an existing result will produce an invalid proof by concatenating bytes. Consider clearing/reinitializing result.proof_data before the FFI call (the function already has an allocator parameter now, so it can safely reset the list).

Copilot uses AI. Check for mistakes.
Comment on lines 181 to +185
var sig = try xmss.Signature.fromBytes(ve.stored_sig.signature[0..]);
errdefer sig.deinit();

const val = try validators.get(ve.validator_id);
var pk = try xmss.PublicKey.fromBytes(&val.pubkey);
var pk = try xmss.PublicKey.fromBytes(&val.attestation_pubkey);
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In AggregateInnerMap, sig/pk are created with errdefer ...deinit() and then appended into sigs/pks. If an error occurs after the append, the errdefer will still run and deinit the same handle that will later also be deinited by the arraylist cleanup, causing a double-deinit. Use a cleanup flag (set to false after successful append) or move the deinit responsibility exclusively to the list (no errdefer once appended).

Copilot uses AI. Check for mistakes.
Comment on lines +555 to +557
var sig = xmss.Signature.fromBytes(&sig_entry.signature) catch continue;
errdefer sig.deinit();

Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In computeAggregatedSignatures, sig is created with errdefer sig.deinit(); but then appended into sigmap_sigs. If a later try in this scope fails, the errdefer will deinit the signature even though the list cleanup will also deinit it, leading to a double-deinit. Use a cleanup flag (disable after successful append) or restructure to only deinit via the list.

Copilot uses AI. Check for mistakes.
- Register block body attestations with fork choice via onAttestation()
  to mirror chain.zig onBlock behavior for correct LMD-GHOST weights
- Remove fabricated proposer gossip attestation that injected phantom
  votes not present in leanSpec fixtures, causing head slot mismatches
- Remove safe target regression guard that incorrectly errored on
  legitimate safe target decreases during reorg scenarios
- Add latestJustifiedRoot and latestJustifiedRootLabel check handlers
… devnet4 scheme

- Update multisig-glue to use rec_aggregation from leanMultisig with
  children support (child pub keys, child proofs, log_inv_rate)
- Update hashsig-glue to leansig_wrapper with devnet4 aborting hypercube
  scheme (V=46, Poseidon1)
- Update Zig FFI layer (xmss/aggregation.zig) with children + log_inv_rate
- Pass children pub keys and INVERSE_PROOF_SIZE=2 through aggregation.zig
- Wire recursive child resolution in block.zig for block building
- Fix SIGSIZE from 3112 to 2536 to match V=46 signature size
Copilot AI review requested due to automatic review settings March 30, 2026 04:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 35 out of 36 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +126 to +136
// FFI call — passes children proofs + their public keys for true recursive aggregation
try xmss.aggregateSignatures(
raw_xmss_pks,
raw_xmss_sigs,
children_pub_keys,
children_proof_refs,
message_hash,
@intCast(epoch),
INVERSE_PROOF_SIZE,
&result.proof_data,
);
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xmss.aggregateSignatures appends into the provided result.proof_data list. If result is reused (or proof_data is non-empty), this will produce an invalid proof by concatenating multiple serializations. Clear/reset result.proof_data before the FFI call (and ideally document that the output list must start empty).

Copilot uses AI. Check for mistakes.
Comment on lines +156 to 159
// Copy the bytes to the output
for (buffer[0..bytes_written]) |byte| {
try multisig_aggregated_signature.append(byte);
}
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function appends serialized bytes into multisig_aggregated_signature without clearing it first. If the caller passes a non-empty list (or reuses the same list across calls), the output will contain concatenated proofs and verification will fail. Consider clearing the list at the start (or change the API to return a fresh list) and document the required precondition explicitly.

Copilot uses AI. Check for mistakes.
Comment on lines +118 to +122
let proof_lens = slice::from_raw_parts(child_proof_lens, num_children);

let total_child_pks: usize = num_keys_arr.iter().sum();
let all_pk_ptrs = slice::from_raw_parts(child_all_pub_keys, total_child_pks);

Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

total_child_pks is computed with sum() on an untrusted child_num_keys array from FFI. usize overflow can wrap in release builds, which can lead to constructing undersized slices and out-of-bounds reads/panics. Use checked accumulation (e.g., try_fold + checked_add) and return null on overflow/inconsistent inputs.

Copilot uses AI. Check for mistakes.
Comment on lines +449 to 456
// Serialize secret key using postcard (serde-based)
let sk_bytes = match postcard::to_allocvec(&private_key_ref.inner) {
Ok(bytes) => bytes,
Err(_) => return 0,
};

if ssz_bytes.len() > buffer_len {
if sk_bytes.len() > buffer_len {
return 0;
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The surrounding comment still says “SSZ encoding”, but this function now serializes the secret key using postcard (postcard::to_allocvec). Update the docstring/API naming so FFI callers don’t assume SSZ for private key bytes (and so it’s clear which encoding hashsig_keypair_from_ssz expects for the secret key).

Copilot uses AI. Check for mistakes.
Comment on lines +73 to 143
pub fn aggregateSignatures(
public_keys: []*const hashsig.HashSigPublicKey,
signatures: []*const hashsig.HashSigSignature,
children_pub_keys: []const []*const hashsig.HashSigPublicKey,
children_proofs: []const *const ByteListMiB,
message_hash: *const [32]u8,
epoch: u32,
log_inv_rate: usize,
multisig_aggregated_signature: *ByteListMiB,
) !void {
if (public_keys.len != signatures.len) {
return AggregationError.PublicKeysSignatureLengthMismatch;
}
if (children_pub_keys.len != children_proofs.len) {
return AggregationError.AggregationFailed;
}

setupProver();

const num_children = children_pub_keys.len;
const allocator = std.heap.c_allocator;

// Build flat child pub key array and per-child key counts
var total_child_pks: usize = 0;
for (children_pub_keys) |cpks| {
total_child_pks += cpks.len;
}

const child_all_pks = allocator.alloc(*const hashsig.HashSigPublicKey, total_child_pks) catch
return AggregationError.AggregationFailed;
defer allocator.free(child_all_pks);

const child_num_keys = allocator.alloc(usize, num_children) catch
return AggregationError.AggregationFailed;
defer allocator.free(child_num_keys);

const child_proof_ptrs = allocator.alloc([*]const u8, num_children) catch
return AggregationError.AggregationFailed;
defer allocator.free(child_proof_ptrs);

const child_proof_lens = allocator.alloc(usize, num_children) catch
return AggregationError.AggregationFailed;
defer allocator.free(child_proof_lens);

var pk_offset: usize = 0;
for (0..num_children) |i| {
const cpks = children_pub_keys[i];
child_num_keys[i] = cpks.len;
for (cpks, 0..) |pk, j| {
child_all_pks[pk_offset + j] = pk;
}
pk_offset += cpks.len;

const proof_slice = children_proofs[i].constSlice();
child_proof_ptrs[i] = proof_slice.ptr;
child_proof_lens[i] = proof_slice.len;
}

const agg_sig = xmss_aggregate(
public_keys.ptr,
public_keys.len,
signatures.ptr,
signatures.len,
public_keys.len,
num_children,
child_all_pks.ptr,
child_num_keys.ptr,
child_proof_ptrs.ptr,
child_proof_lens.ptr,
message_hash,
epoch,
log_inv_rate,
) orelse return AggregationError.AggregationFailed;
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The epoch parameter is now forwarded to the Rust FFI as slot, and the underlying aggregation/verification APIs are slot-based. Renaming this parameter (and related comments/tests) to slot would avoid callers accidentally passing an epoch number and producing unverifiable signatures/proofs.

Copilot uses AI. Check for mistakes.
Comment on lines +124 to +133
for i in 0..num_children {
// Collect pub keys for this child
let n = num_keys_arr[i];
let mut pks = Vec::with_capacity(n);
for j in 0..n {
let pk_ptr = all_pk_ptrs[pk_offset + j];
if pk_ptr.is_null() {
return std::ptr::null();
}
pks.push((*pk_ptr).inner.clone());
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indexing all_pk_ptrs[pk_offset + j] can panic if child_num_keys is inconsistent with the provided child_all_pub_keys backing array (or if the sum/offset bookkeeping is wrong). Panics across extern "C" are highly undesirable; add explicit bounds checks (e.g., verify pk_offset + n <= all_pk_ptrs.len() before the inner loop) and return null on invalid inputs.

Copilot uses AI. Check for mistakes.
Comment on lines 132 to 152
/// Reconstruct a key pair from SSZ-encoded secret and public keys
/// Returns a pointer to the KeyPair or null on error
/// # Safety
/// This is meant to be called from zig, so the pointers will always dereference correctly
#[no_mangle]
pub unsafe extern "C" fn hashsig_keypair_from_ssz(
private_key_ptr: *const u8,
private_key_len: usize,
public_key_ptr: *const u8,
public_key_len: usize,
) -> *mut KeyPair {
if private_key_ptr.is_null() || public_key_ptr.is_null() {
return ptr::null_mut();
}

unsafe {
let sk_slice = slice::from_raw_parts(private_key_ptr, private_key_len);
let pk_slice = slice::from_raw_parts(public_key_ptr, public_key_len);

let private_key: HashSigPrivateKey = match HashSigPrivateKey::from_ssz_bytes(sk_slice) {
let private_key: HashSigPrivateKey = match postcard::from_bytes(sk_slice) {
Ok(key) => key,
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring says the secret key is SSZ-encoded, but the implementation now deserializes the secret key using postcard::from_bytes. This is an API contract mismatch for FFI callers; update the docs (and ideally the function name) to reflect the actual encoding used for the private key bytes.

Copilot uses AI. Check for mistakes.
Comment on lines 47 to +50
// Not enough epochs, remove old key pair and regenerate
var old = cache.fetchRemove(validator_id).?.value;
old.keypair.deinit();
old.attestation_keypair.deinit();
old.proposal_keypair.deinit();
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getOrCreateCachedKeyPair removes and deinitializes the cached keypairs when num_active_epochs increases. Because ValidatorKeys returned from the cache are bitwise copies into each KeyManager (and cached keys are intentionally not deinit’d on KeyManager.deinit), this can invalidate previously returned/borrowed key handles and lead to use-after-free if multiple test key managers exist concurrently or if a cached key is reused after the cache grows. Safer options are to never deinit cached entries (leak in tests), key the cache by (validator_id, num_active_epochs), or use reference-counted ownership so old handles remain valid.

Copilot uses AI. Check for mistakes.
leanMultisig's mt-koala-bear crate uses compile-time #[cfg(target_feature)]
for SIMD dispatch (AVX2/AVX512/NEON). The previous .cargo/config.toml approach
was ineffective because CI sets RUSTFLAGS=-D warnings via environment, which
causes Cargo to ignore config.toml rustflags entirely.

Switch to CARGO_ENCODED_RUSTFLAGS which has highest precedence and is not
clobbered by the RUSTFLAGS environment variable.
Copilot AI review requested due to automatic review settings March 30, 2026 06:21
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 36 out of 37 changed files in this pull request and generated 8 comments.

Comments suppressed due to low confidence (1)

rust/hashsig-glue/src/lib.rs:165

  • hashsig_keypair_from_ssz now deserializes the secret key with postcard::from_bytes, but the API name/docs (and Zig bindings) indicate SSZ for both secret and public keys. This will break loading pre-generated *_sk.ssz files and any callers expecting SSZ. Either restore SSZ decoding for the secret key, or rename/introduce new FFI entrypoints (e.g., *_from_postcard / *_from_bytes) and update the Zig side accordingly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

build.zig Outdated
Comment on lines +727 to +730
// (AVX2/AVX512 on x86_64, NEON on aarch64). It requires target-cpu=native so the
// compiler enables the right feature flags. We use CARGO_ENCODED_RUSTFLAGS so we
// don't clobber any RUSTFLAGS already set in the environment (e.g. CI's -D warnings).
cargo_build.setEnvironmentVariable("CARGO_ENCODED_RUSTFLAGS", "-Ctarget-cpu=native");
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting CARGO_ENCODED_RUSTFLAGS unconditionally here will override any existing CARGO_ENCODED_RUSTFLAGS provided by the environment, despite the comment about not clobbering flags. Also, -Ctarget-cpu=native makes the produced staticlibs CPU-specific (non-portable) and can hurt reproducibility. Consider appending to any existing encoded rustflags and/or gating target-cpu=native behind an opt-in env var or a build option.

Suggested change
// (AVX2/AVX512 on x86_64, NEON on aarch64). It requires target-cpu=native so the
// compiler enables the right feature flags. We use CARGO_ENCODED_RUSTFLAGS so we
// don't clobber any RUSTFLAGS already set in the environment (e.g. CI's -D warnings).
cargo_build.setEnvironmentVariable("CARGO_ENCODED_RUSTFLAGS", "-Ctarget-cpu=native");
// (AVX2/AVX512 on x86_64, NEON on aarch64). It can use target-cpu=native so the
// compiler enables the right feature flags, but this makes builds non-portable and
// less reproducible. Allow callers to opt in via a Zig build option instead of
// forcing it unconditionally.
const enable_native_rustflags =
b.option(bool, "enable-native-rustflags", "Enable -Ctarget-cpu=native for Rust builds") orelse false;
if (enable_native_rustflags) {
cargo_build.setEnvironmentVariable("CARGO_ENCODED_RUSTFLAGS", "-Ctarget-cpu=native");
}

Copilot uses AI. Check for mistakes.
Comment on lines +120 to +136
let total_child_pks: usize = num_keys_arr.iter().sum();
let all_pk_ptrs = slice::from_raw_parts(child_all_pub_keys, total_child_pks);

let mut pk_offset: usize = 0;
for i in 0..num_children {
// Collect pub keys for this child
let n = num_keys_arr[i];
let mut pks = Vec::with_capacity(n);
for j in 0..n {
let pk_ptr = all_pk_ptrs[pk_offset + j];
if pk_ptr.is_null() {
return std::ptr::null();
}
pks.push((*pk_ptr).inner.clone());
}
pk_offset += n;
children_pks.push(pks);
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FFI robustness: xmss_aggregate derives total_child_pks via num_keys_arr.iter().sum() and then performs unchecked pk_offset += n arithmetic. With adversarial inputs, usize overflow can wrap and lead to panics during indexing, which is especially problematic across an extern "C" boundary. Consider using checked addition (and early null return) when summing/advancing offsets to guarantee no panics/UB.

Copilot uses AI. Check for mistakes.
Comment on lines 39 to 44

#[derive(Debug, thiserror::Error)]
pub enum SigningError {
#[error("Signing failed: {0:?}")]
SigningFailed(leansig::signature::SigningError),
#[error("Signing failed")]
SigningFailed,
}
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SigningError::SigningFailed no longer preserves the underlying signing error details. If this error is surfaced in logs/tests, it may be harder to debug failures. Consider keeping the original error as a source (or at least capturing it as a string) while still presenting a stable top-level error message.

Copilot uses AI. Check for mistakes.
Comment on lines 432 to 462
/// Serialize a private key pointer to bytes using SSZ encoding
/// Returns number of bytes written, or 0 on error
/// # Safety
/// buffer must point to a valid buffer of sufficient size
#[no_mangle]
pub unsafe extern "C" fn hashsig_private_key_to_bytes(
private_key: *const PrivateKey,
buffer: *mut u8,
buffer_len: usize,
) -> usize {
if private_key.is_null() || buffer.is_null() {
return 0;
}

unsafe {
let private_key_ref = &*private_key;

// Directly SSZ encode the private key (leansig has SSZ support built-in)
let ssz_bytes = private_key_ref.inner.as_ssz_bytes();
// Serialize secret key using postcard (serde-based)
let sk_bytes = match postcard::to_allocvec(&private_key_ref.inner) {
Ok(bytes) => bytes,
Err(_) => return 0,
};

if ssz_bytes.len() > buffer_len {
if sk_bytes.len() > buffer_len {
return 0;
}

let output_slice = slice::from_raw_parts_mut(buffer, buffer_len);
output_slice[..ssz_bytes.len()].copy_from_slice(&ssz_bytes);
ssz_bytes.len()
output_slice[..sk_bytes.len()].copy_from_slice(&sk_bytes);
sk_bytes.len()
}
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hashsig_private_key_to_bytes now serializes the secret key with postcard, but the function comment says “SSZ encoding” and Zig wrappers call it “SSZ format”. This mismatch is likely to cause silent interoperability failures; please align the docs + naming with the actual encoding, or switch back to SSZ for the private key if SSZ is still the intended wire/file format.

Copilot uses AI. Check for mistakes.
Comment on lines 116 to 124
let seed = hasher.finalize().into();

let (public_key, private_key) = PrivateKey::generate(
&mut <ChaCha20Rng as SeedableRng>::from_seed(seed),
activation_epoch,
num_active_epochs,
&mut StdRng::from_seed(seed),
activation_epoch as u32,
num_active_epochs as u32,
);

let keypair = Box::new(KeyPair {
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FFI safety: hashsig_keypair_generate should defensively handle seed_phrase == NULL before calling CStr::from_ptr (which is UB on null). Even though this hunk changes the RNG, it would be good to add that null check in this function and return a null pointer on invalid input.

Copilot uses AI. Check for mistakes.
Comment on lines 39 to 51
if (cache.get(validator_id)) |cached| {
if (cached.num_active_epochs >= num_active_epochs) {
std.debug.print("CACHE HIT: validator {d}\n", .{validator_id});
return cached.keypair;
return ValidatorKeys{
.attestation_keypair = cached.attestation_keypair,
.proposal_keypair = cached.proposal_keypair,
};
}
// Not enough epochs, remove old key pair and regenerate
var old = cache.fetchRemove(validator_id).?.value;
old.keypair.deinit();
old.attestation_keypair.deinit();
old.proposal_keypair.deinit();
}
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The global cached keypairs are returned by value and then stored in KeyManager. If getOrCreateCachedKeyPair later regenerates (removes + deinit) a cached entry for the same validator, any previously returned copies become dangling (use-after-free) because they share the same underlying Rust handle pointers. Consider storing cached keypairs behind stable ownership (e.g., pointers/refs) or making the cache monotonic (never deinit/replace entries; only grow num_active_epochs).

Copilot uses AI. Check for mistakes.
Comment on lines +1403 to +1412
self.signatures_mutex.lock();
defer self.signatures_mutex.unlock();

// Build attestation list — recursive includes new payloads, plain does not
var attestations = if (recursive)
try self.attestationsFromGossipAndNewPayloads()
else
try self.attestationsFromAttestationSignatures();
defer attestations.deinit(self.allocator);

Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aggregateUnlocked holds signatures_mutex for the entire aggregation, including hashing, proof selection, cloning, and allocations. This is a potentially long critical section and contradicts the comment about updating metrics outside the lock scope. Consider copying the minimal required data structures under the lock, releasing the lock for the expensive work, then re-locking only to commit map updates/removals.

Copilot uses AI. Check for mistakes.
Comment on lines 162 to 176
pub fn verifyAggregatedPayload(public_keys: []*const hashsig.HashSigPublicKey, message_hash: *const [32]u8, epoch: u32, agg_sig: *const ByteListMiB) !void {
// Get bytes from MultisigAggregatedSignature
// Get bytes from aggregated signature
const sig_bytes = agg_sig.constSlice();

setupVerifier();

// Deserialize to Devnet2XmssAggregateSignature
const devnet_sig = xmss_aggregate_signature_from_bytes(sig_bytes.ptr, sig_bytes.len) orelse {
return AggregationError.DeserializationFailed;
};
defer xmss_free_aggregate_signature(devnet_sig);

// Verify
// Verify directly from bytes (Rust deserializes internally)
const result = xmss_verify_aggregated(
public_keys.ptr,
public_keys.len,
message_hash,
devnet_sig,
sig_bytes.ptr,
sig_bytes.len,
epoch,
);
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In aggregateSignatures/verifyAggregatedPayload, the parameter is still named epoch, but it is now passed to the Rust FFI as slot (and used as such in devnet4 aggregation). Renaming this parameter (and related docs) to slot would reduce confusion and prevent accidental misuse when epoch/slot semantics diverge.

Copilot uses AI. Check for mistakes.
ring 0.17 fails compile-time feature assertions on aarch64-apple-darwin
when target-cpu=native is set. The SIMD dispatch fix is only needed on
x86_64 for leanMultisig's AVX2/AVX512 codepaths. Also include -Dwarnings
in the encoded flags so CI warning-as-error behavior is preserved.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants