Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the devnet4 data model and signature flow by splitting validator keys into separate attestation vs proposal keys, removing proposer-attestation-in-block, and reworking how gossip signatures / aggregated payloads are stored and consumed across forkchoice, block production, networking, and tests.
Changes:
- Split validator keys into
attestation_pubkey+proposal_pubkey, propagate through genesis config/state, key-manager, and JSON/YAML/spec fixture parsing. - Simplify
SignedBlockto wrapBeamBlockdirectly and verify proposer signatures overhash_tree_root(block)using the proposal key. - Refactor attestation signature aggregation storage to be keyed by
AttestationData, add recursive-aggregation plumbing, and adjust forkchoice/chain selection logic accordingly.
Reviewed changes
Copilot reviewed 29 out of 29 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| pkgs/types/src/zk.zig | Update STF prover input test to use SignedBlock and new message layout. |
| pkgs/types/src/validator.zig | Split validator pubkey into attestation/proposal keys; update JSON helpers. |
| pkgs/types/src/utils.zig | Update GenesisSpec to store dual pubkey arrays and adjust lifecycle helpers. |
| pkgs/types/src/state.zig | Build validators from dual pubkey arrays; update genesis-related tests/hashes. |
| pkgs/types/src/lib.zig | Re-export new SignedBlock and expose aggregateInnerMap. |
| pkgs/types/src/block_signatures_testing.zig | Remove prior aggregation algorithm test suite. |
| pkgs/types/src/block.zig | Replace (validator_id,data_root) maps with AttestationData-keyed structures; add inner-map aggregation helper; refactor computeAggregatedSignatures. |
| pkgs/types/src/aggregation.zig | Add recursive-aggregation API shape and INVERSE_PROOF_SIZE constant. |
| pkgs/state-transition/src/transition.zig | Verify proposer signature over block root using proposal pubkey; update attestation pubkey usage. |
| pkgs/state-transition/src/mock.zig | Generate mock chain using block-root proposer signing and new signature map layout. |
| pkgs/state-transition/src/lib.zig | Update tests to apply transitions with SignedBlock.message (no nested .block). |
| pkgs/spectest/src/runner/state_transition_runner.zig | Allow fixtures to provide dual pubkeys while retaining fallback for legacy pubkey. |
| pkgs/spectest/src/runner/fork_choice_runner.zig | Adjust aggregated payload storage API and validator parsing for dual pubkeys. |
| pkgs/node/src/validator_client.zig | Produce SignedBlock and sign block root with proposal key; stop skipping proposer attestations. |
| pkgs/node/src/testing.zig | Update node test context for dual pubkeys and proposer block-root signing. |
| pkgs/node/src/node.zig | Update gossip block handling for new SignedBlock layout. |
| pkgs/node/src/network.zig | Update fetched-block cache types and parent-root accessors for SignedBlock. |
| pkgs/node/src/forkchoice.zig | Replace old gossip-signature/root-indexing with AttestationData-keyed signature maps and new aggregation entrypoints. |
| pkgs/node/src/chain.zig | Change block production to select pre-aggregated proofs (greedy) from known payloads; update gossip/onBlock flows for new types. |
| pkgs/network/src/mock.zig | Update mock network cloning/tests for SignedBlock. |
| pkgs/network/src/interface.zig | Update gossip/req-resp message types to carry SignedBlock. |
| pkgs/network/src/ethlibp2p.zig | Update deserialization and logging for new block message layout. |
| pkgs/key-manager/src/lib.zig | Store per-validator dual keypairs; add block-root signing and dual pubkey extraction APIs. |
| pkgs/database/src/test_helpers.zig | Update helper block/state constructors for new validator and block types. |
| pkgs/database/src/rocksdb.zig | Store/load SignedBlock instead of SignedBlockWithAttestation; update tests accordingly. |
| pkgs/configs/src/lib.zig | Parse genesis YAML validators as structured entries containing both pubkeys. |
| pkgs/cli/test/fixtures/config.yaml | Update fixture genesis validators to structured dual-pubkey format. |
| pkgs/cli/src/node.zig | Wire dual pubkeys into chain options; load/store dual keypairs; update checkpoint verification + tests. |
| pkgs/cli/src/main.zig | Update CLI paths that build genesis specs from key-manager to use dual pubkeys. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pkgs/types/src/aggregation.zig
Outdated
| // FFI call — only processes raw XMSS signatures (children not passed to Rust yet) | ||
| try xmss.aggregateSignatures( | ||
| raw_xmss_pks, | ||
| raw_xmss_sigs, | ||
| message_hash, | ||
| @intCast(epoch), | ||
| &result.proof_data, | ||
| ); |
There was a problem hiding this comment.
AggregatedSignatureProof.aggregate always calls xmss.aggregateSignatures(raw_xmss_pks, raw_xmss_sigs, ...), even when the aggregation is intended to be “children-only” (no raw XMSS signatures). The underlying XMSS FFI returns AggregationFailed when called with empty key/signature arrays, so any recursive aggregation step with xmss_participants == null will fail. Additionally, the function currently merges child participants into result.participants while proof_data only attests to the raw signatures, making the resulting proof unverifiable if child participants are included. Until the Rust bindings support recursive proof composition, consider explicitly rejecting any call with children.len > 0 (or keep participants limited to the raw set) rather than producing an inconsistent proof.
| /// Compute aggregated signatures using recursive aggregation: | ||
| /// Step 1: Collect individual gossip signatures from signatures_map | ||
| /// Step 2: Greedy child proof selection — new_payloads first, then known_payloads as helpers | ||
| /// Step 3: Recursive aggregate — combine selected children + remaining gossip sigs |
There was a problem hiding this comment.
computeAggregatedSignatures was substantially refactored (new signature map shape + optional recursive child proof selection), but the dedicated aggregation test suite (block_signatures_testing.zig) was removed in this PR. The remaining tests in block.zig cover SSZ encode/decode but not the aggregation selection/coverage logic. Please add updated tests for the new algorithm (including payload selection and the gossip/payload interaction) to prevent regressions.
pkgs/types/src/block.zig
Outdated
| var participants = try attestation.AggregationBits.init(allocator); | ||
| var participants_cleanup = true; | ||
| errdefer if (participants_cleanup) participants.deinit(); |
There was a problem hiding this comment.
AggregateInnerMap allocates participants but never deinitializes it on the success path. participants_cleanup is set to false after calling AggregatedSignatureProof.aggregate, but aggregate no longer takes ownership of the passed-in bitfield, so this leaks the participants allocation. Consider replacing the participants_cleanup pattern with an unconditional defer participants.deinit(); (or otherwise explicitly deinit after aggregate succeeds).
pkgs/node/src/chain.zig
Outdated
| var payload_it = self.forkChoice.latest_known_aggregated_payloads.iterator(); | ||
| while (payload_it.next()) |entry| { | ||
| // if (!std.mem.eql(u8, &latest_justified.root, &entry.key_ptr.source.root)) continue; | ||
| try entries.append(self.allocator, .{ | ||
| .att_data = entry.key_ptr.*, | ||
| .payloads = entry.value_ptr, | ||
| }); | ||
| } |
There was a problem hiding this comment.
produceBlock iterates over forkChoice.latest_known_aggregated_payloads without holding forkChoice.signatures_mutex. Those payload maps are mutated under signatures_mutex elsewhere (e.g., promotion new→known, pruning, deinit), so this read-side iteration can race and potentially use invalid pointers. Please take signatures_mutex (or snapshot/copy the entries under the lock) while building entries.
pkgs/node/src/chain.zig
Outdated
| var att_bits = try types.AggregationBits.init(self.allocator); | ||
| errdefer att_bits.deinit(); |
There was a problem hiding this comment.
In the greedy selection loop, att_bits has errdefer att_bits.deinit(); and is then appended into agg_attestations before the (fallible) attestation_signatures.append(cloned_proof). If that later append fails, both the errdefer and the outer agg_att_cleanup will deinit the same att_bits, risking a double-free. A common pattern used elsewhere in this PR is to wrap in ?AggregationBits and set it to null immediately after transferring ownership to the list.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 31 out of 31 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const validators_node = root.get("GENESIS_VALIDATORS") orelse root.get("genesis_validators") orelse return GenesisConfigError.MissingValidatorConfig; | ||
| const result = try parseValidatorEntriesFromYaml(allocator, validators_node); | ||
|
|
||
| return types.GenesisSpec{ | ||
| .genesis_time = genesis_time, | ||
| .validator_pubkeys = pubkeys, | ||
| .validator_attestation_pubkeys = result.attestation_pubkeys, | ||
| .validator_proposal_pubkeys = result.proposal_pubkeys, | ||
| }; |
There was a problem hiding this comment.
The YAML parser now requires GENESIS_VALIDATORS entries to be maps containing attestation_pubkey/proposal_pubkey. That is a breaking change for existing configs that used a simple list of scalar pubkeys (and the function still accepts genesis_validators key but not its legacy scalar shape). If backwards compatibility is intended, extend parseValidatorEntriesFromYaml to accept list items that are either (a) scalar pubkey (treated as both attestation+proposal) or (b) map with both fields; otherwise, update the surrounding docs/comments and error names to explicitly indicate the legacy scalar form is no longer supported.
| /// Compute aggregated signatures using recursive aggregation: | ||
| /// Step 1: Collect individual gossip signatures from signatures_map | ||
| /// Step 2: Greedy child proof selection — new_payloads first, then known_payloads as helpers | ||
| /// Step 3: Recursive aggregate — combine selected children + remaining gossip sigs | ||
| pub fn computeAggregatedSignatures( | ||
| self: *Self, | ||
| attestations_list: []const Attestation, | ||
| validators: *const Validators, | ||
| signatures_map: *const SignaturesMap, | ||
| aggregated_payloads: ?*const AggregatedPayloadsMap, | ||
| new_payloads: ?*const AggregatedPayloadsMap, | ||
| known_payloads: ?*const AggregatedPayloadsMap, |
There was a problem hiding this comment.
This PR significantly changes the aggregation algorithm and inputs (new/known payloads + recursive merge). The previous dedicated aggregation test suite was removed (block_signatures_testing.zig), but an equivalent set of tests for the new behavior isn’t shown here. Please add/update unit tests covering: (1) gossip-only aggregation, (2) child-only aggregation (including multi-child), (3) overlap handling (children vs gossip), and (4) greedy selection priority (new_payloads preferred over known_payloads).
| // Collect gossip signatures for all validators in the group | ||
| const inner_map = signatures_map.get(group.data); | ||
| var validator_it = group.validator_bits.iterator(.{}); | ||
| while (validator_it.next()) |validator_id| { | ||
| const vid: ValidatorIndex = @intCast(validator_id); | ||
| if (signatures_map.get(.{ .validator_id = vid, .data_root = data_root })) |sig_entry| { | ||
| // Check if it's not a zero signature | ||
| const sig_entry_opt = if (inner_map) |im| im.get(vid) else null; |
There was a problem hiding this comment.
SignaturesMap.get() returns an InnerMap by value, which makes it easy to accidentally copy the hash map struct and obscures the fact you’re reading shared storage. Prefer using getPtr() here (e.g., const inner_map = signatures_map.getPtr(group.data);) and then query via the pointer; it makes ownership clearer and avoids unnecessary copying of the AutoHashMap wrapper.
78ca450 to
0c66a2e
Compare
Port 14 files that don't conflict with #656 store refactor: - Dual keys: validator.zig, utils.zig, state.zig, configs/lib.zig - AggregatedSignatureProof.aggregate(): aggregation.zig - SignedBlock refactor: transition.zig, lib.zig, zk.zig - Proposer signing: validator_client.zig - Testing/infra: testing.zig, network.zig, database, state_transition_runner
- ValidatorKeys struct with attestation_keypair + proposal_keypair
- getAllPubkeys returns AllPubkeys{attestation_pubkeys, proposal_pubkeys}
- signBlockRoot for proposer block signing
- getAttestationPubkeyBytes / getProposalPubkeyBytes
- Retains #656 owned_keys tracking
- Replace SignatureKey/flat SignaturesMap with #656 wrapper (AttestationData -> InnerMap(ValidatorIndex -> StoredSignature)) - AggregatedPayloadsMap keyed by AttestationData (not Root) - Retain devnet4 AggregatedAttestationsResult + extendProofsGreedily - Add AggregateInnerMap helper from #656 (adapted for devnet4 aggregate) - SignedBlock refactoring (flattened, no BlockWithAttestation) - block_signatures_testing.zig needs further adaptation (TODO)
…ckages Propagate the #656 store refactor naming changes to all consumer files: - SignedBlockWithAttestation → SignedBlock, BlockWithAttestation removed - .message.block.field → .message.field (flattened block structure) - Validator.pubkey → attestation_pubkey + proposal_pubkey (dual keys) - GenesisSpec.validator_pubkeys → validator_attestation_pubkeys - key_manager.getAllPubkeys() returns AllPubkeys{attestation, proposal} - signAttestation → signBlockRoot for proposer signatures - Removed proposer_attestation from block processing - aggregateCommitteeSignatures → aggregate(state, false)
- Update expected state root hash in genesis root comparison test - Update config.yaml fixture to use attestation_pubkey/proposal_pubkey format - Fix attestation_signatures field access in chain.zig test
- Fix gossip_signatures → attestation_signatures in forkchoice.zig - Fix lean_pq_sig_attestation_signatures → lean_pq_sig_aggregated_signatures metric name in chain.zig
…aggregate() aggregate() reads xmss_participants but does not consume them (it creates its own merged bitfield internally). The errdefer + participants_cleanup=false pattern prevented cleanup on the success path, leaking the AggregationBits. Fix: use unconditional defer instead of guarded errdefer.
Update leanSpec to a5df895 (includes devnet4 dual-key fixtures) and lean-quickstart to match the devnet4 branch requirements.
- Rename aggregateCommitteeSignatures → aggregate, maybeAggregateCommitteeSignaturesOnInterval → maybeAggregateOnInterval - Rename validator_pubkeys → validator_attestation_pubkeys in comments - Remove stale "signed proposer attestation" references in logs - Fix signed_block.message.block → signed_block.message (SignedBlock flattening)
0c66a2e to
3896cdb
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 29 out of 29 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
pkgs/node/src/forkchoice.zig:1547
aggregateUnlockedholdssignatures_mutexfor the entire function (viadefer unlock), but the comment says the metrics update happens outside the lock scope. Either unlock before updating gauges (if that’s the intent) or adjust the comment to avoid misleading future changes—especially since aggregation work here can be relatively expensive.
// Capture counts before lock is released
new_payloads_count = self.latest_new_aggregated_payloads.count();
gossip_sigs_count = self.attestation_signatures.count();
// Update fork-choice store gauges after aggregation (outside lock scope)
zeam_metrics.metrics.lean_latest_new_aggregated_payloads.set(@intCast(new_payloads_count));
zeam_metrics.metrics.lean_gossip_signatures.set(@intCast(gossip_sigs_count));
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const all_pubkeys = try key_manager.getAllPubkeys(allocator, num_validators); | ||
| errdefer allocator.free(all_pubkeys.attestation_pubkeys); | ||
| errdefer allocator.free(all_pubkeys.proposal_pubkeys); | ||
|
|
||
| genesis_config = types.GenesisSpec{ | ||
| .genesis_time = 1234, | ||
| .validator_pubkeys = pubkeys, | ||
| .validator_attestation_pubkeys = all_pubkeys.attestation_pubkeys, | ||
| .validator_proposal_pubkeys = all_pubkeys.proposal_pubkeys, | ||
| }; | ||
| should_free_genesis = true; | ||
| } | ||
| defer if (should_free_genesis) allocator.free(genesis_config.validator_pubkeys); | ||
| defer if (should_free_genesis) { | ||
| allocator.free(genesis_config.validator_attestation_pubkeys); | ||
| allocator.free(genesis_config.validator_proposal_pubkeys); | ||
| }; |
There was a problem hiding this comment.
In genMockChain, all_pubkeys.* is freed via errdefer immediately after allocation, and the same slices are also freed again by the later defer if (should_free_genesis) once they’re moved into genesis_config. If an error occurs after should_free_genesis is set, this can double-free those slices. Fix by removing the errdefer after ownership transfer (or guard it with a flag), and let a single cleanup path own the frees.
| // Populate validators from genesis pubkeys | ||
| for (genesis.validator_pubkeys, 0..) |pubkey, i| { | ||
| const val = validator.Validator{ .pubkey = pubkey, .index = i }; | ||
| std.debug.assert(genesis.validator_attestation_pubkeys.len == genesis.validator_proposal_pubkeys.len); |
There was a problem hiding this comment.
genGenesisState uses std.debug.assert to require validator_attestation_pubkeys.len == validator_proposal_pubkeys.len. In release builds assertions may be disabled, and the subsequent multi-slice for will silently iterate to the shorter length, producing a state with mismatched/ignored validators. Prefer returning a real error when the lengths differ (e.g., error.InvalidGenesisSpec) so this is enforced in all build modes.
| std.debug.assert(genesis.validator_attestation_pubkeys.len == genesis.validator_proposal_pubkeys.len); | |
| if (genesis.validator_attestation_pubkeys.len != genesis.validator_proposal_pubkeys.len) { | |
| return error.InvalidGenesisSpec; | |
| } |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 29 out of 29 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| }; | ||
|
|
||
| vid_to_sigmap_idx[validator_id] = sigmap_sigs.items.len; | ||
| try sigmap_sigs.append(allocator, sig); | ||
| try sigmap_pks.append(allocator, pk); |
There was a problem hiding this comment.
In computeAggregatedSignatures, pk is created and then appended into sigmap_pks, but there is no errdefer pk.deinit() before the append. If sigmap_pks.append(...) fails, the public key handle will leak. Add an errdefer (and disable it after a successful append) to keep error paths leak-free.
| }; | |
| vid_to_sigmap_idx[validator_id] = sigmap_sigs.items.len; | |
| try sigmap_sigs.append(allocator, sig); | |
| try sigmap_pks.append(allocator, pk); | |
| }; | |
| var pk_added = false; | |
| errdefer if (!pk_added) pk.deinit(); | |
| vid_to_sigmap_idx[validator_id] = sigmap_sigs.items.len; | |
| try sigmap_sigs.append(allocator, sig); | |
| try sigmap_pks.append(allocator, pk); | |
| pk_added = true; |
| try cache.put(validator_id, CachedKeyPair{ | ||
| .keypair = keypair, | ||
| .attestation_keypair = attestation_keypair, | ||
| .proposal_keypair = proposal_keypair, | ||
| .num_active_epochs = num_active_epochs, | ||
| }); |
There was a problem hiding this comment.
getOrCreateCachedKeyPair has errdefer cleanup for attestation_keypair but not for proposal_keypair. If cache.put(...) fails (OOM) after generating both keypairs, the proposal keypair will leak. Add symmetric error cleanup for proposal_keypair (or generate it as var with errdefer) before the cache.put call.
| if (!has_raw and !has_children) return error.AggregationInvalidInput; | ||
| if (!has_raw and children.len < 2) return error.AggregationInvalidInput; | ||
|
|
There was a problem hiding this comment.
AggregatedSignatureProof.aggregate allows the case xmss_participants == null with children.len >= 2, but it still unconditionally calls xmss.aggregateSignatures(...). The XMSS FFI aggregator fails on zero raw signatures, so this path will always error even though it passes the input validation. Either reject the no-raw case for now, or implement a non-FFI fallback (e.g., dummy proof_data / purely merging participants) until recursive aggregation is supported end-to-end.
| try xmss.aggregateSignatures( | ||
| raw_xmss_pks, | ||
| raw_xmss_sigs, | ||
| message_hash, |
There was a problem hiding this comment.
xmss.aggregateSignatures appends into result.proof_data and does not clear existing content (see xmss aggregation implementation). AggregatedSignatureProof.aggregate currently calls it without resetting result.proof_data, so re-aggregating into an existing result will produce an invalid proof by concatenating bytes. Consider clearing/reinitializing result.proof_data before the FFI call (the function already has an allocator parameter now, so it can safely reset the list).
| var sig = try xmss.Signature.fromBytes(ve.stored_sig.signature[0..]); | ||
| errdefer sig.deinit(); | ||
|
|
||
| const val = try validators.get(ve.validator_id); | ||
| var pk = try xmss.PublicKey.fromBytes(&val.pubkey); | ||
| var pk = try xmss.PublicKey.fromBytes(&val.attestation_pubkey); |
There was a problem hiding this comment.
In AggregateInnerMap, sig/pk are created with errdefer ...deinit() and then appended into sigs/pks. If an error occurs after the append, the errdefer will still run and deinit the same handle that will later also be deinited by the arraylist cleanup, causing a double-deinit. Use a cleanup flag (set to false after successful append) or move the deinit responsibility exclusively to the list (no errdefer once appended).
| var sig = xmss.Signature.fromBytes(&sig_entry.signature) catch continue; | ||
| errdefer sig.deinit(); | ||
|
|
There was a problem hiding this comment.
In computeAggregatedSignatures, sig is created with errdefer sig.deinit(); but then appended into sigmap_sigs. If a later try in this scope fails, the errdefer will deinit the signature even though the list cleanup will also deinit it, leading to a double-deinit. Use a cleanup flag (disable after successful append) or restructure to only deinit via the list.
- Register block body attestations with fork choice via onAttestation() to mirror chain.zig onBlock behavior for correct LMD-GHOST weights - Remove fabricated proposer gossip attestation that injected phantom votes not present in leanSpec fixtures, causing head slot mismatches - Remove safe target regression guard that incorrectly errored on legitimate safe target decreases during reorg scenarios - Add latestJustifiedRoot and latestJustifiedRootLabel check handlers
… devnet4 scheme - Update multisig-glue to use rec_aggregation from leanMultisig with children support (child pub keys, child proofs, log_inv_rate) - Update hashsig-glue to leansig_wrapper with devnet4 aborting hypercube scheme (V=46, Poseidon1) - Update Zig FFI layer (xmss/aggregation.zig) with children + log_inv_rate - Pass children pub keys and INVERSE_PROOF_SIZE=2 through aggregation.zig - Wire recursive child resolution in block.zig for block building - Fix SIGSIZE from 3112 to 2536 to match V=46 signature size
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 35 out of 36 changed files in this pull request and generated 8 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // FFI call — passes children proofs + their public keys for true recursive aggregation | ||
| try xmss.aggregateSignatures( | ||
| raw_xmss_pks, | ||
| raw_xmss_sigs, | ||
| children_pub_keys, | ||
| children_proof_refs, | ||
| message_hash, | ||
| @intCast(epoch), | ||
| INVERSE_PROOF_SIZE, | ||
| &result.proof_data, | ||
| ); |
There was a problem hiding this comment.
xmss.aggregateSignatures appends into the provided result.proof_data list. If result is reused (or proof_data is non-empty), this will produce an invalid proof by concatenating multiple serializations. Clear/reset result.proof_data before the FFI call (and ideally document that the output list must start empty).
| // Copy the bytes to the output | ||
| for (buffer[0..bytes_written]) |byte| { | ||
| try multisig_aggregated_signature.append(byte); | ||
| } |
There was a problem hiding this comment.
This function appends serialized bytes into multisig_aggregated_signature without clearing it first. If the caller passes a non-empty list (or reuses the same list across calls), the output will contain concatenated proofs and verification will fail. Consider clearing the list at the start (or change the API to return a fresh list) and document the required precondition explicitly.
| let proof_lens = slice::from_raw_parts(child_proof_lens, num_children); | ||
|
|
||
| let total_child_pks: usize = num_keys_arr.iter().sum(); | ||
| let all_pk_ptrs = slice::from_raw_parts(child_all_pub_keys, total_child_pks); | ||
|
|
There was a problem hiding this comment.
total_child_pks is computed with sum() on an untrusted child_num_keys array from FFI. usize overflow can wrap in release builds, which can lead to constructing undersized slices and out-of-bounds reads/panics. Use checked accumulation (e.g., try_fold + checked_add) and return null on overflow/inconsistent inputs.
| // Serialize secret key using postcard (serde-based) | ||
| let sk_bytes = match postcard::to_allocvec(&private_key_ref.inner) { | ||
| Ok(bytes) => bytes, | ||
| Err(_) => return 0, | ||
| }; | ||
|
|
||
| if ssz_bytes.len() > buffer_len { | ||
| if sk_bytes.len() > buffer_len { | ||
| return 0; |
There was a problem hiding this comment.
The surrounding comment still says “SSZ encoding”, but this function now serializes the secret key using postcard (postcard::to_allocvec). Update the docstring/API naming so FFI callers don’t assume SSZ for private key bytes (and so it’s clear which encoding hashsig_keypair_from_ssz expects for the secret key).
| pub fn aggregateSignatures( | ||
| public_keys: []*const hashsig.HashSigPublicKey, | ||
| signatures: []*const hashsig.HashSigSignature, | ||
| children_pub_keys: []const []*const hashsig.HashSigPublicKey, | ||
| children_proofs: []const *const ByteListMiB, | ||
| message_hash: *const [32]u8, | ||
| epoch: u32, | ||
| log_inv_rate: usize, | ||
| multisig_aggregated_signature: *ByteListMiB, | ||
| ) !void { | ||
| if (public_keys.len != signatures.len) { | ||
| return AggregationError.PublicKeysSignatureLengthMismatch; | ||
| } | ||
| if (children_pub_keys.len != children_proofs.len) { | ||
| return AggregationError.AggregationFailed; | ||
| } | ||
|
|
||
| setupProver(); | ||
|
|
||
| const num_children = children_pub_keys.len; | ||
| const allocator = std.heap.c_allocator; | ||
|
|
||
| // Build flat child pub key array and per-child key counts | ||
| var total_child_pks: usize = 0; | ||
| for (children_pub_keys) |cpks| { | ||
| total_child_pks += cpks.len; | ||
| } | ||
|
|
||
| const child_all_pks = allocator.alloc(*const hashsig.HashSigPublicKey, total_child_pks) catch | ||
| return AggregationError.AggregationFailed; | ||
| defer allocator.free(child_all_pks); | ||
|
|
||
| const child_num_keys = allocator.alloc(usize, num_children) catch | ||
| return AggregationError.AggregationFailed; | ||
| defer allocator.free(child_num_keys); | ||
|
|
||
| const child_proof_ptrs = allocator.alloc([*]const u8, num_children) catch | ||
| return AggregationError.AggregationFailed; | ||
| defer allocator.free(child_proof_ptrs); | ||
|
|
||
| const child_proof_lens = allocator.alloc(usize, num_children) catch | ||
| return AggregationError.AggregationFailed; | ||
| defer allocator.free(child_proof_lens); | ||
|
|
||
| var pk_offset: usize = 0; | ||
| for (0..num_children) |i| { | ||
| const cpks = children_pub_keys[i]; | ||
| child_num_keys[i] = cpks.len; | ||
| for (cpks, 0..) |pk, j| { | ||
| child_all_pks[pk_offset + j] = pk; | ||
| } | ||
| pk_offset += cpks.len; | ||
|
|
||
| const proof_slice = children_proofs[i].constSlice(); | ||
| child_proof_ptrs[i] = proof_slice.ptr; | ||
| child_proof_lens[i] = proof_slice.len; | ||
| } | ||
|
|
||
| const agg_sig = xmss_aggregate( | ||
| public_keys.ptr, | ||
| public_keys.len, | ||
| signatures.ptr, | ||
| signatures.len, | ||
| public_keys.len, | ||
| num_children, | ||
| child_all_pks.ptr, | ||
| child_num_keys.ptr, | ||
| child_proof_ptrs.ptr, | ||
| child_proof_lens.ptr, | ||
| message_hash, | ||
| epoch, | ||
| log_inv_rate, | ||
| ) orelse return AggregationError.AggregationFailed; |
There was a problem hiding this comment.
The epoch parameter is now forwarded to the Rust FFI as slot, and the underlying aggregation/verification APIs are slot-based. Renaming this parameter (and related comments/tests) to slot would avoid callers accidentally passing an epoch number and producing unverifiable signatures/proofs.
| for i in 0..num_children { | ||
| // Collect pub keys for this child | ||
| let n = num_keys_arr[i]; | ||
| let mut pks = Vec::with_capacity(n); | ||
| for j in 0..n { | ||
| let pk_ptr = all_pk_ptrs[pk_offset + j]; | ||
| if pk_ptr.is_null() { | ||
| return std::ptr::null(); | ||
| } | ||
| pks.push((*pk_ptr).inner.clone()); |
There was a problem hiding this comment.
Indexing all_pk_ptrs[pk_offset + j] can panic if child_num_keys is inconsistent with the provided child_all_pub_keys backing array (or if the sum/offset bookkeeping is wrong). Panics across extern "C" are highly undesirable; add explicit bounds checks (e.g., verify pk_offset + n <= all_pk_ptrs.len() before the inner loop) and return null on invalid inputs.
| /// Reconstruct a key pair from SSZ-encoded secret and public keys | ||
| /// Returns a pointer to the KeyPair or null on error | ||
| /// # Safety | ||
| /// This is meant to be called from zig, so the pointers will always dereference correctly | ||
| #[no_mangle] | ||
| pub unsafe extern "C" fn hashsig_keypair_from_ssz( | ||
| private_key_ptr: *const u8, | ||
| private_key_len: usize, | ||
| public_key_ptr: *const u8, | ||
| public_key_len: usize, | ||
| ) -> *mut KeyPair { | ||
| if private_key_ptr.is_null() || public_key_ptr.is_null() { | ||
| return ptr::null_mut(); | ||
| } | ||
|
|
||
| unsafe { | ||
| let sk_slice = slice::from_raw_parts(private_key_ptr, private_key_len); | ||
| let pk_slice = slice::from_raw_parts(public_key_ptr, public_key_len); | ||
|
|
||
| let private_key: HashSigPrivateKey = match HashSigPrivateKey::from_ssz_bytes(sk_slice) { | ||
| let private_key: HashSigPrivateKey = match postcard::from_bytes(sk_slice) { | ||
| Ok(key) => key, |
There was a problem hiding this comment.
Docstring says the secret key is SSZ-encoded, but the implementation now deserializes the secret key using postcard::from_bytes. This is an API contract mismatch for FFI callers; update the docs (and ideally the function name) to reflect the actual encoding used for the private key bytes.
| // Not enough epochs, remove old key pair and regenerate | ||
| var old = cache.fetchRemove(validator_id).?.value; | ||
| old.keypair.deinit(); | ||
| old.attestation_keypair.deinit(); | ||
| old.proposal_keypair.deinit(); |
There was a problem hiding this comment.
getOrCreateCachedKeyPair removes and deinitializes the cached keypairs when num_active_epochs increases. Because ValidatorKeys returned from the cache are bitwise copies into each KeyManager (and cached keys are intentionally not deinit’d on KeyManager.deinit), this can invalidate previously returned/borrowed key handles and lead to use-after-free if multiple test key managers exist concurrently or if a cached key is reused after the cache grows. Safer options are to never deinit cached entries (leak in tests), key the cache by (validator_id, num_active_epochs), or use reference-counted ownership so old handles remain valid.
leanMultisig's mt-koala-bear crate uses compile-time #[cfg(target_feature)] for SIMD dispatch (AVX2/AVX512/NEON). The previous .cargo/config.toml approach was ineffective because CI sets RUSTFLAGS=-D warnings via environment, which causes Cargo to ignore config.toml rustflags entirely. Switch to CARGO_ENCODED_RUSTFLAGS which has highest precedence and is not clobbered by the RUSTFLAGS environment variable.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 36 out of 37 changed files in this pull request and generated 8 comments.
Comments suppressed due to low confidence (1)
rust/hashsig-glue/src/lib.rs:165
hashsig_keypair_from_ssznow deserializes the secret key withpostcard::from_bytes, but the API name/docs (and Zig bindings) indicate SSZ for both secret and public keys. This will break loading pre-generated*_sk.sszfiles and any callers expecting SSZ. Either restore SSZ decoding for the secret key, or rename/introduce new FFI entrypoints (e.g.,*_from_postcard/*_from_bytes) and update the Zig side accordingly.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
build.zig
Outdated
| // (AVX2/AVX512 on x86_64, NEON on aarch64). It requires target-cpu=native so the | ||
| // compiler enables the right feature flags. We use CARGO_ENCODED_RUSTFLAGS so we | ||
| // don't clobber any RUSTFLAGS already set in the environment (e.g. CI's -D warnings). | ||
| cargo_build.setEnvironmentVariable("CARGO_ENCODED_RUSTFLAGS", "-Ctarget-cpu=native"); |
There was a problem hiding this comment.
Setting CARGO_ENCODED_RUSTFLAGS unconditionally here will override any existing CARGO_ENCODED_RUSTFLAGS provided by the environment, despite the comment about not clobbering flags. Also, -Ctarget-cpu=native makes the produced staticlibs CPU-specific (non-portable) and can hurt reproducibility. Consider appending to any existing encoded rustflags and/or gating target-cpu=native behind an opt-in env var or a build option.
| // (AVX2/AVX512 on x86_64, NEON on aarch64). It requires target-cpu=native so the | |
| // compiler enables the right feature flags. We use CARGO_ENCODED_RUSTFLAGS so we | |
| // don't clobber any RUSTFLAGS already set in the environment (e.g. CI's -D warnings). | |
| cargo_build.setEnvironmentVariable("CARGO_ENCODED_RUSTFLAGS", "-Ctarget-cpu=native"); | |
| // (AVX2/AVX512 on x86_64, NEON on aarch64). It can use target-cpu=native so the | |
| // compiler enables the right feature flags, but this makes builds non-portable and | |
| // less reproducible. Allow callers to opt in via a Zig build option instead of | |
| // forcing it unconditionally. | |
| const enable_native_rustflags = | |
| b.option(bool, "enable-native-rustflags", "Enable -Ctarget-cpu=native for Rust builds") orelse false; | |
| if (enable_native_rustflags) { | |
| cargo_build.setEnvironmentVariable("CARGO_ENCODED_RUSTFLAGS", "-Ctarget-cpu=native"); | |
| } |
| let total_child_pks: usize = num_keys_arr.iter().sum(); | ||
| let all_pk_ptrs = slice::from_raw_parts(child_all_pub_keys, total_child_pks); | ||
|
|
||
| let mut pk_offset: usize = 0; | ||
| for i in 0..num_children { | ||
| // Collect pub keys for this child | ||
| let n = num_keys_arr[i]; | ||
| let mut pks = Vec::with_capacity(n); | ||
| for j in 0..n { | ||
| let pk_ptr = all_pk_ptrs[pk_offset + j]; | ||
| if pk_ptr.is_null() { | ||
| return std::ptr::null(); | ||
| } | ||
| pks.push((*pk_ptr).inner.clone()); | ||
| } | ||
| pk_offset += n; | ||
| children_pks.push(pks); |
There was a problem hiding this comment.
FFI robustness: xmss_aggregate derives total_child_pks via num_keys_arr.iter().sum() and then performs unchecked pk_offset += n arithmetic. With adversarial inputs, usize overflow can wrap and lead to panics during indexing, which is especially problematic across an extern "C" boundary. Consider using checked addition (and early null return) when summing/advancing offsets to guarantee no panics/UB.
|
|
||
| #[derive(Debug, thiserror::Error)] | ||
| pub enum SigningError { | ||
| #[error("Signing failed: {0:?}")] | ||
| SigningFailed(leansig::signature::SigningError), | ||
| #[error("Signing failed")] | ||
| SigningFailed, | ||
| } |
There was a problem hiding this comment.
SigningError::SigningFailed no longer preserves the underlying signing error details. If this error is surfaced in logs/tests, it may be harder to debug failures. Consider keeping the original error as a source (or at least capturing it as a string) while still presenting a stable top-level error message.
| /// Serialize a private key pointer to bytes using SSZ encoding | ||
| /// Returns number of bytes written, or 0 on error | ||
| /// # Safety | ||
| /// buffer must point to a valid buffer of sufficient size | ||
| #[no_mangle] | ||
| pub unsafe extern "C" fn hashsig_private_key_to_bytes( | ||
| private_key: *const PrivateKey, | ||
| buffer: *mut u8, | ||
| buffer_len: usize, | ||
| ) -> usize { | ||
| if private_key.is_null() || buffer.is_null() { | ||
| return 0; | ||
| } | ||
|
|
||
| unsafe { | ||
| let private_key_ref = &*private_key; | ||
|
|
||
| // Directly SSZ encode the private key (leansig has SSZ support built-in) | ||
| let ssz_bytes = private_key_ref.inner.as_ssz_bytes(); | ||
| // Serialize secret key using postcard (serde-based) | ||
| let sk_bytes = match postcard::to_allocvec(&private_key_ref.inner) { | ||
| Ok(bytes) => bytes, | ||
| Err(_) => return 0, | ||
| }; | ||
|
|
||
| if ssz_bytes.len() > buffer_len { | ||
| if sk_bytes.len() > buffer_len { | ||
| return 0; | ||
| } | ||
|
|
||
| let output_slice = slice::from_raw_parts_mut(buffer, buffer_len); | ||
| output_slice[..ssz_bytes.len()].copy_from_slice(&ssz_bytes); | ||
| ssz_bytes.len() | ||
| output_slice[..sk_bytes.len()].copy_from_slice(&sk_bytes); | ||
| sk_bytes.len() | ||
| } |
There was a problem hiding this comment.
hashsig_private_key_to_bytes now serializes the secret key with postcard, but the function comment says “SSZ encoding” and Zig wrappers call it “SSZ format”. This mismatch is likely to cause silent interoperability failures; please align the docs + naming with the actual encoding, or switch back to SSZ for the private key if SSZ is still the intended wire/file format.
| let seed = hasher.finalize().into(); | ||
|
|
||
| let (public_key, private_key) = PrivateKey::generate( | ||
| &mut <ChaCha20Rng as SeedableRng>::from_seed(seed), | ||
| activation_epoch, | ||
| num_active_epochs, | ||
| &mut StdRng::from_seed(seed), | ||
| activation_epoch as u32, | ||
| num_active_epochs as u32, | ||
| ); | ||
|
|
||
| let keypair = Box::new(KeyPair { |
There was a problem hiding this comment.
FFI safety: hashsig_keypair_generate should defensively handle seed_phrase == NULL before calling CStr::from_ptr (which is UB on null). Even though this hunk changes the RNG, it would be good to add that null check in this function and return a null pointer on invalid input.
| if (cache.get(validator_id)) |cached| { | ||
| if (cached.num_active_epochs >= num_active_epochs) { | ||
| std.debug.print("CACHE HIT: validator {d}\n", .{validator_id}); | ||
| return cached.keypair; | ||
| return ValidatorKeys{ | ||
| .attestation_keypair = cached.attestation_keypair, | ||
| .proposal_keypair = cached.proposal_keypair, | ||
| }; | ||
| } | ||
| // Not enough epochs, remove old key pair and regenerate | ||
| var old = cache.fetchRemove(validator_id).?.value; | ||
| old.keypair.deinit(); | ||
| old.attestation_keypair.deinit(); | ||
| old.proposal_keypair.deinit(); | ||
| } |
There was a problem hiding this comment.
The global cached keypairs are returned by value and then stored in KeyManager. If getOrCreateCachedKeyPair later regenerates (removes + deinit) a cached entry for the same validator, any previously returned copies become dangling (use-after-free) because they share the same underlying Rust handle pointers. Consider storing cached keypairs behind stable ownership (e.g., pointers/refs) or making the cache monotonic (never deinit/replace entries; only grow num_active_epochs).
| self.signatures_mutex.lock(); | ||
| defer self.signatures_mutex.unlock(); | ||
|
|
||
| // Build attestation list — recursive includes new payloads, plain does not | ||
| var attestations = if (recursive) | ||
| try self.attestationsFromGossipAndNewPayloads() | ||
| else | ||
| try self.attestationsFromAttestationSignatures(); | ||
| defer attestations.deinit(self.allocator); | ||
|
|
There was a problem hiding this comment.
aggregateUnlocked holds signatures_mutex for the entire aggregation, including hashing, proof selection, cloning, and allocations. This is a potentially long critical section and contradicts the comment about updating metrics outside the lock scope. Consider copying the minimal required data structures under the lock, releasing the lock for the expensive work, then re-locking only to commit map updates/removals.
| pub fn verifyAggregatedPayload(public_keys: []*const hashsig.HashSigPublicKey, message_hash: *const [32]u8, epoch: u32, agg_sig: *const ByteListMiB) !void { | ||
| // Get bytes from MultisigAggregatedSignature | ||
| // Get bytes from aggregated signature | ||
| const sig_bytes = agg_sig.constSlice(); | ||
|
|
||
| setupVerifier(); | ||
|
|
||
| // Deserialize to Devnet2XmssAggregateSignature | ||
| const devnet_sig = xmss_aggregate_signature_from_bytes(sig_bytes.ptr, sig_bytes.len) orelse { | ||
| return AggregationError.DeserializationFailed; | ||
| }; | ||
| defer xmss_free_aggregate_signature(devnet_sig); | ||
|
|
||
| // Verify | ||
| // Verify directly from bytes (Rust deserializes internally) | ||
| const result = xmss_verify_aggregated( | ||
| public_keys.ptr, | ||
| public_keys.len, | ||
| message_hash, | ||
| devnet_sig, | ||
| sig_bytes.ptr, | ||
| sig_bytes.len, | ||
| epoch, | ||
| ); |
There was a problem hiding this comment.
In aggregateSignatures/verifyAggregatedPayload, the parameter is still named epoch, but it is now passed to the Rust FFI as slot (and used as such in devnet4 aggregation). Renaming this parameter (and related docs) to slot would reduce confusion and prevent accidental misuse when epoch/slot semantics diverge.
ring 0.17 fails compile-time feature assertions on aarch64-apple-darwin when target-cpu=native is set. The SIMD dispatch fix is only needed on x86_64 for leanMultisig's AVX2/AVX512 codepaths. Also include -Dwarnings in the encoded flags so CI warning-as-error behavior is preserved.
No description provided.