Improve BFT-shard throughput and proof readiness by jait91 · Pull Request #151 · unicitynetwork/aggregator-go

jait91 · 2026-05-11T14:59:32Z

This PR improves BFT-shard throughput and proof-readiness under load.

Main changes:

Add finalization/proof-readiness timing metrics.
Chunk and parallelize Mongo finalization inserts.
Remove unused write-heavy Mongo indexes from the hot path.
Add configurable BFT-shard precollection with Redis-backed replay safety.
Default async v2 submit behavior by skipping finalized duplicate lookup.
Add cheap in-memory proof-not-ready handling to reduce Mongo pressure from early proof polling.
Improve performance-test polling/worker behavior.
Align BFT-sharding compose defaults with the tested perf configuration.
Add measured performance results in docs/aggregator-performance.md.

Notes

The index changes assume a fresh DB for this branch’s tested path. Existing Mongo databases will keep previously-created
indexes until they are dropped manually.

Unused indexes removed by this PR:

aggregator_records.leafIndex
aggregator_records.finalizedAt
aggregator_records.blockNumber_1_leafIndex_1
block_records.stateIds
block_records.createdAt
smt_nodes.hash
smt_nodes.createdAt

Clean DBs need no migration.

gemini-code-assist

Code Review

This pull request introduces significant performance enhancements, including sharding support, optimized MongoDB batch inserts, and a precollection mechanism to improve throughput and latency. It also adds a /health/leader endpoint for HAProxy and updates the performance test suite. Review feedback identified a critical compilation error regarding sync.WaitGroup usage and suggested improving error handling for JSON marshaling.

…hput # Conflicts: # internal/config/config_test.go

Copilot

Pull request overview

This PR is a substantial throughput/proof-readiness improvement pass for the BFT-shard configuration. It introduces parallel chunked Mongo finalization writes, removes write-heavy indexes from the hot path, adds an active precollector with Redis-backed replay safety for standalone/bft-shard rounds, defaults v2 submit to skip the finalized duplicate lookup, adds a cheap in-memory "proof not ready" short-circuit to reduce Mongo polling pressure, refactors the perf test polling/scheduler, adds a leader-only health endpoint, and aligns compose defaults and docs with the tested perf configuration.

Changes:

New finalization insert chunking + parallel workers, removal of cold indexes, and finalize timing breakdown logging.
Active precollector + grace-period handoff, configurable collect window, classified leaf-add (added/duplicate/rejected), and an in-memory proofPending cache for early get_inclusion_proof.v2 requests.
Perf test: per-job proof scheduling, startup-probe wait, X-State-ID propagation; compose/Makefile updates aligning bft-sharding stack with tested perf settings; new /health/leader endpoint for HAProxy.

Reviewed changes

Copilot reviewed 45 out of 45 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
internal/config/config.go(_test.go)	Adds new processing/database knobs and validation; defaults `SKIP_DUPLICATE_CHECK=true`.
internal/storage/mongodb/batch_insert.go	New helper for chunked, optionally parallel `InsertMany` with duplicate-key tolerance.
internal/storage/mongodb/{aggregator_record,smt,connection,block_records}.go(_test.go)	Wire chunked finalization inserts; in-memory leaf-index sort; trim cold indexes; remove `BlockRecordsStorage.GetByStateID`.
internal/storage/mongodb/index_test.go	Asserts production index set after `CreateIndexes`.
internal/storage/redis/commitment.go(_test.go)	Move pending-sweep into stream loop so live `ResetPendingSweep` is honored without restart; new tests.
internal/smt/thread_safe_smt_snapshot.go(_test.go)	Adds `AddLeavesClassified` returning added/duplicate/rejected indexes.
internal/round/{leaf_add,batch_processor,precollector,round_manager,parent_round_manager,factory}.go	Active precollector lifecycle, grace-period handoff, classified leaf adds, finalize timing breakdown, proof-pending cache, recovery reconciliation.
internal/round/*_test.go	Tests for new precollector lifecycle, recovery reconciliation, classified adds, signature update for `processMiniBatch`.
internal/service/service.go(_test.go)	Optional duplicate-check skip; in-memory not-ready short-circuit using `GetKnownNotReadyBlock`.
internal/bft/{client,client_stub,client_stub_test}.go	New `StartNextRoundFromPrecollector` interface used after UC handling and in stub.
internal/gateway/{server,handlers_rest,handlers_rest_test}.go	New `/health/leader` endpoint and role check.
cmd/performance-test/{main,types}.go	Per-job proof scheduling, startup probe with retries, finalize-breakdown log parsing, new metrics.
Makefile, compose.yml, scripts/haproxy.cfg, scripts/mongo-init.js	Compose/Makefile alignment, leader healthcheck wiring, separate per-shard Mongo, dropped indexes.
docs/aggregator-performance.md	New measured perf results document.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

jait91 · 2026-05-15T08:29:24Z

+			MaxCommitmentsPerRound:     getEnvIntOrDefault("MAX_COMMITMENTS_PER_ROUND", 20000),
+			CollectPhaseDuration:       getEnvDurationOrDefault("COLLECT_PHASE_DURATION", "200ms"),
+			CommitmentStreamBufferSize: getEnvIntOrDefault("COMMITMENT_STREAM_BUFFER_SIZE", 50000),
+			SkipDuplicateCheck:         getEnvBoolOrDefault("SKIP_DUPLICATE_CHECK", true),


Intentional. This matches v2 async behavior: submit success only means the request was accepted for processing; the proof result is authoritative. Duplicate/idempotent submits are allowed, and double-spend detection happens via the eventual proof outcome. SKIP_DUPLICATE_CHECK=false keeps the old submit-time check available.

jait91 · 2026-05-15T08:30:19Z

@@ -33,14 +30,10 @@ db.blocks.createIndex({ chainId: 1 });
 // SMT nodes collection
 db.createCollection('smt_nodes');
 db.smt_nodes.createIndex({ key: 1 }, { unique: true });
-db.smt_nodes.createIndex({ hash: 1 });
-db.smt_nodes.createIndex({ createdAt: -1 });

 // Block records collection
 db.createCollection('block_records');
 db.block_records.createIndex({ blockNumber: 1 }, { unique: true });


added comment about dropped indexes to PR description

jait91 · 2026-05-15T08:31:18Z

+	result := snapshot.AddLeavesClassified(leaves)
+
+	addedCommitments := make([]*models.CertificationRequest, 0, len(result.AddedIndexes))
+	addedLeaves := make([]*smt.Leaf, 0, len(result.AddedIndexes))
+	for _, idx := range result.AddedIndexes {
+		addedCommitments = append(addedCommitments, commitments[idx])
+		addedLeaves = append(addedLeaves, leaves[idx])
+	}
+
+	dropped := make([]interfaces.CertificationRequestAck, 0, len(result.DuplicateIndexes)+len(result.Rejected))
+	for _, idx := range result.DuplicateIndexes {
+		dropped = append(dropped, interfaces.CertificationRequestAck{
+			StateID:  commitments[idx].StateID,
+			StreamID: commitments[idx].StreamID,
+		})
+	}
+	for _, rejected := range result.Rejected {
+		log.WithContext(ctx).Warn("Rejected commitment leaf",
+			"path", leaves[rejected.Index].Path.String(),
+			"error", rejected.Err.Error())
+		dropped = append(dropped, interfaces.CertificationRequestAck{
+			StateID:  commitments[rejected.Index].StateID,
+			StreamID: commitments[rejected.Index].StreamID,
+		})
+	}
+
+	return addedCommitments, addedLeaves, dropped


Intentional. Duplicate leaves are idempotent re-submissions and should not be treated as newly added commitments. ACKing/dropping them during collection avoids stale proof-pending entries and unnecessary Mongo writes; the original accepted commitment remains authoritative.

jait91 · 2026-05-15T08:31:49Z

 networks:
  default:
+    name: aggregator-go_default
+    external: true


added comment to compose file

jait91 · 2026-05-15T08:33:33Z

      - ./data/genesis:/genesis
    healthcheck:
-      test: ["CMD", "nc", "-zv", "bft-root", "8000"]
+      test: ["CMD", "nc", "-zv", "bft-root", "8002"]


Verified. All affected compose files start bft-root with --rpc-server-address ...:8002, so the healthcheck is checking the configured RPC/readiness port rather than the libp2p transport port.

jait91 · 2026-05-15T08:35:06Z

+	for range workers {
+		wg.Go(func() {
+			for job := range jobs {
+				if err := ctx.Err(); err != nil {
+					setFirstErr(err)
+					continue
+				}
+				err := ignoreDuplicateInsertError(collection.InsertMany(ctx, docs[job.start:job.end], options.InsertMany().SetOrdered(false)))
+				setFirstErr(err)
+			}
+		})
+	}
+
+queue:
+	for start := 0; start < len(docs); start += opts.chunkSize {
+		if err := ctx.Err(); err != nil {
+			setFirstErr(err)
+			break
+		}
+		if getFirstErr() != nil {
+			break
+		}
+		select {
+		case jobs <- chunk{start: start, end: min(start+opts.chunkSize, len(docs))}:
+		case <-ctx.Done():
+			setFirstErr(ctx.Err())
+			break queue
+		}
+	}
+	close(jobs)
+	wg.Wait()
+
+	return getFirstErr()
+}


Documented. This helper is only used for idempotent finalization writes; partial chunk writes are safe because retry/recovery can replay them and duplicate-key errors are ignored.

jait91 · 2026-05-15T08:39:08Z

+	if block, ok := as.roundManager.GetKnownNotReadyBlock(req.StateID); ok {
+		responseBlockNumber, err := proofBundleBlockNumber(as.config.Sharding.Mode, block)
+		if err != nil {
+			return nil, err
+		}
+		return emptyInclusionProofResponse(responseBlockNumber, block), nil


Intentional, and confirmed against the SDK behavior. Empty proof responses are treated as “proof not ready yet” and retried; the temporary block number in that response is not used for proof verification.

MastaP

A few smaller comments on top of the existing reviews — each one is a small follow-up rather than a blocker. Skipping points already covered by Copilot's review (SKIP_DUPLICATE_CHECK default, index migration, GetKnownNotReadyBlock UC source).

b3y0urs3lf · 2026-05-21T10:54:00Z

Java SDK reproduction (state-transition-sdk-java)

Confirming the same behavioral difference the TS SDK reported, traced to this PR's change:

Default async v2 submit behavior by skipping finalized duplicate lookup.

Re-spend (double-spend) detection moves from the submit layer to the proof layer. Scenarios asserting submit-time STATE_ID_EXISTS pass on main and fail on this branch.

Behavior observed

main Java suite ✅ passes
- finalized-dup lookup at submit: performed
- re-spend submit status: STATE_ID_EXISTS
- re-spend caught at: submit
this PR (perf/bft-shard-throughput) Java suite ❌ 14 scenarios fail
- finalized-dup lookup at submit: skipped
- re-spend submit status: SUCCESS
- re-spend caught at: inclusion proof (TRANSACTION_HASH_MISMATCH)

Reproduced on both the single-aggregator subscription deployment and the bft-shard (MSB, 2- and 16-shard) deployments built from this branch — it's the build, not the topology.
Double-spend safety is intact; the re-spend never yields a valid token, it's just rejected one layer later.

Status type: org.unicitylabs.sdk.api.CertificationStatus (returned by CertificationResponse.getStatus()). Java uses STATE_ID_EXISTS (not REQUEST_ID_EXISTS). Re-spends are built with a
fresh recipient predicate + random 32-byte stateMask (renamed from nonce in SDK PR #61) → distinct stateId → submit returns SUCCESS on this branch.

Affected scenarios (14)

token-4level-owner-actions.feature — Scenario Outline “Double-spend detected when reuses pre-transfer token ”, all 8 rows (T1a_pre … T4b_pre). Glue:
TreeSteps.theAggregatorRespondsWith.
token-transfer-edge-cases.feature — “Stale token object cannot be reused after transfer” (1). Glue: TokenLifecycleSteps.userTriesToSubmitATransferOfTheStaleTokenTo.
token-split-transfer.feature — “Original token cannot be used after split burn”, “Double-spend of a split token is prevented”, “Double-spend after multi-level split is prevented”,
“Cannot spend a token after it has been split” (4).
token-split-advanced.feature — “Cannot transfer original token after split” (1). Glue for split files: SplitSteps (:355, :763, :970, :989).

Java fails 14 vs the TS suite's 9 because the Java suite adds split-path re-spend scenarios. Same model mismatch, broader surface.

Expected vs actual

Expected (matches main):
Submit a NEW transfer spending an already-finalized state
→ CertificationStatus = STATE_ID_EXISTS (rejected at submit)

Actual (this branch):
Submit a NEW transfer spending an already-finalized state
→ CertificationStatus = SUCCESS (accepted at submit; dup lookup skipped)
→ request inclusion proof for that transfer
→ proof for the state carries the FIRST committed tx's hash
→ verification fails with TRANSACTION_HASH_MISMATCH (rejected at proof)

Assertion failure (TreeSteps.theAggregatorRespondsWith):
Then the aggregator responds with "STATE_ID_EXISTS"
org.opentest4j.AssertionFailedError: expected: <STATE_ID_EXISTS> but was:

For reference, double-spend-prevention.feature (both submits SUCCESS, second proof rejects with TRANSACTION_HASH_MISMATCH) passes on both builds — its glue already encodes the proof-time
model.

Run config

./gradlew bddTest --rerun-tasks
env: AGGREGATOR_URL=<aggregator/proxy endpoint>
AGGREGATOR_API_KEY= # subscription deployment only
TRUST_BASE_PATH=

i.e. On branch bdd-phase-0 state-transition-sdk-java$ AGGREGATOR_URL=http://localhost:8080 AGGREGATOR_API_KEY=sk_d04b15de50ad485b925a48500d01aab2 TRUST_BASE_PATH=/home/dmytro/Documents/Unicity/state-transition-sdk/tests/e2e/trust-base.json ./gradlew bddTest --rerun-tasks -Dcucumber.execution.parallel.enabled=true -Dcucumber.execution.parallel.config.strategy=fixed -Dcucumber.execution.parallel.config.fixed.parallelism=4 -Dcucumber.filter.tags="not @slow and not @wip and not @ignore and not @bft-shard-only and not @multi-shard-only and not @pending-src-cleanup and not @stateful and not @fresh-aggregator and not @stress"

Deterministic; independent of -Dcucumber.execution.parallel.enabled (each scenario uses its own random tokenId → no cross-scenario stateId collision). Not a transport error, not a
warm-vs-fresh-aggregator boundary case.

Questions (same as TS, both SDKs want one answer)

Is skip-finalized-dup-lookup intended as the default once this lands, or config-gated (e.g. a skipFinalizedDuplicateLookup / async-v2 toggle in internal/config)? If gated, both SDK
suites can branch on the mode.
Which contract should SDK suites standardize on — submit-time STATE_ID_EXISTS, or SUCCESS + proof-time TRANSACTION_HASH_MISMATCH? Java SDK's vote: the latter — it's the only assertion
that holds on every build/topology and it strengthens coverage (the strict submit-status assertions never exercised the proof layer).

We'll hold the Java test change until Q1 is confirmed.

TS SDK reproduction (state-transition-sdk)

Confirming the same behavioral difference, traced to this PR's change:

Default async v2 submit behavior by skipping finalized duplicate lookup.

Re-spend (double-spend) detection moves from the submit layer to the proof layer. Scenarios asserting submit-time STATE_ID_EXISTS pass on main and fail on this branch.

Behavior observed

main TS suite ✅ passes
- finalized-dup lookup at submit: performed
- re-spend submit status: STATE_ID_EXISTS
- re-spend caught at: submit
this PR (perf/bft-shard-throughput) TS suite ❌ 9 scenarios fail
- finalized-dup lookup at submit: skipped
- re-spend submit status: SUCCESS
- re-spend caught at: inclusion proof (TRANSACTION_HASH_MISMATCH)

Reproduced on both a single-aggregator deployment and a 2-shard bft-shard (MSB) deployment built from this branch — it's the build, not the topology. Double-spend safety is intact; the
re-spend never yields a valid token, it's just rejected one layer later.

Status type: CertificationStatus (TS enum, src/api/CertificationResponse.ts), read from CertificationResponse.status. TS uses STATE_ID_EXISTS (not REQUEST_ID_EXISTS). Re-spends
are built with a fresh recipient predicate + random 32-byte stateMask (renamed from nonce in SDK PR #110/#112) → distinct stateId → submit returns SUCCESS on this branch.

Affected scenarios (9)

token-4level-owner-actions.feature — Scenario Outline “Double-spend detected when <user> reuses pre-transfer token <token>”, all 8 rows (T1a_pre … T4b_pre). Glue:
tree-owner-actions.steps.ts → the aggregator responds with "…".
token-transfer-edge-cases.feature — “Stale token object cannot be reused after transfer” (1). Glue: transfer-edge-cases.steps.ts + minting.steps.ts → the certification response status is "…".

TS fails 9 vs Java's 14 because the TS split-path double-spend scenarios (token-split-transfer.feature, token-split-advanced.feature: “Original token cannot be used after split burn”,
“Double-spend of a split token is prevented”, “Cannot spend a token after it has been split”, etc.) assert an SDK-side failure — TransferTransaction.create/unlock rejects because the
source token is burned (transferError !== null) — not the aggregator submit status, so they don't touch the dup-lookup path and pass on both builds. Same model mismatch, narrower
submit-status surface.

Expected vs actual

Expected (matches main):
Submit a NEW transfer spending an already-finalized state
→ CertificationStatus = STATE_ID_EXISTS (rejected at submit)

Actual (this branch):
Submit a NEW transfer spending an already-finalized state
→ CertificationStatus = SUCCESS (accepted at submit; dup lookup skipped)
→ request inclusion proof for that transfer
→ proof for the state carries the FIRST committed tx's hash
→ verification fails with TRANSACTION_HASH_MISMATCH (rejected at proof)

Assertion failure (tree-owner-actions.steps.ts / minting.steps.ts):
Then the aggregator responds with "STATE_ID_EXISTS"
AssertionError [ERR_ASSERTION]: expected 'STATE_ID_EXISTS', actual 'SUCCESS'

For reference, double-spend-prevention.feature (both submits SUCCESS, second proof rejects with TRANSACTION_HASH_MISMATCH) passes on both builds — its glue already encodes the
proof-time model.

Run config On branch feature/test-infrastructure

NODE_OPTIONS='--import tsx/esm'
AGGREGATOR_URL=<aggregator/proxy endpoint>
AGGREGATOR_API_KEY= # subscription deployment only
TRUST_BASE_PATH=
./node_modules/.bin/cucumber-js --config ''
--import 'tests/bdd/functional/support/World.ts'
--import 'tests/bdd/functional/steps/**/.steps.ts'
--format summary --parallel 4
--tags 'not @shard-load and not @stateful and not @stress and not @bft-shard-only and not @fresh-aggregator'
'tests/bdd/functional/features/.feature'
(branch feature/test-infrastructure)

Deterministic; independent of --parallel (each scenario uses its own random tokenId → no cross-scenario stateId collision). Not a transport error, not a warm-vs-fresh-aggregator boundary
case.

Questions (same as Java — both SDKs want one answer)

Is skip-finalized-dup-lookup intended as the default once this lands, or config-gated (e.g. a skipFinalizedDuplicateLookup / async-v2 toggle in internal/config)? If gated, both SDK
suites can branch on the mode.
Which contract should SDK suites standardize on — submit-time STATE_ID_EXISTS, or SUCCESS + proof-time TRANSACTION_HASH_MISMATCH? TS SDK's vote: the latter — it's the only
assertion that holds on every build/topology, and it strengthens coverage (the strict submit-status assertions never exercised the proof layer).

We'll hold the TS test change until Q1 is confirmed.

jait91 · 2026-05-21T12:00:00Z

@b3y0urs3lf
Confirmed: this is the intended async/no-finalized-duplicate-lookup behavior for this PR. Double-spend safety is still enforced, but the rejection moves from submit-time STATE_ID_EXISTS to proof-time TRANSACTION_HASH_MISMATCH.

I also checked both SDKs: this appears to affect BDD/e2e test expectations rather than core SDK handling logic. Created follow-up tasks to update the suites:

JS SDK: Update BDD tests for proof-time duplicate/re-spend rejection state-transition-sdk-js#118
Java SDK: Update BDD tests for proof-time duplicate/re-spend rejection state-transition-sdk-java#67

jait91 added 10 commits May 5, 2026 16:34

perf: improve round finalization metrics

3bfae70

perf: chunk Mongo finalization inserts

0d3e4a0

remove unused indexes

52f862a

perf: improve performance test gateway support

6e77bb5

perf: improve bft-shard proof readiness under load

b2edccd

update compose files

048f4d7

docs: add aggregator performance results

06632e4

chore: remove unused block record state lookup

f0da959

feat: add leader-only health check

929aea4

fix: avoid stale proof readiness cache

7e4a0d7

jait91 requested a review from MastaP May 11, 2026 14:59

gemini-code-assist Bot reviewed May 11, 2026

View reviewed changes

Comment thread internal/storage/mongodb/batch_insert.go

Comment thread cmd/performance-test/main.go Outdated

jait91 self-assigned this May 11, 2026

jait91 added this to Unicity May 11, 2026

jait91 moved this to In Dev in Unicity May 11, 2026

jait91 added 3 commits May 11, 2026 19:16

Merge remote-tracking branch 'origin/main' into perf/bft-shard-throug…

483c9f4

…hput # Conflicts: # internal/config/config_test.go

test: handle perf response marshal errors

84389bf

update docs

1ac68dc

MastaP requested a review from Copilot May 13, 2026 16:37

Copilot started reviewing on behalf of MastaP May 13, 2026 16:38 View session

Copilot AI reviewed May 13, 2026

View reviewed changes

MastaP reviewed May 13, 2026

View reviewed changes

Comment thread internal/round/round_manager.go Outdated

Comment thread internal/storage/mongodb/aggregator_record.go

Comment thread internal/gateway/handlers_rest.go Outdated

Comment thread bft-sharding-compose.yml

PR fixes

b6bc68c

jait91 requested a review from MastaP May 15, 2026 08:39

MastaP approved these changes May 18, 2026

View reviewed changes

MastaP assigned b3y0urs3lf and unassigned jait91 May 18, 2026

MastaP moved this from In Dev to Test in Unicity May 18, 2026

b3y0urs3lf moved this from Test to Todo in Unicity May 21, 2026

This was referenced May 21, 2026

Update BDD tests for proof-time duplicate/re-spend rejection unicitynetwork/state-transition-sdk-java#67

Closed

Update BDD tests for proof-time duplicate/re-spend rejection unicitynetwork/state-transition-sdk-js#118

Closed

jait91 moved this from Todo to Test in Unicity May 21, 2026

b3y0urs3lf merged commit cb5cf20 into main May 21, 2026
2 checks passed

b3y0urs3lf deleted the perf/bft-shard-throughput branch May 21, 2026 12:05

github-project-automation Bot moved this from Test to Done in Unicity May 21, 2026

Conversation

jait91 commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Notes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

jait91 May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jait91 May 15, 2026

Choose a reason for hiding this comment

Uh oh!

jait91 May 15, 2026

Choose a reason for hiding this comment

Uh oh!

jait91 May 15, 2026

Choose a reason for hiding this comment

Uh oh!

jait91 May 15, 2026

Choose a reason for hiding this comment

Uh oh!

jait91 May 15, 2026

Choose a reason for hiding this comment

Uh oh!

jait91 May 15, 2026

Choose a reason for hiding this comment

Uh oh!

MastaP left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

b3y0urs3lf commented May 21, 2026

Java SDK reproduction (state-transition-sdk-java)

Behavior observed

TS SDK reproduction (state-transition-sdk)

Behavior observed

Uh oh!

jait91 commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jait91 commented May 11, 2026 •

edited

Loading

jait91 May 15, 2026 •

edited

Loading