test: MPC + stream component test by jakmeier · Pull Request #739 · sig-net/mpc

jakmeier · 2026-04-02T19:49:16Z

Create a MockStream that connects to the MPC fixture setup. This lets us test the indexer stream <-> MPC glue.

volovyks

Nice! test_channel_contention should be very useful in our work on backlog and stability.

volovyks · 2026-04-08T13:16:53Z

            crate::metrics::nodes::CONFIGURATION_DIGEST.set(digest);

-            let (sign_tx, sign_rx) = mpsc::channel(16384);
+            let (sign_tx, sign_rx) = mpsc::channel(if cfg!(test) { 1 } else { 16384 });


I guess this is more of an experiment, but we can make it configurable in tests.

not related to this PR, but it definitely deserves a constant

yeah it's just an experiment that I will clean up before removing the draft label

volovyks · 2026-04-09T07:42:06Z

+        guard.sign_requests(requests)
+    }
+
+    pub async fn rpc_actions(&self, actions: &[RpcAction]) {


It is not clear what we are doing with these actions. Are we queuing them? Processing? Adding to a specific block? I'm ok with a bit longer names. That makes the code a bit more readable IMO.

volovyks · 2026-04-09T08:24:22Z

+    network[1].mock_streams[0].progress_block_height(1).await;
+    network[2].mock_streams[0].progress_block_height(1).await;
+
+    let timeout = Duration::from_secs(10);


nit: Such a pattern significantly slows down our tests; I would add a helper that actually waits for N events or specific events with a timeout.

Wait, doesn't this already do exactly what you ask?

let timeout = Duration::from_secs(10); let actions = network.assert_actions(1, timeout).await;

pub async fn assert_actions( &self, threshold_per_node: usize, timeout: Duration, ) -> HashSet<String> { let result = tokio::time::timeout(timeout, self.wait_for_actions(threshold_per_node)).await; if result.is_err() { self.print_actions().await; } result.expect("should produce enough signatures") }

Yes, but why do we have the timeout? What if it happens faster?

tokio::time::timeout returns early in that case

wait_for_actions keeps polling at a 100ms interval to check if there are enough actions

jakmeier · 2026-04-09T17:21:14Z

I'm still experimenting here. Getting some interesting results.

There is some luck involved if nodes can handle 50 signature requests at once coming from the Solana stream. However, channel buffers have no influence on that. So probably that is not problem.
Meanwhile, 50 requests seems to be a hard threshold for successful signatures. To get more (e.g. 51) I have to send way more requests. I think then they have better chances, with more requests running in parallel.
I often observed 25, 50 or 75 successful signatures. Don't know why, yet. Could be an artifact from the test setup.

I will take a closer look again next week.

jakmeier · 2026-04-16T12:55:53Z

Still investigating but this is interesting.

I see that 25 signatures get through in the first round. Then they wait for a timeout. (3s for faster iteration in the graph but it is the same with longer timeouts) One more is accepted in the third round. This is exactly the same for all participants.

After the third round, no more signatures are produced. They always fail in the posit phase with "proposer timeout waiting for presignature, reorganizing". Could be they are just out of presignatures. Still interesting that the need timeouts to make progress.

Retry round alignment seems to works fine, too. Only a short time window where different nodes have different rounds.

Create a MockStream that connects to the MPC fixture setup. This lets us test the indexer stream <-> MPC glue.

but not with channel capacity?

otherwise it is always the same participant in round 0

adding useful cases and checking if they run in ci

jakmeier · 2026-04-16T15:36:14Z

The issue from my previous comment was that I didn't use good entropy in requests. Hence, at round 0, all requests went to participant 0 (first 25 requests) and after one timeout in round 2, all requests went to participant 2.

That's fixed now and it actually works rather stable, even when adding 1M requests all at once.

jakmeier · 2026-04-16T15:39:13Z

The new problem I was able to identify is with delayed requests. With 1.5 times the posit timeout delay between when nodes see incoming blocks, they don't get a single request signed. It looks like they fail on achieving round alignment.

Change that to only 0.5 times the posit timeout, and all signatures are generated as expected in round 0.

This is reproducible with the pushed test mpc_with_stream::test_channel_contention_multiple_blocks_at_once_delayed.

jakmeier · 2026-04-16T15:42:00Z

+#[test(tokio::test(flavor = "multi_thread"))]
+async fn test_channel_contention_multiple_blocks_at_once_delayed() {
+    // delay should be > ORGANIZE_POSIT_TIMEOUT
+    let delay = mpc_node::protocol::signature::organize_posit_timeout() * 3 / 2;


Changing this to less than a full posit delay make the test pass. Hence we know we only run into the issue when observations are apart more than a posit timeout (20s).

Suggested change

let delay = mpc_node::protocol::signature::organize_posit_timeout() * 3 / 2;

let delay = mpc_node::protocol::signature::organize_posit_timeout() / 2;

jakmeier · 2026-04-16T17:28:17Z

Somehow test_sign_no_presignature_waste now fails, as only 72 out of 75 signatures are produced.

I only changed the entropy of the signature requests. This suggests that "unlucky" proposer assignment can cause long delays, too. I tried increasing the timeout and using different seeds.

But nothing works except the entropy currently on develop. That one assigned all requests to the same participant in any given round. (round 0, proposer = Participant(0); round 1, proposer = Participant(1); round 2, proposer = Participant(2)) Apparently that's the only way we have no presignature waste 😕

jakmeier · 2026-04-23T19:00:41Z

test_sign_no_presignature_waste fails because we eagerly start the signature when the threshold is reached. Node may then becomes proposer for signatures that are already taken care of and we waste signatures.

Fix in #765

volovyks reviewed Apr 9, 2026

View reviewed changes

jakmeier mentioned this pull request Apr 9, 2026

solana indexer hang Mar 19- Mar 24, 2026, likely stream getting stuck in run_stream #734

Open

jakmeier force-pushed the stream_component_test branch from b24050f to 5fd34dc Compare April 16, 2026 13:25

jakmeier added 7 commits April 16, 2026 17:30

test: MPC + stream component test

30c8e21

Create a MockStream that connects to the MPC fixture setup. This lets us test the indexer stream <-> MPC glue.

WIP: channel contention test

387b4c8

wip: contention problem found

1c6d9d8

but not with channel capacity?

fix rebase

a90f95f

good entropy in tests

2dddc87

otherwise it is always the same participant in round 0

contention tests are now running locally

eaa4708

adding useful cases and checking if they run in ci

clean up experiments and add delay test

b883f27

jakmeier force-pushed the stream_component_test branch from 8cd5c90 to b883f27 Compare April 16, 2026 15:32

jakmeier commented Apr 16, 2026

View reviewed changes

This was referenced Apr 16, 2026

investigate low eth in-time request rate and slow solana requests #751

Open

test: MPC + stream component test (part 1) #757

Draft

	let delay = mpc_node::protocol::signature::organize_posit_timeout() * 3 / 2;
	let delay = mpc_node::protocol::signature::organize_posit_timeout() / 2;

Conversation

jakmeier commented Apr 2, 2026

Uh oh!

volovyks left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jakmeier commented Apr 9, 2026

Uh oh!

jakmeier commented Apr 16, 2026

Uh oh!

jakmeier commented Apr 16, 2026

Uh oh!

jakmeier commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jakmeier commented Apr 16, 2026

Uh oh!

jakmeier commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jakmeier commented Apr 16, 2026 •

edited

Loading