Skip to content

Conversation

@joostjager
Copy link
Contributor

@joostjager joostjager commented Oct 15, 2025

This PR adds a new async payment benchmark test to integration_tests_rust.rs. The test sets up two nodes with a channel and measures performance while sending 1,000 concurrent payments. It uses Tokio’s multi-threaded runtime to spawn payments as separate async tasks and tracks completion through LDK events. The benchmark provides a simple way to gauge throughput and identify performance bottlenecks in payment handling.

Related: lightningdevkit/rust-lightning#4161

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Oct 15, 2025

👋 I see @tnull was un-assigned.
If you'd like another reviewer assignment, please click here.

@joostjager
Copy link
Contributor Author

Converted test to criterion benchmark and added it to CI. Also removed the payment throttling, now that it has been established that there is a bug in lightning-net-tokio that causes this. For the benchmark, the 1000 payments can now simply be launched at the same time.

Criterion reports outliers, so this should be able to detect to occassional 30 sec hangs in the payments test.

@joostjager joostjager marked this pull request as ready for review October 21, 2025 13:04
@joostjager joostjager requested a review from tnull October 21, 2025 13:04
@ldk-reviews-bot
Copy link

🔔 1st Reminder

Hey @tnull! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 2nd Reminder

Hey @tnull! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 3rd Reminder

Hey @tnull! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@joostjager joostjager removed the request for review from tnull October 28, 2025 09:33
To enable more realistic testing with sqlite as a backend.
Introduces a criterion-based benchmark that sends 1000 concurrent
payments between two LDK nodes to measure total duration. Also adds a
CI job to automatically run the benchmark.
@joostjager
Copy link
Contributor Author

Rebased onto async test framework changes.

@tnull
Copy link
Collaborator

tnull commented Oct 28, 2025

Mind rebasing once more to fix CI?

Copy link
Collaborator

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! The test store changes don't make sense to me, I believe you intended something else there. Otherwise some comments.

let node = builder.build_with_store(test_sync_store).unwrap();
let kv_store: Arc<DynStore> = match config.store_type {
TestStoreType::InMemory => {
Arc::new(TestSyncStore::new(config.node_config.storage_dir_path.into()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't do what you think it does. This is TestSyncStore, a store that writes to multiple backends to ensure they are kept in-sync, not the in-memory TestStore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I missed that indeed. Renamed InMemory to TestSyncStore. The intention of the commit is to allow a test/benchmark to run with sqlite (only).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.


// Spawn each payment as a separate async task
task::spawn(async move {
println!("{}: Starting payment", payment_hash.0.as_hex());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we positive all the printing IO doesn't impact the performance? Should we omit this for 'production' benchmarking?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did wonder the same. Maybe I should indeed remove it to be sure. Would that be a cfg flag?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be, but do we really need them in the first place?

// Pre-check the HTLC slots to try to avoid the performance impact of a failed payment.
while node_a.list_channels()[0].next_outbound_htlc_limit_msat == 0 {
println!("{}: Waiting for HTLC slots to free up", payment_hash.0.as_hex());
tokio::time::sleep(std::time::Duration::from_millis(100)).await;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you determine 100ms here? Do we have an intuition if this could starve any of the waiting payers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't great indeed. Basically something that looked 'reasonable' to me. I think it is a weakness in the API that a user even needs to do this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

Arc::new(TestSyncStore::new(config.node_config.storage_dir_path.into()))
},
TestStoreType::Sqlite => Arc::new(
SqliteStore::new(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just use the default build() here, which defaults to SQLite.

If you prefer to have it explicit, feel free to rename build to build_with_sqlite_store and have a new build method on Builder call it. Then you can explicitly set build_with_sqlite_store here.

// Pre-check the HTLC slots to try to avoid the performance impact of a failed payment.
while node_a.list_channels()[0].next_outbound_htlc_limit_msat == 0 {
println!("{}: Waiting for HTLC slots to free up", payment_hash.0.as_hex());
tokio::time::sleep(std::time::Duration::from_millis(100)).await;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.


vss-client = "0.3"
prost = { version = "0.11.6", default-features = false}
criterion = { version = "0.7.0", features = ["async_tokio"] }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not a dev-dependency?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants