Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: add store seed for stress test #621

Open
wants to merge 36 commits into
base: next
Choose a base branch
from

Conversation

TomasArrachea
Copy link
Collaborator

@TomasArrachea TomasArrachea commented Jan 15, 2025

This PR adds a binary program that seeds the store. The program creates new accounts, creates notes to consume assets on those accounts, builds the resulting blocks (using the BlockBuilder) and sends them to the store. Each block is built with 16 batches, where each batch contains 16 transactions.

The program also measures and prints the average insertion time of blocks into the store. For inserting 100k accounts, the resulting avg insertion time was ~1000 ms and finished in 7 minutes.

@TomasArrachea TomasArrachea force-pushed the tomasarrachea-stress-test branch 2 times, most recently from 6d62ab8 to 7a561c6 Compare January 15, 2025 15:19
@TomasArrachea TomasArrachea force-pushed the tomasarrachea-stress-test branch from 7a561c6 to a7d4130 Compare January 15, 2025 15:30
@TomasArrachea
Copy link
Collaborator Author

This current implementation is mimicking the default values of BlockBudget (with SERVER_MAX_BATCHES_PER_BLOCK = 4 and SERVER_MAX_TXS_PER_BATCH = 2) to build the transactions into the blocks. This means that to generate and store 1M accounts, 250k blocks are needed, which seems like a lot. Maybe using 16 tx per batch and 16 batches per block would make more sense? cc @bobbinth

@bobbinth
Copy link
Contributor

bobbinth commented Jan 15, 2025

This means that to generate and store 1M accounts, 250k blocks are needed, which seems like a lot. Maybe using 16 tx per batch and 16 batches per block would make more sense?

Yes, this is how I was thinking about it as well: 16 batches per block and 16 transactions per batch. Assuming we can insert ~5 blocks/second (just a guess, not sure how close this is to reality), we should be able to "fill" 1M blocks in about 15 mins.

To do this though, we can't go via the block producer (but should be fine if we build Block structs directly).

@TomasArrachea
Copy link
Collaborator Author

TomasArrachea commented Jan 15, 2025

This means that to generate and store 1M accounts, 250k blocks are needed, which seems like a lot. Maybe using 16 tx per batch and 16 batches per block would make more sense?

Yes, this is how I was thinking about it as well: 16 batches per block and 16 transactions per batch. Assuming we can insert ~5 blocks/second (just a guess, not sure how close this is to reality), we should be able to "fill" 1M blocks in about 15 mins.

To do this though, we can't go via the block producer (but should be fine if we build Block structs directly).

I just ran it locally with those settings (using the BlockBuilder) and got ~1.5 blocks/second, so storing the 7814 blocks would take about 85 mins. Will try to implement the Block building manually and check the difference.

Edit: actually inserting the blocks is much faster, main delay is grinding the accounts

@bobbinth
Copy link
Contributor

Edit: actually inserting the blocks is much faster, main delay is grinding the accounts

Are you using the latest next branch (which relies on the latest next from miden-base). Account grinding should be pretty minimal there.

@bobbinth
Copy link
Contributor

Are you using the latest next branch (which relies on the latest next from miden-base). Account grinding should be pretty minimal there.

btw, if account creation is a bottleneck with the next branch as well, we can always parallelize it (e.g., using rayon) to get a decent speed up. not sure if possible, but would be great to get to the point where insertions into the store are the bottleneck rather than data generation.

@TomasArrachea
Copy link
Collaborator Author

TomasArrachea commented Jan 16, 2025

Yes, even with latest version of miden-base, most of the binary runtime is spend on account grinding:
image

Using rayon sound good!

Edit: implemented this and got 40s to load all 1M accounts (storing 7814 blocks).

@TomasArrachea TomasArrachea force-pushed the tomasarrachea-stress-test branch from c303955 to 4395203 Compare January 16, 2025 20:54
@TomasArrachea
Copy link
Collaborator Author

Just pushed an PoC of a simple test: send a sync request to the loaded store and print the elapsed time. We were discussing with @igamigo about how much we would want this stress test to be configurable. We could load a config json file, specifying the db dump file (created with the binary) and the requests to send and measure. How were you thinking about it @bobbinth ?

@igamigo
Copy link
Collaborator

igamigo commented Jan 17, 2025

I think a good first approach could be to make a simple CLI for generating snapshots and then executing a benchmark against a snapshot.

@bobbinth
Copy link
Contributor

bobbinth commented Jan 17, 2025

implemented this and got 40s to load all 1M accounts (storing 7814 blocks).

Is 40 second correct here? it sounds unbelievably fast (about 200 blocks/second, right?).

Just pushed an PoC of a simple test: send a sync request to the loaded store and print the elapsed time. We were discussing with @igamigo about how much we would want this stress test to be configurable. We could load a config json file, specifying the db dump file (created with the binary) and the requests to send and measure. How were you thinking about it @bobbinth ?

I think as the first step, would be good to be able to set the following:

  • Number of accounts (or blocks) to be created - e.g., we may want to run this at some point with 1M, 10M, 100M etc. accounts.
  • Location of the DB dump file so that we could pick it up for further stress testing.

In the future, we could add more config options (e.g., number of transactions per block, mix of public vs. private accounts and notes) - but that would be in follow-up PRs.

In terms of metrics, let's also start simple - I'm fine with tracking just block insertion times for now. We can also expand this in the future PRs.

@TomasArrachea
Copy link
Collaborator Author

Is 40 second correct here? it sounds unbelievably fast (about 200 blocks/second, right?).

Sorry, I don't think that number was correct. I recorded measurements again and got:

  • 7814 block insertions in 700 seconds (about 10 blocks/second)
  • avg insertion time (block building and storing): 26 ms

@TomasArrachea TomasArrachea force-pushed the tomasarrachea-stress-test branch 5 times, most recently from 16e6098 to f435e68 Compare January 17, 2025 21:55
@TomasArrachea TomasArrachea force-pushed the tomasarrachea-stress-test branch from f435e68 to 5c39a4c Compare January 17, 2025 22:07
@TomasArrachea TomasArrachea marked this pull request as ready for review January 17, 2025 22:09
Comment on lines +62 to +65
/// consumes a note created from a faucet. The cli accepts the following parameters:
/// - `dump_file`: Path to the store database file.
/// - `num_accounts`: Number of accounts to create.
/// - `genesis_file`: Path to the genesis file of the store.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think these could also be doc comments in the SeedStore command as well to make Clap show help accordingly

new_faucet.id()
}

/// Create a new note with a single fungible asset.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
/// Create a new note with a single fungible asset.
/// Create a new note containing 10 tokens of the fungible asset associated with the specified `faucet_id`.


/// Create a new account with a given public key and anchor block. Generates the seed from the given
/// index.
fn create_account(anchor_block: &BlockHeader, public_key: PublicKey, index: usize) -> AccountId {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of usize, I think this could be u64 directly, unless there's something I'm missing

}

/// Seed the store with a given number of accounts.
async fn seed_store(dump_file: &Path, num_accounts: usize, genesis_file: &Path) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think this could be higher in the file to make the code a bit easier to follow

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! done in e545d92

Makefile Outdated
@@ -73,6 +73,10 @@ install-node: ## Installs node
install-faucet: ## Installs faucet
${BUILD_PROTO} cargo install --path bin/faucet --locked

.PHONY: install-stress-test
install-stress-test: ## Installs faucet
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
install-stress-test: ## Installs faucet
install-stress-test: ## Installs stress test binary

Makefile Outdated
@@ -73,6 +73,10 @@ install-node: ## Installs node
install-faucet: ## Installs faucet
${BUILD_PROTO} cargo install --path bin/faucet --locked

.PHONY: install-stress-test
install-stress-test: ## Installs faucet
${BUILD_PROTO} cargo install --path bin/stress-test --locked
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the crate doesn't have a build.rs I'm not sure that we need ${BUILD_PROTO} here.

Comment on lines 35 to 41
miden-air = { workspace = true }
winterfell = { version = "0.11" }
miden-objects = { workspace = true }
rand_chacha = { version = "0.3", default-features = false }
miden-lib = { workspace = true, features = ["testing"] }
miden-node-test-macro = { path = "../test-macro" }
miden-objects = { workspace = true, features = ["testing"] }
miden-tx = { workspace = true, features = ["testing"] }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should alphabetize the deps.

Also, we are using "testing" for some dependencies, maybe we need to use it only in [dev-dependencies]?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! didn't realize this was no longer needed

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Looks good. I left some comments inline - but a more general comment is that I find the current structure a bit difficult to follow. Would a more "linear" approach not work? For example, something like this:

  1. Create a genesis block with the faucet account.
  2. Create two sets of accounts $s_0$ and $s_1$, initially empty.
  3. Create two sets of notes $p_0$ and $p_1$, initially empty.
  4. Start a loop to create $n$ blocks (the loop would have $n$ iterations). In this loop:
    a. Create accounts to populate $s_0$ but do not record them on chain.
    b. Create a transaction that outputs a set of notes $p_0$ directed from the faucet to the accounts in $s_0$.
    c. Create transactions to consume notes in $p_1$ against accounts in $s_1$ (this will be a noop on the first iteration).
    d. Build a block from the above transactions and insert it into the store.
    e. Assing $s_1 \leftarrow s_0$ and $p_1 \leftarrow p_0$.

@@ -0,0 +1,26 @@
[package]
name = "miden-node-stress-test"
description = "Miden stress test binary"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would write the description as "A binary for running stress tests against the Miden node".

## Crate Features

- `tracing-forest`: sets logging using tracing-forest. Disabled by default.
- `testing`: includes testing util functions to mock block-producer behaviour, meant to be used during development and not on production. Disabled by default.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "not on production" -> "not in production".

Comment on lines 22 to 41
async-trait = { version = "0.1" }
itertools = { version = "0.13" }
miden-air = { workspace = true }
miden-lib = { workspace = true }
miden-node-proto = { workspace = true }
miden-node-test-macro = { path = "../test-macro" }
miden-node-utils = { workspace = true }
miden-objects = { workspace = true }
miden-processor = { workspace = true }
miden-stdlib = { workspace = true }
miden-tx = { workspace = true }
rand = { version = "0.8" }
rand_chacha = { version = "0.3", default-features = false }
serde = { version = "1.0", features = ["derive"] }
thiserror = { workspace = true }
tokio = { workspace = true, features = ["rt-multi-thread", "net", "macros", "sync", "time"] }
tokio-stream = { workspace = true, features = ["net"] }
tonic = { workspace = true }
tracing = { workspace = true }
winterfell = { version = "0.11" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding miden-node-test-macro as a dependency would make it impossible to publish this crates on crates.io (since all dependencies must be published).

Also, if we do need to move some dependencies from dev-dependencies here, we should make them optional and enabled only when the testing feature is enabled.

Comment on lines +165 to +167
/// Create a new note containing 10 tokens of the fungible asset associated with the specified
/// `faucet_id`.
fn create_note(faucet_id: AccountId) -> Note {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: are we creating a P2ID notes here (I think we should be)? If so, do we not need to specify the note recipient?

Comment on lines +224 to +230
/// Create a given number of notes and group them into transactions and batches.
/// The batches are sent to the block builder.
fn generate_note_batches(
num_notes: usize,
faucet_id: AccountId,
batch_sender: UnboundedSender<TransactionBatch>,
) -> Vec<Note> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I fully understood how this works. Wouldn't we need to create just 1 transaction per block to create notes? That is, this transaction would be executed against the faucet account and would output 256 notes directed to 256 accounts that were created in the previous block. Or does this cause issues somehow?

Comment on lines 135 to 139
println!(
"Consumed notes: inserted {} blocks with avg insertion time {} ms",
num_insertions,
(insertion_time / num_insertions).as_millis()
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be cool to print out average block insertion time per some number of blocks (i.e., for every 1000 blocks). This way, we'd see how the insertion time changes the the number of blocks in the store increases.

Also, would be cool to print out the store database file size, ideally for each block interval as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants