Skip to content

Conversation

@ArielElp
Copy link

No description provided.

@reviewable-StarkWare
Copy link

This change is Reviewable

Copy link
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dorimedini-starkware reviewed 8 of 8 files at r1, all commit messages.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @nimrod-starkware)


crates/starknet_committer_cli/README line 46 at r1 (raw file):

2. Run the committer_benchmark_plotting script (check the different output options of the script).

python scripts/committer/committer_benchmark_combine_csv.py <LOCAL_CSV_DIR> -o <OUTPUT_CSV>

thanks!

Code quote:

committer_benchmark_combine_csv

crates/starknet_patricia_storage/Cargo.toml line 17 at r1 (raw file):

[dependencies]
hex.workspace = true
rust-rocksdb.workspace = true

alphabetize

Code quote:

rust-rocksdb.workspace = true

crates/starknet_patricia_storage/Cargo.toml line 25 at r1 (raw file):

starknet-types-core.workspace = true
starknet_api.workspace = true
thiserror.workspace = true

another side quest for you:
I think different storage solutions should be behind different features, so we don't always have to compile mdbx+rocksdb when using patricia storage.

  1. define features for short key storage (not merged yet?), mdbx, rocksdb (and maybe map storage...? though that seems overkill, no extra deps with map storage)
  2. find out which dependencies are relevant for which feature. for example, libmdbx and page_size are only required in mdbx, and blake2 is only required for short key storage (not merged yet, or you are not rebased). these dependencies should be marked as optional, and in the feature definition they should be activated (we have examples of this in our code base)
  3. the starknet_committer_cli crate should activate all these features (duh)
  4. feature-gate the relevant modules

Code quote:

[features]
testing = []

[lints]
workspace = true

[dependencies]
hex.workspace = true
rust-rocksdb.workspace = true
libmdbx.workspace = true
lru.workspace = true
page_size.workspace = true
serde = { workspace = true, features = ["derive"] }
serde_json.workspace = true
starknet-types-core.workspace = true
starknet_api.workspace = true
thiserror.workspace = true

crates/starknet_patricia_storage/src/rocksdb_storage.rs line 7 at r1 (raw file):

use crate::storage_trait::{DbHashMap, DbKey, DbValue, PatriciaStorageResult, Storage};

pub struct RocksdbStorage {

this is the convension we use

Suggestion:

RocksDbStorage {

crates/starknet_patricia_storage/src/rocksdb_storage.rs line 12 at r1 (raw file):

impl RocksdbStorage {
    pub fn open(path: &Path) -> PatriciaStorageResult<Self> {

I don't see where this function can fail, why return a result?

Suggestion:

pub fn open(path: &Path) -> Self

crates/starknet_patricia_storage/src/rocksdb_storage.rs line 13 at r1 (raw file):

impl RocksdbStorage {
    pub fn open(path: &Path) -> PatriciaStorageResult<Self> {
        use rust_rocksdb::Options;

move to top of module

Code quote:

use rust_rocksdb::Options;

crates/starknet_patricia_storage/src/rocksdb_storage.rs line 45 at r1 (raw file):

        let db = DB::open(&opts, path).unwrap();
        Ok(Self { db })
    }

I am thinking we would benefit from defining our own config struct.

  1. there are many magic numbers here, I prefer we have control on a high level
  2. give default values a name
  3. comments should start with uppercase and end in punctuation
const DEFAULT_BYTES_PER_SYNC: usize = 1 << 20; // 1MB.
..

struct RocksDbOptions {
    // Short comment explaining what this is.
    pub bytes_per_sync: usize,
    ..
}

impl Default for RocksDbOptions {
    fn default() -> Self {
        Self {
            bytes_per_sync: DEFAULT_BYTES_PER_SYNC,
            ..
        }
    }
}

impl RocksDbStorage {
    fn new(path: &Path, options: &RocksDbOptions) -> Self { .. }
}

Code quote:

    pub fn open(path: &Path) -> PatriciaStorageResult<Self> {
        use rust_rocksdb::Options;
        let mut opts = Options::default();
        opts.create_if_missing(true);

        opts.set_bytes_per_sync(1024 * 1024);
        // collect more data before writing to disk
        opts.set_write_buffer_size(128 * 1024 * 1024);
        opts.increase_parallelism(8);
        opts.set_max_background_jobs(8);
        // Add more memory buffers over the default of 2
        opts.set_max_write_buffer_number(4);

        let mut block = BlockBasedOptions::default();
        let cache = Cache::new_lru_cache(2 * 1024 * 1024 * 1024); // 2 GiB
        block.set_block_cache(&cache);

        // With a single level filter blocks become too large to sit in cache
        block.set_index_type(BlockBasedIndexType::TwoLevelIndexSearch);
        block.set_partition_filters(true);

        // default block size, consider increasing
        block.set_block_size(4 * 1024);
        block.set_cache_index_and_filter_blocks(true);
        // make sure filter blocks are cached
        block.set_pin_l0_filter_and_index_blocks_in_cache(true);

        block.set_bloom_filter(10.0, false);

        opts.set_block_based_table_factory(&block);

        let db = DB::open(&opts, path).unwrap();
        Ok(Self { db })
    }

crates/starknet_patricia_storage/src/rocksdb_storage.rs line 51 at r1 (raw file):

    fn get(&mut self, key: &DbKey) -> PatriciaStorageResult<Option<DbValue>> {
        let value = self.db.get(&key.0)?;
        Ok(value.map(DbValue))

or, if error conversion is required, do this:

Ok(self.db.get(&key.0).map(DbValue)?)

Suggestion:

        self.db.get(&key.0).map(DbValue)

crates/starknet_patricia_storage/src/rocksdb_storage.rs line 57 at r1 (raw file):

        &mut self,
        key: DbKey,
        value: crate::storage_trait::DbValue,

Suggestion:

DbValue,

crates/starknet_patricia_storage/src/rocksdb_storage.rs line 61 at r1 (raw file):

        let prev_val = self.db.get(&key.0)?;
        self.db.put(&key.0, &value.0)?;
        Ok(prev_val.map(DbValue))

hmm.. shame that we need an extra read to comply with the API..
sending you on a side quest here:
my guess is that we rarely, possibly never, use the returned value of set. see if you can (in a separate PR) change the API in the Storage trait to refrain from returning a value at all.

  1. if we don't ever use the return value - this will be relatively easy, although still a broad change.
  2. if we do sometimes use the returned value (not in tests! in business logic), add another API for get_prev_and_set in the Storage trait. I guess some concrete impls return the value on set by default, so that's easy, and in RocksDb-based storage you can use get and then set as you do here.

Code quote:

        let prev_val = self.db.get(&key.0)?;
        self.db.put(&key.0, &value.0)?;
        Ok(prev_val.map(DbValue))

crates/starknet_patricia_storage/src/rocksdb_storage.rs line 68 at r1 (raw file):

        for key in keys {
            res.push(self.db.get(&key.0)?.map(DbValue));
        }

you sure rocks DB doesn't support multi-get in a single operation?

Code quote:

        for key in keys {
            res.push(self.db.get(&key.0)?.map(DbValue));
        }

crates/starknet_patricia_storage/src/rocksdb_storage.rs line 77 at r1 (raw file):

        write_options.set_sync(false);
        // no write ahead log at all
        write_options.disable_wal(true);

see above: these should be controlled by the input RocksDbOptions.
you can add extra fields to the RocksDbStorage struct to support this

Code quote:

        // disable fsync after each write
        write_options.set_sync(false);
        // no write ahead log at all
        write_options.disable_wal(true);

crates/starknet_patricia_storage/src/rocksdb_storage.rs line 93 at r1 (raw file):

        }
        Ok(prev_val.map(DbValue))
    }

see above - shame to get the value before deleting if we don't actually use the functionality.
same side quest request :)

Code quote:

    fn delete(&mut self, key: &DbKey) -> PatriciaStorageResult<Option<DbValue>> {
        let prev_val = self.db.get(&key.0)?;
        if prev_val.is_some() {
            self.db.delete(&key.0)?;
        }
        Ok(prev_val.map(DbValue))
    }

Cargo.toml line 278 at r1 (raw file):

lazy_static = "1.5.0"
libmdbx = "0.3.5"
rust-rocksdb = "0.44.1"

we alphabetize the toml, you shouldn't be passing CI ATM

Code quote:

rust-rocksdb = "0.44.1"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants