feat: SmtStorageReader and SparseMerkleTreeReader traits#967
Conversation
iamrecursion
left a comment
There was a problem hiding this comment.
This seems reasonable at first glance, but I am concerned about the code that has been manually inlined from the SparseMerkleTree trait.
Also, what is your plan with regards to the fact that this PR is targeted for main? These changes should also be included on next in my view.
Will create a PR into next after this is merged. I'm working off of v0.23.0 because that is what miden-node / protocol are currently using. |
iamrecursion
left a comment
There was a problem hiding this comment.
This looks mostly good to me. I've left a few comments inline and I'd like to take another look before merge.
Souds good to me. |
huitseeker
left a comment
There was a problem hiding this comment.
re: #967 (comment)
I have ported the RocksDB changes to the node.
I opened #980 as one way of addressing the issues raised in threads here.
huitseeker
left a comment
There was a problem hiding this comment.
Checkpointing my partial re-review. Overall LGTM except for the line comment, will come back a bit later to stamp.
| } | ||
| } | ||
|
|
||
| Ok(MemoryStorage { leaves, subtrees }.into_snapshot()) |
There was a problem hiding this comment.
Come to think of it: could this snapshot keep the depth-24 cache too?
LargeSmt::load will accept this SmtStorageSnapshot, and because it is backed by MemoryStorage, get_depth24() returns an empty vec even when the snapshot has leaves/subtrees — so a non-empty snapshot panics in debug in initialize_from_storage and reconstructs the wrong top in release.
There was a problem hiding this comment.
I opted to add an implementation to get_depth24() for MemoryStorage and added a test. Let me know if you would have approached it differently.
There was a problem hiding this comment.
LGTM with one caveat, @sergerad : this looks like a breaking public API change. SmtStorage is re-exported from miden_crypto::merkle::smt, and every downstream impl SmtStorage now needs a new SmtStorageReader impl, type Reader, and reader().
I'd at least call that out as breaking in the PR description and changelog, even if we do expect external backends to adapt quickly. If only to help with versioning.
Ended up pointing to next branch because this PR has made breaking changes. |
bobbinth
left a comment
There was a problem hiding this comment.
Looks good! Thank you! I left some comments/questions inline.
| /// Type alias for boxed storage with boxed reader | ||
| pub type BoxedSmtStorage = Box<dyn SmtStorage<Reader = Box<dyn SmtStorageReader>>>; |
There was a problem hiding this comment.
If these are for benchmarks only, should we put them behind the testing feature? (not sure if that will work - but would be good to check)
| /// Implementations may return either a point-in-time snapshot or a live view. | ||
| pub trait SmtStorageReader: 'static + fmt::Debug + Send + Sync { |
There was a problem hiding this comment.
Related to the previous comment: I think returning point-in-time snapshots is fine - but can we really return readers with "live views"? At the storage level this is possible, but since we clone the in-memory portion of the SMT, the in-memory portion would become inconsistent with the storage portion as soon as the data in the storage changes.
Unless the above reasoning is incorrect, I think we should make it an explicit requirement that storage readers must always be against point-in-time snapshots.
| /// Implementations may return either a point-in-time snapshot or a live view. Either way, the | ||
| /// view must be of consistent / committed state (not partial). Holding the reader must not | ||
| /// block writes in any way. | ||
| fn reader(&self) -> Result<Self::Reader, StorageError>; |
There was a problem hiding this comment.
Same comment as above re "live view".
This is all already behind the executable feature #[cfg(feature = "executable")]
mod boxed_storage; |
bobbinth
left a comment
There was a problem hiding this comment.
Looks good! Thank you!
I didn't review RocksDbSnapshot and related structs in depth because I'm assuming this will serve more like a guiding implementation for the structs that will live in miden-node.
| pub struct RocksDbSnapshotStorage { | ||
| inner: Arc<RocksDbSnapshotInner>, | ||
| } |
There was a problem hiding this comment.
When we move RocksDB storage into miden-node completely, it would be good to add some integration tests to make sure this all works as expected.
Description
Extract a new
SmtStorageReadertrait fromSmtStorage, enablingLargeSmt<S>to work with read-only storage for read operations, and only requireSmtStoragefor mutations.Extract a new
SparseMerkleTreeReadertrait fromSparseMerkleTree, enablingLargeSmt<S>to implement read-only trait methods with only anS: SmtStorageReaderbound.Motivation
Previously,
LargeSmt<S: SmtStorage>required full read+write storage for all operations, including purely read-only ones likeget_leaf,get_value,open, and tree loading. This made it impossible to useLargeSmtwith a read-only storage backend.The
SparseMerkleTreetrait bundled read and write methods together, soimpl SparseMerkleTree for LargeSmt<S>requiredS: SmtStorage. Splitting the trait allowsLargeSmtto implement read-only methods with only theSmtStorageReaderbound.In miden-node (PR), we are working towards removing synchronization mechanisms by providing read-only types which can be cloned.
Changes
Storage trait hierarchy (
storage/mod.rs):SmtStorageReader—&selfmethods (leaf/subtree reads, iterators,get_depth24)SmtStorage: SmtStorageReader—&mut selfmethods (inserts, removes,apply)SMT trait hierarchy (
smt/mod.rs):SparseMerkleTreeReader— read-only methods (root,get_inner_node,get_value,get_leaf,open,get_path,compute_mutations, etc.)SparseMerkleTree: SparseMerkleTreeReader— write methods (set_root,insert_inner_node,remove_inner_node,insert_value,insert,apply_mutations, etc.)from_raw_partsremoved from the trait and inlined into the inherent methods onSmtandSimpleSmt, sinceLargeSmtcould never meaningfully implement it.LargeSmt<S>struct bound relaxed (large/mod.rs):S: SmtStoragetoS: SmtStorageReaderimpl<S: SmtStorageReader>— accessors (root,get_leaf,get_value,open,is_empty), iterators, construction from existing storage (new,load,load_with_root)impl<S: SmtStorage>—insert,with_entries, batch construction methodsLargeSmt<S>trait impls split (large/smt_trait.rs):impl<S: SmtStorageReader> SparseMerkleTreeReader for LargeSmt<S>— all read-only trait methodsimpl<S: SmtStorage> SparseMerkleTree for LargeSmt<S>— write methods onlyOther files updated:
SmtandSimpleSmt— trait impls split intoSparseMerkleTreeReader+SparseMerkleTreebatch_ops.rs— read-only qualified calls updated toSparseMerkleTreeReaderconstruction.rs— split into Reader (load from storage) and Storage (build with entries) blocksiter.rs— bounded bySmtStorageReadermemory.rs,rocksdb.rs— splitimpl SmtStorageintoimpl SmtStorageReader+impl SmtStoragesmt/mod.rs— re-exportsSmtStorageReaderandSparseMerkleTreeReaderBackwards compatibility
SmtStorageretains all original methods (read + write) via its supertraitSmtStorageReader, so existing code usingS: SmtStoragecontinues to compile unchanged.Both
SparseMerkleTreeandSparseMerkleTreeReaderarepub(crate), so the trait split has no public API impact.from_raw_partswas only ever called through the inherent methods onSmtandSimpleSmt, which retain the same signature.