Manifest Splitting #767

dcherian · 2025-02-21T19:51:37Z

does the config get serialized properly?
real-world benchmark; test with ERA5
add ndim based condition (3D vs 4D)

icechunk/src/session.rs

dcherian · 2025-02-21T19:54:39Z

icechunk/src/format/manifest.rs

+pub struct ManifestShards(Vec<ManifestExtents>);
+
+impl ManifestShards {
+    pub fn default(ndim: usize) -> Self {


I don't like this, but it is certainly tied to ndim.

Maybe ManifestSplits is an enum to avoid this?

enum ManifestSplits { Single, Multiple(Vec<ManifestExtents>) }

What I don't like is the empty vector. I wonder if Rust has a NonEmptyVec type, otherwise, a trick people use is:

... Multiple{ first: ManifestExtents, rest: Vec<ManifestExtents>}

Ok I don't need the default any more. It was an artifact that appeared because I implemented the core logic before wiring up the config. Now the default gets set when parsing the config using the Array Metadata

icechunk/src/format/manifest.rs

paraseba · 2025-02-21T19:59:31Z

icechunk/src/format/manifest.rs

@@ -37,9 +33,77 @@ impl ManifestExtents {
        Self(v)
    }

+    pub fn contains(&self, coord: &[u32]) -> bool {
+        self.iter().zip(coord.iter()).all(|(range, that)| range.contains(that))


We need to start checking on writes that indexes have the proper size for the metadata

icechunk/src/format/manifest.rs

icechunk/src/session.rs

This reverts commit acf8fa5.

icechunk/src/config.rs

dcherian · 2025-03-18T17:03:49Z

icechunk/src/session.rs

+        for chunk in chunks {
+            let shard_index = shards.which(&chunk.coord)?;
+            sharded_refs
+                .entry(shard_index)
+                .or_insert_with(|| Vec::with_capacity(ref_capacity))
+                .push(chunk);
+        }


I am attempting to convert this method to accept a impl Stream<Item=SessionResult<ChunkInfo>> but I don't see how to convert this groupby logic.

dcherian · 2025-03-18T17:17:12Z

icechunk/src/config.rs

+//      - 0: 120
+//  - path: ./temperature  # 4D variable: (time, level, latitude, longitude)
+//    manifest-split-sizes:
+//      - "level": 1  # alternatively 0: 1


needs validation. E.g. do these dimensions exist? does that axis number make sense?

dcherian · 2025-03-18T17:22:29Z

icechunk/src/config.rs

 }

 static DEFAULT_MANIFEST_PRELOAD_CONFIG: OnceLock<ManifestPreloadConfig> = OnceLock::new();
+static DEFAULT_MANIFEST_SHARDING_CONFIG: OnceLock<ManifestShardingConfig> =
+    OnceLock::new();

 impl ManifestConfig {
    pub fn merge(&self, other: Self) -> Self {


merging is for merging a user-defined value with the library's default value.

dcherian · 2025-03-18T17:23:20Z

icechunk/src/config.rs

+        Self {
+            preload: other.preload.or(self.preload.clone()),
+            // FIXME: why prioritize one over the other?
+            sharding: other.sharding.or(self.sharding.clone()),


TODO: this could be overwrite instead. We need to careful about ordering after merge.

dcherian · 2025-04-02T19:29:44Z

Local benchmarks as of late last night:

Read: 4x speedup

python benchmarks/runner.py --pytest '-m read_benchmark' main 137f2834

benchmark code: one array with shape 500_000_000, chunks=1000, shard_size=100_000 so 5 shards with 100_000 chunk refs each. Reading one element (so really one chunk request).

    def fn():
        repo = ic.Repository.open(
            storage=synth_dataset.storage,
            config=ic.RepositoryConfig(manifest=ic.ManifestConfig(preload=preload)),
        )
        ds = xr.open_zarr(
            repo.readonly_session("main").store,
            group=synth_dataset.group,
            chunks=None,
            consolidated=False,
        )
        subset = ds.isel(synth_dataset.chunk_selector)
        subset[synth_dataset.load_variables].compute()

Parameterized over:

sharding : either unsharded or sharded (5 shards)
preload: off or default (makes no difference)

------------------ benchmark 'xarray-read test_time_xarray_read_chunks_cold_cache': 6 tests ------------------
Name (time in ms)                                                                             Median
--------------------------------------------------------------------------------------------------------------
test_time_xarray_read_chunks_cold_cache[large-manifest-sharded-default] (this PR)       10.1482 (1.0)
test_time_xarray_read_chunks_cold_cache[large-manifest-sharded-off] (this PR)           10.2042 (1.01)
test_time_xarray_read_chunks_cold_cache[large-manifest-unsharded-default] (main)     43.5519 (4.29)
test_time_xarray_read_chunks_cold_cache[large-manifest-unsharded-default] (this PR)     45.1441 (4.45)
test_time_xarray_read_chunks_cold_cache[large-manifest-unsharded-off] (main)         43.3678 (4.27)
test_time_xarray_read_chunks_cold_cache[large-manifest-unsharded-off] (this PR)         45.2304 (4.46)
--------------------------------------------------------------------------------------------------------------

Write: 10% slowdown on commit.

python benchmarks/runner.py --pytest '-k write_sharded_refs' main 137f2834

benchmark: This benchmarks only session.commit after setting 500_000 virtual chunk refs

sharding: None (default) or shard size = 10_000 refs, so 50 shards in total.

even for the default case of writing a single shard. But this is only 0.5s so I don't think it matters much.

---------------- benchmark 'refs-write test_write_sharded_refs': 3 tests ----------------
Name (time in ms)                                                        Median
-----------------------------------------------------------------------------------------
test_write_sharded_refs[no-sharding-large-1d] (main)           491.2220 (1.0)
test_write_sharded_refs[no-sharding-large-1d] (this PR)           545.9192 (1.11)
test_write_sharded_refs[shard-size-10_000-large-1d] (this PR)     555.0974 (1.13)
-----------------------------------------------------------------------------------------

Local ----- S3 --

* main: Fix `Diff` python typing (#890) Fail when creating Storage for Tigris using s3_compatible (#889) Disallow tag/branch creation with non-existing snapshot (#888) Log errors during listing and deleting of objects (#886) Rust integration tests can run in more object stores. (#884) Update pyo3. (#885) Add expiration to stateful test (#868)

* main: Better `Debug` instances and __repr__ methods. (#891) Add chunk container repr, fix test dataset (#893)

* main: Release version v0.2.12 (#894) Use dask array native reduction (#864) Update sample-datasets page (#887)

icechunk/src/format/manifest.rs

paraseba · 2025-04-11T00:52:33Z

icechunk/src/format/manifest.rs

+pub struct ManifestShards(Vec<ManifestExtents>);
+
+impl ManifestShards {
+    pub fn default(ndim: usize) -> Self {


Maybe ManifestSplits is an enum to avoid this?

enum ManifestSplits { Single, Multiple(Vec<ManifestExtents>) }

What I don't like is the empty vector. I wonder if Rust has a NonEmptyVec type, otherwise, a trick people use is:

... Multiple{ first: ManifestExtents, rest: Vec<ManifestExtents>}

paraseba · 2025-04-11T00:57:09Z

icechunk/src/format/manifest.rs

+    ///     ]
+    /// );
+    /// assert_eq!(actual, expected);
+    /// ```


would you be willing to write some property tests for this function?

icechunk/src/format/snapshot.rs

paraseba · 2025-04-11T01:05:51Z

icechunk/src/format/snapshot.rs

@@ -87,6 +100,15 @@ impl ArrayShape {
    }
 }

+// Implement indexing for immutable access
+impl Index<usize> for ArrayShape {


What I don't like about this: panics. Maybe it should return an Option? Not sure if that's something people do.

I think that's what the trait requires, so our hands are tied, no?

I could add a get method instead that returns Option

yes, that is what i was trying to say, a get to get an option would be safer.

icechunk/src/session.rs

paraseba · 2025-04-11T19:10:10Z

icechunk/src/session.rs

@@ -1708,6 +1899,7 @@ async fn fetch_manifest(
 /// available in `from` and `to` arguments.
 ///
 /// Yes, this is horrible.
+#[allow(dead_code)]


we no longer use this shit?

Yeah, i do a pass to group the references in to a manifest shard, so I just accumulate in that pass. I can delete.

icechunk/src/store.rs

icechunk/tests/test_gc.rs

icechunk-python/tests/test_can_read_old.py

* main: Bump the rust-dependencies group with 2 updates (#909) Release version 0.2.13 (#907) Skip bytes logging in object_store (#906) More randomness for test repo prefixes (#905) S3 Storage supports setting storage class (#903) Update configuration.md (#899) Add example to exercise high read concurrency (#896) Bump the rust-dependencies group with 2 updates (#897)

dcherian commented Feb 21, 2025

View reviewed changes

icechunk/src/session.rs Outdated Show resolved Hide resolved

dcherian force-pushed the split-manifests branch from 8630650 to fd1c572 Compare February 21, 2025 19:54

dcherian commented Feb 21, 2025

View reviewed changes

icechunk/src/format/manifest.rs Outdated Show resolved Hide resolved

paraseba reviewed Feb 21, 2025

View reviewed changes

dcherian force-pushed the split-manifests branch 2 times, most recently from e7d9221 to 09476a4 Compare March 6, 2025 23:02

dcherian commented Mar 6, 2025

View reviewed changes

icechunk/src/session.rs Outdated Show resolved Hide resolved

dcherian added 7 commits March 17, 2025 11:09

[WIP] manifest sharding

de9ab79

WIP

14d9048

More condition parsing work

6aa7e29

thread config through

70412ec

clippied

267c9cf

WIP proptest

acf8fa5

Revert "WIP proptest"

2542412

This reverts commit acf8fa5.

dcherian force-pushed the split-manifests branch from 9c1605f to 34126a8 Compare March 17, 2025 21:28

Simple test passes!

d816f8b

dcherian force-pushed the split-manifests branch from 34126a8 to d816f8b Compare March 17, 2025 23:20

dcherian commented Mar 18, 2025

View reviewed changes

icechunk/src/config.rs Outdated Show resolved Hide resolved

dcherian commented Mar 18, 2025

View reviewed changes

icechunk/src/config.rs Outdated Show resolved Hide resolved

dcherian force-pushed the split-manifests branch from 76478b1 to 9a8bbc0 Compare March 18, 2025 16:27

Cleanup

a64252a

dcherian force-pushed the split-manifests branch from 9a8bbc0 to a64252a Compare March 18, 2025 16:28

Better test

8ad20e8

dcherian commented Mar 18, 2025

View reviewed changes

dcherian added 2 commits March 18, 2025 14:08

More tests

ac42185

Lossen type

34da81b

dcherian added 5 commits April 1, 2025 14:02

ShardDimCondition::Any -> Rest

e648329

Minor cleanup

aa82355

More complex tests

5d2318b

Aggregate extents while grouping shards.

10e3b7e

Update reprs

137f283

dcherian force-pushed the split-manifests branch from b512ab1 to 137f283 Compare April 1, 2025 20:07

Benchmarks cleanup

1119f81

dcherian added 6 commits April 2, 2025 13:43

Add write benchmark

f3db41b

Local ----- S3 --

Add read benchmark

ac6a0ef

Add rust test for large numbers of refs

ea50452

Merge branch 'main' into split-manifests

2ea8e2a

* main: Better `Debug` instances and __repr__ methods. (#891) Add chunk container repr, fix test dataset (#893)

Add to test_can_read_old.py

33966cb

dcherian force-pushed the split-manifests branch from ed14ec4 to 33966cb Compare April 3, 2025 21:24

dcherian added 2 commits April 4, 2025 13:59

shard → split

80f8f5d

Merge branch 'main' into split-manifests

94bbf9e

* main: Release version v0.2.12 (#894) Use dask array native reduction (#864) Update sample-datasets page (#887)

dcherian changed the title ~~Manifest Sharding~~ Manifest Splitting Apr 4, 2025

one more rename

6fc7eb6

dcherian force-pushed the split-manifests branch from c81875d to 6fc7eb6 Compare April 4, 2025 20:13

paraseba reviewed Apr 11, 2025

View reviewed changes

dcherian added 4 commits April 15, 2025 10:30

Address minor comments.

c35b589

Comment out handling sessionerror.

f6156b9

Rest -> Any

48bebf2

dcherian force-pushed the split-manifests branch from 51d9d02 to 48bebf2 Compare April 15, 2025 18:02

dcherian added 2 commits April 15, 2025 12:21

Assert len(manifestextents) > 0

0d7e01f

New ManifestSplitDim struct

8c4cc59

dcherian force-pushed the split-manifests branch from 3dbac59 to 8c4cc59 Compare April 15, 2025 18:51

lint

872a522

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Manifest Splitting #767

Manifest Splitting #767

dcherian commented Feb 21, 2025 •

edited

Loading

dcherian Feb 21, 2025

paraseba Apr 11, 2025

dcherian Apr 15, 2025

paraseba Feb 21, 2025

dcherian Mar 18, 2025

dcherian Mar 18, 2025

dcherian Mar 18, 2025

dcherian Mar 18, 2025

dcherian commented Apr 2, 2025 •

edited

Loading

paraseba Apr 11, 2025

paraseba Apr 11, 2025

paraseba Apr 11, 2025

dcherian Apr 15, 2025

paraseba Apr 17, 2025

paraseba Apr 11, 2025

dcherian Apr 15, 2025

Manifest Splitting #767

Are you sure you want to change the base?

Manifest Splitting #767

Conversation

dcherian commented Feb 21, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dcherian commented Apr 2, 2025 • edited Loading

Read: 4x speedup

Write: 10% slowdown on commit.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dcherian commented Feb 21, 2025 •

edited

Loading

dcherian commented Apr 2, 2025 •

edited

Loading