Add registry for chunk key encodings for extensibility #3436

RFLeijenaar · 2025-09-05T10:50:39Z

Following #3433 (@d-v-b), I have implemented a registry for chunk key encodings. This allows users to subclass ChunkKeyEncoding and create their own implementation.

The scope of ChunkKeyEncoding.from_dict() function is reduced to what you expect it do: build from a dict. I placed the parsing function in chunk_key_encodings.py as it is used in both array and metadata v3 construction. I wasn't sure where else to put it.

TODO:

Add unit tests and/or doctests in docstrings
Add docstrings and API docs for any new/modified user-facing classes and functions
New/modified features documented in docs/user-guide/*.rst
Changes documented as a new file in changes/
GitHub Actions have all passed
Test coverage is 100% (Codecov passes)

codecov · 2025-09-05T11:11:17Z

Codecov Report

❌ Patch coverage is 83.01887% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.86%. Comparing base (3d0e40e) to head (fe11bf8).

Files with missing lines	Patch %	Lines
src/zarr/registry.py	65.38%	9 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3436      +/-   ##
==========================================
- Coverage   94.92%   94.86%   -0.06%     
==========================================
  Files          79       79              
  Lines        9491     9518      +27     
==========================================
+ Hits         9009     9029      +20     
- Misses        482      489       +7

Files with missing lines	Coverage Δ
src/zarr/core/array.py	`97.44% <100.00%> (-0.01%)`	⬇️
src/zarr/core/chunk_key_encodings.py	`89.70% <100.00%> (+3.55%)`	⬆️
src/zarr/core/config.py	`83.33% <ø> (ø)`
src/zarr/core/metadata/v3.py	`90.15% <100.00%> (ø)`
src/zarr/registry.py	`85.20% <65.38%> (-3.61%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

RFLeijenaar · 2025-09-05T11:50:46Z

Note that the current implementation does not use the global config. This means that it only supports a single implementation (registration) for each chunk key encoding 'type' as indicated by the name field. Like codecs (parse_codecs), it uses the name field to retrieve the correct class from the registry. This does mean that the chunk key encoding should always be registered with a qualname that matches the name field.

With the entrypoint plugin mechanism this currently doesn't work correctly. It uses the full class path rather than the entrypoint name e.name for the key.

I will update the registry to bring it in line with how codecs are implemented, with a dict of registries, that map a chunk key encoding type (name) to a registry that contains possibly multiple implementations for that type.

its implementation indicated by qualname. Set default chunk key encodings implementations for `default` and `v2` in global config.

d-v-b · 2025-09-05T13:20:51Z

I will update the registry to bring it in line with how codecs are implemented, with a dict of registries, that map a chunk key encoding type (name) to a registry that contains possibly multiple implementations for that type.

You don't need to copy the codec registry. A simple {name: class} mapping is fine for chunk key encodings.

d-v-b · 2025-09-05T13:22:30Z

for context, the codec registry associates multiple codec classes with a single codec identifier because of need specific to codecs (running the same codec algorithm on a CPU vs GPU). Chunk key encodings will never be run on specialized hardware, so we can use a simpler mapping.

RFLeijenaar · 2025-09-05T13:37:17Z

for context, the codec registry associates multiple codec classes with a single codec identifier because of need specific to codecs (running the same codec algorithm on a CPU vs GPU). Chunk key encodings will never be run on specialized hardware, so we can use a simpler mapping.

Originally, I implemented it such that it would only allow one implementation, but there were some issues as noted in my other comment. That said, these can be resolved by always using the name field as the key, and for any entrypoint e setting the key to e.name.

Regarding different implementations, it doesn't need to be on special hardware, right? One could also make their own implementation, possibly with bindings in a faster language. Not that I think that this would be common for chunk key encodings.

Let me know if you want me to revert to the simpeler, one implementation only design.

This enables users to add additional fields to a custom ChunkKeyEncoding without having to override __init__ and taking care of immutability of the attrs.

RFLeijenaar added 2 commits September 5, 2025 12:26

Add registry for chunk key encodings.

0379429

Fix error message for unknown chunk key encoding in create_array test

904df22

github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Sep 5, 2025

Removed unneccsary type ignore

f3e4275

RFLeijenaar added 3 commits September 5, 2025 14:57

Seperate chunk key encoding 'name' (key) from

5efb587

its implementation indicated by qualname. Set default chunk key encodings implementations for `default` and `v2` in global config.

Remove comments and update test_config.

b0091d9

Update output of zarr.config.pprint() in docs.

281e87d

Move parsing of init args in CKE to __post_init__.

fe11bf8

This enables users to add additional fields to a custom ChunkKeyEncoding without having to override __init__ and taking care of immutability of the attrs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add registry for chunk key encodings for extensibility #3436

Add registry for chunk key encodings for extensibility #3436

RFLeijenaar commented Sep 5, 2025

Uh oh!

codecov bot commented Sep 5, 2025 •

edited

Loading

Uh oh!

RFLeijenaar commented Sep 5, 2025 •

edited

Loading

Uh oh!

d-v-b commented Sep 5, 2025

Uh oh!

d-v-b commented Sep 5, 2025

Uh oh!

RFLeijenaar commented Sep 5, 2025

Uh oh!

Uh oh!

Uh oh!

Add registry for chunk key encodings for extensibility #3436

Are you sure you want to change the base?

Add registry for chunk key encodings for extensibility #3436

Conversation

RFLeijenaar commented Sep 5, 2025

Uh oh!

codecov bot commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

RFLeijenaar commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

d-v-b commented Sep 5, 2025

Uh oh!

d-v-b commented Sep 5, 2025

Uh oh!

RFLeijenaar commented Sep 5, 2025

Uh oh!

Uh oh!

codecov bot commented Sep 5, 2025 •

edited

Loading

RFLeijenaar commented Sep 5, 2025 •

edited

Loading