Skip to content

Conversation

dstansby
Copy link
Contributor

@dstansby dstansby commented Jul 28, 2025

It has been a longstanding bugbear of mine that there's no easy way to specify a v2 or v3 array type. This especially came up in the context of #3257, which deals specifically with v2/v3 arrays.

This PR ads type parametrization to the Array class.

After this PR, there is lots of improvements to adding overloads to functions and methods that could be made, but to keep review easier I'd like to leave that for a follow up PR.

@github-actions github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Jul 28, 2025
Copy link

codecov bot commented Jul 28, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.93%. Comparing base (e738e2f) to head (2a34f73).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3304   +/-   ##
=======================================
  Coverage   94.92%   94.93%           
=======================================
  Files          79       80    +1     
  Lines        9491     9507   +16     
=======================================
+ Hits         9009     9025   +16     
  Misses        482      482           
Files with missing lines Coverage Δ
src/zarr/api/asynchronous.py 90.87% <100.00%> (ø)
src/zarr/api/synchronous.py 92.95% <100.00%> (ø)
src/zarr/core/array.py 97.44% <100.00%> (-0.01%) ⬇️
src/zarr/core/attributes.py 96.15% <100.00%> (ø)
src/zarr/core/group.py 95.06% <100.00%> (ø)
src/zarr/core/indexing.py 96.26% <100.00%> (ø)
src/zarr/core/metadata/__init__.py 100.00% <100.00%> (ø)
src/zarr/core/sync_group.py 100.00% <100.00%> (ø)
src/zarr/testing/strategies.py 97.82% <100.00%> (+0.01%) ⬆️
src/zarr/types.py 100.00% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dstansby dstansby force-pushed the array-param branch 3 times, most recently from 53da48a to d9e50d2 Compare July 28, 2025 14:51
@dstansby dstansby marked this pull request as ready for review July 28, 2025 15:46
@github-actions github-actions bot removed the needs release notes Automatically applied to PRs which haven't added release notes label Jul 28, 2025
AsyncArrayV3: TypeAlias = AsyncArray[ArrayV3Metadata]
"""A Zarr format 3 `AsyncArray`"""

AnyArray: TypeAlias = Array[Any]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be Array[ArrayV2Metadata] | Array[ArrayV3Metadata]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I guess that's safer 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't get that to work, beacuse of an error that I sort of but not really understand ☹️ . I think the current state with Any is fine though - .metadata still gets inferred correctly:

import zarr
import zarr.storage

arr = zarr.ones(shape=(1,))
meta = arr.metadata
reveal_locals()
"""
test.py:6: note: Revealed local types are:
test.py:6: note:     arr: zarr.core.array.Array[Any]
test.py:6: note:     meta: zarr.core.metadata.v2.ArrayV2Metadata | zarr.core.metadata.v3.ArrayV3Metadata
"""

@dstansby dstansby added this to the 3.1.2 milestone Jul 30, 2025
@dstansby dstansby force-pushed the array-param branch 3 times, most recently from bbabb9b to a168b6c Compare August 5, 2025 12:00
async def empty_like(
a: ArrayLike, **kwargs: Any
) -> AsyncArray[ArrayV2Metadata] | AsyncArray[ArrayV3Metadata]:
async def empty_like(a: ArrayLike, **kwargs: Any) -> AnyAsyncArray:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you have the energy for it, I think we could type **kwargs on one place with a typeddict and use Unpack in the signatures here.

this would allow us to use an overload to declare that empty_like(..., zarr_format=2) -> AsyncArray[ArrayV2Metadata]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to leave anything beyond the new parametrization and using the new Any* types to follow up PRs if that's okay, to keep this PR tightly scoped.

@d-v-b d-v-b mentioned this pull request Aug 13, 2025
@dstansby dstansby force-pushed the array-param branch 3 times, most recently from 9c3938f to 2807003 Compare August 26, 2025 14:44
@@ -2091,7 +2092,7 @@ def open(
Array opened from the store.
"""
async_array = sync(AsyncArray.open(store))
return cls(async_array)
return Array(async_array)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change? I think using cls here ensures that subclasses can re-use this method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, the correct fix here was to keep cls but change the return type to Self. Fixed, and fixed in a couple of other places in this file too.

Comment on lines +3 to +5
from zarr.core.array import Array, AsyncArray
from zarr.core.metadata.v2 import ArrayV2Metadata
from zarr.core.metadata.v3 import ArrayV3Metadata
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these imports will prevent zarr.core.array from using anything defined in zarr.types, is that intentional?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(outside of a type-checking context, I mean)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking more about this, I think as long as everything in zarr.types is imported exclusively in if TYPE_CHECKING blocks, then there's no risk of circular imports. we would only have circular imports if we tried to work with types as values outside of type-checking, for example using get_args(ZarrFormat). We can always avoid this by defining a typed constant value somewhere else in the codebase.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I certinaly didn't think about this! So neither intentional or unintentional. I'm not sure I see a better solution, but open to suggestions.

@d-v-b
Copy link
Contributor

d-v-b commented Sep 3, 2025

I think these changes are good, as they make it easier to track the v2-ness of v3-ness of the Array class.

But since this PR changes the Array class, and it will also require anyone who annotated variables with the Array type to update that annotation, I think it's important that we get a lot of eyes on this PR.

cc @zarr-developers/core-devs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants