-
-
Notifications
You must be signed in to change notification settings - Fork 358
Parametrize Array with v2/v3 metadata #3304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3304 +/- ##
=======================================
Coverage 94.92% 94.93%
=======================================
Files 79 80 +1
Lines 9491 9507 +16
=======================================
+ Hits 9009 9025 +16
Misses 482 482
🚀 New features to boost your workflow:
|
53da48a
to
d9e50d2
Compare
AsyncArrayV3: TypeAlias = AsyncArray[ArrayV3Metadata] | ||
"""A Zarr format 3 `AsyncArray`""" | ||
|
||
AnyArray: TypeAlias = Array[Any] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be Array[ArrayV2Metadata] | Array[ArrayV3Metadata]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I guess that's safer 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't get that to work, beacuse of an error that I sort of but not really understand Any
is fine though - .metadata
still gets inferred correctly:
import zarr
import zarr.storage
arr = zarr.ones(shape=(1,))
meta = arr.metadata
reveal_locals()
"""
test.py:6: note: Revealed local types are:
test.py:6: note: arr: zarr.core.array.Array[Any]
test.py:6: note: meta: zarr.core.metadata.v2.ArrayV2Metadata | zarr.core.metadata.v3.ArrayV3Metadata
"""
bbabb9b
to
a168b6c
Compare
async def empty_like( | ||
a: ArrayLike, **kwargs: Any | ||
) -> AsyncArray[ArrayV2Metadata] | AsyncArray[ArrayV3Metadata]: | ||
async def empty_like(a: ArrayLike, **kwargs: Any) -> AnyAsyncArray: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you have the energy for it, I think we could type **kwargs on one place with a typeddict and use Unpack
in the signatures here.
this would allow us to use an overload
to declare that empty_like(..., zarr_format=2) -> AsyncArray[ArrayV2Metadata]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to leave anything beyond the new parametrization and using the new Any*
types to follow up PRs if that's okay, to keep this PR tightly scoped.
9c3938f
to
2807003
Compare
2807003
to
5e56f56
Compare
src/zarr/core/array.py
Outdated
@@ -2091,7 +2092,7 @@ def open( | |||
Array opened from the store. | |||
""" | |||
async_array = sync(AsyncArray.open(store)) | |||
return cls(async_array) | |||
return Array(async_array) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this change? I think using cls
here ensures that subclasses can re-use this method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, the correct fix here was to keep cls
but change the return type to Self
. Fixed, and fixed in a couple of other places in this file too.
from zarr.core.array import Array, AsyncArray | ||
from zarr.core.metadata.v2 import ArrayV2Metadata | ||
from zarr.core.metadata.v3 import ArrayV3Metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these imports will prevent zarr.core.array
from using anything defined in zarr.types
, is that intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(outside of a type-checking context, I mean)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thinking more about this, I think as long as everything in zarr.types
is imported exclusively in if TYPE_CHECKING
blocks, then there's no risk of circular imports. we would only have circular imports if we tried to work with types as values outside of type-checking, for example using get_args(ZarrFormat)
. We can always avoid this by defining a typed constant value somewhere else in the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I certinaly didn't think about this! So neither intentional or unintentional. I'm not sure I see a better solution, but open to suggestions.
I think these changes are good, as they make it easier to track the v2-ness of v3-ness of the But since this PR changes the cc @zarr-developers/core-devs |
It has been a longstanding bugbear of mine that there's no easy way to specify a v2 or v3 array type. This especially came up in the context of #3257, which deals specifically with v2/v3 arrays.
This PR ads type parametrization to the
Array
class.After this PR, there is lots of improvements to adding overloads to functions and methods that could be made, but to keep review easier I'd like to leave that for a follow up PR.