Skip to content

Commit 0e54a6e

Browse files
authored
Merge branch 'main' into PYL-W0404
2 parents 05ba029 + b267ff6 commit 0e54a6e

File tree

12 files changed

+75
-153
lines changed

12 files changed

+75
-153
lines changed

.github/workflows/hypothesis.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ jobs:
8787
&& steps.status.outcome == 'failure'
8888
&& github.event_name == 'schedule'
8989
&& github.repository_owner == 'zarr-developers'
90-
uses: xarray-contrib/issue-from-pytest-log@v1
90+
uses: scientific-python/issue-from-pytest-log-action@v1
9191
with:
9292
log-path: output-${{ matrix.python-version }}-log.jsonl
9393
issue-title: "Nightly Hypothesis tests failed"

changes/2819.chore.rst

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,18 @@
11
Ensure that invocations of ``create_array`` use consistent keyword arguments, with consistent defaults.
2-
Specifically, ``zarr.api.synchronous.create_array`` now takes a ``write_data`` keyword argument; The
3-
``create_array`` method on ``zarr.Group`` takes ``data`` and ``write_data`` keyword arguments. The ``fill_value``
4-
keyword argument of the various invocations of ``create_array`` has been consistently set to ``None``, where previously it was either ``None`` or ``0``.
2+
3+
- ``zarr.api.synchronous.create_array`` now takes a ``write_data`` keyword argument
4+
- The ``Group.create_array`` method takes ``data`` and ``write_data`` keyword arguments.
5+
- The functions ``api.asynchronous.create``, ``api.asynchronous.create_array``
6+
and the methods ``Group.create_array``, ``Group.array``, had the default
7+
``fill_value`` changed from ``0`` to the ``DEFAULT_FILL_VALUE`` value, which instructs Zarr to
8+
use the default scalar value associated with the array's data type as the fill value. These are
9+
all functions or methods for array creation that mirror, wrap or are wrapped by, another function
10+
that already has a default ``fill_value`` set to ``DEFAULT_FILL_VALUE``. This change is necessary
11+
to make these functions consistent across the entire codebase, but as this changes default values,
12+
new data might have a different fill value than expected after this change.
13+
14+
For data types where 0 is meaningful, like integers or floats, the default scalar is 0, so this
15+
change should not be noticeable. For data types where 0 is ambiguous, like fixed-length unicode
16+
strings, the default fill value might be different after this change. Users who were relying on how
17+
Zarr interpreted ``0`` as a non-numeric scalar value should set their desired fill value explicitly
18+
after this change.

changes/2874.feature.rst

Lines changed: 18 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,20 @@
1-
Adds zarr-specific data type classes. This replaces the internal use of numpy data types for zarr
2-
v2 and a fixed set of string enums for zarr v3. This change is largely internal, but it does
3-
change the type of the ``dtype`` and ``data_type`` fields on the ``ArrayV2Metadata`` and
4-
``ArrayV3Metadata`` classes. It also changes the JSON metadata representation of the
5-
variable-length string data type, but the old metadata representation can still be
6-
used when reading arrays. The logic for automatically choosing the chunk encoding for a given data
7-
type has also changed, and this necessitated changes to the ``config`` API.
1+
Adds zarr-specific data type classes.
2+
3+
This change adds a ``ZDType`` base class for Zarr V2 and Zarr V3 data types. Child classes are
4+
defined for each NumPy data type. Each child class defines routines for ``JSON`` serialization.
5+
New data types can be created and registered dynamically.
6+
7+
Prior to this change, Zarr Python had two streams for handling data types. For Zarr V2 arrays,
8+
we used NumPy data type identifiers. For Zarr V3 arrays, we used a fixed set of string enums. Both
9+
of these systems proved hard to extend.
10+
11+
This change is largely internal, but it does change the type of the ``dtype`` and ``data_type``
12+
fields on the ``ArrayV2Metadata`` and ``ArrayV3Metadata`` classes. Previously, ``ArrayV2Metadata.dtype``
13+
was a NumPy ``dtype`` object, and ``ArrayV3Metadata.data_type`` was an internally-defined ``enum``.
14+
After this change, both ``ArrayV2Metadata.dtype`` and ``ArrayV3Metadata.data_type`` are instances of
15+
``ZDType``. A NumPy data type can be generated from a ``ZDType`` via the ``ZDType.to_native_dtype()``
16+
method. The internally-defined Zarr V3 ``enum`` class is gone entirely, but the ``ZDType.to_json(zarr_format=3)``
17+
method can be used to generate either a string, or dictionary that has a string ``name`` field, that
18+
represents the string value previously associated with that ``enum``.
819

920
For more on this new feature, see the `documentation </user-guide/data_types.html>`_

changes/3233.feature.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Add an alternate `from_array_metadata_and_store` constructor to `CodecPipeline`.

src/zarr/abc/codec.py

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,12 @@
1212
from collections.abc import Awaitable, Callable, Iterable
1313
from typing import Self
1414

15-
from zarr.abc.store import ByteGetter, ByteSetter
15+
from zarr.abc.store import ByteGetter, ByteSetter, Store
1616
from zarr.core.array_spec import ArraySpec
1717
from zarr.core.chunk_grids import ChunkGrid
1818
from zarr.core.dtype.wrapper import TBaseDType, TBaseScalar, ZDType
1919
from zarr.core.indexing import SelectorTuple
20+
from zarr.core.metadata import ArrayMetadata
2021

2122
__all__ = [
2223
"ArrayArrayCodec",
@@ -281,6 +282,25 @@ def from_codecs(cls, codecs: Iterable[Codec]) -> Self:
281282
"""
282283
...
283284

285+
@classmethod
286+
def from_array_metadata_and_store(cls, array_metadata: ArrayMetadata, store: Store) -> Self:
287+
"""Creates a codec pipeline from array metadata and a store path.
288+
289+
Raises NotImplementedError by default, indicating the CodecPipeline must be created with from_codecs instead.
290+
291+
Parameters
292+
----------
293+
array_metadata : ArrayMetadata
294+
store : Store
295+
296+
Returns
297+
-------
298+
Self
299+
"""
300+
raise NotImplementedError(
301+
f"'{type(cls).__name__}' does not implement CodecPipeline.from_array_metadata_and_store."
302+
)
303+
284304
@property
285305
@abstractmethod
286306
def supports_partial_decode(self) -> bool: ...

src/zarr/api/synchronous.py

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,6 @@
66

77
import zarr.api.asynchronous as async_api
88
import zarr.core.array
9-
from zarr._compat import _deprecate_positional_args
109
from zarr.core.array import DEFAULT_FILL_VALUE, Array, AsyncArray, CompressorLike
1110
from zarr.core.group import Group
1211
from zarr.core.sync import sync
@@ -160,7 +159,6 @@ def load(
160159
)
161160

162161

163-
@_deprecate_positional_args
164162
def open(
165163
store: StoreLike | None = None,
166164
*,
@@ -255,7 +253,6 @@ def save(
255253
)
256254

257255

258-
@_deprecate_positional_args
259256
def save_array(
260257
store: StoreLike,
261258
arr: NDArrayLike,
@@ -387,7 +384,6 @@ def array(data: npt.ArrayLike | Array, **kwargs: Any) -> Array:
387384
return Array(sync(async_api.array(data=data, **kwargs)))
388385

389386

390-
@_deprecate_positional_args
391387
def group(
392388
store: StoreLike | None = None,
393389
*,
@@ -455,7 +451,6 @@ def group(
455451
)
456452

457453

458-
@_deprecate_positional_args
459454
def open_group(
460455
store: StoreLike | None = None,
461456
*,

src/zarr/core/array.py

Lines changed: 14 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,6 @@
2525
from typing_extensions import deprecated
2626

2727
import zarr
28-
from zarr._compat import _deprecate_positional_args
2928
from zarr.abc.codec import ArrayArrayCodec, ArrayBytesCodec, BytesBytesCodec, Codec
3029
from zarr.abc.store import Store, set_or_delete
3130
from zarr.codecs._v2 import V2Codec
@@ -192,7 +191,15 @@ def parse_array_metadata(data: Any) -> ArrayMetadata:
192191
raise TypeError # pragma: no cover
193192

194193

195-
def create_codec_pipeline(metadata: ArrayMetadata) -> CodecPipeline:
194+
def create_codec_pipeline(metadata: ArrayMetadata, *, store: Store | None = None) -> CodecPipeline:
195+
if store is not None:
196+
try:
197+
return get_pipeline_class().from_array_metadata_and_store(
198+
array_metadata=metadata, store=store
199+
)
200+
except NotImplementedError:
201+
pass
202+
196203
if isinstance(metadata, ArrayV3Metadata):
197204
return get_pipeline_class().from_codecs(metadata.codecs)
198205
elif isinstance(metadata, ArrayV2Metadata):
@@ -311,7 +318,11 @@ def __init__(
311318
object.__setattr__(self, "metadata", metadata_parsed)
312319
object.__setattr__(self, "store_path", store_path)
313320
object.__setattr__(self, "_config", config_parsed)
314-
object.__setattr__(self, "codec_pipeline", create_codec_pipeline(metadata=metadata_parsed))
321+
object.__setattr__(
322+
self,
323+
"codec_pipeline",
324+
create_codec_pipeline(metadata=metadata_parsed, store=store_path.store),
325+
)
315326

316327
# this overload defines the function signature when zarr_format is 2
317328
@overload
@@ -430,7 +441,6 @@ async def create(
430441

431442
@classmethod
432443
@deprecated("Use zarr.api.asynchronous.create_array instead.")
433-
@_deprecate_positional_args
434444
async def create(
435445
cls,
436446
store: StoreLike,
@@ -1782,7 +1792,6 @@ class Array:
17821792

17831793
@classmethod
17841794
@deprecated("Use zarr.create_array instead.")
1785-
@_deprecate_positional_args
17861795
def create(
17871796
cls,
17881797
store: StoreLike,
@@ -2595,7 +2604,6 @@ def __setitem__(self, selection: Selection, value: npt.ArrayLike) -> None:
25952604
else:
25962605
self.set_basic_selection(cast("BasicSelection", pure_selection), value, fields=fields)
25972606

2598-
@_deprecate_positional_args
25992607
def get_basic_selection(
26002608
self,
26012609
selection: BasicSelection = Ellipsis,
@@ -2719,7 +2727,6 @@ def get_basic_selection(
27192727
)
27202728
)
27212729

2722-
@_deprecate_positional_args
27232730
def set_basic_selection(
27242731
self,
27252732
selection: BasicSelection,
@@ -2815,7 +2822,6 @@ def set_basic_selection(
28152822
indexer = BasicIndexer(selection, self.shape, self.metadata.chunk_grid)
28162823
sync(self._async_array._set_selection(indexer, value, fields=fields, prototype=prototype))
28172824

2818-
@_deprecate_positional_args
28192825
def get_orthogonal_selection(
28202826
self,
28212827
selection: OrthogonalSelection,
@@ -2940,7 +2946,6 @@ def get_orthogonal_selection(
29402946
)
29412947
)
29422948

2943-
@_deprecate_positional_args
29442949
def set_orthogonal_selection(
29452950
self,
29462951
selection: OrthogonalSelection,
@@ -3051,7 +3056,6 @@ def set_orthogonal_selection(
30513056
self._async_array._set_selection(indexer, value, fields=fields, prototype=prototype)
30523057
)
30533058

3054-
@_deprecate_positional_args
30553059
def get_mask_selection(
30563060
self,
30573061
mask: MaskSelection,
@@ -3134,7 +3138,6 @@ def get_mask_selection(
31343138
)
31353139
)
31363140

3137-
@_deprecate_positional_args
31383141
def set_mask_selection(
31393142
self,
31403143
mask: MaskSelection,
@@ -3213,7 +3216,6 @@ def set_mask_selection(
32133216
indexer = MaskIndexer(mask, self.shape, self.metadata.chunk_grid)
32143217
sync(self._async_array._set_selection(indexer, value, fields=fields, prototype=prototype))
32153218

3216-
@_deprecate_positional_args
32173219
def get_coordinate_selection(
32183220
self,
32193221
selection: CoordinateSelection,
@@ -3303,7 +3305,6 @@ def get_coordinate_selection(
33033305
out_array = np.array(out_array).reshape(indexer.sel_shape)
33043306
return out_array
33053307

3306-
@_deprecate_positional_args
33073308
def set_coordinate_selection(
33083309
self,
33093310
selection: CoordinateSelection,
@@ -3401,7 +3402,6 @@ def set_coordinate_selection(
34013402

34023403
sync(self._async_array._set_selection(indexer, value, fields=fields, prototype=prototype))
34033404

3404-
@_deprecate_positional_args
34053405
def get_block_selection(
34063406
self,
34073407
selection: BasicSelection,
@@ -3500,7 +3500,6 @@ def get_block_selection(
35003500
)
35013501
)
35023502

3503-
@_deprecate_positional_args
35043503
def set_block_selection(
35053504
self,
35063505
selection: BasicSelection,

src/zarr/core/group.py

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,6 @@
1515
from typing_extensions import deprecated
1616

1717
import zarr.api.asynchronous as async_api
18-
from zarr._compat import _deprecate_positional_args
1918
from zarr.abc.metadata import Metadata
2019
from zarr.abc.store import Store, set_or_delete
2120
from zarr.core._info import GroupInfo
@@ -2417,7 +2416,6 @@ def create(self, *args: Any, **kwargs: Any) -> Array:
24172416
# Backwards compatibility for 2.x
24182417
return self.create_array(*args, **kwargs)
24192418

2420-
@_deprecate_positional_args
24212419
def create_array(
24222420
self,
24232421
name: str,
@@ -2635,7 +2633,6 @@ def require_array(self, name: str, *, shape: ShapeLike, **kwargs: Any) -> Array:
26352633
"""
26362634
return Array(self._sync(self._async_group.require_array(name, shape=shape, **kwargs)))
26372635

2638-
@_deprecate_positional_args
26392636
def empty(self, *, name: str, shape: ChunkCoords, **kwargs: Any) -> Array:
26402637
"""Create an empty array with the specified shape in this Group. The contents will be filled with
26412638
the array's fill value or zeros if no fill value is provided.
@@ -2657,7 +2654,6 @@ def empty(self, *, name: str, shape: ChunkCoords, **kwargs: Any) -> Array:
26572654
"""
26582655
return Array(self._sync(self._async_group.empty(name=name, shape=shape, **kwargs)))
26592656

2660-
@_deprecate_positional_args
26612657
def zeros(self, *, name: str, shape: ChunkCoords, **kwargs: Any) -> Array:
26622658
"""Create an array, with zero being used as the default value for uninitialized portions of the array.
26632659
@@ -2677,7 +2673,6 @@ def zeros(self, *, name: str, shape: ChunkCoords, **kwargs: Any) -> Array:
26772673
"""
26782674
return Array(self._sync(self._async_group.zeros(name=name, shape=shape, **kwargs)))
26792675

2680-
@_deprecate_positional_args
26812676
def ones(self, *, name: str, shape: ChunkCoords, **kwargs: Any) -> Array:
26822677
"""Create an array, with one being used as the default value for uninitialized portions of the array.
26832678
@@ -2697,7 +2692,6 @@ def ones(self, *, name: str, shape: ChunkCoords, **kwargs: Any) -> Array:
26972692
"""
26982693
return Array(self._sync(self._async_group.ones(name=name, shape=shape, **kwargs)))
26992694

2700-
@_deprecate_positional_args
27012695
def full(
27022696
self, *, name: str, shape: ChunkCoords, fill_value: Any | None, **kwargs: Any
27032697
) -> Array:
@@ -2725,7 +2719,6 @@ def full(
27252719
)
27262720
)
27272721

2728-
@_deprecate_positional_args
27292722
def empty_like(self, *, name: str, data: async_api.ArrayLike, **kwargs: Any) -> Array:
27302723
"""Create an empty sub-array like `data`. The contents will be filled
27312724
with the array's fill value or zeros if no fill value is provided.
@@ -2752,7 +2745,6 @@ def empty_like(self, *, name: str, data: async_api.ArrayLike, **kwargs: Any) ->
27522745
"""
27532746
return Array(self._sync(self._async_group.empty_like(name=name, data=data, **kwargs)))
27542747

2755-
@_deprecate_positional_args
27562748
def zeros_like(self, *, name: str, data: async_api.ArrayLike, **kwargs: Any) -> Array:
27572749
"""Create a sub-array of zeros like `data`.
27582750
@@ -2773,7 +2765,6 @@ def zeros_like(self, *, name: str, data: async_api.ArrayLike, **kwargs: Any) ->
27732765

27742766
return Array(self._sync(self._async_group.zeros_like(name=name, data=data, **kwargs)))
27752767

2776-
@_deprecate_positional_args
27772768
def ones_like(self, *, name: str, data: async_api.ArrayLike, **kwargs: Any) -> Array:
27782769
"""Create a sub-array of ones like `data`.
27792770
@@ -2793,7 +2784,6 @@ def ones_like(self, *, name: str, data: async_api.ArrayLike, **kwargs: Any) -> A
27932784
"""
27942785
return Array(self._sync(self._async_group.ones_like(name=name, data=data, **kwargs)))
27952786

2796-
@_deprecate_positional_args
27972787
def full_like(self, *, name: str, data: async_api.ArrayLike, **kwargs: Any) -> Array:
27982788
"""Create a sub-array like `data` filled with the `fill_value` of `data` .
27992789
@@ -2823,7 +2813,6 @@ def move(self, source: str, dest: str) -> None:
28232813
return self._sync(self._async_group.move(source, dest))
28242814

28252815
@deprecated("Use Group.create_array instead.")
2826-
@_deprecate_positional_args
28272816
def array(
28282817
self,
28292818
name: str,

0 commit comments

Comments
 (0)