Skip to content

TypeError: refs_as_store() got an unexpected keyword argument 'target_protocol' with engine='kerchunk' #10388

Closed
@scottyhq

Description

@scottyhq

What happened?

I'm getting a traceback in the documentation example for opening a local reference file with engine='kerchunk'
https://docs.xarray.dev/en/stable/user-guide/io.html#kerchunk

What did you expect to happen?

Expecting output as in the docs when this section was introduced (https://docs.xarray.dev/en/v2024.09.0/user-guide/io.html#kerchunk):

ds1
<xarray.Dataset> Size: 264B
Dimensions:  (x: 4, y: 5)
Coordinates:
  * x        (x) int64 32B 10 20 30 40
  * y        (y) datetime64[ns] 40B 2000-01-01 2000-01-02 ... 2000-01-05
    z        (x) object 32B ...
Data variables:
    foo      (x, y) float64 160B ...

Minimal Complete Verifiable Example

import xarray as xr
import numpy as np
import pandas as pd

ds = xr.Dataset(
    {"foo": (("x", "y"), np.random.rand(4, 5))},
    coords={
        "x": [10, 20, 30, 40],
        "y": pd.date_range("2000-01-01", periods=5),
        "z": ("x", list("abcd")),
    },
)

ds.to_netcdf("saved_on_disk.nc")

storage_options = {
    "target_protocol": "file",
}
ds1 = xr.open_dataset(
    "./combined.json",
    engine="kerchunk",
    storage_options=storage_options,
)

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[6], line 4
      1 storage_options = {
      2     "target_protocol": "file",
      3 }
----> 4 ds1 = xr.open_dataset(
      5     "./combined.json",
      6     engine="kerchunk",
      7     storage_options=storage_options,
      8 )

File ~/GitHub/xarray/xarray/xarray/backends/api.py:687, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, chunked_array_type, from_array_kwargs, backend_kwargs, **kwargs)
    675 decoders = _resolve_decoders_kwargs(
    676     decode_cf,
    677     open_backend_dataset_parameters=backend.open_dataset_parameters,
   (...)    683     decode_coords=decode_coords,
    684 )
    686 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
--> 687 backend_ds = backend.open_dataset(
    688     filename_or_obj,
    689     drop_variables=drop_variables,
    690     **decoders,
    691     **kwargs,
    692 )
    693 ds = _dataset_from_backend_dataset(
    694     backend_ds,
    695     filename_or_obj,
   (...)    705     **kwargs,
    706 )
    707 return ds

File ~/miniforge3/envs/xarray-docs/lib/python3.13/site-packages/kerchunk/xarray_backend.py:13, in KerchunkBackend.open_dataset(self, filename_or_obj, storage_options, open_dataset_options, **kw)
      9 def open_dataset(
     10     self, filename_or_obj, *, storage_options=None, open_dataset_options=None, **kw
     11 ):
     12     open_dataset_options = (open_dataset_options or {}) | kw
---> 13     ref_ds = open_reference_dataset(
     14         filename_or_obj,
     15         storage_options=storage_options,
     16         open_dataset_options=open_dataset_options,
     17     )
     18     return ref_ds

File ~/miniforge3/envs/xarray-docs/lib/python3.13/site-packages/kerchunk/xarray_backend.py:45, in open_reference_dataset(filename_or_obj, storage_options, open_dataset_options)
     42 if open_dataset_options is None:
     43     open_dataset_options = {}
---> 45 store = refs_as_store(filename_or_obj, **storage_options)
     47 return xr.open_zarr(
     48     store, zarr_format=2, consolidated=False, **open_dataset_options
     49 )

TypeError: refs_as_store() got an unexpected keyword argument 'target_protocol'

Anything else we need to know?

Came across this working on #10383

Maybe @martindurant can shed some light on the traceback and best solution

Note combined.json looks like this

{
  "version": 1,
  "refs": {
    ".zgroup": "{\"zarr_format\":2}",
    "foo/.zarray": "{\"chunks\":[4,5],\"compressor\":null,\"dtype\":\"<f8\",\"fill_value\":\"NaN\",\"filters\":null,\"order\":\"C\",\"shape\":[4,5],\"zarr_format\":2}",
    "foo/.zattrs": "{\"_ARRAY_DIMENSIONS\":[\"x\",\"y\"],\"coordinates\":\"z\"}",
    "foo/0.0": ["saved_on_disk.h5", 8192, 160],
    "x/.zarray": "{\"chunks\":[4],\"compressor\":null,\"dtype\":\"<i8\",\"fill_value\":null,\"filters\":null,\"order\":\"C\",\"shape\":[4],\"zarr_format\":2}",
    "x/.zattrs": "{\"_ARRAY_DIMENSIONS\":[\"x\"]}",
    "x/0": ["saved_on_disk.h5", 8352, 32],
    "y/.zarray": "{\"chunks\":[5],\"compressor\":null,\"dtype\":\"<i8\",\"fill_value\":null,\"filters\":null,\"order\":\"C\",\"shape\":[5],\"zarr_format\":2}",
    "y/.zattrs": "{\"_ARRAY_DIMENSIONS\":[\"y\"],\"calendar\":\"proleptic_gregorian\",\"units\":\"days since 2000-01-01 00:00:00\"}",
    "y/0": ["saved_on_disk.h5", 8384, 40],
    "z/.zarray": "{\"chunks\":[4],\"compressor\":null,\"dtype\":\"|O\",\"fill_value\":null,\"filters\":[{\"allow_nan\":true,\"check_circular\":true,\"encoding\":\"utf-8\",\"ensure_ascii\":true,\"id\":\"json2\",\"indent\":null,\"separators\":[\",\",\":\"],\"skipkeys\":false,\"sort_keys\":true,\"strict\":true}],\"order\":\"C\",\"shape\":[4],\"zarr_format\":2}",
    "z/0": "[\"a\",\"b\",\"c\",\"d\",\"|O\",[4]]",
    "z/.zattrs": "{\"_ARRAY_DIMENSIONS\":[\"x\"]}"
  }
}

Environment

INSTALLED VERSIONS

commit: None
python: 3.13.3 | packaged by conda-forge | (main, Apr 14 2025, 20:44:30) [Clang 18.1.8 ]
python-bits: 64
OS: Darwin
OS-release: 24.5.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: None
LOCALE: (None, 'UTF-8')
libhdf5: 1.14.4
libnetcdf: 4.9.2

xarray: 2024.5.1.dev524+g0c254da2.d20250421
pandas: 2.2.3
numpy: 2.2.5
scipy: 1.15.2
netCDF4: 1.7.2
pydap: None
h5netcdf: 1.6.1
h5py: 3.12.1
zarr: 3.0.8
cftime: 1.6.4
nc_time_axis: None
iris: 3.12.0
bottleneck: 1.4.2
dask: 2025.3.0
distributed: 2025.3.0
matplotlib: 3.10.1
cartopy: 0.24.0
seaborn: 0.13.2
numbagg: None
fsspec: 2025.5.1
cupy: None
pint: None
sparse: 0.16.0
flox: None
numpy_groupies: None
setuptools: 78.1.1
pip: 25.0.1
conda: None
pytest: None
mypy: None
IPython: 9.1.0
sphinx: 8.2.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions