-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Labels
Description
Is your feature request related to a problem?
I want to write a DataTree
as a netcdf file to a remote filesystem, either directly or via writing to in-memory bytes first. Right now this seems to be impossible.
datatree.to_netcdf
requires afilepath
argument (no option for it to return bytes likedataset.to_netcdf()
)- Passing a file-like object (e.g.
io.BytesIO()
) as thefilepath
results inValueError: cannot save to a group with the scipy.io.netcdf backend
, I believe it selects the scipy backend because it thinks (incorrectly) that's it's the only one capable of writing to a file-like object, but this backend then can't handle groups that are needed for DataTree datatree.to_netcdf(bytes_io, engine='h5netcdf')
doesn't help (same error, specified engine is ignored by_dataset_to_netcdf
and scipy selected instead)- If I hack around the above (
xr.backends.api.WRITEABLE_STORES['scipy'] = xr.backends.api.WRITEABLE_STORES['h5netcdf']
), next obstacle is thatH5NetCDFStore.open
tries to read a magic number from the file-like object even when opening it in write mode: https://github.com/pydata/xarray/blob/main/xarray/backends/h5netcdf_.py#L166C9-L167C65 - After hacking around that (
xarray.backends.h5netcdf_.read_magic_number_from_file = lambda *a,**kw: b"\211HDF\r\n\032\n"
), I'm finally able to getdatatree.to_netcdf(bytes_io)
to work. - Even then it still won't work with a file-like object that doesn't implement
seek
, but I believe this is a fundamental limitation ofh5py
.
Describe the solution you'd like
datatree.to_netcdf() # Returns bytes like dataset.to_netcdf()
or
bytes_io = io.BytesIO()
datatree.to_netcdf(bytes_io)
or
with fsspec.open('remote-filesystem://path/to/file.nc', 'wb') as file_like_object:
datatree.to_netcdf(file_like_object)
(although the latter may not be possible for file-like objects that don't implement seek
.)
Describe alternatives you've considered
No response
Additional context
No response