-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Append along time axis with write_nc #33
Comments
Hi. I wrote a little bit of an extension of nc.py netcdf time dimension is now unlimted as well. There is no check if the time already exists or so, endless amount of possible problems can occur. Maybe you can use it to incorporate it. diff --git a/dimarray/io/nc.py b/dimarray/io/nc.py
index fcef873..5494480 100644
--- a/dimarray/io/nc.py
+++ b/dimarray/io/nc.py
@@ -560,7 +560,7 @@ def _write_dataset(f, obj, mode='w-', indices=None, axis=0, format=FORMAT, verbo
@format_doc(netCDF4=_doc_write_nc, indexing=_doc_indexing_write, write_modes=_doc_write_modes)
-def _write_variable(f, obj=None, name=None, mode='a+', format=FORMAT, indices=None, axis=0, verbose=False, share_grid_mapping=False, **kwargs):
+def _write_variable(f, obj=None, name=None, mode='a+', format=FORMAT, indices=None, axis=0, verbose=False, share_grid_mapping=False, append=False, **kwargs):
""" Write DimArray instance to file
Parameters
@@ -578,6 +578,8 @@ def _write_variable(f, obj=None, name=None, mode='a+', format=FORMAT, indices=No
separate variable in the dataset, accordingly to CF-conventions
in order to share that information across several variables.
Default is False.
+ append : bool, optional
+ if True, try to append along time axis (unlimted dimension)
See Also
--------
@@ -594,6 +596,7 @@ def _write_variable(f, obj=None, name=None, mode='a+', format=FORMAT, indices=No
if name not in f.variables:
assert isinstance(obj, DimArray), "expected a DimArray instance, got {}".format(type(obj))
v = _createVariable(f, name, obj.axes, dtype=obj.dtype, **kwargs)
+ append = False
else:
v = f.variables[name]
@@ -615,7 +618,20 @@ def _write_variable(f, obj=None, name=None, mode='a+', format=FORMAT, indices=No
raise IndexError(msg)
# Write Variable
- v[ix] = np.asarray(obj)
+ if append:
+ # Read dimensions
+ axes = read_dimensions(f, name)
+ # Check if time is in there
+ if 'time' in [ax.name for ax in axes]:
+ ix = len(axes['time'].values)
+ nx = ix+len(obj.axes['time'])
+ # append
+ v[ix:nx,::] = np.asarray(obj)
+ f.variables['time'][ix:nx] = obj.axes['time'].values
+
+ else:
+ # Normal
+ v[ix] = np.asarray(obj)
# add metadata if any
if not isinstance(obj, DimArray):
@@ -794,7 +810,11 @@ def _check_dimensions(f, axes, **verb):
for ax in axes:
dim = ax.name
if not dim in f.dimensions:
- f.createDimension(dim, ax.size)
+ # time dimension is unlimited > append
+ if dim == 'time':
+ f.createDimension(dim,size=None)
+ else:
+ f.createDimension(dim, ax.size)
# strings are given "object" type in Axis object
# ==> assume all objects are actually strings |
Hi MBlaschek, Sorry for the late answer, busy time. Good that you found a solution in your case. You are not the first person to raise this issue, and I agree this would be a nice feature, but dimarray should not prefer any particular dimension, unless absolutely needed (time could arguably be one of these special cases). A few ideas in that direction:
I believe this would only be a minor change to what you have already implemented. Even better would be one test case in tests/test_nc.py You are welcome to give it a go, otherwise I will soon enough. |
@MBlaschek, @vnoel Just wanted to let you know, I have changed quite a bit the way write_nc/read_nc work, which is now simply a wrapper around:
Where DatasetOnDisk behaves similarly to a Dataset. It has an To create a new unlimited dimension you can just, do:
The Since the feature is still new, they might be a few bugs / incompatibilities with previous versions, please report in case you notice something unexpected. |
If most of your work consists in working with netCDF dataset, you might also check this cool package: http://xray.readthedocs.org/ which is similar to dimarray in many points - which I learn about recently. The major conceptual difference is that they base the basic datastructure on netCDF (that is, the basic object is a Dataset, and their DataArray (equivalent of DimArray) is actually a pointer (the variable name) to a Dataset. In dimarray, DimArray is the basic object, and a Dataset is more a convenience to work with netCDF and provides speed up for a number of methods when several arrays share some dimensions in common (e.g. reindex_axis, interp_axis). xray's underlying code looks very professional to me, the only thing is that relying on the netCDF structure probably presents a few more constraints than just assuming an array + axes, and that is relies on pandas. The rest is a matter of preferences. Anyway, worth having a look at. I already adopted some of its API (e.g. an |
Hi great module!
I'm trying to write out a GeoArray in a loop increasing the time coordinate every step.
Dimensions are [1 x 3 x 360 x 720] = time x lev x lat x lon
I write this array to netcdf and then try to append to it
I had a look at the indexing, but not sure if I can increase the time dimension like this to write out the next time step.
Would be great if you could share some thoughts on how to do that.
Thanks
The text was updated successfully, but these errors were encountered: