Releases: opendatacube/datacube-core
1.7 (16 May 2019)
Not a lot of changes since rc1.
- Early exit from
dc.load
onKeyboardInterrupt
, allows partial loads inside notebook. - Some bug fixes in geometry related code
- Some cleanups in tests
- Pre-commit hooks configuration for easier testing
- Re-enable multi-threaded reads for
s3aio
driver (setuse_threads=True
indc.load(..)
)
1.7rc1 (18 April 2019)
1.7rc1 (18 April 2019)
Virtual Products
Add Virtual Products for multi-product loading.
(#522, #597, #601, #612, #644, #677, #699, #700)
Changes to Data Loading
The internal machinery used when loading and reprojecting data, has been
completely rewritten. The new code has been tested, but this is a
complicated and fundamental part of code and there is potential for
breakage.
When loading reprojected data, the new code will produce slightly
different results. We don't believe that it is any less accurate than
the old code, but you cannot expect exactly the same numeric results.
Non-reprojected loads should be identical.
This change has been made for two reasons:
- The reprojection is now core Data Cube, and is not the
responsibility of the IO driver. - When loading lower resolution data, DataCube can now take advantage
of available overviews.
- New futures based IO driver interface (#686)
Other Changes
- Allow specifying different resampling methods for different data
variables of the same Product. (#551) - Allow all reampling methods supported by rasterio. (#622)
- Bug fix (Index out of bounds causing ingestion failures)
- Support indexing data directly from HTTP/HTTPS/S3 URLs (#607)
- Renamed the command line tool
datacube metadata_type
todatacube metadata
(#692) - More useful output from the command line
datacube {product|metadata} {show|list}
- Add optional
progress_cbk
todc.load(_data)
(#702), allows user
to monitor data loading progress. - Thread-safe netCDF access within
dc.load
(#705)
Performance Improvements
- Use single pass over datasets when computing bounds (#660)
- Bugfixes and improved performance of
dask
-backed arrays (#547,
#664)
Documentation Improvements
- Improve API Reference documentation.
Deprecations
- From the command line, the old query syntax for searching within
vague time ranges, eg:2018-03 < time < 2018-04
has been removed.
It is unclear exactly what that syntax should mean, whether to
include or exclude the months specified. It is replaced bytime in [2018-01, 2018-02]
which has the same semantics asdc.load
time
queries. (#709)
1.6.2 (24 January 2018)
Patch release to build a new Docker container, to resolve an upstream security bug.
See #631 for more details.
1.6.1 (28 August 2018)
The real 1.6 release, not an accidental duplicate of the release candidate.
1.6.0 (23 August 2018)
- Enable use of aliases when specifying band names
- Fix ingestion failing after the first run #510
- Docker images now know which version of ODC they contain #523
- Fix data loading when
nodata
isNaN
#531 - Allow querying based on python
datetime.datetime
objects. #499 - Require rasterio 1.0.2 or higher, which fixes several critical bugs when loading and reprojecting from multi-band files.
- Assume fixed paths for
id
andsources
metadata fields #482 datacube.model.Measurement
was put to use for loading in attributes and made to inherit fromdict
to preserve current behaviour. #502- Updates when indexing data with
datacube dataset add
(See #485, #451 and #480)- Allow indexing without lineage
datacube dataset add --ignore-lineage
- Removed the
--sources-policy=skip|verify|ensure
. Instead use--[no-]auto-add-lineage
and--[no-]verify-lineage
- New option
datacube dataset add --exclude-product
<name>
allows excluding some products from auto-matching
- Allow indexing without lineage
- Preliminary API for indexing datasets #511
- Enable creation of MetadataTypes without having an active database connection #535
v1.6rc2 (29 June 2018)
Backwards Incompatible Changes
- The
helpers.write_geotiff()
function has been updated to support
files smaller than 256x256. It also no longer supports specifying
the time index. Before passing data in, use
xarray_data.isel(time=<my_time_index>)
. (#277) - Removed product matching options from
datacube dataset update
(#445). No matching is needed in this case as all datasets are
already in the database and are associated to products. - Removed
--match-rules
option fromdatacube dataset add
(#447) - The seldom-used
stack
keyword argument has been removed from
Datcube.load
. (#461) - The behaviour of the time range queries has changed to be compatible
with standard Python searches (eg. time slice an xarray). Now the
time range selection is inclusive of any unspecified time units.
(#440)-
Example 1:
time=('2008-01', '2008-03')
previously would have returned all
data from the start of 1st January, 2008 to the end of 1st of
March, 2008. Now, this query will return all data from the start
of 1st January, 2008 and 23:59:59.999 on 31st of March, 2008. -
Example 2:
To specify a search time between 1st of January and 29th of
February, 2008 (inclusive), use a search query like
time=('2008-01', '2008-02')
. This query is equivalent to using
any of the following in the second time element:('2008-02-29')
('2008-02-29 23')
('2008-02-29 23:59')
('2008-02-29 23:59:59')
('2008-02-29 23:59:59.999')
-
Changes
-
A
--location-policy
option has been added to thedatacube dataset update
command. Previously this command would always add a new
location to the list of URIs associated with a dataset. It's now
possible to specifyarchive
andforget
options, which will mark
previous location as archived or remove them from the index
altogether. The default behaviour is unchanged. (#469) -
The masking related function
describe_variable_flags()
now returns
a pandas DataFrame by default. This will display as a table in
Jupyter Notebooks. (#422) -
Usability improvements in
datacube dataset [add|update]
commands
(#447, #448, #398)- Embedded documentation updates
- Deprecated
--auto-match
(it was always on anyway) - Renamed
--dtype
to--product
(the old name will still work,
but with a warning) - Add option to skip lineage data when indexing (useful for saving
time when testing) (#473)
-
Enable compression for metadata documents stored in NetCDFs
generated bystacker
andingestor
(#452) -
Implement better handling of stacked NetCDF files (#415)
- Record the slice index as part of the dataset location URI,
using#part=<int>
syntax, index is 0-based - Use this index when loading data instead of fuzzy searching by
timestamp - Fall back to the old behaviour when
#part=<int>
is missing and
the file is more than one time slice deep
- Record the slice index as part of the dataset location URI,
-
Expose the following dataset fields and make them searchable:
indexed_time
(when the dataset was indexed)indexed_by
(user who indexed the dataset)creation_time
(creation of dataset: when it was processed)label
(the label for a dataset)
(See #432 for more details)
Bug Fixes
- The
.dimensions
property of a product no longer crashes when
product is missing agrid_spec
. It instead defaults totime,y,x
- Fix a regression in
v1.6rc1
which made it impossible to run
datacube ingest
to create products which were defined in1.5.5
and earlier versions of ODC. (#423, #436) - Allow specifying the chunking for string variables when writing
NetCDFs (#453)
1.6rc1 Easter Bilby (10 April 2018)
v1.6rc1 Easter Bilby (10 April 2018)
This is the first release in a while, and so there’s a lot of changes, including
some significant refactoring, with the potential having issues when upgrading.
Backwards Incompatible Fixes
- Drop Support for Python 2. Python 3.5 is now the earliest supported Python version.
- Removed the old
ndexpr
,analytics
andexecution engine
code. There is work underway in the execution engine branch to replace these features.
Enhancements
-
Support for third party drivers, for custom data storage and custom index implementations
-
The correct way to get an Index connection in code is to use
datacube.index.index_connect()
. -
Changes in ingestion configuration
-
Must now specify the Data Write Plug-ins to use. For s3 ingestion there was a top level
container
specified, which has been renamed and moved understorage
. The entirestorage
section is passed through to the Data Write Plug-ins, so drivers requiring other configuration can include them here. eg:... storage: ... driver: s3aio bucket: my_s3_bucket ...
-
-
Added a
Dockerfile
to enable automated builds for a reference Docker image. -
Multiple environments can now be specified in one datacube config. See PR 298 and the Runtime Config
- Allow specifying which
index_driver
should be used for an environment.
- Allow specifying which
-
Command line tools can now output CSV or YAML. (Issue issue 206, PR 390)
-
Support for saving data to NetCDF using a Lambert Conformal Conic Projection (PR 329)
-
Lots of documentation updates:
- Information about Bit Masking.
- A description of how data is loaded.
- Some higher level architecture documentation.
- Updates on how to index new data.
Bug Fixes
- Allow creation of
datacube.utils.geometry.Geometry
objects from 3d representations. The Z axis is simply thrown away. - The
datacube --config_file
option has been renamed todatacube --config
, which is shorter and more consistent with the other options. The old name can still be used for now. - Fix a severe performance regression when extracting and reprojecting a small region of data. (PR 393)
- Fix for a somewhat rare bug causing read failures by attempt to read data from a negative index into a file. (PR 376)
- Make
CRS
equality comparisons a little bit looser. Trust either a Proj.4 based comparison or a GDAL based comparison. (Closed issue 243)
New Data Support
1.5.5
- Fixes to package dependencies. No code changes.
1.5.4
-
Minor features backported from 2.0:
-
Support for
limit
in searches -
Alternative lazy search method
find_lazy
-
-
Fixes:
-
Improve native field descriptions
-
Connection should not be held open between multi-product searches
-
Disable prefetch for celery workers
-
Support jsonify-ing decimals
-
1.5.3
-
Use
cloudpickle
as thecelery
serialiser -
Allow
celery
tests to run without installing it -
Move
datacube-worker
inside the main datacube package -
Write
metadata_type
from the ingest configuration if available -
Support config parsing limitations of Python 2
-
Fix #303: resolve GDAL build dependencies on Travis
-
Upgrade
rasterio
to newer version