Skip to content

Commit

Permalink
Merge branch 'main' into use_xsdba
Browse files Browse the repository at this point in the history
  • Loading branch information
Zeitsperre authored Feb 27, 2025
2 parents 24f6a44 + fca4054 commit 6696efb
Show file tree
Hide file tree
Showing 11 changed files with 63 additions and 34 deletions.
2 changes: 1 addition & 1 deletion .cruft.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
"project_slug": "xscen",
"project_short_description": "A climate change scenario-building analysis framework, built with xclim/xarray.",
"pypi_username": "RondeauG",
"version": "0.11.1-dev.1",
"version": "0.11.1-dev.4",
"use_pytest": "y",
"use_black": "y",
"use_conda": "y",
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/upstream.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ jobs:
fail-fast: false
matrix:
python-version:
- "3.10"
- "3.12"
defaults:
run:
shell: bash -l {0}
Expand Down
7 changes: 6 additions & 1 deletion CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,18 @@ Changelog

v0.12.0 (unreleased)
--------------------
Contributors to this version: Trevor James Smith (:user:`Zeitsperre`), Pascal Bourgault (:user:`aulemahal`).
Contributors to this version: Trevor James Smith (:user:`Zeitsperre`), Pascal Bourgault (:user:`aulemahal`), Juliette Lavoie (:user:`juliettelavoie`), Sarah Gammon (:user:`SarahG-579462`).

Breaking changes
^^^^^^^^^^^^^^^^
* `xscen` now uses `flit` as its build-engine and no longer uses `setuptools`, `setuptools-scm`, or `wheel`. (:pull:`519`).
* Update to support Python3.13 and `xclim` v0.55.0 (:pull:`532`).

New features and enhancements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* Include station-obs and forecasts in the derived schema for `build_path`. (:pull:`534`).
* Project catalog now allows `check_valid` and `drop_duplicates` keyword arguments. (:pull:`536`, :issue:`535`).

Bug fixes
^^^^^^^^^
* Docstrings and documentation have been updated to remove several small grammatical errors. (:pull:`527`).
Expand Down
3 changes: 1 addition & 2 deletions docs/notebooks/6_config.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -278,8 +278,7 @@
"\n",
"# Create a dummy dataset\n",
"time = pd.date_range(\"1951-01-01\", \"2100-01-01\", freq=\"YS-JAN\")\n",
"da = xr.DataArray([0] * len(time), coords={\"time\": time})\n",
"da.name = \"test\"\n",
"da = xr.DataArray([0] * len(time), coords={\"time\": time}, name=\"test\")\n",
"ds = da.to_dataset()\n",
"\n",
"# Call climatological_op using no argument other than what's in CONFIG\n",
Expand Down
8 changes: 4 additions & 4 deletions environment-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: xscen-dev
channels:
- conda-forge
dependencies:
- python >=3.10,<3.13
- python >=3.10,<3.14
- pip >=25.0
# Don't forget to sync changes between environment.yml, environment-dev.yml, and pyproject.toml!
# Also consider updating the list in xs.utils.show_versions if you add a new package.
Expand All @@ -11,7 +11,7 @@ dependencies:
- cftime
- cf_xarray >=0.7.6
- clisops >=0.15
- dask >=2024.8.1,<2024.11 # FIXME: https://github.com/Ouranosinc/xclim/issues/1992
- dask >=2024.8.1,<2024.12 # FIXME: Remove upper pin when https://github.com/pangeo-data/rechunker/pull/156 is merged
- flox !=0.9.14 # FIXME: 0.9.14 is a broken version. This pin could be removed eventually.
- fsspec
- geopandas
Expand All @@ -31,10 +31,10 @@ dependencies:
- shapely >=2.0
- sparse
- toolz
- xarray >=2023.11.0, !=2024.6.0, <2024.10.0 # FIXME: 2024.10.0 breaks rechunker with zarr, https://github.com/pangeo-data/rechunker/issues/154
- xarray >=2023.11.0, !=2024.6.0
- xclim >=0.55, <0.56
- xesmf >=0.7, !=0.8.8
- zarr >=2.13, <3.0 # FIXME: xarray is compatible with zarr 3.0 from 2025.01.1, but we pin xarray below that version
- zarr >=2.13, <3
# Opt
- nc-time-axis >=1.3.1
- pyarrow >=10.0.1
Expand Down
8 changes: 4 additions & 4 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: xscen
channels:
- conda-forge
dependencies:
- python >=3.10,<3.13
- python >=3.10,<3.14
- pip >=25.0
# Don't forget to sync changes between environment.yml, environment-dev.yml, and pyproject.toml!
# Also consider updating the list in xs.utils.show_versions if you add a new package.
Expand All @@ -11,7 +11,7 @@ dependencies:
- cftime
- cf_xarray >=0.7.6
- clisops >=0.15
- dask >=2024.8.1,<2024.11 # FIXME: https://github.com/Ouranosinc/xclim/issues/1992
- dask >=2024.8.1,<2024.12 # FIXME: Remove upper pin when https://github.com/pangeo-data/rechunker/pull/156 is merged
- flox !=0.9.14 # FIXME: 0.9.14 is a broken version. This pin could be removed eventually.
- fsspec
- geopandas
Expand All @@ -31,10 +31,10 @@ dependencies:
- shapely >=2.0
- sparse
- toolz
- xarray >=2023.11.0, !=2024.6.0, <2024.10.0 # FIXME: 2024.10.0 breaks rechunker with zarr
- xarray >=2023.11.0, !=2024.6.0
- xclim >=0.55, <0.56
- xesmf >=0.7, !=0.8.8
- zarr >=2.13, <3.0 # FIXME: xarray is compatible with zarr 3.0 from 2025.01.1, but we pin xarray below that version
- zarr >=2.13, <3
# To install from source
- flit >=3.10.1,<4.0
# Opt
Expand Down
10 changes: 5 additions & 5 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ dependencies = [
"cftime",
"cf_xarray >=0.7.6",
"clisops >=0.15",
"dask >=2024.8.1,<2024.11", # FIXME: https://github.com/Ouranosinc/xclim/issues/1992
"dask >=2024.8.1,<2024.12.0", # FIXME: Remove upper pin when https://github.com/pangeo-data/rechunker/pull/156 is merged
"flox !=0.9.14", # FIXME: 0.9.14 is a broken version. This pin could be removed eventually.
"fsspec",
"geopandas",
Expand All @@ -61,10 +61,10 @@ dependencies = [
"shapely >=2.0",
"sparse",
"toolz",
"xarray >=2023.11.0, !=2024.6.0, <2024.10.0", # FIXME: 2024.10.0 breaks rechunker with zarr
"xarray >=2023.11.0, !=2024.6.0",
"xsdba@git+https://[email protected]/Ouranosinc/xsdba.git@naming_conventions",
"xclim >=0.55, <0.56",
"zarr >=2.13, <3.0" # FIXME: xarray is compatible with zarr 3.0 from 2025.01.1, but we pin xarray below that version"
"zarr >=2.13,<3"
]

[project.optional-dependencies]
Expand Down Expand Up @@ -111,7 +111,7 @@ docs = [
"sphinxcontrib-napoleon"
]
extra = [
"xesmf >=0.7, !=0.8.8"
"xesmf >=0.7, !=0.8.8" # FIXME: 0.8.8 currently creates segfaults on ReadTheDocs.
]
all = ["xscen[dev]", "xscen[docs]", "xscen[extra]"]

Expand All @@ -133,7 +133,7 @@ target-version = [
]

[tool.bumpversion]
current_version = "0.11.1-dev.1"
current_version = "0.11.1-dev.4"
commit = true
commit_args = "--no-verify"
tag = false
Expand Down
2 changes: 1 addition & 1 deletion src/xscen/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@

__author__ = """Gabriel Rondeau-Genesse"""
__email__ = "[email protected]"
__version__ = "0.11.1-dev.1"
__version__ = "0.11.1-dev.4"


# FIXME: file and line are unused
Expand Down
46 changes: 33 additions & 13 deletions src/xscen/catalog.py
Original file line number Diff line number Diff line change
Expand Up @@ -719,9 +719,11 @@ def __init__(
create: bool = False,
overwrite: bool = False,
project: dict | None = None,
check_valid: bool = True,
drop_duplicates: bool = True,
**kwargs,
):
"""
r"""
Open or create a project catalog.
Parameters
Expand All @@ -735,6 +737,13 @@ def __init__(
If this and 'create' are True, this will overwrite any existing JSON and CSV file with an empty catalog.
project : dict, optional
Metadata to create the catalog, if required.
check_valid : bool
If True (default), will check that all files in the catalog exist on disk and remove those that don't.
drop_duplicates : bool
If True (default), will drop duplicates in the catalog based on the 'id' and 'path' columns.
\**kwargs : dict
Any other arguments are passed to xscen.catalog.DataCatalog.
Notes
-----
Expand All @@ -746,9 +755,15 @@ def __init__(
if create:
if isinstance(df, str | Path) and (not Path(df).is_file() or overwrite):
self.create(df, project=project, overwrite=overwrite)
super().__init__(df, *args, **kwargs)
self.check_valid()
self.drop_duplicates()
super().__init__(
df,
*args,
check_valid=check_valid,
drop_duplicates=drop_duplicates,
**kwargs,
)
self.check_valid_flag = check_valid
self.drop_duplicates_flag = drop_duplicates
self.meta_file = df if not isinstance(df, dict) else None

# TODO: Implement a way to easily destroy part of the catalog to "reset" some steps
Expand All @@ -772,7 +787,7 @@ def update(
Warnings
--------
If a file was deleted between the parsing of the catalog and this call,
it will be removed from the csv when `check_valid` is called.
it will be removed from the csv if `check_valid` is called.
Parameters
----------
Expand All @@ -786,9 +801,10 @@ def update(
if isinstance(df, pd.Series):
df = pd.DataFrame(df).transpose()
self.esmcat._df = pd.concat([self.df, df])

self.check_valid()
self.drop_duplicates()
if self.check_valid_flag:
self.check_valid()
if self.drop_duplicates_flag:
self.drop_duplicates()

# make sure year really has 4 digits
if "date_start" in self.df:
Expand Down Expand Up @@ -834,8 +850,10 @@ def update(
}
)
disk_cat.esmcat._df = pd.concat([disk_cat.df, df_str])
disk_cat.check_valid()
disk_cat.drop_duplicates()
if self.check_valid_flag:
disk_cat.check_valid()
if self.drop_duplicates_flag:
disk_cat.drop_duplicates()
with fs.open(disk_cat.esmcat.catalog_file, "wb") as csv_outfile:
disk_cat.df.to_csv(csv_outfile, index=False, compression=None)

Expand All @@ -858,7 +876,7 @@ def update_from_ds(
Warnings
--------
If a file was deleted between the parsing of the catalog and this call,
it will be removed from the csv when `check_valid` is called.
it will be removed from the csv if `check_valid` is called.
Parameters
----------
Expand Down Expand Up @@ -910,8 +928,10 @@ def refresh(self):
self.meta_file, read_csv_kwargs=self.read_csv_kwargs
)
initlen = len(self.esmcat.df)
self.check_valid()
self.drop_duplicates()
if self.check_valid_flag:
self.check_valid()
if self.drop_duplicates_flag:
self.drop_duplicates()
if len(self.df) != initlen:
self.update()

Expand Down
4 changes: 2 additions & 2 deletions src/xscen/data/file_schema.yml
Original file line number Diff line number Diff line change
Expand Up @@ -192,10 +192,10 @@ derived-sims-raw:
- xrfreq
- variable
filename: [ variable, xrfreq, bias_adjust_project, version, mip_era, activity, institution, source, driving_model, experiment, member, domain, processing_level, DATES ]
derived-reconstruction:
derived-non-sims:
with:
- facet: type
value: reconstruction
value: [ station-obs, reconstruction, forecast ]
folders:
- type
- institution
Expand Down
5 changes: 5 additions & 0 deletions src/xscen/io.py
Original file line number Diff line number Diff line change
Expand Up @@ -1052,6 +1052,11 @@ def rechunk(
ds = xr.open_dataset(path_in)
else:
ds = path_in
# Remove all input chunks information, avoids an error with rechunker and xarray >= 2024.10
# TODO: Remove this and pin rechunker when https://github.com/pangeo-data/rechunker/pull/156 is merged and released
for var in ds.variables.values():
var.encoding.pop("chunks", None)

variables = list(ds.data_vars)
if chunks_over_var:
chunks = chunks_over_var
Expand Down

0 comments on commit 6696efb

Please sign in to comment.