Skip to content

Commit

Permalink
fix spelling errors, configure codespell
Browse files Browse the repository at this point in the history
  • Loading branch information
Zeitsperre committed Feb 10, 2025
1 parent 7307add commit 5dc9b20
Show file tree
Hide file tree
Showing 17 changed files with 80 additions and 70 deletions.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ lint: lint/flake8 lint/black ## check style
test: ## run tests quickly with the default Python
python -m pytest

test-distributed: ## run tests quickly with the default Python and distibuted workers
test-distributed: ## run tests quickly with the default Python and distributed workers
python -m pytest --num-processes=logical

test-notebooks: ## run tests on notebooks and compare outputs
Expand Down
2 changes: 1 addition & 1 deletion docs/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Installation
============

We strongly recommend installing `xhydro` in an Anaconda Python environment. Futhermore, due to the complexity of some packages, the default dependency solver can take a long time to resolve the environment. If `mamba` is not already your default solver, consider running the following commands in order to speed up the process:
We strongly recommend installing `xhydro` in an Anaconda Python environment. Furthermore, due to the complexity of some packages, the default dependency solver can take a long time to resolve the environment. If `mamba` is not already your default solver, consider running the following commands in order to speed up the process:

.. code-block:: console
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -319,8 +319,8 @@ msgstr "Incertitudes de l'analyse fréquentielle locale"

#: ../../notebooks/regional_frequency_analysis.ipynb:485
msgid ""
"To add some uncertainities, we will work with only one catchment and two "
"distributions as uncertainities can be intensive in computation. We "
"To add some uncertainties, we will work with only one catchment and two "
"distributions as uncertainties can be intensive in computation. We "
"select the station 023401, and distribution 'genextreme' and 'pearson3'."
msgstr ""
"Pour ajouter l'incertitude, nous travaillerons avec un seul "
Expand All @@ -342,7 +342,7 @@ msgid "Bootstraping the observations"
msgstr "Rééchantillonnage des observations"

#: ../../notebooks/regional_frequency_analysis.ipynb:514
msgid "A way to get uncertainities is to bootstrap the observations 200 times."
msgid "A way to get uncertainties is to bootstrap the observations 200 times."
msgstr ""
"Une façon d’obtenir des incertitudes est de rééchantillonner les observations "
"200 fois."
Expand All @@ -365,10 +365,10 @@ msgstr "Incertitudes de l'analyse fréquentielle régionale"

#: ../../notebooks/regional_frequency_analysis.ipynb:642
msgid ""
"For the regional analysis, we again use ``boostrap_obs`` to resample the "
"For the regional analysis, we again use ``bootstrap_obs`` to resample the "
"observations, but, this time, it's much faster as no fit is involved."
msgstr ""
"Pour l'analyse régionale, nous utilisons à nouveau ``boostrap_obs`` pour "
"Pour l'analyse régionale, nous utilisons à nouveau ``bootstrap_obs`` pour "
"rééchantillonner les observations, mais, cette fois, c'est beaucoup plus "
"rapide, car aucun ajustement n'est impliqué."

Expand Down
4 changes: 2 additions & 2 deletions docs/notebooks/extreme_value_analysis.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,7 @@
"source": [
"### Parameter estimation for non-stationary model\n",
"\n",
"For this example the location paramerter vary as linear funcion of the year. To do this, a new dimension containing the year is created."
"For this example the location parameter varies as a linear function of the year. To do this, a new dimension containing the year is created."
]
},
{
Expand Down Expand Up @@ -217,7 +217,7 @@
"source": [
"### Return level estimation for non-stationary model\n",
"\n",
"100-year return level with the location paramerter vary as linear funcion of the year."
"100-year return level with the location parameter varies as a linear function of the year."
]
},
{
Expand Down
10 changes: 5 additions & 5 deletions docs/notebooks/gis.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -889,7 +889,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We can also extract the surface properties for the same `gpd.GeoDataFrame` : "
"We can also extract the surface properties for the same `gpd.GeoDataFrame` :"
]
},
{
Expand Down Expand Up @@ -1019,7 +1019,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Again, for convenience, we can output the results in `xarray.Dataset` format : "
"Again, for convenience, we can output the results in `xarray.Dataset` format :"
]
},
{
Expand Down Expand Up @@ -1464,7 +1464,7 @@
"metadata": {},
"source": [
"### b) Land-use classification\n",
"Land use classification is powered by the Planetary Computer's STAC catalog. It uses the `10m Annual Land Use Land Cover` dataset by default (\"io-lulc-annual-v02\"), but other collections can be specified by using the collection argument. "
"Land use classification is powered by the Planetary Computer's STAC catalog. It uses the `10m Annual Land Use Land Cover` dataset by default (\"io-lulc-annual-v02\"), but other collections can be specified by using the collection argument."
]
},
{
Expand Down Expand Up @@ -1977,7 +1977,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Because the next few steps use [xclim](https://xclim.readthedocs.io/en/stable/index.html) under the hood, the dataset is required to be [CF-compliant](http://cfconventions.org/cf-conventions/cf-conventions.html). At a minimum, the `xarray.DataArray` used must follow these principles:\n",
"Because the next few steps use [xclim](https://xclim.readthedocs.io/en/stable/index.html) under the hood, the dataset is required to be [CF-compliant](https://cfconventions.org/cf-conventions/cf-conventions.html). At a minimum, the `xarray.DataArray` used must follow these principles:\n",
"\n",
"- The dataset needs a time dimension, usually at a daily frequency with no missing timesteps (NaNs are supported). If your data differs from that, you'll need to be extra careful on the results provided.\n",
"- If there is a spatial dimension, such as \"``Station``\" in the example below, it needs an attribute ``cf_role`` with ``timeseries_id`` as its value.\n",
Expand Down Expand Up @@ -3965,7 +3965,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The same data can also be visualized as a `pd.DataFrame` as well : "
"The same data can also be visualized as a `pd.DataFrame` as well :"
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion docs/notebooks/hydrological_modelling.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@
"Hydrological models can differ from one another in terms of required inputs and available functions, but an effort will be made to homogenize them as much as possible as new models get added. Currently, all models have these 3 functions:\n",
"- `.run()` which will execute the model, reformat the outputs to be compatible with analysis tools in `xhydro`, then return the simulated streamflows as a `xarray.Dataset`.\n",
" - The streamflow will be called `streamflow` and have units in `m3 s-1`.\n",
" - In the case of 1D data (such as hydrometric stations), that dimension in the dataset will be identified trough a `cf_role: timeseries_id` attribute.\n",
" - In the case of 1D data (such as hydrometric stations), that dimension in the dataset will be identified through a `cf_role: timeseries_id` attribute.\n",
"- `.get_inputs()` to retrieve the meteorological inputs.\n",
"- `.get_streamflow()` to retrieve the simulated streamflow.\n",
"\n",
Expand Down
8 changes: 4 additions & 4 deletions docs/notebooks/optimal_interpolation.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
"source": [
"Optimal interpolation is a tool that allows combining a spatially distributed field (i.e. the \"background field\") with point observations in such a way that the entire field can be adjusted according to deviations between the observations and the field at the point of observations. For example, it can be used to combine a field of reanalysis precipitation (e.g. ERA5) with observation records, and thus adjust the reanalysis precipitation over the entire domain in a statistically optimal manner.\n",
"\n",
"This page demonstrates how to use `xhydro` to perform optimal interpolation using field-like simulations and point observations for hydrological modelling. In this case, the background field is a set of outputs from a distributed hydrological model and the observations correspond to real hydrometric stations. The aim is to correct the background field (i.e. the distributed hydrological simulations) using optimal interpolation, as in Lachance-Cloutier et al (2017).\n",
"This page demonstrates how to use `xhydro` to perform optimal interpolation using field-like simulations and point observations for hydrological modelling. In this case, the background field is a set of outputs from a distributed hydrological model and the observations correspond to real hydrometric stations. The aim is to correct the background field (i.e. the distributed hydrological simulations) using optimal interpolation, as in Lachance-Cloutier et al. (2017).\n",
"\n",
"*Lachance-Cloutier, S., Turcotte, R. and Cyr, J.F., 2017. Combining streamflow observations and hydrologic simulations for the retrospective estimation of daily streamflow for ungauged rivers in southern Quebec (Canada). Journal of hydrology, 550, pp.294-306.*"
]
Expand Down Expand Up @@ -99,7 +99,7 @@
"\n",
"We now have the basic data required to start processing using optimal interpolation. However, before doing so, we must provide some hyperparameters. Some are more complex than others, so let's break down the main steps.\n",
"\n",
"The first is the need to compute differences (also referred to as \"departures\" between observations and simulations where they both occur simultaneously. We also need to scale the data by the catchment area to ensure errors are relative and can then be interpolated. We then take the logarithm of these values to ensure extrapolation does not cause negative streamflow. We will reverse the transformation later."
"The first is the need to compute differences (also referred to as \"departures\") between observations and simulations where they both occur simultaneously. We also need to scale the data by the catchment area to ensure errors are relative and can then be interpolated. We then take the logarithm of these values to ensure extrapolation does not cause negative streamflow. We will reverse the transformation later."
]
},
{
Expand Down Expand Up @@ -429,7 +429,7 @@
"\n",
"Notice that there are again 274 stations, like in the \"qobs\" dataset. This is because this specific dataset was used to perform leave-one-out cross validation to assess the optimal interpolation performance, and as such, only simulations at gauged sites is of interest. In an operational setting, there is no limit on the number of stations for \"qsim\".\n",
"\n",
"Now let's take a look at the correspondance tables and the observed station dataset."
"Now let's take a look at the correspondence tables and the observed station dataset."
]
},
{
Expand Down Expand Up @@ -521,7 +521,7 @@
"# If we do a leave-one-out cross-validation over the 96 catchments, the entire optimal interpolation process is repeated 96 times but\n",
"# only over the observation sites, each time leaving one station out and kept independent for validation. This is time-consuming and\n",
"# can be parallelized by adjusting this flag and setting an appropriate number of CPU cores according to your computer. By default,\n",
"# the code will only use 1 core. However, if increased, the maximum number tht will be actually used is ([number-of-available-cores / 2] - 1)\n",
"# the code will only use 1 core. However, if increased, the maximum number that will be actually used is ([number-of-available-cores / 2] - 1)\n",
"# CPU cores as to not overexert the computer.\n",
"parallelize = False\n",
"max_cores = 1\n",
Expand Down
42 changes: 21 additions & 21 deletions docs/notebooks/regional_frequency_analysis.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -178,7 +178,7 @@
"metadata": {},
"source": [
"### b) Clustering\n",
"In this example we'll use `AgglomerativeClustering`, but other methods would also provide valid results. The regional clustering itself is performed using xhfa.regional.get_group_from_fit, which can take the arguments of the skleanr functions as a dictionnary."
"In this example we'll use `AgglomerativeClustering`, but other methods would also provide valid results. The regional clustering itself is performed using xhfa.regional.get_group_from_fit, which can take the arguments of the skleanr functions as a dictionary."
]
},
{
Expand Down Expand Up @@ -261,15 +261,15 @@
"- **Interpretation**:\n",
"\n",
" - **Low Z-Score**: A good fit of the model to the observed data. Typically, an absolute value of the Z-Score below 1.64 suggests that the model is appropriate and the fit is statistically acceptable.\n",
" \n",
"\n",
" - **High Z-Score**: Indicates significant discrepancies between the observed and expected values. An absolute value above 1.64 suggests that the model may not fit the data well, and adjustments might be necessary.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To calculate H and Z, we also need a `KappaGen` object from the lmoment3 library. This library is not part of the default xhydro environment and will need to be installed seperately."
"aTo calculate H and Z, we also need a `KappaGen` object from the lmoments3 library. This library is not part of the default xhydro environment and will need to be installed separately."
]
},
{
Expand Down Expand Up @@ -392,8 +392,8 @@
"source": [
"# Uncertainties\n",
"## Local frequency analysis uncertainties\n",
"To add some uncertainities, we will work with only one catchment and two distributions, as uncertainities can be intensive in computation.\n",
"We select the station 023401, and distribution 'genextreme' and 'pearson3'. \n",
"To add some uncertainties, we will work with only one catchment and two distributions, as uncertainties can be intensive in computation.\n",
"We select the station 023401, and distribution 'genextreme' and 'pearson3'.\n",
"\n",
"For the local frequency analysis, we need to fit the distribution so the calulting time can be long."
]
Expand All @@ -414,8 +414,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Bootstraping the observations\n",
"A way to get uncertainities is to bootstrap the observations. For this example, we will boostrap the observations a low amount of times, although a higher number (e.g. 5000) would be preferable in practice."
"### Bootstrapping the observations\n",
"A way to get uncertainties is to bootstrap the observations. For this example, we will bootstrap the observations a low amount of times, although a higher number (e.g. 5000) would be preferable in practice."
]
},
{
Expand All @@ -424,7 +424,7 @@
"metadata": {},
"outputs": [],
"source": [
"ds_4fa_iter = xhfa.uncertainities.boostrap_obs(ds_4fa_one_station, 35)\n",
"ds_4fa_iter = xhfa.uncertainties.bootstrap_obs(ds_4fa_one_station, 35)\n",
"params_boot_obs = xhfa.local.fit(ds_4fa_iter, distributions=[\"genextreme\", \"pearson3\"])"
]
},
Expand Down Expand Up @@ -454,10 +454,10 @@
"metadata": {},
"outputs": [],
"source": [
"values = xhfa.uncertainities.boostrap_dist(\n",
"values = xhfa.uncertainties.bootstrap_dist(\n",
" ds_4fa_one_station, params_loc_one_station, 35\n",
")\n",
"params_boot_dist = xhfa.uncertainities.fit_boot_dist(values)"
"params_boot_dist = xhfa.uncertainties.fit_boot_dist(values)"
]
},
{
Expand Down Expand Up @@ -523,9 +523,9 @@
"metadata": {},
"source": [
"## Regional frequency analysis uncertainties\n",
"### Bootstraping the observations\n",
"### Bootstrapping the observations\n",
"\n",
"For the regional analysis, we again use `boostrap_obs` to resample the observations, but, this time, it's much faster as no fit is involved."
"For the regional analysis, we again use `bootstrap_obs` to resample the observations, but, this time, it's much faster as no fit is involved."
]
},
{
Expand All @@ -534,8 +534,8 @@
"metadata": {},
"outputs": [],
"source": [
"ds_reg_samples = xhfa.uncertainities.boostrap_obs(ds_4fa, 35)\n",
"ds_moments_iter = xhfa.uncertainities.calc_moments_iter(ds_reg_samples).load()"
"ds_reg_samples = xhfa.uncertainties.bootstrap_obs(ds_4fa, 35)\n",
"ds_moments_iter = xhfa.uncertainties.calc_moments_iter(ds_reg_samples).load()"
]
},
{
Expand All @@ -544,7 +544,7 @@
"metadata": {},
"outputs": [],
"source": [
"Q_reg_boot = xhfa.uncertainities.calc_q_iter(\n",
"Q_reg_boot = xhfa.uncertainties.calc_q_iter(\n",
" \"023401\", \"streamflow_max_annual\", ds_groups_H1, ds_moments_iter, return_periods\n",
")"
]
Expand All @@ -562,7 +562,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Since we'll do a few plots to illustrate the results, let's make a function to somplify things a litle."
"Since we'll do a few plots to illustrate the results, let's make a function to simplify things a little."
]
},
{
Expand Down Expand Up @@ -620,7 +620,7 @@
"metadata": {},
"source": [
"### Multiple regions\n",
"Another way to get the uncertainty is to have many regions for one catchement of interest. We can achive this by trying different clustering methods. Or by performing a jackknife on the station list. We dont do too many tests here since it can take quite a while to run and the goal is just to illustrate the possibilities"
"Another way to get the uncertainty is to have many regions for one catchment of interest. We can achieve this by trying different clustering methods. Or by performing a jackknife on the station list. It can take quite a while to run, so we show here a simplified example; The goal is just to illustrate the possibilities."
]
},
{
Expand All @@ -647,7 +647,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We now generaste stations combination by removing 0-n stations. "
"We now generaste stations combination by removing 0-n stations."
]
},
{
Expand All @@ -657,7 +657,7 @@
"outputs": [],
"source": [
"n = 2\n",
"combinations_list = xhfa.uncertainities.generate_combinations(data, n)"
"combinations_list = xhfa.uncertainties.generate_combinations(data, n)"
]
},
{
Expand Down Expand Up @@ -692,7 +692,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The following steps are similar to the previous one, just with more regions. "
"The following steps are similar to the previous one, just with more regions."
]
},
{
Expand Down Expand Up @@ -781,7 +781,7 @@
"metadata": {},
"outputs": [],
"source": [
"Q_reg_boot = xhfa.uncertainities.calc_q_iter(\n",
"Q_reg_boot = xhfa.uncertainties.calc_q_iter(\n",
" \"023401\", \"streamflow_max_annual\", ds_groups_H1, ds_moments_iter, return_periods\n",
")\n",
"Q_reg_boot"
Expand Down
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,8 @@ values = [
]

[tool.codespell]
ignore-words-list = "astroid,socio-economic"
ignore-words-list = "ans,astroid,nd,parametre,projet,socio-economic"
skip = "*.po"

[tool.coverage.paths]
source = ["src/xhydro/", "*/site-packages/xhydro/"]
Expand Down
12 changes: 6 additions & 6 deletions src/xhydro/extreme_value_analysis/julia_import.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""Load and install Julia dependancies into python environment."""
"""Load and install Julia dependencies into python environment."""

import contextlib
import io
Expand Down Expand Up @@ -80,14 +80,14 @@ def check_function_output(func, expected_output, *args, **kwargs) -> bool:
return expected_output in output


# It was not necessary to add a dependancy dictionary as we only need Extremes.jl, however this mechanism is more
# scalable in case we need to add many other julia dependancies in the future
# It was not necessary to add a dependency dictionary as we only need Extremes.jl, however this mechanism is more
# scalable in case we need to add many other julia dependencies in the future
deps = {
"Extremes": "fe3fe864-1b39-11e9-20b8-1f96fa57382d",
}
for dependancy, uuid in deps.items():
if not check_function_output(juliapkg.deps.status, dependancy):
juliapkg.add(dependancy, uuid)
for dependency, uuid in deps.items():
if not check_function_output(juliapkg.deps.status, dependency):
juliapkg.add(dependency, uuid)
juliapkg.resolve()
jl = cast(ModuleType, jl)
jl_version = (
Expand Down
2 changes: 1 addition & 1 deletion src/xhydro/extreme_value_analysis/structures/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ def return_level_cint(
nobsperblock_pareto: int | None = None,
) -> dict[str, list[float]]:
r"""
Return a list of retun level and confidence intervals for a given Julia fitted model.
Return a list of return levels and confidence intervals for a given Julia fitted model.
Parameters
----------
Expand Down
Loading

0 comments on commit 5dc9b20

Please sign in to comment.