Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Library gardening #3

Merged
merged 3 commits into from
Jun 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
version: 2
updates:
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "weekly"
Binary file added docs/source/_static/ArviZ.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/_static/ArviZ_white.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
68 changes: 0 additions & 68 deletions docs/source/_static/custom.css

This file was deleted.

191 changes: 183 additions & 8 deletions docs/source/api/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,13 @@ More coming soon...


## Example datasets
The behaviour of the functions in this section is partially controlled by the
following environment variable:

:::{envvar} ARVIZ_DATA
If present, store remote datasets after downloading in the location indicated there.
Otherwise, datasets are stored at `~/arviz_data/`
:::

```{eval-rst}
.. autosummary::
Expand All @@ -46,24 +53,192 @@ More coming soon...
arviz_base.clear_data_home
```

## Configuration
## Conversion utilities

```{eval-rst}
.. autosummary::
:toctree: generated/

arviz_base.rc_context
arviz_base.convert_to_dataset
arviz_base.dict_to_dataset
arviz_base.generate_dims_coords
arviz_base.make_attrs
arviz_base.ndarray_to_dataarray
```

## Conversion utilities
## Configuration
Most ArviZ default values are regulated by {class}`arviz_base.rcParams`, a class similar
to a dictionary storing key-value pairs inspired by the one in matplotlib.
It is similar to a dictionary and not a dictionary though because all keys are fixed,
and each key has associated a validation function to help prevent setting nonsensical defaults.

### ArviZ configuration file

The `rcParams` class is generated and populated at import time. ArviZ checks several
locations for a file named `arvizrc` and, if found, prefers those settings over the library ones.

The locations checked are the following:

1. Current working directory, {func}`os.getcwd`
1. Location indicated by {envvar}`ARVIZ_DATA` environment variable
1. The third and last location checked is OS dependent:
* On Linux: `$XDG_CONFIG_HOME/arviz` if exists, otherwise `~/.config/arviz/`
* Elsewhere: `~/.arviz/`

:::{dropdown} Example `arvizrc` file
:name: arvizrc
:open:

```none
data.index_origin : 1
plot.backend : bokeh
stats.ci_kind : hdi
stats.ci_prob : 0.95
```

All available keys are listed below. The `arvizrc` file can have any subset of the keys,
it isn't necessary to include them all. For those keys without a user defined default,
the library one is used.
:::

### Context manager
A context manager is also available to temporarily change the default settings.

```{eval-rst}
.. autosummary::
:toctree: generated/

arviz_base.convert_to_dataset
arviz_base.dict_to_dataset
arviz_base.generate_dims_coords
arviz_base.make_attrs
arviz_base.ndarray_to_dataarray
arviz_base.rc_context
```

## rcParams
Below, all keys available within `rcParams` are listed, along with their library default.

Keys can be accessed or modified via ``arviz_base.rcParams[key]``, for example,
``arviz_base.rcParams["data.sample_dims"]``.

:::{important}
These defaults are subject to change. If you rely on a specific default value,
you should create an {ref}`arvizrc <arvizrc>` file with the key-value pairs you rely on.

The goal of the ArviZ team is to try and adapt to best practices as they evolve,
which sometimes requires updating default values, for example to use new algorithms.
:::

### data


```{eval-rst}
.. py:data:: data.http_protocol
:type: str
:value: "https"

Protocol for loading remote datasets. Can be "https" or "http".

.. py:data:: data.index_origin
:type: int
:value: 0

Index origin. By default, ArviZ adds coordinate values to all dimensions.
If no coordinate values were provided, ArviZ generates integer indices
as coordinate values starting at `index_origin`.

.. py:data:: data.log_likelihood
:type: bool
:value: True

Whether to save pointwise log likelihood values.

.. py:data:: data.sample_dims
:type: list
:value: ["chain", "draw"]

What the sampling dimensions are named. These are the dimensions that will be
reduced by default when computing or plotting, therefore, they should be always present.

.. py:data:: data.save_warmup
:type: bool
:value: False

Whether to save warmup iterations.
```

### stats

```{eval-rst}
.. py:data:: stats.module
:type: str
:value: "arviz_stats.base"

Preferred module for stats computations

.. py:data:: stats.ci_kind
:type: str
:value: "eti"

Type of credible interval to compute by default, one of "eti" or "hdi".

.. py:data:: stats.ci_prob
:type: float
:value: 0.83
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When did we switch to 0.83 vs 0.94?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opened an issue #4 to discuss all rcParams changes and updates


The default probability of computed credible intervals. Its default value here
is also a friendly reminder of the arbitrary nature of commonly values like 95%

.. py:data:: stats.information_criterion
:type: str
:value: "loo"

Default algorithm for predictive performance quantification, one of "loo" or "waic".

.. py:data:: stats.ic_compare_method
:type: str
:value: "stacking"

Method for comparing multiple models using their information criteria values,
one of "stacking", "bb-pseudo-bma" or "pseudo-mba".

.. py:data:: stats.ic_pointwise
:type: bool
:value: True

Whether to return pointwise values when computing the
:data:`information criterion <stats.information_criterion>`.

.. py:data:: stats.ic_scale
:type: str
:value: "log"

The scale in which to return
:data:`information criterion <stats.information_criterion>` values,
one of "deviance" (common in the past and reason of the information criterion naming),
"log" or "negative_log".
```

### plots

```{eval-rst}
.. py:data:: plot.backend
:type: str
:value: "matplotlib"

Default plotting backend for :mod:`arviz_plots`, one of "matplotlib", "bokeh" or "none".

.. py:data:: plot.density_kind
:type: str
:value: "kde"

Default representation for 1D marginal densities, one of "kde", "hist", "ecdf" or "dot".

.. py:data:: plot.max_subplots
:type: int
:value: 40

Maximum number of :term:`arviz_plots:plots` that can be generated at once.

.. py:data:: plot.point_estimate
:type: str
:value: "mean"

Default statistical summary for centrality, one of "mean", "median" or "mode".
```
13 changes: 11 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,9 @@
"pull": ("https://github.com/arviz-devs/arviz-base/pull/%s", "PR#%s"),
}

copybutton_prompt_text = r">>> |\.\.\. |\$ |In \[\d*\]: | {2,5}\.\.\.: | {5,8}: "
copybutton_prompt_is_regexp = True

nb_execution_mode = "auto"
nb_execution_excludepatterns = ["*.ipynb"]
nb_kernel_rgx_aliases = {".*": "python3"}
Expand Down Expand Up @@ -91,10 +94,16 @@
"numpy": ("https://numpy.org/doc/stable/", None),
"python": ("https://docs.python.org/3/", None),
"xarray": ("https://docs.xarray.dev/en/stable/", None),
"arviz_plots": ("https://arviz-plots.readthedocs.io/en/latest", None),
}

# -- Options for HTML output

html_theme = "furo"
html_theme = "sphinx_book_theme"
html_theme_options = {
"logo": {
"image_light": "_static/ArviZ.png",
"image_dark": "_static/ArviZ_white.png",
}
}
html_static_path = ["_static"]
html_css_files = ["custom.css"]
6 changes: 4 additions & 2 deletions external_tests/helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,8 @@ def _emcee_lnprob(theta, y, sigma):
prior = _emcee_lnprior(theta)
like_vect = -(((mu + tau * eta - y) / sigma) ** 2)
like = np.sum(like_vect)
return like + prior, (like_vect, np.random.normal((mu + tau * eta), sigma))
rng = np.random.default_rng()
return like + prior, (like_vect, rng.normal((mu + tau * eta), sigma))


def emcee_schools_model(data, draws, chains):
Expand All @@ -47,7 +48,8 @@ def emcee_schools_model(data, draws, chains):
J = data["J"] # pylint: disable=invalid-name
ndim = J + 2

pos = np.random.normal(size=(chains, ndim))
rng = np.random.default_rng()
pos = rng.normal(size=(chains, ndim))
pos[:, 1] = np.absolute(pos[:, 1]) # pylint: disable=unsupported-assignment-operation

here = os.path.dirname(os.path.abspath(__file__))
Expand Down
11 changes: 7 additions & 4 deletions external_tests/test_emcee.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,13 +116,15 @@ def test_slices_warning(self, data, slices):

def test_no_blobs_error(self):
sampler = emcee.EnsembleSampler(6, 1, lambda x: -(x**2))
sampler.run_mcmc(np.random.normal(size=(6, 1)), 20)
rng = np.random.default_rng()
sampler.run_mcmc(rng.normal(size=(6, 1)), 20)
with pytest.raises(ValueError):
from_emcee(sampler, blob_names=["inexistent"])

def test_peculiar_blobs(self, data):
sampler = emcee.EnsembleSampler(6, 1, lambda x: (-np.sum(x**2), (np.random.normal(x), 3)))
sampler.run_mcmc(np.random.normal(size=(6, 1)), 20)
rng = np.random.default_rng()
sampler = emcee.EnsembleSampler(6, 1, lambda x: (-np.sum(x**2), (rng.normal(x), 3)))
sampler.run_mcmc(rng.normal(size=(6, 1)), 20)
inference_data = from_emcee(sampler, blob_names=["normal", "threes"])
fails = check_multiple_attrs({"log_likelihood": ["normal", "threes"]}, inference_data)
assert not fails
Expand All @@ -131,8 +133,9 @@ def test_peculiar_blobs(self, data):
assert not fails

def test_single_blob(self):
rng = np.random.default_rng()
sampler = emcee.EnsembleSampler(6, 1, lambda x: (-np.sum(x**2), 3))
sampler.run_mcmc(np.random.normal(size=(6, 1)), 20)
sampler.run_mcmc(rng.normal(size=(6, 1)), 20)
inference_data = from_emcee(sampler, blob_names=["blob"], blob_groups=["blob_group"])
fails = check_multiple_attrs({"blob_group": ["blob"]}, inference_data)
assert not fails
Expand Down
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ ci = [
"cloudpickle"
]
doc = [
"furo",
"sphinx-book-theme",
"myst-parser[linkify]",
"myst-nb",
"sphinx-copybutton",
Expand Down Expand Up @@ -82,6 +82,7 @@ select = [
"E", # Pycodestyle
"W", # Pycodestyle
"D", # pydocstyle
"NPY", # numpy specific rules
"UP", # pyupgrade
"I", # isort
"PL", # Pylint
Expand Down
Loading
Loading