Skip to content

Commit

Permalink
Fixes and updates to the documentation and website (#676)
Browse files Browse the repository at this point in the history
* first docs fix commit

* fix repo url

* fix bad relative imports

* fix api docs using implicit namespaces

* misc link fixes in docs

* fix model names

* remove papers_using_models from second mention

* small fixes to api links

* more link fixes

* more link fixes

* try to deploy to external repo

* try to deploy to external repo

* force execute notebooks

* small fix for ruff, and fix docs dir

* push to gh-pages

* add inits, change gitignore

* continue to rename to fairchem in docs; feel free to roll this back

* misc docs fixes

* more small fixes

* fix main.py location in tutorial utils

* gotchas checkpoint fix

* fairchem_root

* fix build by removing __init__.py

* add env to conda command

* edit install docs

* update sphinx targets

* execute auto

* execute force

* publish for the moment to gh-pages branch aswell

* quite install ocpapi

* fix finetuning from notebook tutorial

* try to remove scale file from tutorial

* download scaling file in tutorial

* edit conf.py

* fix some links

* wget al scale files

* move configs to root

* temp links to build docs

* update index page

* update docs logo

* set scale file links to main branch

* set scale file links to main branch blob

---------

Co-authored-by: Misko <[email protected]>
Co-authored-by: lbluque <[email protected]>
Former-commit-id: 5dc6bcbc53704e404f214da0afde1ecc83e06f08
  • Loading branch information
3 people authored May 11, 2024
1 parent d9d3962 commit 2ca6fd2
Show file tree
Hide file tree
Showing 133 changed files with 4,413 additions and 532 deletions.
13 changes: 11 additions & 2 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,18 @@ jobs:
run: |
jupyter-book build docs
# Deploy the book's HTML to gh-pages branch
- name: Deploy
# Deploy the book's HTML to gh-pages branch # TODO remove once ODAC link updated
- name: Deploy to ghpages branch
uses: peaceiris/actions-gh-pages@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: docs/_build/html

- name: Deploy to fair-chem.github.io
uses: peaceiris/actions-gh-pages@v4
with:
deploy_key: ${{ secrets.ACTIONS_DEPLOY_KEY }}
external_repository: FAIR-Chem/fair-chem.github.io
publish_branch: gh-pages
publish_dir: docs/_build/html

2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -125,4 +125,4 @@ out.txt
config.yml

docs/autoapi/
docs/tutorial/advanced/*.db
docs/tutorial/advanced/*.db
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
15 changes: 8 additions & 7 deletions docs/_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,31 +5,31 @@

#######################################################################################
# Book settings
title : Open Catalyst Project Documentation # The title of the book. Will be placed in the left navbar.
author : The Open Catalyst Project # The author of the book
title : FAIR Chemistry Documentation # The title of the book. Will be placed in the left navbar.
author : FAIR Chemistry & Collaborators # The author of the book
copyright : "2024" # Copyright year to be placed in the footer
logo : logo.png # A path to the book logo

# Force re-execution of notebooks on each build.
# See https://jupyterbook.org/content/execute.html
execute:
execute_notebooks: cache
execute_notebooks: force
timeout: 1800

# Define the name of the latex output file for PDF builds
latex:
latex_documents:
targetname: ocp.tex
targetname: fairchem.tex

# Add a bibtex file so that we can create citations
bibtex_bibfiles:
- references.bib

# Information about where the book exists on the web
repository:
url: https://github.com/Open-Catalyst-Project/ocp # Online location of your book
url: https://github.com/FAIR-Chem/fairchem # Online location of your book
path_to_book: docs # Optional path to your book, relative to the repository root
branch: ocp_documentation # Which branch of the repository should be used when creating links (optional)
branch: main # Which branch of the repository should be used when creating links (optional)

# Add GitHub buttons to your book
# See https://jupyterbook.org/customize/config.html#add-a-link-to-your-repository
Expand All @@ -50,4 +50,5 @@ sphinx:
- 'autoapi.extension'
config:
autosummary_generate: True
autoapi_dirs: ['../ocpmodels']
autoapi_dirs: ['../src/fairchem/core','../src/fairchem/data','../src/fairchem/applications/AdsorbML/adsorbml','../src/fairchem/applications/CatTSunami/ocpneb','../src/fairchem/demo/ocpapi']
autoapi_python_use_implicit_namespaces: True
15 changes: 8 additions & 7 deletions docs/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,15 @@ root: index
parts:
- caption: Quickstart & Installation
chapters:
- file: core/intro_series
- file: core/install
- file: core/quickstart
- file: core/gotchas
- file: core/license
- caption: Learn More
chapters:
- file: core/intro_series
- file: videos/technical_talks
- file: core/papers_using_models
- caption: OCP API & Demo
chapters:
- url: https://open-catalyst.metademolab.com/
Expand All @@ -32,13 +36,10 @@ parts:
- file: core/inference
- file: core/fine-tuning/fine-tuning-oxides
- file: core/model_faq
- caption: Videos and Talks
chapters:
- file: videos/technical_talks
- caption: Case Studies & Tutorials
- caption: Catalysis Case Studies & Tutorials
chapters:
- file: tutorials/cattsunami_walkthrough
- file: tutorials/adsorbml_walkthrough
- file: core/papers_using_models
- file: tutorials/intro
- file: tutorials/OCP-introduction
- file: tutorials/NRR/NRR_toc
Expand All @@ -53,7 +54,7 @@ parts:
- file: legacy_tutorials/OCP_Tutorial
- file: legacy_tutorials/data_preprocessing
- file: legacy_tutorials/data_visualization
- caption: ocpmodels documentation
- caption: fairchem documentation
chapters:
- file: autoapi/index
- caption: Notebook execution times
Expand Down
2 changes: 1 addition & 1 deletion docs/core/ase_dataset_creation.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@

# Making and using ASE datasets

There are multiple ways to train and evaluate OCP models on data other than OC20 and OC22. Writing an LMDB is the most performant option. However, ASE-based dataset formats are also included as a convenience for people with existing data who simply want to try OCP tools without needing to learn about LMDBs.
There are multiple ways to train and evaluate FAIRChem models on data other than OC20 and OC22. Writing an LMDB is the most performant option. However, ASE-based dataset formats are also included as a convenience for people with existing data who simply want to try fairchem tools without needing to learn about LMDBs.


## Using an ASE Database
Expand Down
8 changes: 4 additions & 4 deletions docs/core/datasets/oc20.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
IS2* datasets are stored as LMDB files and are ready to be used upon download.
S2EF train+val datasets require an additional preprocessing step.

For convenience, a self-contained script can be found [here](https://github.com/Open-Catalyst-Project/ocp/blob/main/scripts/download_data.py) to download, preprocess, and organize the data directories to be readily usable by the existing [configs](https://github.com/Open-Catalyst-Project/ocp/tree/main/configs).
For convenience, a self-contained script can be found [here](https://github.com/FAIR-Chem/fairchem/blob/main/src/fairchem/core/scripts/download_data.py) to download, preprocess, and organize the data directories to be readily usable by the existing [configs](https://github.com/FAIR-Chem/fairchem/tree/main/src/fairchem/core/configs).

For IS2*, run the script as:

Expand Down Expand Up @@ -47,10 +47,10 @@ python scripts/download_data.py --task s2ef --split test



To download and process the dataset in a directory other than your local `ocp/data` folder, add the following command line argument `--data-path`.
To download and process the dataset in a directory other than your local `fairchem/data` folder, add the following command line argument `--data-path`.

Note that the baseline [configs](https://github.com/Open-Catalyst-Project/ocp/tree/main/configs)
expect the data to be found in `ocp/data`, make sure you symlink your directory or
Note that the baseline [configs](https://github.com/FAIR-Chem/fairchem/tree/main/src/fairchem/core/configs)
expect the data to be found in `fairchem/data`, make sure you symlink your directory or
modify the paths in the configs accordingly.

The following sections list dataset download links and sizes for various S2EF
Expand Down
10 changes: 5 additions & 5 deletions docs/core/fine-tuning/fine-tuning-oxides.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ We get this checkpoint here.
```{code-cell} ipython3
from fairchem.core.models.model_registry import model_name_to_local_file
checkpoint_path = model_name_to_local_file('GemNet-OCOC20+OC22', local_cache='/tmp/ocp_checkpoints/')
checkpoint_path = model_name_to_local_file('GemNet-OC-S2EFS-OC20+OC22', local_cache='/tmp/ocp_checkpoints/')
```

The data we need is provided in `supporting-information.json`. That file is embedded in the supporting information for the article, and is provided here in the tutorial. We load this data and explore it a little. The json file provides a dictionary with the structure:
Expand Down Expand Up @@ -185,7 +185,7 @@ The train set is used for training. The test and val sets are used to check for

You choose the splits you want, 80:10:10 is common. We take a simple approach to split the database here. We make an array of integers that correspond to the ids, randomly shuffle them, and then get each row in the randomized order and write them to a new db.

We provide some helper functions in `ocpmodels.common.tutorial_utils` to streamline this process.
We provide some helper functions in `fairchem.core.common.tutorial_utils` to streamline this process.

```{code-cell} ipython3
from fairchem.core.common.tutorial_utils import train_test_val_split
Expand Down Expand Up @@ -235,7 +235,7 @@ yml

## Running the training job

`ocp` provides a `main.py` file that is used for training. Here we construct the Python command you need to run, and run it. `main.py` is not executable, so we have to run it with python, and you need the absolute path to it, which we get from the `ocp_main()` that is defined in the ocpmodels.common.tutorial_utils.
`fairchem` provides a `main.py` file that is used for training. Here we construct the Python command you need to run, and run it. `main.py` is not executable, so we have to run it with python, and you need the absolute path to it, which we get from the `fairchem_main()` that is defined in the fairchem.core.common.tutorial_utils.

you must set a `mode` and provide a `config-yml`. We provide a checkpoint for a starting point, if you don't do this, it will start from scratch.

Expand All @@ -261,10 +261,10 @@ This can take up to 30 minutes for 80 epochs, so we only do a few here to see wh
:tags: [hide-output]
import time
from fairchem.core.common.tutorial_utils import ocp_main
from fairchem.core.common.tutorial_utils import fairchem_main
t0 = time.time()
! python {ocp_main()} --mode train --config-yml {yml} --checkpoint {checkpoint_path} --run-dir fine-tuning --identifier ft-oxides --amp > train.txt 2>&1
! python {fairchem_main()} --mode train --config-yml {yml} --checkpoint {checkpoint_path} --run-dir fine-tuning --identifier ft-oxides --amp > train.txt 2>&1
print(f'Elapsed time = {time.time() - t0:1.1f} seconds')
```

Expand Down
28 changes: 14 additions & 14 deletions docs/core/gotchas.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ kernelspec:
name: python3
---

Common gotchas with OCP
Common gotchas with fairchem
---------------------------------

# OutOfMemoryError
Expand Down Expand Up @@ -45,7 +45,7 @@ The problem here is that no neighbors are found for the single atom which causes
```{code-cell} ipython3
from fairchem.core.common.relaxation.ase_utils import OCPCalculator
from fairchem.core.models.model_registry import model_name_to_local_file
checkpoint_path = model_name_to_local_file('GemNet-OCOC20+OC22', local_cache='/tmp/ocp_checkpoints/')
checkpoint_path = model_name_to_local_file('GemNet-OC-S2EFS-OC20+OC22', local_cache='/tmp/ocp_checkpoints/')
calc = OCPCalculator(checkpoint_path=checkpoint_path)
```

Expand Down Expand Up @@ -79,7 +79,7 @@ add_adsorbate(slab, 'O', height=1.2, position='fcc')
from fairchem.core.models.model_registry import model_name_to_local_file
# OC20 model - trained on adsorption energies
checkpoint_path = model_name_to_local_file('GemNet-OC All', local_cache='/tmp/ocp_checkpoints/')
checkpoint_path = model_name_to_local_file('GemNet-OC-S2EF-OC20-All', local_cache='/tmp/ocp_checkpoints/')
with contextlib.redirect_stdout(StringIO()) as _:
calc = OCPCalculator(checkpoint_path=checkpoint_path, cpu=False)
Expand All @@ -92,7 +92,7 @@ slab.get_potential_energy()

```{code-cell} ipython3
# An OC22 checkpoint - trained on total energy
checkpoint_path = model_name_to_local_file('GemNet-OCOC22', local_cache='/tmp/ocp_checkpoints/')
checkpoint_path = model_name_to_local_file('GemNet-OC-S2EFS-OC20+OC22', local_cache='/tmp/ocp_checkpoints/')
with contextlib.redirect_stdout(StringIO()) as _:
calc = OCPCalculator(checkpoint_path=checkpoint_path, cpu=False)
Expand All @@ -105,7 +105,7 @@ slab.get_potential_energy()

```{code-cell} ipython3
# This eSCN model is trained on adsorption energies
checkpoint_path = model_name_to_local_file('eSCN-L4-M2-Lay12 2M', local_cache='/tmp/ocp_checkpoints/')
checkpoint_path = model_name_to_local_file('eSCN-L4-M2-Lay12-S2EF-OC20-2M', local_cache='/tmp/ocp_checkpoints/')
with contextlib.redirect_stdout(StringIO()) as _:
calc = OCPCalculator(checkpoint_path=checkpoint_path, cpu=False)
Expand All @@ -128,12 +128,12 @@ WARNING:root:Unrecognized arguments: ['symmetric_edge_symmetrization']

You can ignore this warning, it is not important for predictions.

## Unable to identify OCP trainer
## Unable to identify ocp trainer

The trainer is not specified in some checkpoints, and defaults to `forces` which means energy and forces are calculated. This is the default for the ASE OCP calculator, and this warning just alerts you it is setting that.

```
WARNING:root:Unable to identify OCP trainer, defaulting to `forces`. Specify the `trainer` argument into OCPCalculator if otherwise.
WARNING:root:Unable to identify ocp trainer, defaulting to `forces`. Specify the `trainer` argument into OCPCalculator if otherwise.
```

+++
Expand All @@ -158,7 +158,7 @@ from fairchem.core.common.relaxation.ase_utils import OCPCalculator
from fairchem.core.models.model_registry import model_name_to_local_file
import os
checkpoint_path = model_name_to_local_file('GemNet-OCOC20+OC22', local_cache='/tmp/ocp_checkpoints/')
checkpoint_path = model_name_to_local_file('GemNet-OC-S2EFS-OC20+OC22', local_cache='/tmp/ocp_checkpoints/')
calc = OCPCalculator(checkpoint_path=checkpoint_path)
```
Expand All @@ -184,7 +184,7 @@ from fairchem.core.common.relaxation.ase_utils import OCPCalculator
from fairchem.core.models.model_registry import model_name_to_local_file
import os
checkpoint_path = model_name_to_local_file('GemNet-OCOC20+OC22', local_cache='/tmp/ocp_checkpoints/')
checkpoint_path = model_name_to_local_file('GemNet-OC-S2EFS-OC20+OC22', local_cache='/tmp/ocp_checkpoints/')
calc = OCPCalculator(checkpoint_path=checkpoint_path)
```

Expand All @@ -209,7 +209,7 @@ from fairchem.core.common.relaxation.ase_utils import OCPCalculator
from fairchem.core.models.model_registry import model_name_to_local_file
import os
checkpoint_path = model_name_to_local_file('EquiformerV2 (31M) All+MD', local_cache='/tmp/ocp_checkpoints/')
checkpoint_path = model_name_to_local_file('EquiformerV2-31M-S2EF-OC20-All+MD', local_cache='/tmp/ocp_checkpoints/')
calc = OCPCalculator(checkpoint_path=checkpoint_path)
```
Expand All @@ -224,14 +224,14 @@ atoms.get_potential_energy()
# Stochastic simulation results

Some models are not deterministic (SCN/eSCN/EqV2), i.e. you can get slightly different answers each time you run it.
An example is shown below. See [Issue 563](https://github.com/Open-Catalyst-Project/ocp/issues/563) for more discussion.
An example is shown below. See [Issue 563](https://github.com/FAIR-Chem/fairchem/issues/563) for more discussion.
This happens because a random selection of is made to sample edges, and a different selection is made each time you run it.

```{code-cell} ipython3
from fairchem.core.models.model_registry import model_name_to_local_file
from fairchem.core.common.relaxation.ase_utils import OCPCalculator
checkpoint_path = model_name_to_local_file('EquiformerV2 (31M) All+MD', local_cache='/tmp/ocp_checkpoints/')
checkpoint_path = model_name_to_local_file('EquiformerV2-31M-S2EF-OC20-All+MD', local_cache='/tmp/ocp_checkpoints/')
calc = OCPCalculator(checkpoint_path=checkpoint_path, cpu=True)
from ase.build import fcc111, add_adsorbate
Expand All @@ -254,11 +254,11 @@ for result in results:

# The forces don't sum to zero

In DFT, the forces on all the atoms should sum to zero; otherwise, there is a net translational or rotational force present. This is not enforced in OCP models. Instead, individual forces are predicted, with no constraint that they sum to zero. If the force predictions are very accurate, then they sum close to zero. You can further improve this if you subtract the mean force from each atom.
In DFT, the forces on all the atoms should sum to zero; otherwise, there is a net translational or rotational force present. This is not enforced in fairchem models. Instead, individual forces are predicted, with no constraint that they sum to zero. If the force predictions are very accurate, then they sum close to zero. You can further improve this if you subtract the mean force from each atom.

```{code-cell} ipython3
from fairchem.core.models.model_registry import model_name_to_local_file
checkpoint_path = model_name_to_local_file('EquiformerV2 (31M) All+MD', local_cache='/tmp/ocp_checkpoints/')
checkpoint_path = model_name_to_local_file('EquiformerV2-31M-S2EF-OC20-All+MD', local_cache='/tmp/ocp_checkpoints/')
from fairchem.core.common.relaxation.ase_utils import OCPCalculator
calc = OCPCalculator(checkpoint_path=checkpoint_path, cpu=True)
Expand Down
8 changes: 4 additions & 4 deletions docs/core/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ print(available_pretrained_models)
```{code-cell} ipython3
from fairchem.core.models.model_registry import model_name_to_local_file
checkpoint_path = model_name_to_local_file('GemNet-dTOC22', local_cache='/tmp/ocp_checkpoints/')
checkpoint_path = model_name_to_local_file('GemNet-dT-S2EFS-OC22', local_cache='/tmp/ocp_checkpoints/')
checkpoint_path
```
Expand Down Expand Up @@ -101,10 +101,10 @@ It is a good idea to redirect the output to a file. If the output gets too large
```{code-cell} ipython3
%%capture inference
import time
from fairchem.core.common.tutorial_utils import ocp_main
from fairchem.core.common.tutorial_utils import fairchem_main
t0 = time.time()
! python {ocp_main()} --mode predict --config-yml {yml} --checkpoint {checkpoint_path} --amp
! python {fairchem_main()} --mode predict --config-yml {yml} --checkpoint {checkpoint_path} --amp
print(f'Elapsed time = {time.time() - t0:1.1f} seconds')
```

Expand Down Expand Up @@ -197,7 +197,7 @@ The results should be the same.

It is worth noting the default precision of predictions is float16 with main.py, but with the ASE calculator the default precision is float32. Supposedly you can specify `--task.prediction_dtype=float32` at the command line to or specify it in the config.yml like we do above, but as of the tutorial this does not resolve the issue.

As noted above (see also [Issue 542](https://github.com/Open-Catalyst-Project/ocp/issues/542)), the ASE calculator and main.py use different precisions by default, which can lead to small differences.
As noted above (see also [Issue 542](https://github.com/FAIR-Chem/fairchem/issues/542)), the ASE calculator and main.py use different precisions by default, which can lead to small differences.

```{code-cell} ipython3
np.mean(np.abs(results['energy'][sind] - OCP * natoms)) # MAE
Expand Down
Loading

0 comments on commit 2ca6fd2

Please sign in to comment.