Skip to content

Commit

Permalink
Monorepo (#670)
Browse files Browse the repository at this point in the history
* Update README.md

Change the recommended threshold for anomaly detection.

* Update flag_anomaly.py

Update surface atom movement threshold for surface reconstruction detection.

* changes to make OC-Dataset compatiable with newer version of ASE and pymatgen

* move test folder into ocdata for easier import of the helper functions (#46)

* implemented changes

* updating with reverse method applied too

* lint

* making requested updates

* Initial commit

* Update and rename LICENSE to LICENSE.md

* Update README.md

* Update README.md

* add model configs

* pre-commit checks

* Upload OC20-Dense dataset links

* Add metadata information

* readme typo

* Update mapping checksum

* adsorbml eval script

* update docstring

* Update README.md

* add scripts and readme initial version

* add utility script

* some documentation

* Update README.md

* Update README.md

* Create README.md

* Update README.md

* Update README.md

* Update README.md

* pre-commit hooks

* fix eval ionic speedup

* update db/pkls to new ASE

* random placement code

* update READMEs

* rename var for consistency

* add pointer to pretrained checkpoints

* Update MODELS.md

* add ml+sp success rates

* Bugfix: sample at least 1 site when no. of simplices > num_sites

* Bufix: pymatgen expects rotation angle in radians not degrees

* Shuffle sites before returning num_sites

* ceil instead of floor

* set center of mass to 0,0,0 before placing adsorbate

* add setup file

* reorg for setup

* set binding adsorbate to site

* update com docs

* inplace bufix

* eval backwards compatibility fix

* adding notebook

* updating some small stuff

* fix min_diff abs ordering; fmax on only moveable atoms; minor readme fixes

* Placement code refactor (#62)

* Reorganizes db files a little bit

* flake8 config

* Reorganize + lint old tests

* Separating out core classes + runner code

* default python gitignore

* Runner scripts in a separate folder

* Bulk refactor

* Renames surfaces to surface

* Default paths

* Specify types in core.Bulk

* core.Adsorbate refactor + init tests

* Surface --> slab in core.Bulk

* core.Surface refactor

* Copy over atoms when initializing bulk / adsorbate

* Some helper functions for adsorbate, bulk, surface

* Adslab generation, including rigid body rotations along x, y, z

* Structure --> core.Adslab

* More renaming

* Docs

* vasp_flags change from main

* remove comment

* adding my changes since they are pretty extensive. Needs testing still.

* operational. Known issue: itersecting with atoms outside of the simplex - easy fix anticipated

* making proximate placement optional since it has the small issue

* commit for posterity, changing approach

* new approach works well, but definitely clunky. Still needs testing.

* init ci config

* install typo

* pass black

* cache ci build

* add ci badge

* update tests, paths, bulk db

* relative ci path

* ci directory structure fix

* set precomputed defaults to None

* update adslab test, add vasp test

* downgrade codecov

* rename db folder, import error

* Rename `Surface`-->`Slab`, `Adslab`-->`AdsorbateSlabConfig`

* Moves a lot of the slab creation logic to the Slab object

* Pull symbols and covalent radii from ase instead of hardcoding it here

* Makes surface tagging modular

* Initializing a slab from ase.Atoms

* Minor update to how Slabs are initialized from atoms directly

* Return adsorbate smiles if available

* Pmg placement heuristics: on-top, bridge, hollow sites + binding atoms

* Updates test_inputs to latest api

* Pass tests

* update conversion script

* Removes hardcoded min_xy

* Constrains heuristic rotations around x and y

* bugfix: bulk_id_from_db wasn't getting set when randomly sampling a bulk

* explicitly solving for the point of intersection

* cleaned up docs

* making it work with heuristic and fixing par propagation

* Fix undersampling around the edges

* Update bulks pkl

* Drop unused constants

* Updates docstring

* Fix seed to set kpoints deterministically

* Remove unused functions from vasp utils

* AdsorbateSlabConfig doc update

* all functional - will clean up docs and what args are passed

* all cleaned up. some bad placements possible. need to investigate

* added runner script (docstrings not finalized yet)

* correctly center at site

* bugs with placement

* generate from file of indices

* correcting adsorbate position for projection

* cleaning up overlap function

* adding tests + cleanup

* return missing

* pool for runner script, and formatting

* add codecov badge

* uncommenting because this edge case still exists

* save precomputed surfaces if doesn't exist

* adding intercalation test and moving anomaly detection

* formatting :)

* formatting :) :)

* save tiled and tagged slabs, only tile one for random if not saving

* tqdm support for driver script

* New mode: random sites + heuristic adsorbate placement

* making bidentate a random choice

* Adsorbate is already moved to site; avoid extra param

* add precompute slab logic

* add missing pool

* save out site and sampled rotation angles to metadata

* Uniformly random 3d rotations

* bug fix - now check for atoms withing interstitial gap to solve for intersections

* make bulk db searchable by src id

* adding adsorbate searchability via SMILES

* adding bulk docs

* removing check for 2D materials since we plan to just catch errors at anomaly detection

* adding ability to enumerate specific miller indices

* set default and flags for random placement rotations

* write surface inputs as part of precompute step

* rand site+heur diff prefix

* delete file only if it exists

* adding binding index selection to adosrbate class for the case where atoms are provided as input.

* Updating readme to reflect current repo state

* updates to readme api

* remove outdated precomputed surfaces from readme

* Make things pip-installable

* Readme pass

* Minor updates to readme

* Pass on ocdata.core docs

* Link to the OC20 commit

* Minor

* Version bump

---------

Co-authored-by: Abhishek Das <[email protected]>
Co-authored-by: Brook Wander <[email protected]>
Co-authored-by: Janice Lan <[email protected]>

* Fix repo badges (#64)

* remove old dataset logic

* updates imports

* rm unnecessary lines

* requested changes from Muhammed

* log successful completion

* adding output

* removing paths

* add ml-only success metric

* rename to free atoms

* initial changes - think this looks much faster

* cleaning up timing

* black on files I havent changed

* add no vasp flag to precompute slabs

* Fix NH2NH2 naming and add reaction strings to adsorbates pkl (#67)

* fix NH2NH2 naming

* rm ase dbs, add reaction string

* rm conversion script

* resolve failing tests

* Allow pre-loaded databases in initializers

Bulk and Adsorbate initializers accept a path to database stored in a
pkl file. This adds an option to instead pass in an already-loaded
database to support cases where they are being read outside of these
classes.

* Fix test for new adsorbate db structure

* Backwards compatibility for adsorbate db (#70)

* Backwards compatibility for adsorbate db

Open-Catalyst-Project/Open-Catalyst-Dataset#67
added support for reaction strings in adsorbate database entries. Tuple
unpacking in the Adsorbate initializer assumes length 4 (after the
reaction string was added). Older databases will be missing the reaction
string and the initializer will raise when they are used. This makes the
initializer backwards compatible with older databases, saving the
reaction string only if present.

* _save_adsorbate -> _load_adsorbate

* adding 2023 neurips challenge folder with eval script and mapping/target files

* adding a readme for challenge_eval script

* Initial commit

* Add Client and get_bulks method

* Use common test cases for client method

There are a few test cases that we'll want to run against each client
method: unexpected response code, exception raised in client, and
successful response. This adds a function that runs all cases for each
client method, rather than copying the code around for each test.

* Add support for GET /ocp/adsorbates

* Remove try/catch when reading api response

This was left over from parsing the response with an assumption that is
was json serialized. Now we are just fetching the plain text in the
response.

* Removed defaults in data models

After getting through a few more data structures in the API, there are
enough fields without reasonable defaults that I think it makes sense to
removed defaults. We'll still handle the addition of new fields, but
removal of a field, if that happens at some point, will require users to
upgrade their library version.

* Add support for slab enumeration

* Add support for enumerating adsorbate slab configs

* Add support for submitting relaxations

* Add support for fetching relaxation results

* Add support for deleting relaxations

* Simplify model naming

* Add support for fetching relaxation requests

* Add doc for exceptions in client methods

* Add more type hints to client

* Add exception type for rate limited api calls

* Add exception type for calls that cannot be retried

* Add public api to __init__ file

* Add retry decorator for calls to the ocp api

* Create client module

* Add workflows for adsorbate slab relaxations

* Add context to log statements

* Add logging for rate limited requests

* Return adsorbate details from workflow

* Add quickstart to readme

* Rename filter on miller indices

* Add file IO example to README

* Rename AdsorbateConfiguration to AdsorbateSlabRelaxation

* Add default client to public workflow methods

* Add lifetime to find_adsorbate_binding_sites

* Add integration tests for workflows

* Add unit tests for context.py

* Add unit tests for retry.py

* Add "slab" to public workflow names

* Add unit tests for adsorbates.py

* Add support for omitted_config_ids fields

* Add progress bar that tracks finished relaxations

* Add note about asyncio to readme

* Add methods to convert to ase.Atoms objects

* Add forces/energies to ase.Atoms when possible

* Add FixAtom constraint to sub-surface atoms

* Do not flatten relaxation results across slabs

* Configure setuptools to make the package pip-installable  (#11)

* Initial setup.py

* Readme note

* Remove env yaml

* Make sure to run relaxations with tagged slabs

* Create client with scheme and host

Previously the Client initializer took the full base URL as input. This
splits into scheme and host inputs. Later we'll use the host to map API
urls to UI urls.

* Add method to get UI results URL

Adds a method that returns the URL at which results can be visualized.

* Add URLs to workflow outputs

Adds the API host and UI URL for each set of relaxations (system in API
terminology).

* Use equiformer v2 as the default model type

* Pypi distribution (#15)

* pypi install update + add license, citing info to setup, readme

* Add patch id to version

* Implement remaining unit tests

* Add circleci config

* Add circlci badge and code coverage

* Check allowed models against server side list

* More general slab filtering

The old slab_filter interface accepted a single slab and returned
True/False to keep/reject it. This updates the interface to accept a
list of adslabs and return a list of adslabs, allowing for operations on
the entire set, including on individual adsorbate configs if needed.

* Add prompt_for_slabs_to_keep filter

Changes the default behavior of find_adsorbate_binding_sites() to prompt
users for the set of slabs that they want to submit.

* Update docs about "adslab"

* Documentation fixes

Adds a note to the README about the supported bulks and adsorbates,
fixes some language in documentation throughout the package, and
modifies some docstrings so that they work well with sphinx
documentation generators.

* Release v1.0.0

* Initial commit

* Added force field evaluation code (#1)

* Added force field evaluation code

---------

Co-authored-by: Anuroop Sriram <[email protected]>

* Add files via upload

* skip vasp surface inputs + Pymatgen install fix (#72)

* skip vasp surface inputs

* fix logic

* fix ci

* ci debug:

* snapshot pymatgen version

* ci debug

* pmg conda install

* pmg install ci debug

* update pmg debug

* pass tests

* missing tests

* test fix

* update shift

* Add files via upload

* Add files via upload

* Add files via upload

* Support for placing multiple adsorbates (#74)

* initial multi-adsorbate placement

* support for multi configurations

* add multi-ads coverage tests

* update docstring

* typo

* update metadata dict

* adsorbate argument to placement

* Remove unused imports

* Remove more unused imports

* update docs

* indent

* Update README.md

* Update README.md

* init commit

* pre-commit hooks

* add license

* add gitignore

* Add initial Architector examples for testing (#1)

* Add initial Architector examples for testing

* rm checkpoint

---------

Co-authored-by: Muhammed Shuaibi <[email protected]>

* add MIT license (#77)

* add MIT license

* add license to readme

* add orca recipes (#3)

* add orca recipes

* more docs

* explicit setup, directory

* version fix

* support quacc==0.7.2

* include pkls in pip setup (#81)

* [BE] Remove hardcoded paths (#82)

* include pkls in pip setup

* remove hardcoded paths

* remove unused configs

* fix test imports

* be moving too fast

* use pyproject.toml

* github actions

* in workflows directory

* in workflows directory

* correct call to black

* check only 3.9

* fix dependencies

* ocdata not ocpdata

* Store oriented unit bulk (#75)

* save out oriented unit bulk

* informational codecov

* black diff

* fix lint

* test 3.9 only...

* initial commit

* moving stuff around and correcting paths

* bare bones for validation -- files need updating

* 3.9 - 3.11

* general updates

* Add additional Orca keywors for population analysis/properties which add minimal cost (#4)

* Add additional Orca keywors for population analysis/properties which add minimal cost

* actually NormalPrint doesn't add useful things

* update Sella optimizer to fmax 0.05

---------

Co-authored-by: Daniel Levine <[email protected]>

* remove Field

* all cleaned up -- should be ready for review

* clearing cell output from tutorial notebook

* clean up timing

* reducing number of initial configs

* updating file path approach to mirror new approach in ocdata

* explicitly define sella kwargs (#7)

* explicitly define sella kwargs

* lower scf+max_steps

* Support for custom Orca calculator (#6)

* feje orca support

* feje support

* return results

* updates per latest quacc

* add nbo

* move monkey patch to driver script, not recipes

* pin ase+update quacc

* clarify docs

* Added supercell info file

* Update README.md

* scripts for sampling GEOM dataset (#5)

* scripts for sampling geom

* Update biomolecules/geom/sample_geom_drugs.py

Co-authored-by: Daniel Levine <[email protected]>

* minor updates based on review

* updating geom scripts to new file structure

* fixed level of sampling

---------

Co-authored-by: Daniel Levine <[email protected]>

* add sella

* modify nbo input (#9)

* Add printing of Reduced Mulliken and Lowdin populations for each orbital (#10)

Also added Lowdin and Mulliken bond orders since they were the only population feature missing from the NormalPrint level.

* mark slabs test xfail per pmg bug

* assert approx

* no pmg version, lint, remove circleci

* grid3

* Update README.md

fixing tutorial link and adding dataset download link

* adding arXiv links to readme

* add checksum

* move to new src folder

* rename imports for monorepo

* fix up .gitignore @lbluque solved .pt include

* folder promote ocpapi and open-catalyst-dataset

* fixes to workflow

* move main.py to root

* add packages

* ruff fixes

* remove unused gitignore and pre-commit-config

* move ocpapi readme

* remove extra github workflows from data/oc

* remove extra pyproject.toml and setup.py

* move enviornment deps to packages

---------

Co-authored-by: apalizha <[email protected]>
Co-authored-by: Aini Palizhati <[email protected]>
Co-authored-by: Brook Wander <[email protected]>
Co-authored-by: Muhammed Shuaibi <[email protected]>
Co-authored-by: Muhammed Shuaibi <[email protected]>
Co-authored-by: Janice Lan <[email protected]>
Co-authored-by: Janice Lan <[email protected]>
Co-authored-by: Abhishek Das <[email protected]>
Co-authored-by: Muhammed Shuaibi <[email protected]>
Co-authored-by: Brook Wander <[email protected]>
Co-authored-by: Brook Wander <[email protected]>
Co-authored-by: Kyle Michel <[email protected]>
Co-authored-by: Kyle Michel <[email protected]>
Co-authored-by: Brandon Wood <[email protected]>
Co-authored-by: anuroopsriram <[email protected]>
Co-authored-by: anuroopsriram <[email protected]>
Co-authored-by: Anuroop Sriram <[email protected]>
Co-authored-by: Xiaohan Yu <[email protected]>
Co-authored-by: Michael G. Taylor <[email protected]>
Co-authored-by: lbluque <[email protected]>
Co-authored-by: Daniel Levine <[email protected]>
Co-authored-by: Brandon Wood <[email protected]>
Co-authored-by: Daniel Levine <[email protected]>
Co-authored-by: EC2 Default User <[email protected]>
Former-commit-id: 0ededf69a58ecea9f41a509a8f01f08784d591ef
  • Loading branch information
25 people authored May 7, 2024
1 parent 8f20f8e commit 7665d20
Show file tree
Hide file tree
Showing 486 changed files with 51,592 additions and 455 deletions.
13 changes: 10 additions & 3 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,12 +30,19 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pushd packages/
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
if [ -f requirements-optional.txt ]; then pip install -r requirements-optional.txt; fi
popd
pushd packages/fairchem-core
pip install -e .[docs,adsorbml]
pip install git+https://github.com/Open-Catalyst-Project/Open-Catalyst-Dataset.git
pip install git+https://github.com/Open-Catalyst-Project/ocpapi.git
# pip install git+https://github.com/Open-Catalyst-Project/CatTSunami.git
popd
pushd packages/fairchem-data-oc
pip install -e .[dev]
popd
pushd packages/fairchem-demo-ocpapi
pip install -e .[dev]
popd
# Build the book
- name: Build the book
Expand Down
9 changes: 8 additions & 1 deletion .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,17 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pushd packages/
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
if [ -f requirements-optional.txt ]; then pip install -r requirements-optional.txt; fi
popd
pushd packages/fairchem-core
pip install -e .[dev]
popd
- name: ruff
run: |
ruff --version
ruff check --statistics --config pyproject.toml
ruff check --statistics --config packages/fairchem-core/pyproject.toml src/fairchem/core/
ruff check --statistics --config packages/fairchem-data-oc/pyproject.toml src/fairchem/data/oc/
#ruff check --statistics --config packages/fairchem-data-om/pyproject.toml src/fairchem/data/om/
#ruff check --statistics --config packages/fairchem-demo-ocpapi/pyproject.toml src/fairchem/demo/ocpapi/
24 changes: 21 additions & 3 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,18 +33,36 @@ jobs:
${{ runner.os }}-pip-
${{ runner.os }}-
- name: Install dependencies and package
- name: Install core dependencies and package
# this can be added along with a dependabot config to run tests with latest versions
# pip install -r requirements.txt
# pip install -r requirements-optional.txt
run: |
python -m pip install --upgrade pip
pushd packages/
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
if [ -f requirements-optional.txt ]; then pip install -r requirements-optional.txt; fi
popd
pushd packages/fairchem-core/
pip install -e .[dev]
- name: Test with pytest
popd
pushd packages/fairchem-data-oc/
pip install -e .[dev]
popd
pushd packages/fairchem-demo-ocpapi/
pip install -e .[dev]
popd
- name: Test core with pytest
run: |
pytest tests -vv --cov-report=xml --cov=ocpmodels
pushd src/fairchem/core/
pytest tests -vv --cov-report=xml --cov=fairchem.core
popd
pushd src/fairchem/data/oc/
pytest tests -vv --cov-report=xml --cov=fairchem.data.oc
popd
pushd src/fairchem/demo/ocpapi/
pytest tests -vv --cov-report=xml --cov=fairchem.demo.ocpapi
popd
- if: ${{ matrix.python_version == '3.11' }}
name: codecov-report
Expand Down
5 changes: 3 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
wandb
data
/data
/src/fairchem/data
checkpoints
results
logs
Expand Down Expand Up @@ -118,7 +119,7 @@ env.yml
docs/legacy_tutorials/videos/

# No ASE dbs, or OCP LMDBs!
*.pt
/*.pt
*.lmdb
*.lmdb-lock
out.txt
Expand Down
3 changes: 2 additions & 1 deletion LICENSE.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
# License

MIT License

Copyright (c) Facebook, Inc. and its affiliates.
Copyright (c) Meta, Inc. and its affiliates.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
89 changes: 12 additions & 77 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,81 +1,16 @@
# `ocp` by Open Catalyst Project
Welcome to FAIRChem! (Under construction - 2024-05-06~2024-05-07)

![tests](https://github.com/Open-Catalyst-Project/ocp/actions/workflows/test.yml/badge.svg?branch=main)
[![codecov](https://codecov.io/gh/Open-Catalyst-Project/ocp/graph/badge.svg?token=M606LH5LK6)](https://codecov.io/gh/Open-Catalyst-Project/ocp)
(2024-05-06) Repository changes

`ocp` is the [Open Catalyst Project](https://opencatalystproject.org/)'s
library of state-of-the-art machine learning algorithms for catalysis.
To better test integration between our packages and keep them up to date we have migrated all existing repositories under Open-Catalyst-Project to a single repository (this one), FAIRChem.

<div align="left">
<img src="https://user-images.githubusercontent.com/1156489/170388229-642c6619-dece-4c88-85ef-b46f4d5f1031.gif">
</div>
Our new structure supports the fairchem package namespace with several independently installable packages:
* [fairchem.core](src/fairchem/core) (formerly [OpenCatalystProject - OCP](https://github.com/Open-Catalyst-Project/ocp/tree/main))
* [fairchem.data.oc](src/fairchem/data/oc) (fomerly [Open-Catalyst-Dataset](https://github.com/Open-Catalyst-Project/Open-Catalyst-Dataset))
* [fairchem.data.om](src/fairchem/data/om) (formerly [om-data](https://github.com/Open-Catalyst-Project/om-data))
* [fairchem.data.odac](src/fairchem/data/odac) (formerly [odac-data](https://github.com/Open-Catalyst-Project/odac-data/tree/main))
* [fairchem.demo.ocpapi](src/fairchem/demo/ocpapi) (formerly [ocpapi](https://github.com/Open-Catalyst-Project/ocpapi))
* [fairchem.applications.adsorbml](src/fairchem/applications/AdsorbML) (formerly [AdsorbML](https://github.com/Open-Catalyst-Project/AdsorbML))
* [fairchem.applications.cattsunami](src/fairchem/applications/CatTSunami)

## [OCP Documentation](https://open-catalyst-project.github.io/ocp/)
Full documentation, released data/checkpoints, and tutorials are now available.

- [Installation](https://open-catalyst-project.github.io/ocp/core/install.html)
- [Pretrained models](https://open-catalyst-project.github.io/ocp/core/models.html)
- [FAQ](https://open-catalyst-project.github.io/ocp/core/model_faq.html)

`ocp` provides training and evaluation code for tasks and models that take arbitrary
chemical structures as input to predict energies / forces / positions / stresses,
and can be used as a base scaffold for research projects. For an overview of
tasks, data, and metrics, please read the documentations and respective papers:
- [OC20](https://open-catalyst-project.github.io/ocp/core/datasets/oc20.html)
- [OC22](https://open-catalyst-project.github.io/ocp/core/datasets/oc22.html)
- [ODAC23](https://open-catalyst-project.github.io/ocp/core/datasets/odac.html)

Projects and models built on `ocp`:

- SchNet [[`arXiv`](https://arxiv.org/abs/1706.08566)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/blob/main/ocpmodels/models/schnet.py)]
- DimeNet++ [[`arXiv`](https://arxiv.org/abs/2011.14115)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/blob/main/ocpmodels/models/dimenet_plus_plus.py)]
- GemNet-dT [[`arXiv`](https://arxiv.org/abs/2106.08903)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/tree/main/ocpmodels/models/gemnet)]
- PaiNN [[`arXiv`](https://arxiv.org/abs/2102.03150)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/tree/main/ocpmodels/models/painn)]
- Graph Parallelism [[`arXiv`](https://arxiv.org/abs/2203.09697)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/tree/main/ocpmodels/models/gemnet_gp)]
- GemNet-OC [[`arXiv`](https://arxiv.org/abs/2204.02782)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/tree/main/ocpmodels/models/gemnet_oc)]
- SCN [[`arXiv`](https://arxiv.org/abs/2206.14331)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/tree/main/ocpmodels/models/scn)]
- AdsorbML [[`arXiv`](https://arxiv.org/abs/2211.16486)] [[`code`](https://github.com/open-catalyst-project/adsorbml)]
- eSCN [[`arXiv`](https://arxiv.org/abs/2302.03655)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/tree/main/ocpmodels/models/escn)]
- EquiformerV2 [[`arXiv`](https://arxiv.org/abs/2306.12059)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/tree/main/ocpmodels/models/equiformer_v2)]

Older model implementations that are no longer supported:

- CGCNN [[`arXiv`](https://arxiv.org/abs/1710.10324)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/blob/e7a8745eb307e8a681a1aa9d30c36e8c41e9457e/ocpmodels/models/cgcnn.py)]
- DimeNet [[`arXiv`](https://arxiv.org/abs/2003.03123)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/blob/e7a8745eb307e8a681a1aa9d30c36e8c41e9457e/ocpmodels/models/dimenet.py)]
- SpinConv [[`arXiv`](https://arxiv.org/abs/2106.09575)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/blob/e7a8745eb307e8a681a1aa9d30c36e8c41e9457e/ocpmodels/models/spinconv.py)]
- ForceNet [[`arXiv`](https://arxiv.org/abs/2103.01436)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/blob/e7a8745eb307e8a681a1aa9d30c36e8c41e9457e/ocpmodels/models/forcenet.py)]

## Discussion

For all non-codebase related questions and to keep up-to-date with the latest OCP
announcements, please join the [discussion board](https://discuss.opencatalystproject.org/).

All code-related questions and issues should be posted directly on our
[issues page](https://github.com/Open-Catalyst-Project/ocp/issues).
Make sure to first go through the [FAQ](https://github.com/Open-Catalyst-Project/ocp/tree/main/FAQ.md)
to check if your question's answered already.

## Acknowledgements

- This codebase was initially forked from [CGCNN](https://github.com/txie-93/cgcnn)
by [Tian Xie](http://txie.me), but has undergone significant changes since.
- A lot of engineering ideas have been borrowed from [github.com/facebookresearch/mmf](https://github.com/facebookresearch/mmf).
- The DimeNet++ implementation is based on the [author's Tensorflow implementation](https://github.com/klicperajo/dimenet) and the [DimeNet implementation in Pytorch Geometric](https://github.com/rusty1s/pytorch_geometric/blob/master/torch_geometric/nn/models/dimenet.py).

## License

`ocp` is released under the [MIT](https://github.com/Open-Catalyst-Project/ocp/blob/main/LICENSE.md) license.

## Citing `ocp`

If you use this codebase in your work, please consider citing:

```bibtex
@article{ocp_dataset,
author = {Chanussot*, Lowik and Das*, Abhishek and Goyal*, Siddharth and Lavril*, Thibaut and Shuaibi*, Muhammed and Riviere, Morgane and Tran, Kevin and Heras-Domingo, Javier and Ho, Caleb and Hu, Weihua and Palizhati, Aini and Sriram, Anuroop and Wood, Brandon and Yoon, Junwoong and Parikh, Devi and Zitnick, C. Lawrence and Ulissi, Zachary},
title = {Open Catalyst 2020 (OC20) Dataset and Community Challenges},
journal = {ACS Catalysis},
year = {2021},
doi = {10.1021/acscatal.0c04525},
}
```
[go to the docs!](https://open-catalyst-project.github.io/ocp/)
10 changes: 5 additions & 5 deletions docs/core/fine-tuning/fine-tuning-oxides.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ First we get the checkpoint that we want. According to the [MODELS](../../core/m
We get this checkpoint here.

```{code-cell} ipython3
from ocpmodels.models.model_registry import model_name_to_local_file
from fairchem.core.models.model_registry import model_name_to_local_file
checkpoint_path = model_name_to_local_file('GemNet-OCOC20+OC22', local_cache='/tmp/ocp_checkpoints/')
```
Expand Down Expand Up @@ -74,7 +74,7 @@ atoms, c['data']['total_energy'], c['data']['forces']
Next, we will create an OCP calculator that we can use to get predictions from.

```{code-cell} ipython3
from ocpmodels.common.relaxation.ase_utils import OCPCalculator
from fairchem.core.common.relaxation.ase_utils import OCPCalculator
calc = OCPCalculator(checkpoint_path=checkpoint_path, trainer='forces', cpu=False)
```

Expand Down Expand Up @@ -188,7 +188,7 @@ You choose the splits you want, 80:10:10 is common. We take a simple approach to
We provide some helper functions in `ocpmodels.common.tutorial_utils` to streamline this process.

```{code-cell} ipython3
from ocpmodels.common.tutorial_utils import train_test_val_split
from fairchem.core.common.tutorial_utils import train_test_val_split
! rm -fr train.db test.db val.db
train, test, val = train_test_val_split('oxides.db')
Expand All @@ -200,7 +200,7 @@ train, test, val
We have to create a yaml configuration file for the model we are using. The pre-trained checkpoints contain their config data, so we use this to get the base configuration, and then remove pieces we don't need, and update pieces we do need.

```{code-cell} ipython3
from ocpmodels.common.tutorial_utils import generate_yml_config
from fairchem.core.common.tutorial_utils import generate_yml_config
yml = generate_yml_config(checkpoint_path, 'config.yml',
delete=['slurm', 'cmd', 'logger', 'task', 'model_attributes',
Expand Down Expand Up @@ -261,7 +261,7 @@ This can take up to 30 minutes for 80 epochs, so we only do a few here to see wh
:tags: [hide-output]
import time
from ocpmodels.common.tutorial_utils import ocp_main
from fairchem.core.common.tutorial_utils import ocp_main
t0 = time.time()
! python {ocp_main()} --mode train --config-yml {yml} --checkpoint {checkpoint_path} --run-dir fine-tuning --identifier ft-oxides --amp > train.txt 2>&1
Expand Down
26 changes: 13 additions & 13 deletions docs/core/gotchas.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,8 @@ RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1] because the
The problem here is that no neighbors are found for the single atom which causes an error. This may be model dependent. There is currently no way to get atomic energies for some models.

```{code-cell} ipython3
from ocpmodels.common.relaxation.ase_utils import OCPCalculator
from ocpmodels.models.model_registry import model_name_to_local_file
from fairchem.core.common.relaxation.ase_utils import OCPCalculator
from fairchem.core.models.model_registry import model_name_to_local_file
checkpoint_path = model_name_to_local_file('GemNet-OCOC20+OC22', local_cache='/tmp/ocp_checkpoints/')
calc = OCPCalculator(checkpoint_path=checkpoint_path)
```
Expand Down Expand Up @@ -76,7 +76,7 @@ add_adsorbate(slab, 'O', height=1.2, position='fcc')
```

```{code-cell} ipython3
from ocpmodels.models.model_registry import model_name_to_local_file
from fairchem.core.models.model_registry import model_name_to_local_file
# OC20 model - trained on adsorption energies
checkpoint_path = model_name_to_local_file('GemNet-OC All', local_cache='/tmp/ocp_checkpoints/')
Expand Down Expand Up @@ -154,8 +154,8 @@ Gemnet in particular seems to require at least 4 atoms. This has to do with inte

```{code-cell} ipython3
%%capture
from ocpmodels.common.relaxation.ase_utils import OCPCalculator
from ocpmodels.models.model_registry import model_name_to_local_file
from fairchem.core.common.relaxation.ase_utils import OCPCalculator
from fairchem.core.models.model_registry import model_name_to_local_file
import os
checkpoint_path = model_name_to_local_file('GemNet-OCOC20+OC22', local_cache='/tmp/ocp_checkpoints/')
Expand All @@ -180,8 +180,8 @@ Some models use tags to determine which atoms to calculate energies for. For exa

```{code-cell} ipython3
%%capture
from ocpmodels.common.relaxation.ase_utils import OCPCalculator
from ocpmodels.models.model_registry import model_name_to_local_file
from fairchem.core.common.relaxation.ase_utils import OCPCalculator
from fairchem.core.models.model_registry import model_name_to_local_file
import os
checkpoint_path = model_name_to_local_file('GemNet-OCOC20+OC22', local_cache='/tmp/ocp_checkpoints/')
Expand All @@ -205,8 +205,8 @@ atoms.get_potential_energy()
Not all models require tags though. This EquiformerV2 model does not use them. This is another detail that is important to keep in mind.

```{code-cell} ipython3
from ocpmodels.common.relaxation.ase_utils import OCPCalculator
from ocpmodels.models.model_registry import model_name_to_local_file
from fairchem.core.common.relaxation.ase_utils import OCPCalculator
from fairchem.core.models.model_registry import model_name_to_local_file
import os
checkpoint_path = model_name_to_local_file('EquiformerV2 (31M) All+MD', local_cache='/tmp/ocp_checkpoints/')
Expand All @@ -228,8 +228,8 @@ An example is shown below. See [Issue 563](https://github.com/Open-Catalyst-Proj
This happens because a random selection of is made to sample edges, and a different selection is made each time you run it.

```{code-cell} ipython3
from ocpmodels.models.model_registry import model_name_to_local_file
from ocpmodels.common.relaxation.ase_utils import OCPCalculator
from fairchem.core.models.model_registry import model_name_to_local_file
from fairchem.core.common.relaxation.ase_utils import OCPCalculator
checkpoint_path = model_name_to_local_file('EquiformerV2 (31M) All+MD', local_cache='/tmp/ocp_checkpoints/')
calc = OCPCalculator(checkpoint_path=checkpoint_path, cpu=True)
Expand Down Expand Up @@ -257,10 +257,10 @@ for result in results:
In DFT, the forces on all the atoms should sum to zero; otherwise, there is a net translational or rotational force present. This is not enforced in OCP models. Instead, individual forces are predicted, with no constraint that they sum to zero. If the force predictions are very accurate, then they sum close to zero. You can further improve this if you subtract the mean force from each atom.

```{code-cell} ipython3
from ocpmodels.models.model_registry import model_name_to_local_file
from fairchem.core.models.model_registry import model_name_to_local_file
checkpoint_path = model_name_to_local_file('EquiformerV2 (31M) All+MD', local_cache='/tmp/ocp_checkpoints/')
from ocpmodels.common.relaxation.ase_utils import OCPCalculator
from fairchem.core.common.relaxation.ase_utils import OCPCalculator
calc = OCPCalculator(checkpoint_path=checkpoint_path, cpu=True)
from ase.build import fcc111, add_adsorbate
Expand Down
10 changes: 5 additions & 5 deletions docs/core/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,12 +57,12 @@ with ase.db.connect('full_data.db') as full_db:
You have to choose a checkpoint to start with. The newer checkpoints may require too much memory for this environment.

```{code-cell} ipython3
from ocpmodels.models.model_registry import available_pretrained_models
from fairchem.core.models.model_registry import available_pretrained_models
print(available_pretrained_models)
```

```{code-cell} ipython3
from ocpmodels.models.model_registry import model_name_to_local_file
from fairchem.core.models.model_registry import model_name_to_local_file
checkpoint_path = model_name_to_local_file('GemNet-dTOC22', local_cache='/tmp/ocp_checkpoints/')
checkpoint_path
Expand All @@ -72,7 +72,7 @@ checkpoint_path
We have to update our configuration yml file with the dataset. It is necessary to specify the train and test set for some reason.

```{code-cell} ipython3
from ocpmodels.common.tutorial_utils import generate_yml_config
from fairchem.core.common.tutorial_utils import generate_yml_config
yml = generate_yml_config(checkpoint_path, 'config.yml',
delete=['cmd', 'logger', 'task', 'model_attributes',
'dataset', 'slurm'],
Expand Down Expand Up @@ -101,7 +101,7 @@ It is a good idea to redirect the output to a file. If the output gets too large
```{code-cell} ipython3
%%capture inference
import time
from ocpmodels.common.tutorial_utils import ocp_main
from fairchem.core.common.tutorial_utils import ocp_main
t0 = time.time()
! python {ocp_main()} --mode predict --config-yml {yml} --checkpoint {checkpoint_path} --amp
Expand Down Expand Up @@ -166,7 +166,7 @@ We include this here just to show that:
2. That this is much slower.

```{code-cell} ipython3
from ocpmodels.common.relaxation.ase_utils import OCPCalculator
from fairchem.core.common.relaxation.ase_utils import OCPCalculator
calc = OCPCalculator(checkpoint_path=checkpoint_path, cpu=False)
```

Expand Down
4 changes: 2 additions & 2 deletions docs/core/lmdb_dataset_creation.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ about these steps as they've been automated as part of this
[download script](https://github.com/Open-Catalyst-Project/ocp/blob/master/scripts/download_data.py).

```{code-cell} ipython3
from ocpmodels.preprocessing import AtomsToGraphs
from ocpmodels.datasets import LmdbDataset
from fairchem.core.preprocessing import AtomsToGraphs
from fairchem.core.datasets import LmdbDataset
import ase.io
from ase.build import bulk
from ase.build import fcc100, add_adsorbate, molecule
Expand Down
Loading

0 comments on commit 7665d20

Please sign in to comment.