The GT4SD (Generative Toolkit for Scientific Discovery) is an open-source platform to accelerate hypothesis generation in the scientific discovery process. It provides a library for making state-of-the-art generative AI models easier to use.
For full details on the library API and examples see the docs.
Almost all pretrained models are also available via gradio
-powered web apps on Hugging Face Spaces.
This branch contains a minimal version which supports only the Regression Transformers: training and inference pipelines to generate small molecules, polymers or peptides based on numerical property constraints. For details read the paper.
git clone https://github.com/GT4SD/gt4sd-core.git -b rt-minimal
cd gt4sd-core/
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pip install .
# for development
# pip install -r dev_requirements.txt
# pip install -e .
If you use gt4sd
in your projects, please consider citing the following:
@software{GT4SD,
author = {GT4SD Team},
month = {2},
title = {{GT4SD (Generative Toolkit for Scientific Discovery)}},
url = {https://github.com/GT4SD/gt4sd-core},
version = {main},
year = {2022}
}
@article{manica2022gt4sd,
title={Accelerating material design with the generative toolkit for scientific discovery},
author={Manica, Matteo and Born, Jannis and Cadow, Joris and Christofidellis, Dimitrios and Dave, Ashish and Clarke, Dean and Teukam, Yves Gaetan Nana and Giannone, Giorgio and Hoffman, Samuel C and Buchan, Matthew and others},
journal={npj Computational Materials},
volume={9},
number={1},
pages={69},
year={2023},
publisher={Nature Publishing Group UK London}
}
The gt4sd
codebase is under MIT license.
For individual model usage, please refer to the model licenses found in the original packages.