Protenix: Protein + X

⚡ Protenix Web Server • 📄 Technical Report

We’re excited to introduce Protenix — a trainable, open-source PyTorch reproduction of AlphaFold 3.

Protenix is built for high-accuracy structure prediction. It serves as an initial step in our journey toward advancing accessible and extensible research tools for the computational biology community.

🌟 Related Projects

Protenix-Dock: Our implementation of a classical protein-ligand docking framework that leverages empirical scoring functions. Without using deep neural networks, Protenix-Dock delivers competitive performance in rigid docking tasks.

Updates

🎉 Model Update

2025-05-30: Protenix-v0.5.0 is now available! You may try Protenix-v0.5.0 by accessing the server, or upgrade to the latest version using pip.

🔥 Feature Update

2025-01-16: The preview version of constraint feature is released to branch constraint_esm.
2025-01-16: The training data pipeline is released.
2025-01-16: The MSA pipeline is released.
2025-01-16: Use local colabfold_search to generate protenix-compatible MSA.

📊 Benchmark

We benchmarked the performance of Protenix-v0.5.0 against Boltz-1 and Chai-1 across multiple datasets, including PoseBusters v2, AF3 Nucleic Acid Complexes, AF3 Antibody Set, and our curated Recent PDB set.

We will soon release the benchmarking toolkit, including the evaluation datasets, data curation pipeline, and metric calculators, to support transparent and reproducible benchmarking.

🛠 Installation

PyPI

pip3 install protenix

For development on a CPU-only machine, it is convenient to install with the --cpu flag in editable mode:

python3 setup.py develop --cpu

Docker (Recommended for Training)

Check the detailed guide: Docker Installation.

🚀 Inference

Expected Input & Output Format

For details on the input JSON format and expected outputs, please refer to the Input/Output Documentation.

Prepare Inputs

Convert PDB/CIF File to Input JSON

If your input is a .pdb or .cif file, you can convert it into a JSON file for inference.

# ensure `release_data/ccd_cache/components.cif` or run:
python scripts/gen_ccd_cache.py -c release_data/ccd_cache/ -n [num_cpu]

# for PDB
# download pdb file
wget https://files.rcsb.org/download/7pzb.pdb
# run with pdb/cif file, and convert it to json file for inference.
protenix tojson --input examples/7pzb.pdb --out_dir ./output

# for CIF (same process)
# download cif file
wget https://files.rcsb.org/download/7pzb.cif
# run with pdb/cif file, and convert it to json file for inference.
protenix tojson --input examples/7pzb.cif --out_dir ./output

(Optional) Prepare MSA Files

We provide an independent MSA search utility. You can run it using either a JSON file or a protein FASTA file.

# run msa search with json file, it will write precomputed msa dir info to a new json file.
protenix msa --input examples/example_without_msa.json --out_dir ./output

# run msa search with fasta file which only contains protein.
protenix msa --input examples/prot.fasta --out_dir ./output

Inference via Command Line

If you installed Protenix via pip, you can run the following command to perform model inference:

# the default n_cycle/n_step/n_samples is 10/200/5 respectively, you can modify it by passing --cycle x1 --step x2 --sample x3

# run with example.json, which contains precomputed msa dir.
protenix predict --input examples/example.json --out_dir  ./output --seeds 101

# run with multiple json files, the default seed is 101.
protenix predict --input ./jsons_dir/ --out_dir  ./output

# if the json do not contain precomputed msa dir,
# add --use_msa_server to search msa and then predict.
# if mutiple seeds are provided, split them by comma.
protenix predict --input examples/example_without_msa.json --out_dir ./output --seeds 101,102 --use_msa_server

Inference via Bash Script

Alternatively you can run inference by: Alternatively, run inference via script:

bash inference_demo.sh

The script accepts the following arguments:

input_json_path: Path to a JSON file that fully specifies the input structure.
dump_dir: Directory where inference results will be saved.
dtype: Data type used during inference. Supported options: bf16 and fp32.
use_msa: Whether to enable MSA features (default: true).
use_esm: Whether to enable ESM features (default: false).

Note: By default, layernorm and EvoformerAttention kernels are disabled for simplicity. To enable them and speed up inference, see the Kernels Setup Guide.

🧬 Training

Refer to the Training Documentation for setup and details.

📌 Constraint Feature

Protenix now allows users to specify contacts, enabling the model to leverage additional inter-chain information as constraint guidance! We benchmarked this feature on the PoseBusters dataset and a curated protein-antibody interface subset. Results show that Protenix can generate significantly more accurate structures when guided by constraints. You can try it out via the constraint_esm branch.

Tips: Our online service already supports constraint inputs — no local setup required! However, for local command-line usage, be sure to check out the constraint_esm branch, as this feature is not yet included in the main branch.

Training and Inference Cost

For details on memory usage and runtime during training and inference, refer to the Training & Inference Cost Documentation.

Citing Protenix

If you use Protenix in your research, please cite the following:

@article{chen2025protenix,
  title={Protenix - Advancing Structure Prediction Through a Comprehensive AlphaFold3 Reproduction},
  author={Chen, Xinshi and Zhang, Yuxuan and Lu, Chan and Ma, Wenzhi and Guan, Jiaqi and Gong, Chengyue and Yang, Jincai and Zhang, Hanyu and Zhang, Ke and Wu, Shenghao and Zhou, Kuangqi and Yang, Yanping and Liu, Zhenyu and Wang, Lan and Shi, Bo and Shi, Shaochen and Xiao, Wenzhi},
  year={2025},
  doi = {10.1101/2025.01.08.631967},
  journal = {bioRxiv}
}

Contributing to Protenix

We welcome contributions from the community to help improve Protenix!

📄 Check out the Contributing Guide to get started.

✅ Code Quality: We use pre-commit hooks to ensure consistency and code quality. Please install them before making commits:

pip install pre-commit
pre-commit install

🐞 Found a bug or have a feature request? Open an issue.

Acknowledgements

The implementation of LayerNorm operators refers to both OneFlow and FastFold. We also adopted several module implementations from OpenFold, except for LayerNorm, which is implemented independently.

Code of Conduct

We are committed to fostering a welcoming and inclusive environment. Please review our Code of Conduct for guidelines on how to participate respectfully.

Security

If you discover a potential security issue in this project, or think you may have discovered a security issue, we ask that you notify Bytedance Security via our security center or vulnerability reporting email.

Please do not create a public GitHub issue.

License

The Protenix project including both code and model parameters is released under the Apache 2.0 License. It is free for both academic research and commercial use.

Contact Us

We welcome inquiries and collaboration opportunities for advanced applications of our model, such as developing new features, fine-tuning for specific use cases, and more. Please feel free to contact us at [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 216 Commits
.github/workflows		.github/workflows
assets		assets
configs		configs
docs		docs
examples		examples
notebooks		notebooks
protenix		protenix
runner		runner
scripts		scripts
tests		tests
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Protenix_Technical_Report.pdf		Protenix_Technical_Report.pdf
README.md		README.md
finetune_demo.sh		finetune_demo.sh
inference_demo.sh		inference_demo.sh
requirements.txt		requirements.txt
setup.py		setup.py
train_demo.sh		train_demo.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Protenix: Protein + X

🌟 Related Projects

Updates

🎉 Model Update

🔥 Feature Update

📊 Benchmark

🛠 Installation

PyPI

Docker (Recommended for Training)

🚀 Inference

Expected Input & Output Format

Prepare Inputs

Convert PDB/CIF File to Input JSON

(Optional) Prepare MSA Files

Inference via Command Line

Inference via Bash Script

🧬 Training

📌 Constraint Feature

Training and Inference Cost

Citing Protenix

Contributing to Protenix

Acknowledgements

Code of Conduct

Security

License

Contact Us

About

Uh oh!

Releases 20

Uh oh!

Contributors 17

Uh oh!

Languages

License

bytedance/Protenix

Folders and files

Latest commit

History

Repository files navigation

Protenix: Protein + X

🌟 Related Projects

Updates

🎉 Model Update

🔥 Feature Update

📊 Benchmark

🛠 Installation

PyPI

Docker (Recommended for Training)

🚀 Inference

Expected Input & Output Format

Prepare Inputs

Convert PDB/CIF File to Input JSON

(Optional) Prepare MSA Files

Inference via Command Line

Inference via Bash Script

🧬 Training

📌 Constraint Feature

Training and Inference Cost

Citing Protenix

Contributing to Protenix

Acknowledgements

Code of Conduct

Security

License

Contact Us

About

Topics

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases 20

Uh oh!

Contributors 17

Uh oh!

Languages