Latents-based conformational control in OpenFold3.
Confornet depends on OpenFold3-preview (>= 0.4.0, a.k.a. OF3p2).
conda create -n confornet python=3.12
conda activate confornet
# 1. Install confornet (pulls OpenFold3 >= 0.4.0 and other deps)
pip install -e .
# 2. Reinstall OpenFold3 from GitHub main.
# The current PyPI release may error on ColabFold MSA server queries
# when a hit points to an obsolete PDB entry (e.g. 6rm0).
# The fix is already on `main` but has not yet been released.
pip install --no-deps --force-reinstall \
"openfold3 @ git+https://github.com/aqlaboratory/openfold-3.git@main"
# 3. Download OF3p2 checkpoint (~2.3 GB)
# setup_openfold uses $OPENFOLD_CACHE as the destination; set it first if you
# want the checkpoint somewhere other than the default (~/.cache/openfold).
export OPENFOLD_CACHE=/path/to/checkpoint/dir
setup_openfoldEvaluation uses the USalign binary at packages/USalign. See packages/README.md.
Important
The paper reports results with OF3p1 (of3_ft3_v1.pt, OpenFold3 0.3.x). This checkpoint is not compatible with openfold3 >= 0.4.0 and will fail to load. We are standardizing on OF3p2 (of3-p2-155k.pt) going forward and will re-run the benchmarks; updated results will be posted.
To reproduce the exact paper numbers, use OF3p1 with openfold3==0.3.1.
A tiny benchmark under repo/toy_assets/toy_benchmark/ (mdfA from membrane + fs-4zrb_C-4zrb_H from foldswitching) along with a pretrained ConforNet trained to fold membrane transporters toward the outward-facing conformation. Peak GPU memory usage < 24GB.
BENCH=toy_benchmark
ASSETS=repo/toy_assets
CKPT=/path/to/of3-p2-155k.pt
# 1. Preprocessing — writes MSA + OF3p batches under $ASSETS/$BENCH/{msa,batch}.
python preprocess.py --benchmark $BENCH --assets-dir $ASSETS
# 2. k-ConforNet diversity training on both test cases.
python scripts/run_diversity.py \
--benchmark $BENCH --assets-dir $ASSETS \
--checkpoint $CKPT \
--output-dir ./output/demo/diversity \
--k-confornets 2 --num-runs 2 --num-samples 5
# This will generate 2 * 2 * 4 * 5 = 80 samples per test case
# May take ~30 GPU minutes. If you have multiple GPUs, launch with torchrun --nproc_per_node=4
# 3. Conformation transfer — use the provided ConforNet
# to fold the mdfA sequence. Demonstrates transfer.
python scripts/run_transfer.py \
--benchmark $BENCH --assets-dir $ASSETS \
--confornet-path $ASSETS/$BENCH/confornet/TM_0287v2_6QV1_B.pt \
--test-case mdfA \
--checkpoint $CKPT \
--output-dir ./output/demo/transfer \
--num-samples 10
# 10 samples generation, quickSee repo/demo.ipynb for evaluation + py3Dmol overlay visualization of the outputs (pip install py3Dmol).
All benchmarks reported in the paper are provided under assets/. Follow the benchmark scaffolding in assets/ (references, residue ranges, and test-case mappings); see assets/README.md for details. Once you have defined your benchmark, or if you want to reproduce the paper results, see scripts/README.md for the benchmark entry points: run_diversity, run_mse_training, run_transfer, run_baseline, evaluate, and summarize.
Warning
ConforNets backpropagate through the Pairformer and therefore require more GPU memory than standard inference. Roughly speaking, a 40GB GPU can fit a ~300aa, while an 80GB GPU can fit ~600 aa. The dist_cdf_mse objective is currently less memory-efficient. We are actively working on memory optimizations.
assets/ contains benchmark definitions and reference PDBs. Preprocessing computes MSAs and saves OF3p batches as .pt files.
# Default ./assets directory
python preprocess.py --benchmark domainmotion
# Custom assets directory (copies ./assets there first if needed)
python preprocess.py --benchmark domainmotion --assets-dir /scratch/assets
# Skip MSA step (not recommended unless you know what you're doing)
python preprocess.py --benchmark domainmotion --skip-msaOF3p batches may take 10+ GB per benchmark.
Two independent axes, both embarrassingly parallel (no cross-rank communication):
- Intra-node, multi-GPU —
torchrun --nproc_per_node=N. Each script partitions its work byjob_idx % WORLD_SIZE == LOCAL_RANK. ReadsLOCAL_RANK/WORLD_SIZE(torchrun) orFLUX_TASK_RANK/NRANKS_PER_NODE(Flux); seeconfornet/utils/dist.py. - Inter-node, test-case sharding — benchmark scripts accept
--num-nodes N --node-idx i. Each invocation keeps only the test cases wheretc_idx % num_nodes == node_idx. Launch onetorchrunper node (SLURM array, multiplesruns, etc.).
Combine: torchrun --nproc_per_node=4 -m scripts.run_diversity ... --num-nodes 8 --node-idx $NODE_IDX = 4 GPUs × 8 nodes, disjoint.
- OF3p2 switch: default checkpoint is
of3-p2-155k.pt; the paper's OF3p1 checkpoint will not load againstopenfold3 >= 0.4.0. - USAlign: dropped the
mdtraj/bioemu_benchmarksdependencies in favor of the USAlign binary. Introduces a 0.1–3 Å difference in global RMSD (~1–2% in reported success rates). Some OOD60 local test cases can show larger differences because the alignment method itself has changed.
