Tools to compute high-frequency native contacts from MD trajectories and build a coarse-grained (CG) Gō-Martini model with martinize2.
The workflow converts a trajectory into per-frame structures (PDB/CIF), computes per-frame contact maps, identifies persistent (high-frequency) contacts, selects a reference frame, and generates CG topology/parameters. It supports PDB or CIF frames natively and uses nm for contact thresholds.
- Python 3.8+
- Packages:
MDAnalysis,numpy,tqdm contact_mapexecutable (from GoMartini ContactMapGenerator)martinize2in your$PATH- (Optional)
mkdsspfor secondary structure
Get contact_map:
- Zenodo: https://zenodo.org/records/3817447
- GitHub: https://github.com/Martini-Force-Field-Initiative/GoMartini/tree/main/ContactMapGenerator
Use traj_to_pdb.py to write frames as frame_####.pdb (or .cif).
python traj_to_pdb.py --trajectory 6ZH9_WT_dry_100.nc --topology 6ZH9_WT_dry.parm7 --ranges 1-195,195-323 --outdir . --stride 1Options:
--trajectorytrajectory file (e.g.,.xtc,.dcd,.nc)--topologytopology or coordinates (e.g.,.pdb,.gro,.parm7)--rangesresidue blocks per chain (e.g.,2-196,197-325)--outdirwhere frames are written--stridekeep every n-th frame
Frames must be named frame_0001.pdb, frame_0002.pdb, … (or .cif).
If both .pdb and .cif exist for the same index, PDB is preferred.
The main script is contact_analysis.py (formerly contact_calculation.py). It:
- Generates
.mapfiles for each frame (parallelized). - Filters contacts using nm thresholds (
--go-low,--go-up). - Annotates intra and inter contacts and removes duplicates.
- Computes contact frequencies across frames and writes:
normalized_<type>.txthigh_<type>.txt(≥--threshold)
- Selects the frame with the most high-frequency contacts.
- Runs
martinize2with-go <frame_X.map>matching-f <frame_X.pdb|cif>. - Produces and filters
go_nbparams.itp. - Computes average CA–CA distances (in nm) for high-frequency pairs missing from the selected ITP and writes
missing_high_freq.itp. - Keeps only pairs with average distance ≤
--go-up. - (Optional)
--add-missingappendsmissing_high_freq.itpentries (with header) intogo_nbparams.itp. - Writes per-frame counts for high and Gō sets and moves outputs into
output_files/.
python contact_analysis.py --cm /path/to/contact_map --type inter --cpus 15 --threshold 0.7 --merge all --dssp /usr/bin/mkdssp --from charmm --go-eps 15.0 --go-low 0.3 --go-up 1.1 --add-missing--cm(str,"."): directory containingcontact_map--type(both|intra|inter,"both"): scope of contacts
For monomers, useintra.--cpus(int,15): parallel workers for mapping--threshold(float,0.7): high-frequency cutoff--merge(all|None,None): chains to merge before martinize2 (allmerges every chain)--dssp(str|None,None): path tomkdssp--from(amber|charmm,"amber"): source force field for martinize2--posres(none|all|backbone,"none"): position restraints
Gō model and contact filtering (nm):
--go-eps(float,9.414): epsilon for Gō bias (kJ/mol)--go-low(float,0.3): minimum contact distance threshold--go-up(float,1.1): maximum contact distance threshold--go-res-dist(int|None,None): remove contacts below a graph distance--go-backbone(str,"BB"): backbone bead name for Gō site--go-atomname(str,"CA"): virtual site atom name--go-write-file(flag or str): ask martinize2 to write its contact map if it computes it
Water bias and IDR:
--water-bias(flag): enable water bias--water-bias-eps(list ofstr): e.g.,H:3.6 C:2.1 idr:2.1--id-regions(list ofstr): e.g.,A-10:45 60:80--idr-tune(flag): deprecated passthrough to martinize2
Protein description and edits:
--noscfix(flag),--scfix(flag),--cys(str)--mutate(list ofstr): e.g.,A-PHE45:ALA PHE30:ALA--modify(list ofstr): e.g.,A-ASP45:ASP0 ASP:ASP0 +HSE--nter,--cter,--nt: terminus patches
Diagnostics:
--write-graph,--write-repair,--write-canon-v(repeatable): increase martinize2 verbosity--maxwarn(list ofint)
Reference frame:
--force-frame(int orNone): use a specificframe_####instead of auto-selection
Appending missing contacts:
--add-missing(flag): append high-frequency pairs frommissing_high_freq.itpinto the finalgo_nbparams.itp. The script includes the[ nonbond_params ]header on append, and no extra blank line is added. Distances for these pairs are average CA–CA over all frames (in nm), and only pairs with average ≤--go-upare kept.
Run python contact_analysis.py -h to view all flags with their default values.
When building missing_high_freq.itp, the script scans all frames and measures CA–CA distances for each missing high-frequency contact.
- PDB frames: MDAnalysis positions are in angstroms. They are converted to nm by dividing by 10.
- CIF frames: a lightweight mmCIF reader provides angstrom coordinates, then the script converts to nm by dividing by 10.
It then averages per-pair distances across all frames (nm) and keeps the pair only if average_distance ≤ --go-up.
The Lennard-Jones minimum used in the ITP is:
rmin = avg / 2^(1/6)
epsilon = --go-eps # kJ/molfiltered_*.txt,annotated_*.txt: per-frame filtered and annotated contactsnormalized_<type>.txt: contact frequencies across frameshigh_<type>.txt: contacts with frequency ≥ thresholdbest_frame.txt(plus a ties file if needed)topol.top,<frame>_CG.pdbgo_nbparams.itp(filtered as requested)go_nbparams_mock_<type>.itp(mock for bookkeeping)missing_high_freq.itp(optional source for--add-missing)high_counts_per_frame.txt,go_counts_per_frame.txt
All intermediate .txt, .map, and input frames are moved to output_files/.
The final CG PDB stays in the run directory.
- For single proteins (monomers) use
--type intra. - For complexes, use
--type interor--type both:inter: keeps all intra pairs plus only high-frequency inter pairsintra: keeps all intra pairs plus only high-frequency intra pairsboth: keeps only high-frequency pairs (intra and inter)
- The script feeds
martinize2with-go <frame_X.map>corresponding to the same frame used by-f(PDB or CIF). - If you see an MDAnalysis warning about missing element information, it is harmless for distance calculations.
The script logs the exact command line into run.log each time you run it.
If you use these tools, please cite the Gō-Martini and MARTINI references, and the contact_map generator repository linked above.