Skip to content

SFGLab/Polymer_model_benchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chromatin model comparison and validation

4D Nucleome Hackathon 2024

In March 2024, at The University of Washington in Seattle (USA) we conducted a project to address two challenges in the research of functional and structural genomics: comparison and validation of chromatin models. During the hackathon we developed a workflow for model comparison and validation, in which we convert models to distance matrices and calculate Spearman correlation coefficients between pairs of matrices to estimate the correlations between the models. We ran 5 distinct software packages for chromatin modeling for testing. Our results are described here 1.

Repository structure

  • run_sims - instructions how to run software

  • analysis:

  • yamls - conda environment files (one YAML file per software / Team member)

  • scratch - playground notebooks that we used during the hackathon

  • Model_Validation - scripts and notebooks for post-hackathon model validation

Workflow

  1. Run software and collect chromatin models (in XYZ, PDB or CIF format). If the output constitues an ensemble of models, proceed with the average over the ensemble.
  2. Interpolate the models to the same number of coordinates (e.g., n=214) and calculate distance matrices (make_dist_mats.ipynb).
  3. Calculate Spearman correlation coefficients between pairs of distance matrices. (process_distance_matrices.ipynb).
  4. Compare correlation coeffitients to estimate the differences between the models.

Software packages used during the hackathon

Data used during the hackathon

genomic region of interest: chr1:178421513-179491193

4DN Data Portal (https://data.4dnucleome.org/)

  • Hi-C: 4DNES4AABNEZ, 4DNESNMAAN97
  • SPRITE: 4DNESI1U7ZW9

ENCODE (https://www.encodeproject.org/)

  • Hi-C: ENCSR968KAY
  • ChIA-PET: ENCSR184YZV_CTCF_ChIAPET (ENCFF379AWZ.hic), ENCSR764VXA
  • ChIP-Seq: ENCSR000DZN_CTCF, ENCSR000DZP_SMC

Rao et al., 2014

  • Hi-C: GSE63525 (for Tier 1 cell line GM12878)

References

Footnotes

  1. Kubica J, Korsak S, Banecki KH, Schirman D, Yadavalli AD, Brenner Clerkin A, et al. (2025) The challenge of chromatin model comparison and validation: A project from the first international 4D Nucleome Hackathon. PLoS Comput Biol 21(8): e1013358. https://doi.org/10.1371/journal.pcbi.1013358

  2. Korsak, S., & Plewczynski, D. (2024). LoopSage: An energy-based Monte Carlo approach for the loop extrusion modeling of chromatin. Methods, 223, 106–117. https://doi.org/10.1016/j.ymeth.2024.01.015

  3. Di Pierro, M., Zhang, B., Aiden, E. L., Wolynes, P. G., & Onuchic, J. N. (2016). Transferable model for chromosome architecture. Proceedings of the National Academy of Sciences, 113(43), 12168–12173. https://doi.org/10.1073/pnas.1613607113

  4. Shi, G., & Thirumalai, D. (2023). A maximum-entropy model to predict 3D structural ensembles of chromatin from pairwise distances with applications to interphase chromosomes and structural variants. Nature Communications, 14(1), 1150. https://doi.org/10.1038/s41467-023-36412-4

  5. Shinkai, S., Itoga, H., Kyoda, K., & Onami, S. (2022). PHi-C2: Interpreting Hi-C data as the dynamic 3D genome state. Bioinformatics, 38(21), 4984–4986. https://doi.org/10.1093/bioinformatics/btac613

  6. Korsak, S., Banecki, K., & Plewczynski, D. (2024). Multiscale molecular modelling of chromatin with multimm: From nucleosomes to the whole genome. Cold Spring Harbor Laboratory. http://dx.doi.org/10.1101/2024.07.26.605260

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages