Skip to content

Scripts for reproducing the analyses of "Tracing Modern Breeding Introgressions in European Potato" Dent et al., 2025

Notifications You must be signed in to change notification settings

schneebergerlab/PotatoMCAs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

🥔 PotatoMCAs

Scripts and data for reproducing the analyses of
“Tracing Modern Breeding Introgressions in European Potato”
Dent et al., 2025


📁 Repository structure

data/

Contains the supplemental data files associated with the manuscript.

  • SupplementaryData1.CuratedPedigree.29Oct24.tsv alias: D1
    → Curated pedigree information generated for this study.

  • SupplementaryData2.SNPdosage.14Oct25.xlsx alias: D2
    → SNP genotype data described in Vos et al. (2015), used to trace introgression haplotypes. Contains SNP dosages for all measured samples, and another sheet showing only those which could be assigned to a pedigree record.

  • CommonCatalogue_LeafList_02Feb23.tsv alias: D3
    → List of Common Catalogue varieties that could be assigned to a pedigree record (Summary of Supplementary Table S3)

  • DM_pericentromeric_coords.chr.txt alias: D4
    → Positions of pericentromeres in the DM reference genome

  • Gebhardtcombined_cleaned_25Jul25.xlsx alias: D5
    → A 'tidied-up' version of Gebhardt (2023) Supplementary Tables 1-12

  • InfiniumBlast.DM.REF.cleaned.tsv.positions.tsv alias: D6
    → Results of blasting the Vos et al. (2015) SNP sequences in the DMv6.1 reference.

  • DM_1-3_516_R44_potato_genome_assembly.v6.1_main12_2col.chrsizes alias: D7
    → Chromosome sizes for the DM reference genome.


scripts/

Analysis scripts used to process pedigree and SNP data.

  • 📜 RunPedigreeAssessment.ipynb
    → This python notebook primarily takes D1 and D3 to produce summary data about the pedigree and MCA analysis of European cultivars from the first half of the manuscript,.

  • GraphReader3_CountUnknownsByYear.py
    → Parses the curated pedigree file (D1) and counts unknown parentage over time. Generates data for Supplmentary Table S1. Can be run within the RunPedigreeAssessment.ipynb notebook.

  • GraphReader3_CountByContinent.py
    → Parses the curated pedigree file (D1) and counts the number of varieties from Germany, The Netherlands, Great Britian (and Europe as a whole) which have parents bred in the same country/Continent. Generates data for Supplmentary Table S2. (line 271 defines the time period, if you are interested in seeing how this changes over time). Can be run within the RunPedigreeAssessment.ipynb notebook.

  • GraphReader3_scoreUnique2_v2.py
    → Parses the curated pedigree file (D1) and the Common Catalogue varieties (D3) and calculates the Major Contributing Ancestors of those varieties in D3. Generates data for Supplmentary Table S4 and scores for Figure 1b. Can be run within the RunPedigreeAssessment.ipynb notebook.

  • GraphReader3_RelationshipOfTop25MCAs2.py
    → Parses the curated pedigree file (D1) and the output of the above script, and reports the direct paths (ie ancestor/descendant relationships) through the pedigree connecting any of the top 25 Major Contributing Ancestors. Generates data used for drawing edges in Figure 1b. Can be run within the RunPedigreeAssessment.ipynb notebook for European MCAs, and within FindIntrogressedSNPs_21Oct25.ipynb for SNP cluster MCAs.

  • GraphReader3_FindContributionsOfVar.py
    → Parses the curated pedigree file (D1), the Common Catalogue varieties (D3), and a comma-separated list of varieties (eg "KATAHDIN,AM 66-42") and summarises the number/depth of direct paths in the pedigree between the listed varieties and those in D3. Generates summary data for KATAHDIN reported in the main text.

  • 📜 FindIntrogressedSNPs_21Oct25.ipynb
    → This python notebook primarily takes D2-D6 and identifies introgressed SNP clusters. It also generates the plots of introgressed SNP clusters used for Main Figures 2-4 and Supplemental Figures 3-14 of the manuscript, and runs the MCA analysis for these clusters.

  • GraphReader3_scoreUnique2_Introgressions_outDir.py
    → Adapted from GraphReader3_scoreUnique2_v2.py, this script calculates MCAs by first seeding SNP dosages into the pedigree. This is best run from within the python notebook above.

  • Variety_multiChrom.py
    → Some class definitions used by the GraphReader scripts.


output/

Plots and tables will be written here if you run the python notebook above.


🧪 Usage

# Clone the repository
git clone https://github.com/schneebergerlab/PotatoMCAs.git
cd PotatoMCAs

# Open either of the python notebooks (.ipynb) in notebook/lab
# Run each block of code

About

Scripts for reproducing the analyses of "Tracing Modern Breeding Introgressions in European Potato" Dent et al., 2025

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published