Scripts and data for reproducing the analyses of
“Tracing Modern Breeding Introgressions in European Potato”
Dent et al., 2025
Contains the supplemental data files associated with the manuscript.
-
SupplementaryData1.CuratedPedigree.29Oct24.tsvalias: D1
→ Curated pedigree information generated for this study. -
SupplementaryData2.SNPdosage.14Oct25.xlsxalias: D2
→ SNP genotype data described in Vos et al. (2015), used to trace introgression haplotypes. Contains SNP dosages for all measured samples, and another sheet showing only those which could be assigned to a pedigree record. -
CommonCatalogue_LeafList_02Feb23.tsvalias: D3
→ List of Common Catalogue varieties that could be assigned to a pedigree record (Summary of Supplementary Table S3) -
DM_pericentromeric_coords.chr.txtalias: D4
→ Positions of pericentromeres in the DM reference genome -
Gebhardtcombined_cleaned_25Jul25.xlsxalias: D5
→ A 'tidied-up' version of Gebhardt (2023) Supplementary Tables 1-12 -
InfiniumBlast.DM.REF.cleaned.tsv.positions.tsvalias: D6
→ Results of blasting the Vos et al. (2015) SNP sequences in the DMv6.1 reference. -
DM_1-3_516_R44_potato_genome_assembly.v6.1_main12_2col.chrsizesalias: D7
→ Chromosome sizes for the DM reference genome.
Analysis scripts used to process pedigree and SNP data.
-
📜
RunPedigreeAssessment.ipynb
→ This python notebook primarily takes D1 and D3 to produce summary data about the pedigree and MCA analysis of European cultivars from the first half of the manuscript,. -
GraphReader3_CountUnknownsByYear.py
→ Parses the curated pedigree file (D1) and counts unknown parentage over time. Generates data for Supplmentary Table S1. Can be run within the RunPedigreeAssessment.ipynb notebook. -
GraphReader3_CountByContinent.py
→ Parses the curated pedigree file (D1) and counts the number of varieties from Germany, The Netherlands, Great Britian (and Europe as a whole) which have parents bred in the same country/Continent. Generates data for Supplmentary Table S2. (line 271 defines the time period, if you are interested in seeing how this changes over time). Can be run within the RunPedigreeAssessment.ipynb notebook. -
GraphReader3_scoreUnique2_v2.py
→ Parses the curated pedigree file (D1) and the Common Catalogue varieties (D3) and calculates the Major Contributing Ancestors of those varieties in D3. Generates data for Supplmentary Table S4 and scores for Figure 1b. Can be run within the RunPedigreeAssessment.ipynb notebook. -
GraphReader3_RelationshipOfTop25MCAs2.py
→ Parses the curated pedigree file (D1) and the output of the above script, and reports the direct paths (ie ancestor/descendant relationships) through the pedigree connecting any of the top 25 Major Contributing Ancestors. Generates data used for drawing edges in Figure 1b. Can be run within the RunPedigreeAssessment.ipynb notebook for European MCAs, and within FindIntrogressedSNPs_21Oct25.ipynb for SNP cluster MCAs. -
GraphReader3_FindContributionsOfVar.py
→ Parses the curated pedigree file (D1), the Common Catalogue varieties (D3), and a comma-separated list of varieties (eg "KATAHDIN,AM 66-42") and summarises the number/depth of direct paths in the pedigree between the listed varieties and those in D3. Generates summary data for KATAHDIN reported in the main text. -
📜
FindIntrogressedSNPs_21Oct25.ipynb
→ This python notebook primarily takes D2-D6 and identifies introgressed SNP clusters. It also generates the plots of introgressed SNP clusters used for Main Figures 2-4 and Supplemental Figures 3-14 of the manuscript, and runs the MCA analysis for these clusters. -
GraphReader3_scoreUnique2_Introgressions_outDir.py
→ Adapted from GraphReader3_scoreUnique2_v2.py, this script calculates MCAs by first seeding SNP dosages into the pedigree. This is best run from within the python notebook above. -
Variety_multiChrom.py
→ Some class definitions used by the GraphReader scripts.
Plots and tables will be written here if you run the python notebook above.
# Clone the repository
git clone https://github.com/schneebergerlab/PotatoMCAs.git
cd PotatoMCAs
# Open either of the python notebooks (.ipynb) in notebook/lab
# Run each block of code