Skip to content
mjsull edited this page Jun 17, 2015 · 1 revision

DiscoPlot of a mock genome. A mock genome was created by adding genomic rearrangements to the chromosome of E. coli str. UTI89. Paired-end reads generated from the mock genome (query) with GemSim (ref) and mapped back to UTI89 (reference). The first ~500 Kbp were then visualised using DiscoPlot.

DiscoPlots of structural variants

DiscoPlots of common structural variants. Each box shows a common genomic rearrangement represented by a DiscoPlot. Rows A and B were created using 100 bp long paired-end reads with an insert size of 300bp. Rows C and D were created using single-end reads with an average length of 1000bp. For each box the rearrangement in the sequenced genome is listed, followed by the scale of the gridlines in brackets. A1, C1: 300 bp deletion (400 bp). A2, C2: 300 bp insertion (400 bp). A3, C3: 300 bp inversion (400 bp). A4, C4: 300 bp sequence translocated 50 Kbp upstream (10 Kbp). B1, D1: 3000 bp deletion (1000 bp). B2, D2: 3000 bp insertion (500 bp). B3, D3: 3000 bp inversion (1000 bp). B4, D4: 3000 bp sequence translocated 50 Kbp upstream (10 Kbp).

DiscoPlot of E. coli genome

The dynamic nature of the genome of Escherichia coli str. UTI89. Discoplot of paired-end reads from a clonal culture of UTI89 mapped back to the published reference chromosome and plasmid. Coordinates from 0 to 5,065,741 represent the chromosome of E. coli UTI89, coordinates ≥ 5,066,000 represent the plasmid of E. coli UTI89

DiscoPlot of E. coli genome

Discordant reads in E. coli str. UTI89. a) Read alignment indicates inversion of bases 919,638..922,323. 12bp inverted repeat present at terminals of region. Start and stop of inverted region occurs in two probable tail fibre proteins. Two additional tail fibre assembly proteins are encoded within the boundaries of this region. Region is immediately downstream of a putative DNA invertase gene. b, f, h, i) Reads are misaligned as they map equally well in a concordant position. c) Read alignment indicates circularisation of bases 1,653,000..1,662,603. 17bp direct repeats present at terminals of this region. Region also encodes five putative phage-related membrane proteins, two putative phage proteins, three phage hypothetical proteins, four hypothetical proteins and a single putative phage related secreted protein. Size of crosses indicates coverage of this region is higher than average. Only a single read (indicated by the cross, top left) indicates potential excision of this region. d) Read alignments indicate inversion of bases 2,109,690..2,114,003. Region contains ~100bp inverted repeat at terminals which encodes a tRNA. Region contains 3 hypothetical proteins and an additional tRNA identical to the repeats. A P4-phage integrase is present immediately downstream of the inversion. The lack of concordantly mapping reads at prophage boundary indicates that the inverted phage has reached fixation in the population. e) Reads indicate inversion of bases 2,906,008..2,906,936. 15bp inverted repeats present at terminals of this region. The 3’ end of a putative tail fibre assembly gene is encoded by this region. g) Read alignments indicate inversion of bases 4,907,424..4,907,737. Regions has 9bp inverted repeat at terminals. It is located in a non-coding region between fimA and fimE which encode the type I fimbriae.

Clone this wiki locally