Skip to content

bam diff seems to return the reads that look identical between A and B #59

@hisplan

Description

@hisplan

Hi,

I have two BAM files that I'd like to compare. Each is about 5.6GB. I expect them to be identical (I'm sort of doing a reproducibility test).

When I ran with the following command:

bam diff --in1 a.bam --in2 b.bam --all --onlyDiffs --recPoolSize -1 --out c.bam

It generated three files:

-rw-r--r-- 1    2373574 Jun 13 15:12 c.bam
-rw-r--r-- 1    1803478 Jun 13 15:12 c_only1_a.bam
-rw-r--r-- 1    1803105 Jun 13 15:12 c_only2_b.bam

I tried to see what actually differs between the two, but I think they look identical. My suspicion is maybe something to do with the muti-mapped reads. Do you have any idea how to resolve this?

samtools view c.bam | head -n1
:TAGATCGCAAATGGTA:CTTATCAAGCGC:;A00228:279:HFWFVDMXX:1:1229:8477:28792	0	1	629088	1	46M1I44M	*	0	0	GTCCGAACTAGTATCAGGCTTCAAAATCGAATACGCCGCAGGCCCCCTTCGCCCTATTCTTCATAGCAGAATACACAAACATTATTATAAT	,:FF,FFFFFF:,,FF:FFFFFFF,F::FFF,FF,FF:FFFF:FFF,:F:F,FFFFFF,F,FF:FFF,FFF:FFFFFFFFFFFFF:FFFFF	AS:i:78	HI:i:4	NH:i:4	nM:i:3	ZC:Z:42M1I48M	ZT:Z:AS:i:78;HI:i:3;NH:i:4;nM:i:3
$ samtools view a.bam | grep -F ":TAGATCGCAAATGGTA:CTTATCAAGCGC:;A00228:279:HFWFVDMXX:1:1229:8477:28792"
:TAGATCGCAAATGGTA:CTTATCAAGCGC:;A00228:279:HFWFVDMXX:1:1229:8477:28792	0	1	629088	1	42M1I48M	*	0	0	GTCCGAACTAGTATCAGGCTTCAAAATCGAATACGCCGCAGGCCCCCTTCGCCCTATTCTTCATAGCAGAATACACAAACATTATTATAAT	,:FF,FFFFFF:,,FF:FFFFFFF,F::FFF,FF,FF:FFFF:FFF,:F:F,FFFFFF,F,FF:FFF,FFF:FFFFFFFFFFFFF:FFFFF	NH:i:4	HI:i:3	AS:i:78	nM:i:3
:TAGATCGCAAATGGTA:CTTATCAAGCGC:;A00228:279:HFWFVDMXX:1:1229:8477:28792	0	1	629088	1	46M1I44M	*	0	0	GTCCGAACTAGTATCAGGCTTCAAAATCGAATACGCCGCAGGCCCCCTTCGCCCTATTCTTCATAGCAGAATACACAAACATTATTATAAT	,:FF,FFFFFF:,,FF:FFFFFFF,F::FFF,FF,FF:FFFF:FFF,:F:F,FFFFFF,F,FF:FFF,FFF:FFFFFFFFFFFFF:FFFFF	NH:i:4	HI:i:4	AS:i:78	nM:i:3
$ samtools view b.bam | grep -F ":TAGATCGCAAATGGTA:CTTATCAAGCGC:;A00228:279:HFWFVDMXX:1:1229:8477:28792"
:TAGATCGCAAATGGTA:CTTATCAAGCGC:;A00228:279:HFWFVDMXX:1:1229:8477:28792	0	1	629088	1	42M1I48M	*	0	0	GTCCGAACTAGTATCAGGCTTCAAAATCGAATACGCCGCAGGCCCCCTTCGCCCTATTCTTCATAGCAGAATACACAAACATTATTATAAT	,:FF,FFFFFF:,,FF:FFFFFFF,F::FFF,FF,FF:FFFF:FFF,:F:F,FFFFFF,F,FF:FFF,FFF:FFFFFFFFFFFFF:FFFFF	NH:i:4	HI:i:3	AS:i:78	nM:i:3
:TAGATCGCAAATGGTA:CTTATCAAGCGC:;A00228:279:HFWFVDMXX:1:1229:8477:28792	0	1	629088	1	46M1I44M	*	0	0	GTCCGAACTAGTATCAGGCTTCAAAATCGAATACGCCGCAGGCCCCCTTCGCCCTATTCTTCATAGCAGAATACACAAACATTATTATAAT	,:FF,FFFFFF:,,FF:FFFFFFF,F::FFF,FF,FF:FFFF:FFF,:F:F,FFFFFF,F,FF:FFF,FFF:FFFFFFFFFFFFF:FFFFF	NH:i:4	HI:i:4	AS:i:78	nM:i:3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions