Skip to content

phase block number compared to Whatshap #17

Open
@MichelMoser

Description

@MichelMoser

Hello, thank you for the great tool!

I was just testing HapDup v0.7 on our fish genome.
Comparing the output with phasing done with WhatsHap (WH), I wondered why there is such a big difference in phased block size and block number between HapDup and the WH pipeline?

For the fish chromosomes, WH was generating 679 blocks using 2'689'114 phased SNPs.
Margin (HapDup pipeline) was generating 5352 blocks using 3'862'108 phased SNPs.

The main difference seems to be the prior read filtering and usage of MarginPhase for the phasing in HapDup, but does this explain such a big difference?

I was wondering if phase blocks of HapDup could be concatenated using whatshap SNP and block information to increase continuity?
I imagine it would be a straightforward approach overlapping SNP positions between Margin and WH with phase block ids and lift-over phase ids from WH.
I will do some visual inspections and scripting to test if there is overlap of called SNPs and agreement on block boarders.

Cheers,
Michel

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions