Skip to content

Aligning short terminal exons #1038

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
MichaelHiller opened this issue Apr 3, 2023 · 6 comments
Open

Aligning short terminal exons #1038

MichaelHiller opened this issue Apr 3, 2023 · 6 comments

Comments

@MichaelHiller
Copy link

Dear Heng and team,

we are aligning polished IsoSeq reads against genomes and I noticed a few cases where the last exon of 13 bp was softclipped and not aligned. However, the exon perfectly aligns, if one creates a ~370 bp intron with GTG ... AAG consensus splice sites.

Blat correctly places the terminal 13 bp exon (second block), while minimap2 stops the alignment at the end of the first block in this image:
image

We are calling minimap2 even with
minimap2 --eqx -a -c -t $nThreats -ax splice:hq -uf --secondary=no -C5 -o ${P_outMinimap2}/ALL.CuP.aln.sam --junc-bed ${ref_annot} -cs long ${genome_fa} ${P_out_isoC}/ALL.CuP.fasta.gz

Is there a way to increase sensitivity to correctly align such terminal exons? I would be happy to spend a more to get the right alignment.

Thx a lot
Michael

@lh3
Copy link
Owner

lh3 commented Apr 3, 2023

Do you have the sequences? It might be possible to tune parameters to get the alignment.

@MichaelHiller
Copy link
Author

Sure, I attach the read and the genomic context around that gene. Running this reduced example gives the same alignment.
read.fa.gz
genome.fa.gz

@lh3
Copy link
Owner

lh3 commented Apr 8, 2023

Thanks for the example. It is not possible to tune parameter to get the right alignment. I will keep this issue open and think more in future.

@lh3 lh3 added the enhancement label Apr 8, 2023
@MichaelHiller
Copy link
Author

Alright, thanks a lot for looking into this.

Conceptually, HiSat2 (and other mappers?) make a list of downstream exon candidate positions that are inferred from mappings of other reads (in our case they are already given via --junc-bed ${ref_annot}) and then check if shorter terminal exons align to these candidates.
This could be a postprocessing step for reads that have unaligning terminal parts.

@nextgenusfs
Copy link

nextgenusfs commented Apr 14, 2025

I wrote gapmm2 to address similar issue, it tries to re-align the terminal exons using edlib. I'm sure there are other ways to do this as well. https://github.com/nextgenusfs/gapmm2

@MichaelHiller
Copy link
Author

Great. We will give it a try. Thanks for posting it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants