Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input sequence changed #184

Open
aldendirks opened this issue Mar 21, 2025 · 2 comments
Open

Input sequence changed #184

aldendirks opened this issue Mar 21, 2025 · 2 comments

Comments

@aldendirks
Copy link

Hello,

I'm attempting to run R2DT with the web GUI on a 1,193 bp DNA sequence. The output sequence changes from the long sequence to just

>description
TXXXX

Why is this?

Thanks for your help!

@AntonPetrov
Copy link
Member

Thank you for creating this issue! The XXXX symbols are used to mask large insertions relative to the template (large is defined as over 100 nucleotides long).

In this case something clearly has gone wrong. I suspect that there is no good template and R2DT is trying to use some poorly matching template with a giant insertion.

I would be happy to look into it if you paste the sequence here or email it to me ([email protected]). Thanks again for using R2DT!

@aldendirks
Copy link
Author

aldendirks commented Mar 21, 2025

Thank you, here is the sequence:

TTAAGCGGGGCGCAATCCTGGCTCCTTGGCACTGGCCAACGTAAAACGGCAGAATGGGAAATAGGGAAAGGAATGGAAGGAAATTGAGGGGAAAAAGGAGGGTTTAGTTTTAAGAAAAAGCTCACTTTGTTGTCTTTGTCTTTTTTGGAGTCCTTTTTTTACAACTTTCGCTGTGTTGACCAATGACTCCAGGGAATTAAGCGGGGCGCAATCCTGGCTCCTTGGCACTGGCCAACGTAAAATGGCTGAATGGGAAATAGGGAAAGGAACAGAGAAAGTATTGACAAGAAAGGGGGCTTTTTTTTCGGTAATTGTGAACTTACATATTTTGGTGTCTTGACAAAGTATAAAAAGGAAGAAAGGAAAAAAGAAAGGACAGGCATTTGTGTTTGTTTTTCCACTTTTCTTTTCTTTTTTGTTTTTTTGGACTGTGAAAAGCTGGACTTTATCTTTTTGTGACAAGCATTTCTTTGCTTCTTTTGTGTTTCTTATGGCAGGATCTTGGTTTAAAAAGAAAAAGGAAGAAAAAAGAAAAAGGGAGGGAAAGAAAGAAGAGGGTCTTTGGAAGTGAAAGCTTTTGTTTGTTTTATGGCTTGTTTTTCTTTTTCATTCTTTCGGCTTTCTTTTGTAGAGTGTTTTCTTTGCTTTGTCTTTTTCTTTTTTTTGGGACGACAGCGTGCAAGTACATCTTTTCTTCTTTGCAAAAAGGAGAAAAAAGAGGAAAGGCGTTGTTTTTCAAAAAAAAAAGAAATGGGCAAGGCAGACTTTTTGTATTTTTTCGATTTTTTAACTTGCAAGTACATCTCTGCTTCTGTAAAAGGAAAAAGAGAGAGGAAAGGAGTTGGAAAAGAAAAAGAAATACAAAAGGGAAAAGGAGAAGATACAAAGCAAGAGGGAAAAGAAAGGATGAAAGGGGAAGATGGGTCCATGGAATTGACAAGAGTTTTCACCAAGAAGGTTTTCCAATTCTTTTCCAATGGTTTTCTTTCTATTCTTTTTTCTTTTTGGAAAAAGGTTCTGTCATGAAAAAGGCAAGAGACAAATGTGTTGGGAAGACTTTGCAAAAACAAAAAGAAAGGGAACAGAAAATAGCAAGAAGAAAAGCAAAAAAGAAAAGAACAAAAGGGGGGGAAAGCGAGGGCTTGTGTTTGTTTTTTTCTTTTTCTCTCTATTGCTTTTTTGCTTTGTTTTGA

The reality is this sequences encompasses two ORFs that I'm running together as one sequence. Both have predicted structure according to the Plasmodium RNase P template. Since they were neighboring (separated by 8 bp), I tried running them as one out of curiosity.

First fredicted RNase P

TTAAGCGGGGCGCAATCCTGGCTCCTTGGCACTGGCCAACGTAAAACGGCAGAATGGGAAATAGGGAAAGGAATGGAAGGAAATTGAGGGGAAAAAGGAGGGTTTAGTTTTAAGAAAAAGCTCACTTTGTTGTCTTTGTCTTTTTTGGAGTCCTTTTTTTACAACTTTCGCTGTGTTGACCAATGACTCCAGGGAATTAAGCGGGGCGCAATCCTGGCTCCTTGGCACTGGCCAACGTAAAATGGCTGAATGGGAAATAGGGAAAGGAACAGAGAAAGTATTGACAAGAAAGGGGGCTTTTTTTTCGGTAATTGTGAACTTACATATTTTGGTGTCTTGACAAAGTATAAAAAGGAAGAAAGGAAAAAAGAAAGGACAGGCATTTGTGTTTGTTTTTCCACTTTTCTTTTCTTTTTTGTTTTTTTGGACTGTGAAAAGCTGGACTTTATCTTTTTGTGACAA

8 bp divider

GCATTTCT

Second predicted RNase P

TTGCTTCTTTTGTGTTTCTTATGGCAGGATCTTGGTTTAAAAAGAAAAAGGAAGAAAAAAGAAAAAGGGAGGGAAAGAAAGAAGAGGGTCTTTGGAAGTGAAAGCTTTTGTTTGTTTTATGGCTTGTTTTTCTTTTTCATTCTTTCGGCTTTCTTTTGTAGAGTGTTTTCTTTGCTTTGTCTTTTTCTTTTTTTTGGGACGACAGCGTGCAAGTACATCTTTTCTTCTTTGCAAAAAGGAGAAAAAAGAGGAAAGGCGTTGTTTTTCAAAAAAAAAAGAAATGGGCAAGGCAGACTTTTTGTATTTTTTCGATTTTTTAACTTGCAAGTACATCTCTGCTTCTGTAAAAGGAAAAAGAGAGAGGAAAGGAGTTGGAAAAGAAAAAGAAATACAAAAGGGAAAAGGAGAAGATACAAAGCAAGAGGGAAAAGAAAGGATGAAAGGGGAAGATGGGTCCATGGAATTGACAAGAGTTTTCACCAAGAAGGTTTTCCAATTCTTTTCCAATGGTTTTCTTTCTATTCTTTTTTCTTTTTGGAAAAAGGTTCTGTCATGAAAAAGGCAAGAGACAAATGTGTTGGGAAGACTTTGCAAAAACAAAAAGAAAGGGAACAGAAAATAGCAAGAAGAAAAGCAAAAAAGAAAAGAACAAAAGGGGGGGAAAGCGAGGGCTTGTGTTTGTTTTTTTCTTTTTCTCTCTATTGCTTTTTTGCTTTGTTTTGA

I actually don't know if these are RNase P's, just that in testing identified ORFs in a very long ITS1 region these two seemed to have some predicted secondary structure according to that template. Not sure if that is an appropriate way to interpret the output, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants