Skip to content

DeepPrimeGuideRNA: reverse-complement of all sequences in input does not always give same output? #90

@francoiskroll

Description

@francoiskroll

Sorry, I think I'm almost done with all my sanity checks 😃. Just want to make sure I'm using the tool correctly before doing anything substantial.

With DeepPrimeGuideRNA(...), why does it matter—sometimes, but not always—whether one works on the 5'–3' genome strand or 3'–5' genome strand?

If I take the example from documentation:

## target in 5'–3' genome direction ##
# PBS & RT on 3'–5' strand
# i.e. exactly the example in documentation
target='ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATG'
pbs='GGCAAGGGTGT'
rtt='CGTCTCAGTTTCTGGGAGCTTTGAAAACTCCACAA'
edit_len=1
edit_pos=34
edit_type='sub'
peg = DeepPrimeGuideRNA('pegRNA_test', target=target, pbs=pbs, rtt=rtt, 
                           edit_len=edit_len, edit_pos=edit_pos, edit_type=edit_type)
score = peg.predict('PE2')
print(score) # 0.09

## reverse-complement of all three sequences, so target is now in 3'–5' genome direction ##
target='CATTTGCAGGTTATAGTTCTTCTCAGTTTCTGGGAGCTTTGAAAACTCCACAAGGCAAGGGTGTTGTCTTTTAT'
pbs='ACACCCTTGCC'
rtt='TTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACG'
edit_len=1
edit_pos=34
edit_type='sub'
peg = DeepPrimeGuideRNA('pegRNA_test', target=target, pbs=pbs, rtt=rtt, 
                           edit_len=edit_len, edit_pos=edit_pos, edit_type=edit_type)
score = peg.predict('PE2')
print(score) # 0.16

Another example:

## target in 3'–5' genome direction ##
# because gene (zebrafish adgrf3b) is in reverse direction
# PBS & RT on 5'–3' strand
target='CAAATGTGTATGGCAGATGTCCAGAGGCGACGAAGCGCCATTGACTTTGGGACCAGGGAGTGAAATAATGCTTT'
pbs='GACATCTGCC'
rtt='GCTTCGTCGCCTCTG'
edit_len  = 1
edit_pos  = 5
edit_type = 'sub'
peg = DeepPrimeGuideRNA('pegRNA', target=target, pbs=pbs, rtt=rtt, 
                           edit_len=edit_len, edit_pos=edit_pos, edit_type=edit_type)
score=peg.predict('PE2')
print(score) # 0.54

## reverse-complement of all three sequences, so target is now in 3'–5' genome direction ##
target='AAAGCATTATTTCACTCCCTGGTCCCAAAGTCAATGGCGCTTCGTCGCCTCTGGACATCTGCCATACACATTTG'
pbs='GGCAGATGTC'
rtt='CAGAGGCGACGAAGC'
edit_len  = 1
edit_pos  = 5
edit_type = 'sub'
peg = DeepPrimeGuideRNA('pegRNA', target=target, pbs=pbs, rtt=rtt, 
                           edit_len=edit_len, edit_pos=edit_pos, edit_type=edit_type)
score=peg.predict('PE2')
print(score) # 1.24

I would have thought reverse-complementing all sequences in input would always give the same output. Am I missing something obvious?


EDIT: Corrected first example, as per reply below from @hkimlab.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions