Iterative motif search #10

ozgunbabur · 2022-04-28T18:20:12Z

Write an iterative method that will start by looking for enrichments and deficiencies for each location and each amino acid. Then it will

select the significant subset using a given threshold
for each significant result, do a new search within the pattern of that result

Iterate this until nothing comes out as significant. This will give you a tree of results, but you will see that some of the nodes on the tree will converge in the same motif, meaning it is actually a DAG.

Report each significant motif with their p-values, and with their parent-child relations.

AdamFinkleUMB · 2022-05-05T04:32:11Z

Tentatively done but not sure if result is what you want.

ozgunbabur · 2022-05-05T14:50:56Z

Hi Adam, please describe what you have done about this issue and please tell how we can test it.

AdamFinkleUMB · 2022-05-05T21:54:49Z

I fixed the bug we saw today: I needed to convert the number of the motif into a character. The search now neatly returns a readable result if the threshold is kept low.

ozgunbabur · 2022-05-06T01:15:05Z

What is the result on the simulated dataset with window 5?

AdamFinkleUMB · 2022-05-09T20:41:19Z

path = "test_data/simulated-phosphoproteomic-data.txt"
window = 5; length = 2 * window + 1
step = 1024
threshold = 0.0005

Key:
motif => [newfound_motifs]
(index, letter, presence)
Index is relative to 0 at the left, letter is the amino acid,
and presence is whether the acid must appear (True) or absent (False)

22212
None => [(4, 'S', False), (5, 'P', True)]

18981
(4, 'S', False) => [(5, 'P', True), (9, 'I', False)]

1239
(5, 'P', True) => []

15209
(9, 'I', False) => [(4, 'H', True), (5, 'P', True), (7, 'K', True)]

440
(4, 'H', True) => []

1017
(5, 'P', True) => []

1918
(7, 'K', True) => [(5, 'P', True)]

106
(5, 'P', True) => []

1512
(5, 'P', True) => []

Final Graph: {None: (5, 'P', True)}

ozgunbabur · 2022-05-10T13:41:50Z

How should we read these? I would like to understand the resulting DAG structure.

AdamFinkleUMB · 2022-05-12T03:38:49Z

The resulting acyclic graph in this case would be the original sequences with a single edge of (5, "P", True) leading to only the sequences with a "P" at index 5. The method itself works, and understanding the DAG structure can be part of the visualization issue.

ozgunbabur · 2022-05-12T15:20:58Z

Let me give an example: The output above produces "(5, 'P', True) => []" twice in the last steps. Why is that?

How can we look at this output and draw the DAG?

Also, where does the index 5 map on the sequence? Is it the center?

ozgunbabur assigned AdamFinkleUMB Apr 28, 2022

ozgunbabur mentioned this issue Apr 28, 2022

Visualize the result DAG #11

Open

AdamFinkleUMB closed this as completed May 5, 2022

ozgunbabur reopened this May 5, 2022

AdamFinkleUMB closed this as completed May 12, 2022

ozgunbabur reopened this May 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Iterative motif search #10

Iterative motif search #10

ozgunbabur commented Apr 28, 2022

AdamFinkleUMB commented May 5, 2022

ozgunbabur commented May 5, 2022

AdamFinkleUMB commented May 5, 2022

ozgunbabur commented May 6, 2022

AdamFinkleUMB commented May 9, 2022

ozgunbabur commented May 10, 2022

AdamFinkleUMB commented May 12, 2022

ozgunbabur commented May 12, 2022

Iterative motif search #10

Iterative motif search #10

Comments

ozgunbabur commented Apr 28, 2022

AdamFinkleUMB commented May 5, 2022

ozgunbabur commented May 5, 2022

AdamFinkleUMB commented May 5, 2022

ozgunbabur commented May 6, 2022

AdamFinkleUMB commented May 9, 2022

ozgunbabur commented May 10, 2022

AdamFinkleUMB commented May 12, 2022

ozgunbabur commented May 12, 2022