-
Notifications
You must be signed in to change notification settings - Fork 483
New Addition: ripples and general update to latest version of usher #7306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
de920ae
be50626
392ac5b
1927643
d26d227
805426c
d2e7faf
06836fe
7070fa7
d43bfb6
0d49338
5292ac7
19d6d7e
d25847c
e09a68c
a68cadb
09a175a
d80054c
ff0c24e
5825dbf
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,94 @@ | ||
| <tool id='usher_ripples' name='UShER RIPPLES' version='@TOOL_VERSION@+@GALAXY_TOOL_VERSION@' profile='23.2'> | ||
| <description>detects recombination events in large mutation annotated tree (MAT) files.</description> | ||
| <macros> | ||
| <import>macros.xml</import> | ||
| </macros> | ||
| <expand macro="xrefs"/> | ||
| <expand macro='requirements' /> | ||
| <version_command>usher --version</version_command> | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should we put this also into the macro? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can and will from a teaching perspective: what's the rationale behind this request? Is it to avoid redundancy between the different wrappers? simplicity? good practice? thx :-) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Redundancy ... DRY. Here it's not that important, I don't expect this command changes much over time - but who knows :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok, makes sense indeed :-) and done so in the meantime. |
||
| <command detect_errors='exit_code'><![CDATA[ | ||
| ## get correct extension filenames | ||
| ln -sf '$input_mat' '$input_mat.element_identifier' && | ||
|
|
||
| ripples | ||
| --input-mat '$input_mat.element_identifier' | ||
|
|
||
| --branch-length $branch_length | ||
| --min-coordinate-range $min_coordinate_range | ||
| --max-coordinate-range $max_coordinate_range | ||
| --samples-filename '$samples_filename' | ||
| --parsimony-improvement $parsimony_improvement | ||
| --num-descendants $num_descendants | ||
|
|
||
| --outdir ./ | ||
| --threads \${GALAXY_SLOTS:-1} > output_stdout.txt | ||
|
|
||
| ]]> </command> | ||
| <inputs> | ||
| <param argument="--input-mat" type="data" format="protobuf3" label="Mutation-annotated tree object" help="Load a mutation annotated tree file, in protocol-buffers format (protobuf3)."/> | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. please consider adding min/max to all integers/float params There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would like to, but have no idea what sensible ranges would be ... |
||
| <param argument="--branch-length" type="integer" value="3" min="0" label="Minimum branch length" help="Minimum length of the branch to consider for recombination events. Default = 3." /> | ||
| <param argument="--min-coordinate-range" type="integer" value="1000" min="0" label="Minimal coordinate range" help="Minimum range of the genomic coordinates of the mutations on the recombinant branch. Default = 1,000." /> | ||
| <param argument="--max-coordinate-range" type="integer" value="10000000" min="0" label="Maximal coordinate range" help="Maximum range of the genomic coordinates of the mutations on the recombinant branch. Default = 10,000,000." /> | ||
| <param argument="--samples-filename" type="data" format="txt" label="Sample restriction file" help="Restrict the search to the ancestors of the samples specified in the input file." /> | ||
| <param argument="--parsimony-improvement" type="integer" value="3" min="0" label="Parsimony improvement" help="Minimum improvement in parsimony score of the recombinant sequences during the partial placement. Default = 3." /> | ||
| <param argument="--num-descendants" type="integer" value="10" label="Number of descendants" help="Minimum number of leaves that node should have to be considered for recombinatino. Default = 10." /> | ||
| </inputs> | ||
| <outputs> | ||
| <data name="recombination" format="tabular" from_work_dir='recombination.tsv' label="${tool.name} on ${on_string}: recombinations" > | ||
| <actions> | ||
| <action name="column_names" type="metadata" default="recomb_node_id,breakpoint-1_interval,breakpoint-2_interval,donor_node_id,donor_is_sibling,donor_parsimony,acceptor_node_id,acceptor_is_sibling,acceptor_parsimony,original_parsimony,min_starting_parsimony,recomb_parsimony" /> | ||
| </actions> | ||
| </data> | ||
| <data name="descendants" format="tabular" from_work_dir='descendants.tsv' label="${tool.name} on ${on_string}: descendants" > | ||
| <actions> | ||
| <action name="column_names" type="metadata" default="node_id,descendants" /> | ||
| </actions> | ||
| </data> | ||
|
|
||
| </outputs> | ||
| <tests> | ||
| <test expect_num_outputs="2"> | ||
| <param name="input_mat" value="mutation_annotation.pb" ftype="protobuf3"/> | ||
| <param name="samples_filename" value="sample_names.txt" ftype="txt"/> | ||
| <output name="descendants" file="test_26_descendants.tabular" ftype="tabular"/> | ||
| <output name="recombination" file="test_26_recombination.tabular" ftype="tabular"/> | ||
| </test> | ||
| <test expect_num_outputs="2"> | ||
| <param name="input_mat" value="mutation_annotation.pb" ftype="protobuf3"/> | ||
| <param name="samples_filename" value="sample_names.txt" ftype="txt"/> | ||
| <param name="num_descendants" value="20" /> | ||
| <param name="parsimony_improvement" value="5" /> | ||
| <param name="branch_length" value="2" /> | ||
| <output name="descendants" file="test_27_descendants.tabular" ftype="tabular"/> | ||
| <output name="recombination" file="test_27_recombination.tabular" ftype="tabular"/> | ||
| </test> | ||
| </tests> | ||
| <help><![CDATA[ | ||
|
|
||
| .. class:: infomark | ||
|
|
||
| **Purpose** | ||
|
|
||
| RIPPLES (Recombination Inference using Phylogenetic PLacEmentS) is a program used to detect recombination events in large mutation annotated tree (MAT) files. | ||
|
|
||
| ---- | ||
|
|
||
| RIPPLES is a program to rapidly and sensitively detect recombinant nodes and their ancestors in a mutation-annotated tree (MAT). RIPPLES exploits the fact that recombinant lineages arising from diverse genomes will often be found on “long branches” which result from accommodating the divergent evolutionary histories of the two parental haplotypes. Therefore, RIPPLES first identifies long branches in a MAT. RIPPLES then exhaustively breaks the potential recombinant sequence into distinct segments that are differentiated by mutations on the recombinant sequence and separated by up to two breakpoints. For each set of breakpoints, RIPPLES places each of its corresponding segments using maximum parsimony to find the two parental nodes – a donor and an acceptor – that result in the highest parsimony score improvement relative to the original placement on the global phylogeny. The nodes for which a set of breakpoints along with two parental nodes can be identified that provide a parsimony score improvement above a user-specified threshold are reported as recombinants. | ||
|
|
||
| .. class:: infomark | ||
|
|
||
| **RIPPLES Common Options** | ||
|
|
||
| - input-mat: Input mutation-annotated tree file [REQUIRED]. If only this argument is set, print the count of samples and nodes in the tree. | ||
| - branch-length (-l): Minimum length of the branch to consider for recombination events. Default = 3. | ||
| - min-coordinate-range (-r): Minimum range of the genomic coordinates of the mutations on the recombinant branch. Default = 1,000. | ||
| - max-coordinate-range (-R): Maximum range of the genomic coordinates of the mutations on the recombinant branch. Default = 10,000,000. | ||
| - samples-filename (-s): Restrict the search to the ancestors of the samples specified in the input file. | ||
| - parsimony-improvement (-p): Minimum improvement in parsimony score of the recombinant sequences during the partial placement. Default = 3. | ||
| - num-descendants (-n): Minimum number of leaves that node should have to be considered for recombinatino. Default = 10. | ||
|
|
||
| You can find more information in the `RIPPLES official documentation page <https://usher-wiki.readthedocs.io/en/latest/ripples.html>`_. | ||
|
|
||
| ]]> </help> | ||
| <expand macro="citations" /> | ||
| </tool> | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,4 +1,4 @@ | ||
| England/BRIS-1853249/2020|20-04-02 Spain/BRIS-1853249/2020|20-04-02 | ||
| Wales/PHWC-25B04/2020|20-03-24 Spain/BRIS-1853249/2020|20-04-02 | ||
| NPL/61-TW/2020|MT072688.1|20-01-13 Spain/BRIS-1853249/2020|20-04-02 | ||
| Wales/LIVE-A6831/2020|20-03-16 Spain/BRIS-1853249/2020|20-04-02 | ||
| England/BRIS-1853249/2020|20-04-02 Spain/BRIS-1853249/2020|20-04-02_A | ||
| Wales/PHWC-25B04/2020|20-03-24 Spain/BRIS-1853249/2020|20-04-02_B | ||
| NPL/61-TW/2020|MT072688.1|20-01-13 Spain/BRIS-1853249/2020|20-04-02_C | ||
| Wales/LIVE-A6831/2020|20-03-16 Spain/BRIS-1853249/2020|20-04-02_D |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.