You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<paramname="reference_history"type="data"argument="-r"format="fasta"label="Reference sequence"help="Reference sequence FASTA file from your history."/>
76
+
</when>
77
+
</conditional>
78
+
<conditionalname="scoring_matrix_cond">
79
+
<paramname="datatype"type="select"argument="-t"label="Data type"help="Choose the alignment space (nucleotide, protein, or codon).">
80
+
<optionvalue="codon"selected="true">Align sequences in codon space</option>
81
+
<optionvalue="nucleotide">Align sequences in nucleotide space</option>
82
+
<optionvalue="protein">Align sequences in protein space</option>
<optionvalue="HIV_BETWEEN_F">HIV_BETWEEN_F (for HIV alignments)</option>
94
+
</param>
95
+
</when>
96
+
<whenvalue="history">
97
+
<paramname="scoring_matrix"type="data"argument="-s"format="tabular"label="Scoring matrix file"help="Scoring matrix file from your history. The file should be a tabular matrix with rows and columns representing amino acids or nucleotides, and cells containing substitution scores."/>
<paramname="scoring_matrix"type="data"argument="-s"format="tabular"label="Scoring matrix file"help="Scoring matrix file from your history. The file should be a tabular matrix with rows and columns representing amino acids or nucleotides, and cells containing substitution scores."/>
<optionvalue="history"selected="true">Use a custom scoring matrix from history</option>
121
+
</param>
122
+
<whenvalue="history">
123
+
<paramname="scoring_matrix"type="data"argument="-s"format="tabular"label="Scoring matrix file"help="Scoring matrix file from your history. The file should be a tabular matrix with rows and columns representing amino acids or nucleotides, and cells containing substitution scores."/>
124
+
</when>
125
+
</conditional>
126
+
</when>
127
+
</conditional>
128
+
<paramname="local_alignment"type="select"argument="-l"label="Global/local alignment"help="Select the alignment type.">
129
+
<optionvalue="trim"selected="true">Trim alignment (global to query, local to reference)</option>
<optionvalue="silent">Silent (try both strands, report best score)</option>
141
+
<optionvalue="annotated">Annotated (like silent, but annotates strand)</option>
142
+
</param>
143
+
<paramname="affine_gap"type="boolean"argument="-a"truevalue="-a"falsevalue=""checked="false"label="Disable affine gap scoring"help="Disable affine gap scoring (enabled by default)."/>
144
+
<paramname="write_reference"type="boolean"argument="-I"truevalue="-I"falsevalue=""checked="false"label="Write out the reference sequence"help="Include the reference sequence in the output."/>
145
+
</inputs>
146
+
<outputs>
147
+
<dataname="output"format="fasta"label="${tool.name} on ${on_string}: ${format}"/>
`cawlign` is a codon-aware aligner that maps sequences from a FASTA file to a reference sequence. It can perform nucleotide, protein, and codon-aware alignments.
173
+
174
+
**Input**
175
+
176
+
- **Sequences to align**: A FASTA file containing the sequences to be aligned.
177
+
- **Reference sequence**: You can use a built-in reference sequence or provide one from your history.
178
+
- **Scoring matrix**: You can use a built-in scoring matrix or provide one from your history. The available built-in matrices depend on the selected data type.
179
+
180
+
**Output**
181
+
182
+
A FASTA file containing the alignments. The structure of the output depends on the selected output format parameter (see Output Examples below).
183
+
184
+
.. class:: infomark
185
+
186
+
**Alignment Methods**
187
+
188
+
`cawlign` can perform three types of alignment: nucleotide, protein, and codon-aware.
189
+
190
+
* **Nucleotide Alignment**: This is a standard pairwise alignment of nucleotide sequences using the Smith-Waterman-Gotoh algorithm with affine gap penalties.
191
+
192
+
* **Protein Alignment**: This is a standard pairwise alignment of protein sequences, also using the Smith-Waterman-Gotoh algorithm with affine gap penalties. The nucleotide sequences are translated into amino acid sequences before alignment.
193
+
194
+
* **Codon-aware Alignment**: This is a more complex alignment method that is aware of the codon structure of the sequences. It aligns nucleotide sequences in codon space, which allows it to handle frameshift mutations (insertions or deletions that are not a multiple of 3 nucleotides) more accurately than a simple nucleotide alignment. This is achieved by using a dynamic programming algorithm that considers various types of codon matches and mismatches, including 3-to-1, 3-to-2, 3-to-4, and 3-to-5 matches. This makes it particularly useful for aligning coding sequences where frameshift mutations may have occurred, such as in viral genomes.
195
+
196
+
**Options**
197
+
198
+
- **Data type**: The type of alignment to perform.
199
+
- **Nucleotide**: Align sequences in nucleotide space.
200
+
- **Protein**: Align sequences in protein space.
201
+
- **Codon**: Align sequences in codon space. This requires the reference to be in-frame.
202
+
203
+
- **Global/local alignment**: The type of alignment strategy.
204
+
- **Trim**: A trimming alignment that is global with respect to the query and local with respect to the reference.
205
+
- **Global**: Full string alignment; all gaps are scored equally.
206
+
- **Local**: Partial string local (Smith-Waterman type) alignment that maximizes the alignment score.
207
+
208
+
- **Output format**: The format of the output file.
209
+
- **Reference map**: Aligns query sequences to the reference and does not retain insertions relative to the reference.
210
+
- **Reference align**: Aligns query sequences to the reference and does retain insertions relative to the reference. Insertions are shown in lowercase.
211
+
- **Pairwise**: Aligns query sequences to the reference and does retain insertions relative to the reference; reports all pairwise alignments.
212
+
213
+
- **Reverse complementation**: How to handle reverse complementation.
214
+
- **None**: No reverse complementation.
215
+
- **Silent**: Try both forward and reverse-complemented query sequences and report the alignment with the best score.
216
+
- **Annotated**: Like "Silent", but also annotates which strand was used.
217
+
218
+
- **Disable affine gap scoring**: By default, `cawlign` uses affine gap scoring. Check this option to disable it.
219
+
220
+
- **Write out the reference sequence**: Include the reference sequence in the output.
221
+
222
+
.. class:: infomark
223
+
224
+
**Output Examples**
225
+
226
+
Here are examples of what the different output formats look like using an example with an insertion ('gataca') and a deletion. The sequences are truncated for clarity.
227
+
228
+
**refmap**
229
+
230
+
The `refmap` output format aligns the query sequences to the reference but does not retain insertions relative to the reference.
The `pairwise` output format reports the full pairwise alignment, including the reference sequence, with insertions and deletions shown as gaps in the corresponding sequence.
0 commit comments