14
14
- [ outgroup] ( #outgroup )
15
15
- [ phylo] ( #phylo )
16
16
- [ dotplot] ( #dotplot )
17
+ - [ Other functions] ( #other-functions )
17
18
* [ Phylogenomics pipeline] ( #Phylogenomics-pipeline )
18
19
* [ Input formats] ( #input-formats )
19
20
* [ Output formats] ( #output-formats )
@@ -129,10 +130,10 @@ Then you can download the container image and run:
129
130
```
130
131
apptainer remote add --no-login SylabsCloud cloud.sylabs.io
131
132
apptainer remote use SylabsCloud
132
- apptainer pull orthoindex.sif library://shang-hongyun/collection/centos8dock- orthoindex.sif:1 .0
133
+ apptainer pull orthoindex.sif library://shang-hongyun/collection/orthoindex:1.2 .0
133
134
./orthoindex.sif soi -h
134
135
```
135
- The image can be found [ here] ( https://cloud.sylabs.io/library/shang-hongyun/collection/centos8dock- orthoindex.sif ) .
136
+ The image can be found [ here] ( https://cloud.sylabs.io/library/shang-hongyun/collection/orthoindex ) .
136
137
137
138
## Subcommands ##
138
139
```
@@ -158,7 +159,7 @@ optional arguments:
158
159
The subcommand ` filter ` filters orthologous blocks with a default minimum index of 0.6:
159
160
```
160
161
$ soi filter -h
161
- usage: soi filter [-h] -s [FILE [FILE ...]] -o [FOLDER/FILE [FOLDER/FILE ...]] [-c FLOAT] [-upper FLOAT] [-n INT]
162
+ usage: soi filter [-h] -s [FILE [FILE ...]] -o [FOLDER/FILE [FOLDER/FILE ...]] [-c FLOAT] [-u FLOAT] [-n INT] [-g FILE] [-d INT] [-stat OUT_STATS] [-oo ]
162
163
163
164
optional arguments:
164
165
-h, --help show this help message and exit
@@ -168,8 +169,14 @@ optional arguments:
168
169
Orthologues output from OrthoFinder (folder), or OrthoMCL (file). [required]
169
170
-c FLOAT, -cutoff FLOAT
170
171
Cutoff (lower limit) of Orthology Index [default=0.6]
171
- -upper FLOAT Upper limit of Orthology Index [default=1]
172
+ -u FLOAT, -upper FLOAT
173
+ Upper limit of Orthology Index [default=1]
172
174
-n INT, -min_n INT Minimum gene number in a block [default=0]
175
+ -g FILE, -gff FILE Gff file. [required for `-d`]
176
+ -d INT, -min_dist INT
177
+ Minimum distance to remove a tandem repeated block [default=None]
178
+ -stat OUT_STATS Output stats by species pairs. [default=None]
179
+ -oo Output retained orthology instead of synteny. [default=False]
173
180
```
174
181
Usage examples:
175
182
```
@@ -179,12 +186,16 @@ soi filter -s wgdi/*.collinearity -o OrthoFinder/OrthoFinder/Result*/ > collinea
179
186
# from outputs of MCscanX and OrthoMCL
180
187
soi filter -s mcscanx/*.collinearity -o pairs/orthologs.txt > collinearity.ortho
181
188
182
- # from a list file and increase the cutoff
189
+ # from a list file and decrease the cutoff
183
190
ls wgdi/*.collinearity > collinearity.list
184
- soi filter -s collinearity.list -o OrthoFinder/OrthoFinder/Result*/ -c 0.7 > collinearity.ortho
191
+ soi filter -s collinearity.list -o OrthoFinder/OrthoFinder/Result*/ -c 0.5 > collinearity.ortho
185
192
186
- # filter a paralogous peak
193
+ # filter a out- paralogous peak
187
194
soi filter -s wgdi/*.collinearity -o OrthoFinder/OrthoFinder/Result*/ -c 0.05 -upper 0.4 > collinearity.para
195
+
196
+ # remove intra-species, tandem repeat-derived synteny
197
+ soi filter -s wgdi/*.collinearity -o OrthoFinder/OrthoFinder/Result*/ -gff all_species_gene.gff -d 200 > collinearity.homo
198
+
188
199
```
189
200
#### ` cluster ` ####
190
201
The subcommand ‘cluster’ groups orthologous syntenic genes into syntenic orthogroups (SOGs), through constructing an orthologous syntenic graph
@@ -248,8 +259,9 @@ trimming alignments with trimAl v1.2 (Capella-Gutierrez et al. 2009) (parameter:
248
259
and reconstructing maximum-likelihood trees with IQ-TREE v2.2.0.3 (Minh et al. 2020).
249
260
```
250
261
$ soi phylo -h
251
- usage: soi phylo [-h] -og FILE -pep FILE [-cds FILE] [-both] [-fmt STR] [-root [TAXON [TAXON ...]]] [-pre STR] [-mm FLOAT] [-mc INT] [-sc] [-ss FILE] [-concat] [-p INT] [-tmp FOLDER]
252
- [-clean]
262
+ sage: soi phylo [-h] -og FILE -pep FILE [-cds FILE] [-both] [-root [TAXON [TAXON ...]]] [-pre STR] [-mm FLOAT] [-mc INT] [-sc]
263
+ [-ss FILE] [-fmt {orthomcl,orthofinder,mcscanx}] [-only_aln] [-concat] [-trimal_opts STR] [-iqtree_opts STR]
264
+ [-p INT] [-tmp FOLDER] [-clean]
253
265
254
266
optional arguments:
255
267
-h, --help show this help message and exit
@@ -258,7 +270,6 @@ optional arguments:
258
270
-pep FILE Protein fasta file. [required]
259
271
-cds FILE CDS fasta file. [default=None]
260
272
-both To use both CDS and PEP to build gene trees. [default: only CDS when `-cds` is true]
261
- -fmt STR Format of `-orthogroup` input. [default=orthomcl]
262
273
-root [TAXON [TAXON ...]], -outgroup [TAXON [TAXON ...]]
263
274
Outgroups to root gene trees [default=None]
264
275
-pre STR, -prefix STR
@@ -269,10 +280,15 @@ optional arguments:
269
280
To limit a common maximum copy number for every species. [default=6]
270
281
-sc, -singlecopy Only retrieve singlecopy genes (=`-max_copies 1`). [default=None]
271
282
-ss FILE, -spsd FILE To limit a specific copy number for each species (format: 'TAXON<tab>NUMBER'). [default=None]
283
+ -fmt {orthomcl,orthofinder,mcscanx}
284
+ Format of `-orthogroup` input. [default=orthomcl]
285
+ -only_aln Only aligning sequences, to skip trimal and iqtree. [default=None]
272
286
-concat To concatenate alignments for tools such as IQTREE (valid when `-singlecopy` is true). [default=None]
287
+ -trimal_opts STR TrimAl options. [default='-automated1']
288
+ -iqtree_opts STR IQ-TREE options. [default='']
273
289
-p INT, -ncpu INT Number of processors. [default=20]
274
290
-tmp FOLDER, -tmpdir FOLDER
275
- Temporary folder. [default=./tmp/ ]
291
+ Temporary folder. [default=./tmp-8a639818-fb56-11ef-b568-4cd98fb9bbe7 ]
276
292
-clean Cleanup temporary folder. [default=None]
277
293
```
278
294
Usage examples:
@@ -293,35 +309,45 @@ with colored by the Orthology Index or Ks values.
293
309
294
310
```
295
311
$ soi dotplot -h
296
- usage: soi dotplot [-h] -s FILE [FILE ...] -g FILE -c FILE [-o STR] [--format FORMAT] [--homology] [--cluster] [--diagonal] [--gene-axis] [--number-plots] [--min-block INT]
297
- [--min-same-block INT] [--xlabel XLABEL] [--ylabel YLABEL] [--figsize NUM] [--fontsize NUM] [--dotsize NUM] [--ofdir FOLDER/FILE [FOLDER/FILE ...]] [--of-ratio FLOAT]
298
- [--of-color] [--kaks FILE] [--ks-hist] [--max-ks Ks] [--ks-cmap Ks [Ks ...]] [--ks-step Ks] [--use-median] [--method STR] [--lower-ks Ks] [--upper-ks Ks]
299
- [--plot-ploidy] [--window_size INT] [--window_step INT] [--min_block INT] [--max_distance INT] [--max_ploidy INT] [--min_overlap FLOAT] [--color COLOR]
300
- [--edgecolor COLOR]
312
+ usage: soi dotplot [-h] -s FILE [FILE ...] -g FILE -c FILE [-o STR] [--format FORMAT] [--number-plots] [--min-block INT]
313
+ [--min-dist INT] [--cluster] [--diagonal] [--gene-axis] [--xlines FILE] [--ylines FILE] [--xbars FILE]
314
+ [--ybars FILE] [--xbarlab] [--ybarlab] [--xlabel XLABEL] [--ylabel YLABEL] [--figsize NUM [NUM ...]]
315
+ [--fontsize NUM] [--dotsize NUM] [--ofdir FOLDER/FILE [FOLDER/FILE ...]] [--of-ratio FLOAT] [--of-color]
316
+ [--kaks FILE] [--ks-hist] [--max-ks Ks] [--ks-cmap Ks [Ks ...]] [--ks-step Ks] [--use-median] [--method STR]
317
+ [--lower-ks Ks] [--upper-ks Ks] [--output-hist] [--cbar] [--plot-ploidy] [--window_size INT]
318
+ [--window_step INT] [--min_block INT] [--max_distance INT] [--max_ploidy INT] [--min_overlap FLOAT]
319
+ [--color COLOR] [--edgecolor COLOR] [--plot-bin]
320
+
301
321
302
322
optional arguments:
303
323
-h, --help show this help message and exit
304
- -s FILE [FILE ...] syntenic block file (*.collinearity, output of MCSCANX/WGDI)
305
- -g FILE gene annotation gff file (*.gff, one of MCSCANX/WGDI input)
306
- -c FILE chromosomes config file (*.ctl, same format as MCSCANX dotplotter)
324
+ -s FILE [FILE ...] syntenic block file (*.collinearity, output of MCSCANX/WGDI)[required]
325
+ -g FILE gene annotation gff file (*.gff, one of MCSCANX/WGDI input)[required]
326
+ -c FILE chromosomes config file (*.ctl, same format as MCSCANX dotplotter)[required]
307
327
-o STR output file prefix. [default: the same as `-c`]
308
328
--format FORMAT output figure format [default=['pdf', 'png']]
309
- --homology `-s` is in homology format (gene1<tab>gene2). [default=False]
310
- --cluster cluster chromosomes. [default=False]
311
- --diagonal try to put blocks onto the diagonal. [default=False]
312
- --gene-axis use gene as axis instead of base pair. [default=False]
313
329
--number-plots number subplots with (a-d). [default=False]
314
330
--min-block INT min gene number in a block. [default=None]
315
- --min-same-block INT min gene number in a block on the same chromosome . [default=25 ]
331
+ --min-dist INT remove tandem with distance shorter than this value . [default=None ]
316
332
317
- Art settings :
318
- art settings for plots
333
+ Dot plot :
334
+ settings for dot plots
319
335
336
+ --cluster cluster chromosomes. [default=False]
337
+ --diagonal try to put blocks onto the diagonal. [default=False]
338
+ --gene-axis use gene as axis instead of base pair. [default=False]
339
+ --xlines FILE bed/pos file to add vertical lines. [default=None]
340
+ --ylines FILE bed/pos file to add horizontal lines. [default=None]
341
+ --xbars FILE ancetor file to set colorbar for x axis. [default=None]
342
+ --ybars FILE ancetor file to set colorbar for y axis. [default=None]
343
+ --xbarlab plot labels for x bars. [default=False]
344
+ --ybarlab plot labels for y bars. [default=False]
320
345
--xlabel XLABEL x label for dot plot. [default=None]
321
346
--ylabel YLABEL y label for dot plot. [default=None]
322
- --figsize NUM figure size [default=18]
323
- --fontsize NUM font size [default=10]
324
- --dotsize NUM dot size [default=0.8]
347
+ --figsize NUM [NUM ...]
348
+ figure size (width [height]) [default=[16]]
349
+ --fontsize NUM font size of chromosome labels [default=10]
350
+ --dotsize NUM dot size [default=1]
325
351
326
352
Orthology Index filter/color:
327
353
filtering or coloring blocks by Orthology Index (prior to Ks color)
@@ -332,7 +358,7 @@ Orthology Index filter/color:
332
358
--of-color coloring dots by Orthology Index [default=None]
333
359
334
360
Ks plot:
335
- options to plot with Ks
361
+ options to histogram plot with Ks
336
362
337
363
--kaks FILE kaks output from KaKs_Calculator/WGDI. [default=None]
338
364
--ks-hist plot histogram or not [default=None]
@@ -344,8 +370,10 @@ Ks plot:
344
370
--method STR Ks calculation method [default=NG86]
345
371
--lower-ks Ks lower limit of median Ks. [default=None]
346
372
--upper-ks Ks upper limit of median Ks. [default=None]
373
+ --output-hist output the data for histogram plot. [default=False]
374
+ --cbar plot color bar when no histogram plot. [default=False]
347
375
348
- ploidy plot:
376
+ Ploidy plot:
349
377
options to plot relative ploidy (synteny depth)
350
378
351
379
--plot-ploidy plot relative ploidy. [default=False]
@@ -357,10 +385,18 @@ ploidy plot:
357
385
--min_overlap FLOAT min overlap. [default=0.4]
358
386
--color COLOR bar fill color. [default=None]
359
387
--edgecolor COLOR bar edge color. [default=None]
388
+
389
+ Plot Ks by bins:
390
+ options to plot binned Ks
391
+
392
+ --plot-bin plot binned Ks. [default=False]
360
393
```
361
394
362
395
Usage examples: see [ Quick Start] ( #Quick-Start ) .
363
396
397
+ #### Other functions ####
398
+ **Macro-synteny phylogeny**: See [the function](SOI-tools.md#macro-synteny-phylogeny)
399
+
364
400
### Phylogenomics pipeline ###
365
401
366
402
See [ evolution_example] ( https://github.com/zhangrengang/evolution_example/ ) for a pipeline of phylogenomics analyses based on Orthology Index.
0 commit comments