Skip to content

Commit 2974569

Browse files
author
zhangrengang
committed
update readme
1 parent 9b6402b commit 2974569

File tree

2 files changed

+71
-36
lines changed

2 files changed

+71
-36
lines changed

README.md

+67-31
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
- [outgroup](#outgroup)
1515
- [phylo](#phylo)
1616
- [dotplot](#dotplot)
17+
- [Other functions](#other-functions)
1718
* [Phylogenomics pipeline](#Phylogenomics-pipeline)
1819
* [Input formats](#input-formats)
1920
* [Output formats](#output-formats)
@@ -129,10 +130,10 @@ Then you can download the container image and run:
129130
```
130131
apptainer remote add --no-login SylabsCloud cloud.sylabs.io
131132
apptainer remote use SylabsCloud
132-
apptainer pull orthoindex.sif library://shang-hongyun/collection/centos8dock-orthoindex.sif:1.0
133+
apptainer pull orthoindex.sif library://shang-hongyun/collection/orthoindex:1.2.0
133134
./orthoindex.sif soi -h
134135
```
135-
The image can be found [here](https://cloud.sylabs.io/library/shang-hongyun/collection/centos8dock-orthoindex.sif).
136+
The image can be found [here](https://cloud.sylabs.io/library/shang-hongyun/collection/orthoindex).
136137

137138
## Subcommands ##
138139
```
@@ -158,7 +159,7 @@ optional arguments:
158159
The subcommand `filter` filters orthologous blocks with a default minimum index of 0.6:
159160
```
160161
$ soi filter -h
161-
usage: soi filter [-h] -s [FILE [FILE ...]] -o [FOLDER/FILE [FOLDER/FILE ...]] [-c FLOAT] [-upper FLOAT] [-n INT]
162+
usage: soi filter [-h] -s [FILE [FILE ...]] -o [FOLDER/FILE [FOLDER/FILE ...]] [-c FLOAT] [-u FLOAT] [-n INT] [-g FILE] [-d INT] [-stat OUT_STATS] [-oo]
162163
163164
optional arguments:
164165
-h, --help show this help message and exit
@@ -168,8 +169,14 @@ optional arguments:
168169
Orthologues output from OrthoFinder (folder), or OrthoMCL (file). [required]
169170
-c FLOAT, -cutoff FLOAT
170171
Cutoff (lower limit) of Orthology Index [default=0.6]
171-
-upper FLOAT Upper limit of Orthology Index [default=1]
172+
-u FLOAT, -upper FLOAT
173+
Upper limit of Orthology Index [default=1]
172174
-n INT, -min_n INT Minimum gene number in a block [default=0]
175+
-g FILE, -gff FILE Gff file. [required for `-d`]
176+
-d INT, -min_dist INT
177+
Minimum distance to remove a tandem repeated block [default=None]
178+
-stat OUT_STATS Output stats by species pairs. [default=None]
179+
-oo Output retained orthology instead of synteny. [default=False]
173180
```
174181
Usage examples:
175182
```
@@ -179,12 +186,16 @@ soi filter -s wgdi/*.collinearity -o OrthoFinder/OrthoFinder/Result*/ > collinea
179186
# from outputs of MCscanX and OrthoMCL
180187
soi filter -s mcscanx/*.collinearity -o pairs/orthologs.txt > collinearity.ortho
181188
182-
# from a list file and increase the cutoff
189+
# from a list file and decrease the cutoff
183190
ls wgdi/*.collinearity > collinearity.list
184-
soi filter -s collinearity.list -o OrthoFinder/OrthoFinder/Result*/ -c 0.7 > collinearity.ortho
191+
soi filter -s collinearity.list -o OrthoFinder/OrthoFinder/Result*/ -c 0.5 > collinearity.ortho
185192
186-
# filter a paralogous peak
193+
# filter a out-paralogous peak
187194
soi filter -s wgdi/*.collinearity -o OrthoFinder/OrthoFinder/Result*/ -c 0.05 -upper 0.4 > collinearity.para
195+
196+
# remove intra-species, tandem repeat-derived synteny
197+
soi filter -s wgdi/*.collinearity -o OrthoFinder/OrthoFinder/Result*/ -gff all_species_gene.gff -d 200 > collinearity.homo
198+
188199
```
189200
#### `cluster` ####
190201
The subcommand ‘cluster’ groups orthologous syntenic genes into syntenic orthogroups (SOGs), through constructing an orthologous syntenic graph
@@ -248,8 +259,9 @@ trimming alignments with trimAl v1.2 (Capella-Gutierrez et al. 2009) (parameter:
248259
and reconstructing maximum-likelihood trees with IQ-TREE v2.2.0.3 (Minh et al. 2020).
249260
```
250261
$ soi phylo -h
251-
usage: soi phylo [-h] -og FILE -pep FILE [-cds FILE] [-both] [-fmt STR] [-root [TAXON [TAXON ...]]] [-pre STR] [-mm FLOAT] [-mc INT] [-sc] [-ss FILE] [-concat] [-p INT] [-tmp FOLDER]
252-
[-clean]
262+
sage: soi phylo [-h] -og FILE -pep FILE [-cds FILE] [-both] [-root [TAXON [TAXON ...]]] [-pre STR] [-mm FLOAT] [-mc INT] [-sc]
263+
[-ss FILE] [-fmt {orthomcl,orthofinder,mcscanx}] [-only_aln] [-concat] [-trimal_opts STR] [-iqtree_opts STR]
264+
[-p INT] [-tmp FOLDER] [-clean]
253265
254266
optional arguments:
255267
-h, --help show this help message and exit
@@ -258,7 +270,6 @@ optional arguments:
258270
-pep FILE Protein fasta file. [required]
259271
-cds FILE CDS fasta file. [default=None]
260272
-both To use both CDS and PEP to build gene trees. [default: only CDS when `-cds` is true]
261-
-fmt STR Format of `-orthogroup` input. [default=orthomcl]
262273
-root [TAXON [TAXON ...]], -outgroup [TAXON [TAXON ...]]
263274
Outgroups to root gene trees [default=None]
264275
-pre STR, -prefix STR
@@ -269,10 +280,15 @@ optional arguments:
269280
To limit a common maximum copy number for every species. [default=6]
270281
-sc, -singlecopy Only retrieve singlecopy genes (=`-max_copies 1`). [default=None]
271282
-ss FILE, -spsd FILE To limit a specific copy number for each species (format: 'TAXON<tab>NUMBER'). [default=None]
283+
-fmt {orthomcl,orthofinder,mcscanx}
284+
Format of `-orthogroup` input. [default=orthomcl]
285+
-only_aln Only aligning sequences, to skip trimal and iqtree. [default=None]
272286
-concat To concatenate alignments for tools such as IQTREE (valid when `-singlecopy` is true). [default=None]
287+
-trimal_opts STR TrimAl options. [default='-automated1']
288+
-iqtree_opts STR IQ-TREE options. [default='']
273289
-p INT, -ncpu INT Number of processors. [default=20]
274290
-tmp FOLDER, -tmpdir FOLDER
275-
Temporary folder. [default=./tmp/]
291+
Temporary folder. [default=./tmp-8a639818-fb56-11ef-b568-4cd98fb9bbe7]
276292
-clean Cleanup temporary folder. [default=None]
277293
```
278294
Usage examples:
@@ -293,35 +309,45 @@ with colored by the Orthology Index or Ks values.
293309

294310
```
295311
$ soi dotplot -h
296-
usage: soi dotplot [-h] -s FILE [FILE ...] -g FILE -c FILE [-o STR] [--format FORMAT] [--homology] [--cluster] [--diagonal] [--gene-axis] [--number-plots] [--min-block INT]
297-
[--min-same-block INT] [--xlabel XLABEL] [--ylabel YLABEL] [--figsize NUM] [--fontsize NUM] [--dotsize NUM] [--ofdir FOLDER/FILE [FOLDER/FILE ...]] [--of-ratio FLOAT]
298-
[--of-color] [--kaks FILE] [--ks-hist] [--max-ks Ks] [--ks-cmap Ks [Ks ...]] [--ks-step Ks] [--use-median] [--method STR] [--lower-ks Ks] [--upper-ks Ks]
299-
[--plot-ploidy] [--window_size INT] [--window_step INT] [--min_block INT] [--max_distance INT] [--max_ploidy INT] [--min_overlap FLOAT] [--color COLOR]
300-
[--edgecolor COLOR]
312+
usage: soi dotplot [-h] -s FILE [FILE ...] -g FILE -c FILE [-o STR] [--format FORMAT] [--number-plots] [--min-block INT]
313+
[--min-dist INT] [--cluster] [--diagonal] [--gene-axis] [--xlines FILE] [--ylines FILE] [--xbars FILE]
314+
[--ybars FILE] [--xbarlab] [--ybarlab] [--xlabel XLABEL] [--ylabel YLABEL] [--figsize NUM [NUM ...]]
315+
[--fontsize NUM] [--dotsize NUM] [--ofdir FOLDER/FILE [FOLDER/FILE ...]] [--of-ratio FLOAT] [--of-color]
316+
[--kaks FILE] [--ks-hist] [--max-ks Ks] [--ks-cmap Ks [Ks ...]] [--ks-step Ks] [--use-median] [--method STR]
317+
[--lower-ks Ks] [--upper-ks Ks] [--output-hist] [--cbar] [--plot-ploidy] [--window_size INT]
318+
[--window_step INT] [--min_block INT] [--max_distance INT] [--max_ploidy INT] [--min_overlap FLOAT]
319+
[--color COLOR] [--edgecolor COLOR] [--plot-bin]
320+
301321
302322
optional arguments:
303323
-h, --help show this help message and exit
304-
-s FILE [FILE ...] syntenic block file (*.collinearity, output of MCSCANX/WGDI)
305-
-g FILE gene annotation gff file (*.gff, one of MCSCANX/WGDI input)
306-
-c FILE chromosomes config file (*.ctl, same format as MCSCANX dotplotter)
324+
-s FILE [FILE ...] syntenic block file (*.collinearity, output of MCSCANX/WGDI)[required]
325+
-g FILE gene annotation gff file (*.gff, one of MCSCANX/WGDI input)[required]
326+
-c FILE chromosomes config file (*.ctl, same format as MCSCANX dotplotter)[required]
307327
-o STR output file prefix. [default: the same as `-c`]
308328
--format FORMAT output figure format [default=['pdf', 'png']]
309-
--homology `-s` is in homology format (gene1<tab>gene2). [default=False]
310-
--cluster cluster chromosomes. [default=False]
311-
--diagonal try to put blocks onto the diagonal. [default=False]
312-
--gene-axis use gene as axis instead of base pair. [default=False]
313329
--number-plots number subplots with (a-d). [default=False]
314330
--min-block INT min gene number in a block. [default=None]
315-
--min-same-block INT min gene number in a block on the same chromosome. [default=25]
331+
--min-dist INT remove tandem with distance shorter than this value. [default=None]
316332
317-
Art settings:
318-
art settings for plots
333+
Dot plot:
334+
settings for dot plots
319335
336+
--cluster cluster chromosomes. [default=False]
337+
--diagonal try to put blocks onto the diagonal. [default=False]
338+
--gene-axis use gene as axis instead of base pair. [default=False]
339+
--xlines FILE bed/pos file to add vertical lines. [default=None]
340+
--ylines FILE bed/pos file to add horizontal lines. [default=None]
341+
--xbars FILE ancetor file to set colorbar for x axis. [default=None]
342+
--ybars FILE ancetor file to set colorbar for y axis. [default=None]
343+
--xbarlab plot labels for x bars. [default=False]
344+
--ybarlab plot labels for y bars. [default=False]
320345
--xlabel XLABEL x label for dot plot. [default=None]
321346
--ylabel YLABEL y label for dot plot. [default=None]
322-
--figsize NUM figure size [default=18]
323-
--fontsize NUM font size [default=10]
324-
--dotsize NUM dot size [default=0.8]
347+
--figsize NUM [NUM ...]
348+
figure size (width [height]) [default=[16]]
349+
--fontsize NUM font size of chromosome labels [default=10]
350+
--dotsize NUM dot size [default=1]
325351
326352
Orthology Index filter/color:
327353
filtering or coloring blocks by Orthology Index (prior to Ks color)
@@ -332,7 +358,7 @@ Orthology Index filter/color:
332358
--of-color coloring dots by Orthology Index [default=None]
333359
334360
Ks plot:
335-
options to plot with Ks
361+
options to histogram plot with Ks
336362
337363
--kaks FILE kaks output from KaKs_Calculator/WGDI. [default=None]
338364
--ks-hist plot histogram or not [default=None]
@@ -344,8 +370,10 @@ Ks plot:
344370
--method STR Ks calculation method [default=NG86]
345371
--lower-ks Ks lower limit of median Ks. [default=None]
346372
--upper-ks Ks upper limit of median Ks. [default=None]
373+
--output-hist output the data for histogram plot. [default=False]
374+
--cbar plot color bar when no histogram plot. [default=False]
347375
348-
ploidy plot:
376+
Ploidy plot:
349377
options to plot relative ploidy (synteny depth)
350378
351379
--plot-ploidy plot relative ploidy. [default=False]
@@ -357,10 +385,18 @@ ploidy plot:
357385
--min_overlap FLOAT min overlap. [default=0.4]
358386
--color COLOR bar fill color. [default=None]
359387
--edgecolor COLOR bar edge color. [default=None]
388+
389+
Plot Ks by bins:
390+
options to plot binned Ks
391+
392+
--plot-bin plot binned Ks. [default=False]
360393
```
361394

362395
Usage examples: see [Quick Start](#Quick-Start).
363396

397+
#### Other functions ####
398+
**Macro-synteny phylogeny**: See [the function](SOI-tools.md#macro-synteny-phylogeny)
399+
364400
### Phylogenomics pipeline ###
365401

366402
See [evolution_example](https://github.com/zhangrengang/evolution_example/) for a pipeline of phylogenomics analyses based on Orthology Index.

setup.py

+4-5
Original file line numberDiff line numberDiff line change
@@ -3,18 +3,17 @@
33

44
from setuptools import setup, find_packages
55
from distutils.extension import Extension
6-
#from Cython.Build import cythonize
76

87
with open('README.md') as f:
98
long_description = f.read()
109

1110

1211
setup(
13-
name='orthoindex',
12+
name='soi',
1413
version=version,
15-
description='OrthoIndex: distinguishing synteny from orthology to out-paralogy',
16-
url='https://github.com/zhangrengang/orthoindex/',
17-
author='Zhang, Ren-Gang and Wang, Zhao-Xuan',
14+
description='SOI: identifying orthologous synteny',
15+
url='https://github.com/zhangrengang/soi/',
16+
author='Zhang, Ren-Gang',
1817
license='GPL-3.0',
1918

2019
python_requires='>=3.7',

0 commit comments

Comments
 (0)