Microbial-graph project

Full Documentation

Documentation

FAPROTAX workflow

workflow document

References

emp repo
emp repo_cooccurence
emp_sample metadata

Process data from .biom to tsv

Install ruby (the emp repo use ruby to convert .biom to tsv)
Run "run.sh" under 1_dataProcessing/bioTocsv

To run "run.sh", directly type ./run.sh in the terminal
If typing ./run.sh gives you erro, maybe it's because the script is not executable. Then, try to type the command chmod +x to make the file executable: chmod +x run.sh. After making it executable, type ./run.sh should work
run.sh is a bash script, we can edit it accordingly. eg. change the BIOM, or change the MAP file name. If we use the same ones, then no need to change.
I commented out the line that computes OTU co-occurence network in run.sh, because I think we can use our own ways(eg. vsp or others to calculate. I tried out vsp, described below). Feel free to recover this line if you want to see their version of OTU co-occurence network. (To recover, remove the "#" symbol in front of the line)

Within your folder, you'll see several files after run.sh finish (here are the resulting files if BIOM=emp_deblur_90bp.subset_2k.rare_5000.biom): column_metadata.txt, emp_deblur_90bp.subset_2k.rare_5000.biom.json, emp_deblur_90bp.subset_2k.rare_5000.biom.SUMMARY.txt, otu-co-occurrence-network-z20.csv, otu_sample_value.txt, row_metadata.txt. If you also comment out the line for otu_cooccurrence_network calculation, there won't be otu-co-occurrence-network-z20.csv, but feel free to play aruond!

To do the clustering, we'll mainly need the "otu_sample_value.txt", "row_metadata.txt", "column_metadata.txt"

From tsv to graph

If we end up using our own method to cluster(instead of their otu_cooccurrence_network method), perhaps an efficient way is we modify their code to fit our needs.

For now, I tried out with vsp(the package mentioned in stat992). Please see the script "otu_cooccurrence_network_notes.R" under 2_tsvTograph.
I've also put the data needed under 2_tsvTograph/, feel free to use!

The code can convert the "otu_sample_value.txt" into a sparse bipartite adjacency matrix(row: otu; column: sample).
Then we can subset the otus of interest.
After subsetting, we can plug this matrix of interest into the vsp function for clustering. This way, we can see the cluster of otus and the samples, by inspecting the Z and Y matrix respectively. Just an idea though, welcome to any thoguht!!!

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
1_dataProcessing/biomTotsv		1_dataProcessing/biomTotsv
2_tsvTograph		2_tsvTograph
3_final		3_final
4_faprotax		4_faprotax
README.md		README.md
markdownsyntex.md		markdownsyntex.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Microbial-graph project

Full Documentation

FAPROTAX workflow

References

Process data from .biom to tsv

From tsv to graph

About

Releases

Packages

Languages

SzuHannah/Microbial-graph

Folders and files

Latest commit

History

Repository files navigation

Microbial-graph project

Full Documentation

FAPROTAX workflow

References

Process data from .biom to tsv

From tsv to graph

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages