Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metaphlan Databases Reinstalling #88

Open
jboconnor13 opened this issue Oct 29, 2024 · 2 comments
Open

Metaphlan Databases Reinstalling #88

jboconnor13 opened this issue Oct 29, 2024 · 2 comments

Comments

@jboconnor13
Copy link
Collaborator

When and index has been specified in the config file (i.e. metaphlan_index_name: mpa_vOct22_CHOCOPhlAnSGB_202212) and those files for that version are already installed in the in the metabphlan_bd directory specified in the config (metaphlan_bowtie_db: data/metaphlan_db/), a new metaphlan database is reinstalled each time the workflow is ran. Perhaps the output can be specified in the rule all inputs to resolve this issue?

@jboconnor13
Copy link
Collaborator Author

jboconnor13 commented Oct 29, 2024

It is also worth noting this is when the setup_metaphlan rule is adjusted to have the installation done manually in the snakefile as described in #86 (see below)

#if [ "{params.index_name}" = "latest" ]; then
#  metaphlan --install --nproc {threads} --bowtie2db {output.loc} {params.extra}

#else
#  metaphlan --install --nproc {threads} --bowtie2db {output.loc} --index {params.index_name} {params.extra}

#fi

# Option to do it manually if --install doesn't seem to work
 cd {output.loc}
# Can specify whatever version you want here
 wget http://cmprod1.cibio.unitn.it/biobakery4/metaphlan_databases/bowtie2_indexes/mpa_vOct22_CHOCOPhlAnSGB_202212_bt2.tar
 tar -xvf mpa_vOct22_CHOCOPhlAnSGB_202212_bt2.tar
 rm mpa_vOct22_CHOCOPhlAnSGB_202212_bt2.tar`

@sterrettJD
Copy link
Owner

Hey @jboconnor13 what does snakemake say the reason for rerunning is? Is there a certain file missing? Is the code changed?

For example, snakemake should say something like this.

[Mon Nov 4 14:12:15 2024]
rule taxa_barplot:
input: tutorial.f0.0.r0.0.nonhost.humann/all_bugs_list.tsv, R_packages_installed
output: tutorial.f0.0.r0.0.nonhost.humann/Metaphlan_microshades.html
jobid: 31
reason: Missing output files: tutorial.f0.0.r0.0.nonhost.humann/Metaphlan_microshades.html
resources: mem_mb=10000, mem_mib=9537, disk_mb=1000, disk_mib=954, tmpdir=, partition=short, runtime=120, slurm=

Rscript     -e "rmarkdown::render('/Users/jost9358/miniconda3/envs/HoMi_tutorial/lib/python3.11/site-packages/homi_pipeline/rule_utils/Metaphlan_microshades.Rmd', output_dir='/scratch/Users/jost9358/HoMi_tutorial/tutorial.f0.0.r0.0.nonhost.humann', params=list(bugslist='/scratch/Users/jost9358/HoMi_tutorial/tutorial.f0.0.r0.0.nonhost.humann/all_bugs_list.tsv', metadata='/scratch/Users/jost9358/HoMi_tutorial/tutorial_metadata.csv', directory='/scratch/Users/jost9358/HoMi_tutorial/tutorial.f0.0.r0.0.nonhost.humann'))"

Submitted job 31 with external jobid '9774328'.

What does the reason section say?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants