Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge intervals for WES in GATK GenomicsDBImport #1777

Merged
merged 3 commits into from
Jan 27, 2025

Conversation

tdanhorn
Copy link
Contributor

@tdanhorn tdanhorn commented Jan 23, 2025

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/sarek branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Running sarek with --joint_germline on WES samples with an intervals file containing many thousands of targets causes GATK GenomicsDBImport to create millions of files and run for several days without completing. Adding the --merge-intervals option to that process fixes that. This PR add the parameter conditional on the --wes pipeline parameter.

Closes #1776

Copy link

github-actions bot commented Jan 23, 2025

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.0.2.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

@tdanhorn tdanhorn changed the base branch from master to dev January 23, 2025 00:16
Copy link

github-actions bot commented Jan 23, 2025

nf-core pipelines lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 9868622

+| ✅ 215 tests passed       |+
#| ❔  11 tests were ignored |#
!| ❗   4 tests had warnings |!

❗ Test warnings:

  • pipeline_todos - TODO string in base.config: Check the defaults for all processes
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!

❔ Tests ignored:

✅ Tests passed:

Run details

  • nf-core/tools version 3.0.2
  • Run at 2025-01-27 13:43:09

Copy link
Contributor

@FriederikeHanssen FriederikeHanssen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you!

@maxulysse maxulysse merged commit dfb2d15 into dev Jan 27, 2025
38 checks passed
@FriederikeHanssen FriederikeHanssen deleted the genomicsdbimport-merge-intervals branch January 27, 2025 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Joint germline: GATK GenomicsDBImport chokes on millions of files with WES intervals file
3 participants