-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use dynamic resource allocation; disable timeline for now; added some… #96
base: dev
Are you sure you want to change the base?
Conversation
… default parameters
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## dev #96 +/- ##
==========================================
- Coverage 36.28% 36.27% -0.02%
==========================================
Files 14 14
Lines 722 725 +3
==========================================
+ Hits 262 263 +1
- Misses 460 462 +2
☔ View full report in Codecov by Sentry. |
This is not how dynamic process allocation is supposed to be used:
That would force process to effectively fail at least one to get the right number of resources for the 4/20G profiles. Moreover, it also hides the number of cpus/mem that's required by each process by using I highly advice having labeled processes (not named) to be able to provide generic groups of resources like this:
...and then go back to the processes and add the right labels. Should you want to make those 'dynamic' too you could do something like:
|
… require more cpus for processing. Will change if causing issues.
nextflow.config
Outdated
memory = { 2.GB * task.attempt } | ||
} | ||
withLabel: large { | ||
cpus = { 4 * task.attempt } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the cpu number supposed to increase on each attempt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the image-to-zarr
. Yes. bf2raw can use as many cores as possible. For a large image. I would want it go faster with more core. But for the other process Generate_image
. It's not really true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually 20Gb might be too much as a starting point. Most laptops have <16Gb. I may just start with 10Gb memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think people using laptops will have an issue with that because they will run it using 'local executor' 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... In the current version, I think it applies to all users since no profile-specific allocation is in-place.
Or you mean it will omit this label when it's local?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure the local executor won't use those and will only honour the config for executor{ memory=..., cpu=...}
so they should be fine to keep as-is... I can double check latter with a tiny example if you want to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be great. Cause it is reading it with local on my VM. Here's my dummy script
#!/usr/bin/env/ nextflow
// Copyright © 2023 Tong LI <[email protected]>
nextflow.enable.dsl=2
process test {
label 'huge'
script:
"""
echo 'hello world'
"""
}
workflow {
test()
}
❯ more nextflow.config
process {
withLabel: 'huge' {
memory = 500.Gb
}
Process requirement exceeds available memory -- req: 500 GB; avail: 336.4 GB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the (dynamic) resource allocation has to be optional. The numbers are very dependant on the environment being used, and users might be frustrated that they need to iterate through cycles they know will fail when they already know what needs to be allocated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will continue to bring this PR up to date with recent changes and start to address comments in #104.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to document how resource allocation is used and how the dynamically resource allocation can be used. The sanger_lsf
profile also needs to be documented if we are going to keep it in the config.
@@ -109,6 +109,8 @@ def mergeArgs (stem, data_type, args) { | |||
process image_to_zarr { | |||
tag "${image}" | |||
debug verbose_log | |||
label 'big_mem' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've split the label for memory and cpu to give more granular control over resource allocation.
nextflow.config
Outdated
out_dir = './output' | ||
report_dir = './reports' | ||
custom_config_version = 'master' | ||
custom_config_base = "https://raw.githubusercontent.com/nf-core/configs/${params.custom_config_version}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these lines (4-5) needed at the moment?
nextflow.config
Outdated
|
||
withLabel: big_cpu { | ||
cpus = { | ||
4 * task.attempt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dannda and I had a quick discussion about this. Staring at 4 cpus, and running up by 4 through each iteration feels a lot? It may cause a problem when the number being requested can't be allocated?
|
||
withLabel: big_mem { | ||
memory = { | ||
10.GB * task.attempt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation gives a compelling argument for dynamic resource allocation feature. However, environmentally, it would be better to not unnecessarily burning through resources to find a shape that works when it is likely the person executing the job has a good idea of what is required. Perhaps we can add to the documentation to describe how the resource allocation can be optionally used?
conda { | ||
conda.enabled = true | ||
process { | ||
conda = "$baseDir/envs/environment.yaml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noting here that docs should match these settings.
… default parameters