diff --git a/CHANGELOG.md b/CHANGELOG.md index 00fb345..0d021bd 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Added +- New `getGenomeAttribute()` function to retrieve genome attributes from `params.genomes` for the selected genome + ### Changed - Enhanced `softwareVersionsToYAML()` to support mixed input sources including YAML strings, file paths, topic tuples, and maps ([#24](https://github.com/nf-core/nf-core-utils/pull/24)) diff --git a/docs/ReferencesExtension.md b/docs/ReferencesExtension.md index 76017d2..03275dc 100644 --- a/docs/ReferencesExtension.md +++ b/docs/ReferencesExtension.md @@ -21,8 +21,75 @@ The extension provides two essential functions: ### 1.2. Migration from Legacy Systems -!!! note "Subworkflow Migration" -This extension replaces the legacy `utils_references` subworkflow with cleaner, plugin-based functionality that's easier to maintain and use. +This function is used to retrieve genome attributes in the nf-core TEMPLATE. + +It retrieves a specific attribute (such as `fasta`, `gtf`, or index paths) for the selected genome from the `params.genomes` map. It is useful for pipelines that support multiple genomes and need to access reference files or metadata for the currently selected genome. + +#### Function Signature + +```groovy +Object getGenomeAttribute(String attribute) +``` + +or (static utility): + +```groovy +Object ReferencesUtils.getGenomeAttribute(Map params, String attribute) +``` + +#### Parameters + +| Parameter | Type | Required | Description | +| ----------- | ------ | -------- | ------------------------------------------------------------------ | +| `attribute` | String | Yes | The attribute name to retrieve (e.g. 'fasta', 'gtf', 'star') | +| `params` | Map | Yes | (static) The Nextflow params map containing `genome` and `genomes` | + +#### Return Value + +Returns the value of the requested attribute for the selected genome, or `null` if not found. + +#### Practical Example + +```groovy +// Example params structure +def params = [ + genome: 'GRCh38', + genomes: [ + GRCh38: [ + fasta: 's3://bucket/genome.fa', + gtf: 's3://bucket/genes.gtf', + ], + GRCh37: [ + fasta: 's3://bucket/genome37.fa' + star: 's3://bucket/star_index/' + ] + ] +] + +// Retrieve the FASTA file for the selected genome +def fasta = ReferencesUtils.getGenomeAttribute(params, 'fasta') +// Returns: 's3://bucket/genome.fa' + +// Retrieve the GTF file +def gtf = ReferencesUtils.getGenomeAttribute(params, 'gtf') +// Returns: 's3://bucket/genes.gtf' + +// If the attribute or genome is missing, returns null +def missing = ReferencesUtils.getGenomeAttribute(params, 'star') +// Returns: null +``` + +#### Usage in Nextflow + +```nextflow +include { getGenomeAttribute } from 'plugin/nf-core-utils' + +workflow { + // Example: get the FASTA file for the selected genome + genome_fasta = getGenomeAttribute('fasta') + log.info "Selected genome FASTA: ${genome_fasta}" +} +``` ## 2. Getting Started @@ -33,10 +100,9 @@ Let's start with a simple example that demonstrates the core concept: ```nextflow title="basic_references.nf" #!/usr/bin/env nextflow -nextflow.enable.dsl = 2 - // Import reference utilities -include { getReferencesFile; getReferencesValue } from 'plugin/nf-core-utils' +include { getReferencesFile } from 'plugin/nf-core-utils' +include { getReferencesValue } from 'plugin/nf-core-utils' // Pipeline parameters params.fasta = null // User can override with custom file @@ -105,17 +171,6 @@ workflow { This function intelligently resolves file paths based on user parameters and reference metadata. -#### Function Signature - -```groovy -Channel getReferencesFile( - Channel references, // Reference metadata channel - Object param, // User parameter (file path or null) - String attribute, // Metadata attribute name - String basepath // Base path for relative resolution -) -``` - #### Parameters | Parameter | Type | Required | Description | @@ -162,6 +217,43 @@ workflow { } ``` +#### Practical Example + +```nextflow title="file_resolution_example.nf" +#!/usr/bin/env nextflow + +include { getReferencesFile } from 'plugin/nf-core-utils' + +params.fasta = null +params.gtf = "/custom/annotations.gtf" +params.igenomes_base = 's3://ngi-igenomes/igenomes' + +workflow { + // Create comprehensive reference metadata + references = Channel.of([ + genome: 'GRCh38', + fasta: 'Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa', + gtf: 'Homo_sapiens/NCBI/GRCh38/Annotation/Genes/genes.gtf', + readme: 'Homo_sapiens/NCBI/GRCh38/README.txt' + ]) + + // Resolve multiple reference files + genome_fasta = getReferencesFile(references, params.fasta, 'fasta', params.igenomes_base) + genome_gtf = getReferencesFile(references, params.gtf, 'gtf', params.igenomes_base) + + // Combine for downstream processing + references_ready = genome_fasta.combine(genome_gtf) + + references_ready.view { fasta, gtf -> + """ + Reference files ready: + FASTA: ${fasta} + GTF: ${gtf} + """ + } +} +``` + ### 3.2. getReferencesValue - Metadata Value Resolution This function extracts metadata values with user parameter override support. @@ -230,9 +322,8 @@ Here's a comprehensive example showing how to integrate reference resolution int ```nextflow title="complete_reference_pipeline.nf" hl_lines="8-11 18-27 35-42" #!/usr/bin/env nextflow -nextflow.enable.dsl = 2 - -include { getReferencesFile; getReferencesValue } from 'plugin/nf-core-utils' +include { getReferencesFile } from 'plugin/nf-core-utils' +include { getReferencesValue } from 'plugin/nf-core-utils' // Pipeline parameters with sensible defaults params.input = 'samples.csv' @@ -316,7 +407,8 @@ For pipelines using standardized reference collections, create a systematic appr ```nextflow title="igenomes_integration.nf" #!/usr/bin/env nextflow -include { getReferencesFile; getReferencesValue } from 'plugin/nf-core-utils' +include { getReferencesFile } from 'plugin/nf-core-utils' +include { getReferencesValue } from 'plugin/nf-core-utils' params.genome = 'GRCh38' params.igenomes_base = 's3://ngi-igenomes/igenomes' @@ -374,7 +466,8 @@ For pipelines supporting multiple genomes: ```nextflow title="multi_genome_support.nf" #!/usr/bin/env nextflow -include { getReferencesFile; getReferencesValue } from 'plugin/nf-core-utils' +include { getReferencesFile } from 'plugin/nf-core-utils' +include { getReferencesValue } from 'plugin/nf-core-utils' // Support for multiple genomes params.genomes = ['GRCh38', 'mm10'] @@ -545,7 +638,8 @@ Always document your reference requirements: * Supported genomes: GRCh38, GRCh37, mm10, mm9 */ -include { getReferencesFile; getReferencesValue } from 'plugin/nf-core-utils' +include { getReferencesFile } from 'plugin/nf-core-utils' +include { getReferencesValue } from 'plugin/nf-core-utils' // Reference parameters with documentation params.genome = null // Standard genome name (e.g., 'GRCh38') diff --git a/src/main/groovy/nfcore/plugin/NfUtilsExtension.groovy b/src/main/groovy/nfcore/plugin/NfUtilsExtension.groovy index f81fdb1..a13151d 100644 --- a/src/main/groovy/nfcore/plugin/NfUtilsExtension.groovy +++ b/src/main/groovy/nfcore/plugin/NfUtilsExtension.groovy @@ -25,6 +25,7 @@ import nfcore.plugin.nfcore.NfcoreConfigValidator import nfcore.plugin.nfcore.NfcoreVersionUtils import nfcore.plugin.nfcore.NfcoreCitationUtils import nfcore.plugin.nfcore.NfcoreReportingOrchestrator +import nfcore.plugin.ReferencesUtils /** * Implements a custom function which can be imported by @@ -49,6 +50,20 @@ class NfUtilsExtension extends PluginExtensionPoint { println "Hello, ${target}!" } + /** + * Get a genome attribute from params.genomes for the selected genome. + * Delegates to ReferencesUtils.getGenomeAttribute(). + * + * @param attribute The attribute name to retrieve (e.g. 'fasta', 'gtf') + * @return The attribute value or null + */ + @Function + Object getGenomeAttribute(String attribute) { + Map params = session?.getConfig()?.get('params') as Map + if (!params) params = session?.params as Map + return ReferencesUtils.getGenomeAttribute(params, attribute) + } + // --- Methods from NfcoreExtension --- /** * Generate methods description text for MultiQC report diff --git a/src/main/groovy/nfcore/plugin/references/ReferencesUtils.groovy b/src/main/groovy/nfcore/plugin/references/ReferencesUtils.groovy index ef9d1c2..9f3110a 100644 --- a/src/main/groovy/nfcore/plugin/references/ReferencesUtils.groovy +++ b/src/main/groovy/nfcore/plugin/references/ReferencesUtils.groovy @@ -32,6 +32,54 @@ class ReferencesUtils { this.session = session } + /** + * Instance convenience method that reads params from the initialized Session + * and delegates to the static getGenomeAttribute(Map, String) implementation. + * + * Usage (when ReferencesUtils has been initialized with a Session): + * def utils = new ReferencesUtils() + * utils.init(session) + * utils.getGenomeAttribute('fasta') + */ + Object getGenomeAttribute(String attribute) { + Map params = session?.getConfig()?.get('params') as Map + // Fallback: some Nextflow Session implementations expose params directly + if (!params) params = session?.params as Map + if (params) return ReferencesUtils.getGenomeAttribute(params, attribute) + return null + } + + /** + * Return the named attribute for the selected genome in params, or null. + * Returns a single value (not wrapped in a List). + * + * Example: + * def fasta = ReferencesUtils.getGenomeAttribute(params, 'fasta') + */ + static Object getGenomeAttribute(Map params, String attribute) { + if (params == null) return null + + final Object genomesObj = params.get('genomes') + final Object genomeKeyObj = params.get('genome') + + if (!(genomesObj instanceof Map) || !(genomeKeyObj instanceof String)) { + return null + } + + final Map genomes = (Map) genomesObj + final String genomeKey = (String) genomeKeyObj + + if (!genomes.containsKey(genomeKey)) return null + + final Object genomeObj = genomes.get(genomeKey) + if (!(genomeObj instanceof Map)) return null + + final Map genome = (Map) genomeObj + if (!genome.containsKey(attribute)) return null + + return genome.get(attribute) + } + /** * Get references file from a references list or parameters * diff --git a/src/test/groovy/nfcore/plugin/references/ReferencesUtilsTest.groovy b/src/test/groovy/nfcore/plugin/references/ReferencesUtilsTest.groovy index 9c0bf83..aba616b 100644 --- a/src/test/groovy/nfcore/plugin/references/ReferencesUtilsTest.groovy +++ b/src/test/groovy/nfcore/plugin/references/ReferencesUtilsTest.groovy @@ -55,4 +55,135 @@ class ReferencesUtilsTest extends Specification { result_file == [null] result_value == [null] } + + def "test getGenomeAttribute with valid params"() { + given: + def params = [ + genome: 'GRCh38', + genomes: [ + GRCh38: [ + fasta: 's3://bucket/genome.fa', + gtf: 's3://bucket/genes.gtf', + star: 's3://bucket/star_index/' + ], + GRCh37: [ + fasta: 's3://bucket/genome37.fa' + ] + ] + ] + + when: + def fasta = ReferencesUtils.getGenomeAttribute(params, 'fasta') + def gtf = ReferencesUtils.getGenomeAttribute(params, 'gtf') + def star = ReferencesUtils.getGenomeAttribute(params, 'star') + + then: + fasta == 's3://bucket/genome.fa' + gtf == 's3://bucket/genes.gtf' + star == 's3://bucket/star_index/' + } + + def "test getGenomeAttribute with missing attribute"() { + given: + def params = [ + genome: 'GRCh38', + genomes: [ + GRCh38: [ + fasta: 's3://bucket/genome.fa' + ] + ] + ] + + when: + def result = ReferencesUtils.getGenomeAttribute(params, 'gtf') + + then: + result == null + } + + def "test getGenomeAttribute with missing genome"() { + given: + def params = [ + genome: 'GRCh99', + genomes: [ + GRCh38: [ + fasta: 's3://bucket/genome.fa' + ] + ] + ] + + when: + def result = ReferencesUtils.getGenomeAttribute(params, 'fasta') + + then: + result == null + } + + def "test getGenomeAttribute with null params"() { + when: + def result = ReferencesUtils.getGenomeAttribute(null, 'fasta') + + then: + result == null + } + + def "test getGenomeAttribute with missing genomes map"() { + given: + def params = [ + genome: 'GRCh38' + ] + + when: + def result = ReferencesUtils.getGenomeAttribute(params, 'fasta') + + then: + result == null + } + + def "test getGenomeAttribute with missing genome key"() { + given: + def params = [ + genomes: [ + GRCh38: [ + fasta: 's3://bucket/genome.fa' + ] + ] + ] + + when: + def result = ReferencesUtils.getGenomeAttribute(params, 'fasta') + + then: + result == null + } + + def "test getGenomeAttribute with different value types"() { + given: + def params = [ + genome: 'test', + genomes: [ + test: [ + string_value: 'some_string', + list_value: ['item1', 'item2'], + map_value: [key: 'value'], + number_value: 42, + boolean_value: true + ] + ] + ] + + when: + def string_result = ReferencesUtils.getGenomeAttribute(params, 'string_value') + def list_result = ReferencesUtils.getGenomeAttribute(params, 'list_value') + def map_result = ReferencesUtils.getGenomeAttribute(params, 'map_value') + def number_result = ReferencesUtils.getGenomeAttribute(params, 'number_value') + def boolean_result = ReferencesUtils.getGenomeAttribute(params, 'boolean_value') + + then: + string_result == 'some_string' + list_result == ['item1', 'item2'] + map_result == [key: 'value'] + number_result == 42 + boolean_result == true + } }