Skip to content

Conversation

edmundmiller
Copy link

The pipeline's sample sheet validation was fully migrated to nf-schema, replacing a custom Python script.

Key changes include:

  • assets/schema_input.json was updated to:
    • Implement comprehensive regex patterns for sample, antibody, and control fields.
    • Enforce antibody/control dependencies using dependentRequired.
    • Add minimum: 1 constraint for replicate IDs.
  • The custom Python validation script bin/check_samplesheet.py and its Nextflow wrapper modules/local/samplesheet_check.nf were removed.
  • subworkflows/local/utils_nfcore_chipseq_pipeline/main.nf gained a validateSamplesheetRow() function to preserve pre-nf-schema logic, including space replacement warnings and single_end computation.
  • subworkflows/local/input_check.nf was simplified, removing its dependency on the Python script and now directly processing pre-validated samplesheet data.
  • main.nf and workflows/chipseq.nf were updated to use the PIPELINE_INITIALISATION.out.samplesheet channel, reflecting the new validation flow.

Subsequently, only nf-test related artifacts were removed, including nextflow and nf-test executables, and all .nf-test/ directories, while preserving all nf-schema implementation changes.

@nf-core-bot
Copy link
Member

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.2.1.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

Copy link

github-actions bot commented Jul 1, 2025

nf-core pipelines lint overall result: Failed ❌

Posted for pipeline commit c1c8dd6

+| ✅ 268 tests passed       |+
#| ❔   2 tests were ignored |#
!| ❗  31 tests had warnings |!
-| ❌   1 tests failed       |-

❌ Test failures:

  • modules_config - conf/modules.config contains withName:SAMPLESHEET_CHECK, but the corresponding process is not present in any of the Nextflow scripts.

❗ Test warnings:

  • pipeline_todos - TODO string in nextflow.config: Optionally, you can add a pipeline-specific nf-core config at https://github.com/nf-core/configs
  • pipeline_todos - TODO string in nextflow.config: Update the field with the details of the contributors to your pipeline. New with Nextflow version 24.10.0
  • pipeline_todos - TODO string in base.config: Check the defaults for all processes
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
  • pipeline_if_empty_null - ifEmpty(null) found in prepare_genome.nf: _ versions = ch_versions.ifEmpty(null) // channel: [ versions.yml ]
    _
  • pipeline_if_empty_null - ifEmpty(null) found in main.nf: _ versions = ch_versions.ifEmpty(null) // channel: [ versions.yml ]
    _
  • local_component_structure - igv.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
  • local_component_structure - star_align.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
  • local_component_structure - gtf2bed.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
  • local_component_structure - macs3_consensus.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
  • local_component_structure - bam_remove_orphans.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
  • local_component_structure - plot_homer_annotatepeaks.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
  • local_component_structure - genome_blacklist_regions.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
  • local_component_structure - multiqc.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
  • local_component_structure - plot_macs3_qc.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
  • local_component_structure - multiqc_custom_phantompeakqualtools.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
  • local_component_structure - bamtools_filter.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
  • local_component_structure - multiqc_custom_peaks.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
  • local_component_structure - deseq2_qc.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
  • local_component_structure - star_genomegenerate.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
  • local_component_structure - annotate_boolean_peaks.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
  • local_component_structure - frip_score.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
  • local_component_structure - bam_peaks_call_qc_annotate_macs3_homer.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure
  • local_component_structure - bed_consensus_quantify_qc_bedtools_featurecounts_deseq2.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure
  • local_component_structure - bam_filter_bamtools.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure
  • local_component_structure - prepare_genome.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure
  • local_component_structure - bam_bedgraph_bigwig_bedtools_ucsc.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure
  • local_component_structure - align_star.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure

❔ Tests ignored:

  • nextflow_config - Config default ignored: params.bamtools_filter_pe_config
  • nextflow_config - Config default ignored: params.bamtools_filter_se_config

✅ Tests passed:

Run details

  • nf-core/tools version 3.3.2
  • Run at 2025-09-19 14:59:59

@edmundmiller edmundmiller force-pushed the cursor/implement-nf-schema-in-pipeline-14ac branch from 29520d7 to 96867d9 Compare September 3, 2025 15:15
@edmundmiller edmundmiller force-pushed the cursor/implement-nf-schema-in-pipeline-14ac branch from 96867d9 to e92fa1c Compare September 17, 2025 20:16
edmundmiller and others added 6 commits September 17, 2025 16:00
- Add minimum value constraints for replicate IDs (minimum: 1)
- Implement antibody/control dependency validation using dependentRequired
- Update regex patterns for antibody and control fields to allow alphanumeric, dots, and hyphens
- Improve error messages for better user experience
- Ensure data integrity with stricter validation rules

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Remove custom Python samplesheet validation script (bin/check_samplesheet.py)
- Remove corresponding Nextflow module (modules/local/samplesheet_check.nf)
- Eliminates 247 lines of custom validation code
- Paves way for modern nf-schema validation system

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Replace file-based samplesheet input with nf-schema samplesheetToList()
- Implement direct channel creation from validated samplesheet data
- Add validateSamplesheetRow() function for custom validation logic
- Preserve space replacement warnings and single_end computation
- Ensure backward compatibility with existing pipeline logic

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Update create_fastq_channel() to handle nf-schema output format
- Remove dependency on SAMPLESHEET_CHECK module
- Process pre-validated data from nf-schema samplesheetToList()
- Add proper handling of replicate and control_replicate metadata
- Eliminate redundant file existence checks (handled by nf-schema)
- Improve single_end detection logic for various data formats

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Update NFCORE_CHIPSEQ workflow to accept pre-validated samplesheet channel
- Replace file-based samplesheet input with channel-based data flow
- Connect PIPELINE_INITIALISATION.out.samplesheet to main workflow
- Remove redundant file creation and validation steps
- Streamline data flow from validation to processing

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Move INPUT_CHECK workflow into utils_nfcore_chipseq_pipeline/main.nf
- Integrate create_fastq_channel function with other pipeline utilities
- Update import path in workflows/chipseq.nf
- Remove standalone input_check.nf subworkflow
- Align with nf-core/rnaseq architectural patterns for cleaner organization
- Reduces local subworkflows from 6 to 5, improving maintainability

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@edmundmiller edmundmiller reopened this Sep 17, 2025
edmundmiller and others added 3 commits September 19, 2025 08:48
Extract metadata correctly from nf-schema structure and transform control 
references to include replicate suffix for proper IP/control BAM pairing.

🤖 Generated with Claude Code
Co-Authored-By: Claude <[email protected]>
Use unique staging directories for different input channels to prevent 
file name collisions during MultiQC execution.

🤖 Generated with Claude Code
Co-Authored-By: Claude <[email protected]>
Update test snapshots to reflect corrected sample IDs, restored peak calling 
processes, and proper workflow execution with nf-schema integration.

🤖 Generated with Claude Code
Co-Authored-By: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JSON schema: Chipseq Get rid of checksamplesheet.py and switch to nf-schema
2 participants