Skip to content

Conversation

@andrewgull
Copy link
Contributor

@andrewgull andrewgull commented Oct 30, 2025

Fixes issue #4680

Updates the output file copying logic to correctly handle BAM files whose names contain intermediate words/tags (like {sample}.haplotagged.bam).

QC

While the contributions guidelines are more extensive, please particularly ensure that:

  • test.py was updated to call any added or updated example rules in a Snakefile
  • input: and output: file paths in the rules can be chosen arbitrarily
  • wherever possible, command line arguments are inferred and set automatically (e.g. based on file extensions in input: or output:)
  • temporary files are either written to a unique hidden folder in the working directory, or (better) stored where the Python function tempfile.gettempdir() points to
  • the meta.yaml contains a link to the documentation of the respective tool or command under url:
  • conda environments use a minimal amount of channels and packages, in recommended ordering

Summary by CodeRabbit

  • Bug Fixes

    • Clearer logging of assembly steps and safer skipping of ambiguous temporary outputs to avoid incorrect mappings or overwrites.
  • Refactor

    • Unified discovery and movement of batch run outputs via a file-mapping workflow for more reliable delivery of final results, retaining special-case handling for reference outputs.
  • Chores

    • Updated environment to add an auxiliary workflow utility and pinned packages to support the new mapping workflow.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 30, 2025

📝 Walkthrough

Walkthrough

The wrapper now discovers generated files in a temporary run directory, maps file suffixes to Snakemake output attributes with ambiguity checks, and moves resolved files via move_files; environment YAML adds snakemake-wrapper-utils==0.8.0.

Changes

Cohort / File(s) Summary
CNVkit batch wrapper
bio/cnvkit/batch/wrapper.py
Replace per-output basename/dirname copy logic with temp-directory enumeration; build a file_map (suffix → output attribute) and disambiguate ambiguous matches; assemble a source→destination mapping and call move_files(snakemake, mapping) instead of manual shutil.copy2. Enable stdout logging to append to log; remove unused logging import; explicitly include reference.cnn in mapping.
Environment
bio/cnvkit/batch/environment.yaml
Add dependency snakemake-wrapper-utils==0.8.0 to support move_files workflow.
Pinned environment
bio/cnvkit/batch/environment.linux-64.pin.txt
Add several pinned Bioconda/related package entries (including snakemake-wrapper-utils, pyfaidx, cnvkit, and related ecosystem packages) to reflect environment dependency additions.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Snakemake
  participant Wrapper
  participant TmpDir as "tmpdirname (temp files)"
  participant MoveUtil as "snakemake-wrapper-utils:move_files"
  participant FS as Filesystem

  Snakemake->>Wrapper: invoke CNVkit batch wrapper
  Wrapper->>TmpDir: run CNVkit -> produce temp files
  Wrapper->>TmpDir: listdir(tmpdirname) -> temp_files
  Wrapper->>Wrapper: build file_map (suffix → output attr)
  Wrapper->>Wrapper: disambiguate ambiguous matches, assemble mapping (temp -> output)
  Wrapper->>MoveUtil: move_files(snakemake, mapping)
  MoveUtil->>FS: move/rename temp -> Snakemake destinations
  FS-->>MoveUtil: moved paths
  Wrapper->>Snakemake: finish (emit shell with resolved paths)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Verify suffix→attribute mapping rules and ambiguity handling (e.g., segment vs call/bintest conflicts).
  • Confirm mapping structure and edge-case handling (multiple matches, missing matches) for move_files.
  • Check logging changes and that reference.cnn is reliably captured.
  • Validate the environment pin additions align with packaging expectations.

Possibly related issues

Possibly related PRs

  • feat: pyEffGenomeSize #4452 — Related: also adopts snakemake-wrapper-utils move_files-based temporary-output handling and adds the dependency.

Suggested reviewers

  • fgvieira
  • johanneskoester

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix: resolve 'Couldn't locate file' error with input BAM with extra delimiters' is specific and accurately describes the main bug fix in the changeset, clearly indicating the issue being resolved.
Description check ✅ Passed The pull request description includes the issue reference, explains the specific problem being solved, and has all QC checklist items marked as completed, demonstrating adherence to contribution guidelines.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 61fe501 and 067f845.

📒 Files selected for processing (1)
  • bio/cnvkit/batch/environment.linux-64.pin.txt (1 hunks)
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3498
File: bio/ngscheckmate/ncm/wrapper.py:68-68
Timestamp: 2024-12-06T14:25:43.922Z
Learning: In the `bio/ngscheckmate/ncm/wrapper.py` file for the NGSCheckMate wrapper, do not suggest adding file existence checks for the list file input, as the files are already required by the wrapper.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/test/Snakefile:14-22
Timestamp: 2024-11-26T09:16:39.570Z
Learning: In the `snakemake-wrappers` repository, when writing test `Snakefile`s (e.g., `bio/ngscheckmate/makesnvpattern/test/Snakefile`), hardcoded input/output paths are acceptable because these are examples and the IO can be chosen freely.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/wrapper.py:18-24
Timestamp: 2024-11-26T09:08:06.041Z
Learning: In Snakemake wrappers, input file validation is managed by Snakemake, so manual validation in the wrapper code is unnecessary.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/test/Snakefile:2-6
Timestamp: 2024-11-26T08:31:00.099Z
Learning: In test files for Snakemake wrappers, such as `bio/mtnucratio/test/Snakefile`, hard-coded input and output paths are acceptable as examples and do not need to use wildcards to make paths flexible.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-08-21T08:30:42.757Z
Learning: In `wrapper.py` scripts, do not flag the use of an undefined variable called `snakemake`.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-10-08T17:41:54.542Z
Learning: In `wrapper.py` scripts, do not flag the use of an undefined variable called `snakemake`.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3502
File: bio/ngsbits/sampleancestry/wrapper.py:18-23
Timestamp: 2024-11-26T14:59:03.678Z
Learning: In Snakemake wrapper scripts, Snakemake validates input and output paths, so explicit shell quoting is not necessary.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/wrapper.py:27-28
Timestamp: 2024-11-26T08:30:23.818Z
Learning: In Snakemake wrappers (e.g., `wrapper.py` files), it's unnecessary to verify the availability of tools like `mtnucratio` within the code, because Snakemake with Conda ensures that the required tools are installed and available.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-08-21T08:33:50.878Z
Learning: The `snakemake` variable is inserted via a preamble during execution in `wrapper.py` scripts, so it doesn't need to be explicitly defined.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-10-08T17:41:54.542Z
Learning: The `snakemake` variable is inserted via a preamble during execution in `wrapper.py` scripts, so it doesn't need to be explicitly defined.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/preprocess-variants/wrapper.py:0-0
Timestamp: 2024-11-15T13:48:33.759Z
Learning: In Snakemake wrappers, security considerations like input sanitization are unnecessary, as the wrappers are under full control of the user.
📚 Learning: 2025-02-11T12:24:22.592Z
Learnt from: dlaehnemann
Repo: snakemake/snakemake-wrappers PR: 3648
File: bio/nanosim/simulator/environment.yaml:6-6
Timestamp: 2025-02-11T12:24:22.592Z
Learning: In the nanosim bioconda recipe, dependencies are carefully managed with specific version pins (e.g., scikit-learn ~=0.23.2) to ensure compatibility with pre-trained models. These dependencies don't need to be explicitly added to environment.yaml files when the main package is listed as a dependency, as they are handled through the bioconda recipe system.

Applied to files:

  • bio/cnvkit/batch/environment.linux-64.pin.txt
📚 Learning: 2025-06-02T07:56:35.854Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 4159
File: bio/pyfaidx/environment.yaml:6-6
Timestamp: 2025-06-02T07:56:35.854Z
Learning: In the Snakemake-wrapper repository, conda dependency version pinning in environment.yaml files uses spaces around the equals sign (e.g., `- pyfaidx =0.8.1.4`) as the established coding standard, even though conda itself doesn't require the spaces.

Applied to files:

  • bio/cnvkit/batch/environment.linux-64.pin.txt
📚 Learning: 2025-02-11T12:24:22.592Z
Learnt from: dlaehnemann
Repo: snakemake/snakemake-wrappers PR: 3648
File: bio/nanosim/simulator/environment.yaml:6-6
Timestamp: 2025-02-11T12:24:22.592Z
Learning: For bioconda packages like nanosim, dependencies like scikit-learn are managed through the bioconda recipe and don't need to be explicitly added to the environment.yaml file when the main package is listed as a dependency.

Applied to files:

  • bio/cnvkit/batch/environment.linux-64.pin.txt
📚 Learning: 2024-11-26T08:30:23.818Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/wrapper.py:27-28
Timestamp: 2024-11-26T08:30:23.818Z
Learning: In Snakemake wrappers (e.g., `wrapper.py` files), it's unnecessary to verify the availability of tools like `mtnucratio` within the code, because Snakemake with Conda ensures that the required tools are installed and available.

Applied to files:

  • bio/cnvkit/batch/environment.linux-64.pin.txt
📚 Learning: 2025-04-17T09:24:51.738Z
Learnt from: dlaehnemann
Repo: snakemake/snakemake-wrappers PR: 0
File: :0-0
Timestamp: 2025-04-17T09:24:51.738Z
Learning: In snakemake-wrappers repository, environment.yaml files should follow these conventions:
1. Use whitespace before the equal sign in version specifications (e.g., "datavzrd =2.53.1")
2. Only specify the exact version for the main software package
3. Don't add version constraints for dependencies unless absolutely necessary
4. See full guidelines at: https://snakemake-wrappers.readthedocs.io/en/stable/contributing.html#environment-yaml-file

Applied to files:

  • bio/cnvkit/batch/environment.linux-64.pin.txt
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: docs
  • GitHub Check: testing
  • GitHub Check: Summary
🔇 Additional comments (1)
bio/cnvkit/batch/environment.linux-64.pin.txt (1)

141-141: Pin file correctly reflects snakemake-wrapper-utils dependency addition.

The addition of snakemake-wrapper-utils-0.8.0-pyhdfd78af_0 from bioconda/noarch aligns with the PR's shift to using move_files for output file handling. The package metadata (version, build, checksum) appears valid for a noarch Python package.

Since this is an auto-generated conda lock file, verify that the source environment.yaml was also properly updated to include snakemake-wrapper-utils = 0.8.0 (following project conventions with spaces around the equals sign).


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
bio/cnvkit/batch/wrapper.py (1)

74-75: Remove redundant basename() call.

Since listdir() already returns filenames without directory paths, applying basename() to elements of temp_files is redundant.

Apply this diff:

             logging.error(
                 f"Couldn't locate file {join(tmpdirname, source)} to copy it to {destination}. "
-                f"Possible files are {[basename(f) for f in temp_files]}"
+                f"Possible files are {temp_files}"
             )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eb370a9 and 43c2535.

📒 Files selected for processing (1)
  • bio/cnvkit/batch/wrapper.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • bio/cnvkit/batch/wrapper.py
**/wrapper.py

⚙️ CodeRabbit configuration file

Do not complain about use of undefined variable called snakemake.

Files:

  • bio/cnvkit/batch/wrapper.py
🧠 Learnings (5)
📓 Common learnings
Learnt from: johanneskoester
PR: snakemake/snakemake-wrappers#3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.
📚 Learning: 2024-11-15T18:31:15.447Z
Learnt from: johanneskoester
PR: snakemake/snakemake-wrappers#3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-21T10:23:03.427Z
Learnt from: johanneskoester
PR: snakemake/snakemake-wrappers#3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:29:27.546Z
Learnt from: tdayris
PR: snakemake/snakemake-wrappers#3496
File: bio/mtnucratio/wrapper.py:21-25
Timestamp: 2024-11-26T08:29:27.546Z
Learning: In `bio/mtnucratio/wrapper.py`, when creating symlinks within a temporary directory, it's acceptable to rely on Python's built-in error handling for I/O permissions and availability. Explicit try-except blocks around symlink creation may not be necessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:33:32.457Z
Learnt from: tdayris
PR: snakemake/snakemake-wrappers#3497
File: bio/sexdeterrmine/wrapper.py:0-0
Timestamp: 2024-11-26T08:33:32.457Z
Learning: In the `bio/sexdeterrmine/wrapper.py` file for the Sex.DetERRmine wrapper, missing input/output errors are raised by the shell command and captured in the logs. Therefore, additional error handling for file operations is not necessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
🪛 Ruff (0.14.2)
bio/cnvkit/batch/wrapper.py

69-69: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: docs
  • GitHub Check: testing
  • GitHub Check: Summary
🔇 Additional comments (1)
bio/cnvkit/batch/wrapper.py (1)

71-71: LGTM!

The copy operation correctly constructs the source path and uses shutil.copy2 to preserve file metadata.

Updates the output file lookup logic to correctly handle BAM files whose names contain intermediate words/tags (like `{sample}.haplotagged.bam`).
@andrewgull andrewgull marked this pull request as draft October 30, 2025 15:20
@fgvieira
Copy link
Collaborator

Maybe consider using the function move_files from snakemake-wrapper-utils, like in here:

for move_cmd in move_files(snakemake, optional_output):
shell("{move_cmd} {log}")

@andrewgull andrewgull marked this pull request as ready for review November 5, 2025 12:14
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
bio/cnvkit/batch/wrapper.py (2)

96-104: Address previous review feedback: use strict zip and consider snakemake-wrapper-utils.

Two issues from previous review remain unaddressed:

  1. zip() without strict= parameter: If sources and destinations lengths differ, files will be silently skipped, causing data integrity issues. As flagged by the previous reviewer and Ruff, use strict=True (Python 3.10+) or add an explicit length check.

  2. Consider move_files utility: Reviewer fgvieira suggested using the move_files function from snakemake-wrapper-utils (as shown in this example). This would provide more robust file handling and reduce custom logic.

For the zip issue, apply this diff:

-    for source, destination in zip(sources, destinations):
+    if len(sources) != len(destinations):
+        raise ValueError(
+            f"Mismatch: {len(sources)} matched files but {len(destinations)} outputs. "
+            f"Matched: {sources}"
+        )
+    for source, destination in zip(sources, destinations):

Alternatively, if Python 3.10+ is guaranteed:

-    for source, destination in zip(sources, destinations):
+    for source, destination in zip(sources, destinations, strict=True):

Please also address whether the move_files approach would simplify this logic.


83-93: Missing validation that all expected outputs were matched.

The matching logic silently skips any expected output attributes that don't have corresponding files in the temp directory. If CNVkit fails to generate an expected file, or if the suffix matching fails, the output will be incomplete without error.

Add validation after the matching loop:

    for file in temp_files:
        for suffix, attr in file_map.items():
            if file.endswith(suffix):
                # Skip ambiguous matches
                if attr == "segments" and any(
                    x in file for x in ["call.cns", "bintest.cns"]
                ):
                    continue
                output_path = getattr(snakemake.output, attr, None)
                if output_path is None:
                    raise ValueError(
                        f"Output attribute '{attr}' not defined in Snakemake rule for file '{file}'"
                    )
                destinations.append(output_path)
                sources.append(file)
                break  # stop after first match
    
    # Validate that all expected outputs were found
    expected_outputs = set(destinations)
    declared_outputs = {
        getattr(snakemake.output, attr, None) 
        for attr in file_map.values()
        if hasattr(snakemake.output, attr)
    }
    missing = declared_outputs - expected_outputs
    if missing:
        raise ValueError(
            f"Expected outputs not found in temp directory: {missing}. "
            f"Generated files: {temp_files}"
        )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 43c2535 and 756f490.

📒 Files selected for processing (1)
  • bio/cnvkit/batch/wrapper.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • bio/cnvkit/batch/wrapper.py
**/wrapper.py

⚙️ CodeRabbit configuration file

Do not complain about use of undefined variable called snakemake.

Files:

  • bio/cnvkit/batch/wrapper.py
🧠 Learnings (11)
📓 Common learnings
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/test/Snakefile:14-22
Timestamp: 2024-11-26T09:16:39.570Z
Learning: In the `snakemake-wrappers` repository, when writing test `Snakefile`s (e.g., `bio/ngscheckmate/makesnvpattern/test/Snakefile`), hardcoded input/output paths are acceptable because these are examples and the IO can be chosen freely.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3498
File: bio/ngscheckmate/ncm/wrapper.py:68-68
Timestamp: 2024-12-06T14:25:43.922Z
Learning: In the `bio/ngscheckmate/ncm/wrapper.py` file for the NGSCheckMate wrapper, do not suggest adding file existence checks for the list file input, as the files are already required by the wrapper.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/test/Snakefile:2-6
Timestamp: 2024-11-26T08:31:00.099Z
Learning: In test files for Snakemake wrappers, such as `bio/mtnucratio/test/Snakefile`, hard-coded input and output paths are acceptable as examples and do not need to use wildcards to make paths flexible.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/wrapper.py:18-24
Timestamp: 2024-11-26T09:08:06.041Z
Learning: In Snakemake wrappers, input file validation is managed by Snakemake, so manual validation in the wrapper code is unnecessary.
📚 Learning: 2024-11-15T18:31:15.447Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-21T10:23:03.427Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:29:27.546Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/wrapper.py:21-25
Timestamp: 2024-11-26T08:29:27.546Z
Learning: In `bio/mtnucratio/wrapper.py`, when creating symlinks within a temporary directory, it's acceptable to rely on Python's built-in error handling for I/O permissions and availability. Explicit try-except blocks around symlink creation may not be necessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-12-06T14:25:43.922Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3498
File: bio/ngscheckmate/ncm/wrapper.py:68-68
Timestamp: 2024-12-06T14:25:43.922Z
Learning: In the `bio/ngscheckmate/ncm/wrapper.py` file for the NGSCheckMate wrapper, do not suggest adding file existence checks for the list file input, as the files are already required by the wrapper.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:31:00.099Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/test/Snakefile:2-6
Timestamp: 2024-11-26T08:31:00.099Z
Learning: In test files for Snakemake wrappers, such as `bio/mtnucratio/test/Snakefile`, hard-coded input and output paths are acceptable as examples and do not need to use wildcards to make paths flexible.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:33:32.457Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3497
File: bio/sexdeterrmine/wrapper.py:0-0
Timestamp: 2024-11-26T08:33:32.457Z
Learning: In the `bio/sexdeterrmine/wrapper.py` file for the Sex.DetERRmine wrapper, missing input/output errors are raised by the shell command and captured in the logs. Therefore, additional error handling for file operations is not necessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:35:42.140Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3497
File: bio/sexdeterrmine/wrapper.py:23-26
Timestamp: 2024-11-26T08:35:42.140Z
Learning: In the `bio/sexdeterrmine/wrapper.py` file (Python), we rely on Samtools to handle input validation for the depth file provided by the user, so additional file existence checks are not necessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2025-01-30T14:15:32.176Z
Learnt from: dlaehnemann
Repo: snakemake/snakemake-wrappers PR: 3432
File: bio/reference/ensembl-sequence/wrapper.py:64-70
Timestamp: 2025-01-30T14:15:32.176Z
Learning: In Snakemake wrappers that download multiple chromosomes into a single output file, if any chromosome download fails, the entire operation should fail immediately (using `break`) rather than continuing with remaining chromosomes, as partial output would be invalid.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T10:49:54.765Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3501
File: meta/bio/varscan2_snpeff/test/Snakefile:58-71
Timestamp: 2024-11-26T10:49:54.765Z
Learning: In test Snakefiles within the snakemake-wrappers repository, it is acceptable to use simplified paths and logging configurations that may differ from real-life pipelines.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T14:59:03.678Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3502
File: bio/ngsbits/sampleancestry/wrapper.py:18-23
Timestamp: 2024-11-26T14:59:03.678Z
Learning: In Snakemake wrapper scripts, Snakemake validates input and output paths, so explicit shell quoting is not necessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
🪛 Ruff (0.14.3)
bio/cnvkit/batch/wrapper.py

91-91: Undefined name snakemake

(F821)


96-96: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Summary

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
bio/cnvkit/batch/wrapper.py (2)

83-93: Add validation for missing output attributes.

If snakemake.output doesn't have the expected attribute, getattr(..., None) returns None, which gets appended to destinations. This will cause a cryptic TypeError during copy2() at line 98 rather than a clear error message about the missing output.

Apply this diff to validate output attributes:

                 if attr == "segments" and any(
                     x in file for x in ["call.cns", "bintest.cns"]
                 ):
                     continue
-                destinations.append(getattr(snakemake.output, attr, None))
+                output_path = getattr(snakemake.output, attr, None)
+                if output_path is None:
+                    raise ValueError(
+                        f"Output attribute '{attr}' not defined in Snakemake rule for file '{file}'"
+                    )
+                destinations.append(output_path)
                 sources.append(file)
                 break  # stop after first match

95-104: Add length validation before zipping sources and destinations.

The zip() operation silently stops at the shorter iterable if sources and destinations have different lengths, which could result in missing output files or uncopied temporary files. This creates a data integrity risk if CNVkit produces unexpected auxiliary files or if the matching logic has bugs.

For Python 3.10+, add strict=True to zip:

-    for source, destination in zip(sources, destinations):
+    for source, destination in zip(sources, destinations, strict=True):
         try:
             shutil.copy2(join(tmpdirname, source), destination)

For older Python versions, add an explicit length check:

+    if len(sources) != len(destinations):
+        raise ValueError(
+            f"Mismatch: {len(sources)} matched files but {len(destinations)} outputs expected. "
+            f"Matched files: {sources}"
+        )
     for source, destination in zip(sources, destinations):
         try:
🧹 Nitpick comments (2)
bio/cnvkit/batch/wrapper.py (2)

70-78: Consider adding leading dots to suffixes for clarity.

The suffix keys work correctly with endswith() but could be more explicit. Adding leading dots (e.g., .cnr, .cns, .call.cns) would make it clearer these are file extensions rather than arbitrary substrings.

Example:

     file_map = {
-        "antitargetcoverage.cnn": "antitarget_coverage",
-        "bintest.cns": "bins",
-        "cnr": "regions",
-        "call.cns": "segments_called",
-        "targetcoverage.cnn": "target_coverage",
-        "cns": "segments",
-        "reference.cnn": "reference",
+        ".antitargetcoverage.cnn": "antitarget_coverage",
+        ".bintest.cns": "bins",
+        ".cnr": "regions",
+        ".call.cns": "segments_called",
+        ".targetcoverage.cnn": "target_coverage",
+        ".cns": "segments",
+        ".reference.cnn": "reference",
     }

66-104: Consider using the move_files utility from snakemake-wrapper-utils.

As suggested by reviewer fgvieira, the move_files function from snakemake-wrapper-utils provides a battle-tested approach for moving output files from temporary directories. This would reduce custom code, improve maintainability, and leverage existing utility functions.

Example usage (from other wrappers):

from snakemake_wrapper_utils.move_files import move_files

# After cnvkit runs...
move_files(
    src_dir=tmpdirname,
    dst_files=snakemake.output,
    pattern="{filename}",
)

See reference implementation: bio/tmb/pyeffgenomesize/wrapper.py

Note: You would need to verify that move_files supports the specific file suffix matching patterns required by CNVkit's output structure. Based on learnings

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 756f490 and 7840ec2.

📒 Files selected for processing (1)
  • bio/cnvkit/batch/wrapper.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • bio/cnvkit/batch/wrapper.py
**/wrapper.py

⚙️ CodeRabbit configuration file

Do not complain about use of undefined variable called snakemake.

Files:

  • bio/cnvkit/batch/wrapper.py
🧠 Learnings (9)
📓 Common learnings
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3498
File: bio/ngscheckmate/ncm/wrapper.py:68-68
Timestamp: 2024-12-06T14:25:43.922Z
Learning: In the `bio/ngscheckmate/ncm/wrapper.py` file for the NGSCheckMate wrapper, do not suggest adding file existence checks for the list file input, as the files are already required by the wrapper.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/test/Snakefile:14-22
Timestamp: 2024-11-26T09:16:39.570Z
Learning: In the `snakemake-wrappers` repository, when writing test `Snakefile`s (e.g., `bio/ngscheckmate/makesnvpattern/test/Snakefile`), hardcoded input/output paths are acceptable because these are examples and the IO can be chosen freely.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/wrapper.py:18-24
Timestamp: 2024-11-26T09:08:06.041Z
Learning: In Snakemake wrappers, input file validation is managed by Snakemake, so manual validation in the wrapper code is unnecessary.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/test/Snakefile:2-6
Timestamp: 2024-11-26T08:31:00.099Z
Learning: In test files for Snakemake wrappers, such as `bio/mtnucratio/test/Snakefile`, hard-coded input and output paths are acceptable as examples and do not need to use wildcards to make paths flexible.
📚 Learning: 2024-11-15T18:31:15.447Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-21T10:23:03.427Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:29:27.546Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/wrapper.py:21-25
Timestamp: 2024-11-26T08:29:27.546Z
Learning: In `bio/mtnucratio/wrapper.py`, when creating symlinks within a temporary directory, it's acceptable to rely on Python's built-in error handling for I/O permissions and availability. Explicit try-except blocks around symlink creation may not be necessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-12-06T14:25:43.922Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3498
File: bio/ngscheckmate/ncm/wrapper.py:68-68
Timestamp: 2024-12-06T14:25:43.922Z
Learning: In the `bio/ngscheckmate/ncm/wrapper.py` file for the NGSCheckMate wrapper, do not suggest adding file existence checks for the list file input, as the files are already required by the wrapper.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:33:32.457Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3497
File: bio/sexdeterrmine/wrapper.py:0-0
Timestamp: 2024-11-26T08:33:32.457Z
Learning: In the `bio/sexdeterrmine/wrapper.py` file for the Sex.DetERRmine wrapper, missing input/output errors are raised by the shell command and captured in the logs. Therefore, additional error handling for file operations is not necessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:35:42.140Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3497
File: bio/sexdeterrmine/wrapper.py:23-26
Timestamp: 2024-11-26T08:35:42.140Z
Learning: In the `bio/sexdeterrmine/wrapper.py` file (Python), we rely on Samtools to handle input validation for the depth file provided by the user, so additional file existence checks are not necessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T10:49:54.765Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3501
File: meta/bio/varscan2_snpeff/test/Snakefile:58-71
Timestamp: 2024-11-26T10:49:54.765Z
Learning: In test Snakefiles within the snakemake-wrappers repository, it is acceptable to use simplified paths and logging configurations that may differ from real-life pipelines.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T14:59:03.678Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3502
File: bio/ngsbits/sampleancestry/wrapper.py:18-23
Timestamp: 2024-11-26T14:59:03.678Z
Learning: In Snakemake wrapper scripts, Snakemake validates input and output paths, so explicit shell quoting is not necessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
🪛 Ruff (0.14.3)
bio/cnvkit/batch/wrapper.py

91-91: Undefined name snakemake

(F821)


96-96: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: testing
  • GitHub Check: docs
  • GitHub Check: Summary

use file.endswith(suffix) to find suffix in file - more robust; remove redundant dots in the beginning of "antitargetcoverage.cnn" and "targetcoverage.cnn"
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
bio/cnvkit/batch/wrapper.py (4)

70-78: Consider adding leading dots to generic extension suffixes.

The suffixes "cnr" and "cns" on lines 73 and 76 lack leading dots, which means file.endswith("cnr") would match unusual filenames like "mycnr" or "samplecnr". While CNVkit likely doesn't generate such names, explicit extension matching is more robust.

Consider this change for clarity:

     file_map = {
         "antitargetcoverage.cnn": "antitarget_coverage",
         "bintest.cns": "bins",
-        "cnr": "regions",
+        ".cnr": "regions",
         "call.cns": "segments_called",
         "targetcoverage.cnn": "target_coverage",
-        "cns": "segments",
+        ".cns": "segments",
         "reference.cnn": "reference",
     }

87-89: Ambiguity check could be more precise.

The substring matching with any(x in file for x in ["call.cns", "bintest.cns"]) works for typical CNVkit outputs but could theoretically match unexpected patterns (e.g., a filename containing "call.cns" in the middle).

For stricter matching, consider:

                 if attr == "segments" and any(
-                    x in file for x in ["call.cns", "bintest.cns"]
+                    file.endswith(x) for x in ["call.cns", "bintest.cns"]
                 ):

98-98: Add explicit strict=True to prevent silent mismatches.

The zip() without strict=True will silently truncate if sources and destinations have different lengths, potentially leaving files uncopied or outputs missing. This was flagged in previous review rounds.

If Python 3.10+ is available:

-    for source, destination in zip(sources, destinations):
+    for source, destination in zip(sources, destinations, strict=True):

Otherwise, add an explicit check before the loop:

+    if len(sources) != len(destinations):
+        raise ValueError(
+            f"Mismatch: {len(sources)} source files but {len(destinations)} destinations. "
+            f"Sources: {sources}, Destinations: {destinations}"
+        )
     for source, destination in zip(sources, destinations):
🧰 Tools
🪛 Ruff (0.14.3)

98-98: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


66-106: Consider using move_files from snakemake-wrapper-utils.

As reviewer fgvieira suggested, the snakemake-wrapper-utils library provides a move_files function designed for this exact scenario. Using it would simplify the code, reduce maintenance burden, and leverage well-tested utility logic.

Reference example from bio/tmb/pyeffgenomesize/wrapper.py:

from snakemake_wrapper_utils.util import move_files

# After cnvkit runs in tmpdirname:
move_files(tmpdirname, snakemake.output)

This would eliminate the need for custom suffix mapping, ambiguity checks, and manual file pairing. Based on learnings from reviewer feedback.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7840ec2 and b3c43e2.

📒 Files selected for processing (1)
  • bio/cnvkit/batch/wrapper.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • bio/cnvkit/batch/wrapper.py
**/wrapper.py

⚙️ CodeRabbit configuration file

Do not complain about use of undefined variable called snakemake.

Files:

  • bio/cnvkit/batch/wrapper.py
🧠 Learnings (17)
📓 Common learnings
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3498
File: bio/ngscheckmate/ncm/wrapper.py:68-68
Timestamp: 2024-12-06T14:25:43.922Z
Learning: In the `bio/ngscheckmate/ncm/wrapper.py` file for the NGSCheckMate wrapper, do not suggest adding file existence checks for the list file input, as the files are already required by the wrapper.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/test/Snakefile:14-22
Timestamp: 2024-11-26T09:16:39.570Z
Learning: In the `snakemake-wrappers` repository, when writing test `Snakefile`s (e.g., `bio/ngscheckmate/makesnvpattern/test/Snakefile`), hardcoded input/output paths are acceptable because these are examples and the IO can be chosen freely.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3501
File: meta/bio/varscan2_snpeff/test/Snakefile:1-10
Timestamp: 2024-11-26T10:49:04.406Z
Learning: Using generic output filenames like "genome.fasta" is acceptable in the `snakemake-wrappers` project.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3502
File: bio/ngsbits/sampleancestry/wrapper.py:18-23
Timestamp: 2024-11-26T14:59:03.678Z
Learning: In Snakemake wrapper scripts, Snakemake validates input and output paths, so explicit shell quoting is not necessary.
📚 Learning: 2024-11-15T18:31:15.447Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-21T10:23:03.427Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:29:27.546Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/wrapper.py:21-25
Timestamp: 2024-11-26T08:29:27.546Z
Learning: In `bio/mtnucratio/wrapper.py`, when creating symlinks within a temporary directory, it's acceptable to rely on Python's built-in error handling for I/O permissions and availability. Explicit try-except blocks around symlink creation may not be necessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-12-06T14:25:43.922Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3498
File: bio/ngscheckmate/ncm/wrapper.py:68-68
Timestamp: 2024-12-06T14:25:43.922Z
Learning: In the `bio/ngscheckmate/ncm/wrapper.py` file for the NGSCheckMate wrapper, do not suggest adding file existence checks for the list file input, as the files are already required by the wrapper.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T14:59:03.678Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3502
File: bio/ngsbits/sampleancestry/wrapper.py:18-23
Timestamp: 2024-11-26T14:59:03.678Z
Learning: In Snakemake wrapper scripts, Snakemake validates input and output paths, so explicit shell quoting is not necessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T10:49:04.406Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3501
File: meta/bio/varscan2_snpeff/test/Snakefile:1-10
Timestamp: 2024-11-26T10:49:04.406Z
Learning: Using generic output filenames like "genome.fasta" is acceptable in the `snakemake-wrappers` project.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:31:00.099Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/test/Snakefile:2-6
Timestamp: 2024-11-26T08:31:00.099Z
Learning: In test files for Snakemake wrappers, such as `bio/mtnucratio/test/Snakefile`, hard-coded input and output paths are acceptable as examples and do not need to use wildcards to make paths flexible.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:33:32.457Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3497
File: bio/sexdeterrmine/wrapper.py:0-0
Timestamp: 2024-11-26T08:33:32.457Z
Learning: In the `bio/sexdeterrmine/wrapper.py` file for the Sex.DetERRmine wrapper, missing input/output errors are raised by the shell command and captured in the logs. Therefore, additional error handling for file operations is not necessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:35:42.140Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3497
File: bio/sexdeterrmine/wrapper.py:23-26
Timestamp: 2024-11-26T08:35:42.140Z
Learning: In the `bio/sexdeterrmine/wrapper.py` file (Python), we rely on Samtools to handle input validation for the depth file provided by the user, so additional file existence checks are not necessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T09:08:06.041Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/wrapper.py:18-24
Timestamp: 2024-11-26T09:08:06.041Z
Learning: In Snakemake wrappers, input file validation is managed by Snakemake, so manual validation in the wrapper code is unnecessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:30:23.818Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/wrapper.py:27-28
Timestamp: 2024-11-26T08:30:23.818Z
Learning: In Snakemake wrappers (e.g., `wrapper.py` files), it's unnecessary to verify the availability of tools like `mtnucratio` within the code, because Snakemake with Conda ensures that the required tools are installed and available.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T09:16:39.570Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/test/Snakefile:14-22
Timestamp: 2024-11-26T09:16:39.570Z
Learning: In the `snakemake-wrappers` repository, when writing test `Snakefile`s (e.g., `bio/ngscheckmate/makesnvpattern/test/Snakefile`), hardcoded input/output paths are acceptable because these are examples and the IO can be chosen freely.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-15T13:48:33.759Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/preprocess-variants/wrapper.py:0-0
Timestamp: 2024-11-15T13:48:33.759Z
Learning: In Snakemake wrappers, security considerations like input sanitization are unnecessary, as the wrappers are under full control of the user.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2025-01-30T14:15:32.176Z
Learnt from: dlaehnemann
Repo: snakemake/snakemake-wrappers PR: 3432
File: bio/reference/ensembl-sequence/wrapper.py:64-70
Timestamp: 2025-01-30T14:15:32.176Z
Learning: In Snakemake wrappers that download multiple chromosomes into a single output file, if any chromosome download fails, the entire operation should fail immediately (using `break`) rather than continuing with remaining chromosomes, as partial output would be invalid.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-08-21T08:33:50.878Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-08-21T08:33:50.878Z
Learning: The `snakemake` variable is inserted via a preamble during execution in `wrapper.py` scripts, so it doesn't need to be explicitly defined.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T10:49:54.765Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3501
File: meta/bio/varscan2_snpeff/test/Snakefile:58-71
Timestamp: 2024-11-26T10:49:54.765Z
Learning: In test Snakefiles within the snakemake-wrappers repository, it is acceptable to use simplified paths and logging configurations that may differ from real-life pipelines.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
🪛 Ruff (0.14.3)
bio/cnvkit/batch/wrapper.py

91-91: Undefined name snakemake

(F821)


98-98: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: testing
  • GitHub Check: docs
🔇 Additional comments (1)
bio/cnvkit/batch/wrapper.py (1)

91-94: Good handling of optional outputs.

The logic correctly skips files whose corresponding output attributes are not defined in the Snakemake rule, avoiding cryptic copy2() failures while allowing CNVkit to produce optional outputs.

Note: The Ruff F821 warning about undefined snakemake is expected per the coding guidelines, as this variable is injected at runtime by Snakemake.

to mv cnvkit-generated files to snakemake-specified output.
add leading dots to generic extension suffixes.
rm not needed imports
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
bio/cnvkit/batch/wrapper.py (2)

64-76: Well-designed suffix mapping structure.

The file discovery and mapping approach is a significant improvement over the previous implementation. The suffix keys are intentionally mixed (some with leading dots like .cnr, .cns for generic extensions; others without like call.cns, bintest.cns for specific patterns). This design correctly prioritizes more-specific matches when used with endswith().

Consider adding a brief comment explaining the suffix design rationale:

# Mapping of file suffixes to snakemake.output attributes
# More specific suffixes (call.cns, bintest.cns) are checked before generic ones (.cns)
file_map = {
    ...
}

78-90: Robust matching logic with proper ambiguity handling.

The matching logic correctly uses endswith() for suffix matching (resolving the past review concern about substring matching). The ambiguity check for segments (lines 85-88) is a defensive measure that prevents incorrect classification when filenames contain both specific and generic patterns.

Consider adding a comment to explain the ambiguity check:

# Skip ambiguous matches - if checking generic .cns but file contains
# more specific patterns (call.cns or bintest.cns), don't match here
if attr == "segments" and any(
    x in file for x in ["call.cns", "bintest.cns"]
):
    continue
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7ba9956 and 61fe501.

📒 Files selected for processing (2)
  • bio/cnvkit/batch/environment.yaml (1 hunks)
  • bio/cnvkit/batch/wrapper.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • bio/cnvkit/batch/wrapper.py
**/wrapper.py

⚙️ CodeRabbit configuration file

Do not complain about use of undefined variable called snakemake.

Files:

  • bio/cnvkit/batch/wrapper.py
🧠 Learnings (27)
📓 Common learnings
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3498
File: bio/ngscheckmate/ncm/wrapper.py:68-68
Timestamp: 2024-12-06T14:25:43.922Z
Learning: In the `bio/ngscheckmate/ncm/wrapper.py` file for the NGSCheckMate wrapper, do not suggest adding file existence checks for the list file input, as the files are already required by the wrapper.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/test/Snakefile:14-22
Timestamp: 2024-11-26T09:16:39.570Z
Learning: In the `snakemake-wrappers` repository, when writing test `Snakefile`s (e.g., `bio/ngscheckmate/makesnvpattern/test/Snakefile`), hardcoded input/output paths are acceptable because these are examples and the IO can be chosen freely.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3502
File: bio/ngsbits/sampleancestry/wrapper.py:18-23
Timestamp: 2024-11-26T14:59:03.678Z
Learning: In Snakemake wrapper scripts, Snakemake validates input and output paths, so explicit shell quoting is not necessary.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-08-21T08:30:42.757Z
Learning: In `wrapper.py` scripts, do not flag the use of an undefined variable called `snakemake`.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-10-08T17:41:54.542Z
Learning: In `wrapper.py` scripts, do not flag the use of an undefined variable called `snakemake`.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/wrapper.py:27-28
Timestamp: 2024-11-26T08:30:23.818Z
Learning: In Snakemake wrappers (e.g., `wrapper.py` files), it's unnecessary to verify the availability of tools like `mtnucratio` within the code, because Snakemake with Conda ensures that the required tools are installed and available.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/wrapper.py:18-24
Timestamp: 2024-11-26T09:08:06.041Z
Learning: In Snakemake wrappers, input file validation is managed by Snakemake, so manual validation in the wrapper code is unnecessary.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-08-21T08:33:50.878Z
Learning: The `snakemake` variable is inserted via a preamble during execution in `wrapper.py` scripts, so it doesn't need to be explicitly defined.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-10-08T17:41:54.542Z
Learning: The `snakemake` variable is inserted via a preamble during execution in `wrapper.py` scripts, so it doesn't need to be explicitly defined.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/preprocess-variants/wrapper.py:0-0
Timestamp: 2024-11-15T13:48:33.759Z
Learning: In Snakemake wrappers, security considerations like input sanitization are unnecessary, as the wrappers are under full control of the user.
📚 Learning: 2025-04-17T09:24:51.738Z
Learnt from: dlaehnemann
Repo: snakemake/snakemake-wrappers PR: 0
File: :0-0
Timestamp: 2025-04-17T09:24:51.738Z
Learning: In snakemake-wrappers repository, environment.yaml files should follow these conventions:
1. Use whitespace before the equal sign in version specifications (e.g., "datavzrd =2.53.1")
2. Only specify the exact version for the main software package
3. Don't add version constraints for dependencies unless absolutely necessary
4. See full guidelines at: https://snakemake-wrappers.readthedocs.io/en/stable/contributing.html#environment-yaml-file

Applied to files:

  • bio/cnvkit/batch/environment.yaml
📚 Learning: 2025-02-11T12:24:22.592Z
Learnt from: dlaehnemann
Repo: snakemake/snakemake-wrappers PR: 3648
File: bio/nanosim/simulator/environment.yaml:6-6
Timestamp: 2025-02-11T12:24:22.592Z
Learning: In the nanosim bioconda recipe, dependencies are carefully managed with specific version pins (e.g., scikit-learn ~=0.23.2) to ensure compatibility with pre-trained models. These dependencies don't need to be explicitly added to environment.yaml files when the main package is listed as a dependency, as they are handled through the bioconda recipe system.

Applied to files:

  • bio/cnvkit/batch/environment.yaml
📚 Learning: 2025-04-17T09:24:51.738Z
Learnt from: dlaehnemann
Repo: snakemake/snakemake-wrappers PR: 0
File: :0-0
Timestamp: 2025-04-17T09:24:51.738Z
Learning: In snakemake-wrappers repository, environment.yaml files should follow these conventions:
1. Use whitespace before the equal sign in version specifications (e.g., "datavzrd =2.53.1")
2. Only specify the exact version for the main software package
3. Don't add version constraints for dependencies unless absolutely necessary
4. See guidelines at: https://snakemake-wrappers.readthedocs.io/en/stable/contributing.html#environment-yaml-file

Applied to files:

  • bio/cnvkit/batch/environment.yaml
📚 Learning: 2024-11-26T08:30:23.818Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/wrapper.py:27-28
Timestamp: 2024-11-26T08:30:23.818Z
Learning: In Snakemake wrappers (e.g., `wrapper.py` files), it's unnecessary to verify the availability of tools like `mtnucratio` within the code, because Snakemake with Conda ensures that the required tools are installed and available.

Applied to files:

  • bio/cnvkit/batch/environment.yaml
  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2025-06-02T07:56:35.854Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 4159
File: bio/pyfaidx/environment.yaml:6-6
Timestamp: 2025-06-02T07:56:35.854Z
Learning: In the Snakemake-wrapper repository, conda dependency version pinning in environment.yaml files uses spaces around the equals sign (e.g., `- pyfaidx =0.8.1.4`) as the established coding standard, even though conda itself doesn't require the spaces.

Applied to files:

  • bio/cnvkit/batch/environment.yaml
📚 Learning: 2025-02-11T12:24:22.592Z
Learnt from: dlaehnemann
Repo: snakemake/snakemake-wrappers PR: 3648
File: bio/nanosim/simulator/environment.yaml:6-6
Timestamp: 2025-02-11T12:24:22.592Z
Learning: For bioconda packages like nanosim, dependencies like scikit-learn are managed through the bioconda recipe and don't need to be explicitly added to the environment.yaml file when the main package is listed as a dependency.

Applied to files:

  • bio/cnvkit/batch/environment.yaml
📚 Learning: 2024-08-21T08:30:42.757Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-08-21T08:30:42.757Z
Learning: In `wrapper.py` scripts, do not flag the use of an undefined variable called `snakemake`.

Applied to files:

  • bio/cnvkit/batch/environment.yaml
  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T10:49:54.765Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3501
File: meta/bio/varscan2_snpeff/test/Snakefile:58-71
Timestamp: 2024-11-26T10:49:54.765Z
Learning: In test Snakefiles within the snakemake-wrappers repository, it is acceptable to use simplified paths and logging configurations that may differ from real-life pipelines.

Applied to files:

  • bio/cnvkit/batch/environment.yaml
  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-15T18:32:07.612Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:11-12
Timestamp: 2024-11-15T18:32:07.612Z
Learning: In the `snakemake-wrappers` repository, when developing wrappers, it's acceptable to use "master" as the wrapper version in the `wrapper` directive. This ensures that the documentation always reflects the latest stable version of the wrappers.

Applied to files:

  • bio/cnvkit/batch/environment.yaml
📚 Learning: 2024-08-21T08:33:50.878Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-08-21T08:33:50.878Z
Learning: The `snakemake` variable is inserted via a preamble during execution in `wrapper.py` scripts, so it doesn't need to be explicitly defined.

Applied to files:

  • bio/cnvkit/batch/environment.yaml
  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T10:49:04.406Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3501
File: meta/bio/varscan2_snpeff/test/Snakefile:1-10
Timestamp: 2024-11-26T10:49:04.406Z
Learning: In the `snakemake-wrappers` project, it's acceptable to use "master" in the wrapper path instead of pinning to a specific version.

Applied to files:

  • bio/cnvkit/batch/environment.yaml
📚 Learning: 2024-11-15T13:48:33.759Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/preprocess-variants/wrapper.py:0-0
Timestamp: 2024-11-15T13:48:33.759Z
Learning: In Snakemake wrappers, security considerations like input sanitization are unnecessary, as the wrappers are under full control of the user.

Applied to files:

  • bio/cnvkit/batch/environment.yaml
  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-21T10:23:03.427Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.

Applied to files:

  • bio/cnvkit/batch/environment.yaml
  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-15T18:31:15.447Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-12-06T14:25:43.922Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3498
File: bio/ngscheckmate/ncm/wrapper.py:68-68
Timestamp: 2024-12-06T14:25:43.922Z
Learning: In the `bio/ngscheckmate/ncm/wrapper.py` file for the NGSCheckMate wrapper, do not suggest adding file existence checks for the list file input, as the files are already required by the wrapper.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T14:59:03.678Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3502
File: bio/ngsbits/sampleancestry/wrapper.py:18-23
Timestamp: 2024-11-26T14:59:03.678Z
Learning: In Snakemake wrapper scripts, Snakemake validates input and output paths, so explicit shell quoting is not necessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:29:27.546Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/wrapper.py:21-25
Timestamp: 2024-11-26T08:29:27.546Z
Learning: In `bio/mtnucratio/wrapper.py`, when creating symlinks within a temporary directory, it's acceptable to rely on Python's built-in error handling for I/O permissions and availability. Explicit try-except blocks around symlink creation may not be necessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:33:32.457Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3497
File: bio/sexdeterrmine/wrapper.py:0-0
Timestamp: 2024-11-26T08:33:32.457Z
Learning: In the `bio/sexdeterrmine/wrapper.py` file for the Sex.DetERRmine wrapper, missing input/output errors are raised by the shell command and captured in the logs. Therefore, additional error handling for file operations is not necessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:35:42.140Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3497
File: bio/sexdeterrmine/wrapper.py:23-26
Timestamp: 2024-11-26T08:35:42.140Z
Learning: In the `bio/sexdeterrmine/wrapper.py` file (Python), we rely on Samtools to handle input validation for the depth file provided by the user, so additional file existence checks are not necessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T09:08:06.041Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/wrapper.py:18-24
Timestamp: 2024-11-26T09:08:06.041Z
Learning: In Snakemake wrappers, input file validation is managed by Snakemake, so manual validation in the wrapper code is unnecessary.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T09:16:39.570Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/test/Snakefile:14-22
Timestamp: 2024-11-26T09:16:39.570Z
Learning: In the `snakemake-wrappers` repository, when writing test `Snakefile`s (e.g., `bio/ngscheckmate/makesnvpattern/test/Snakefile`), hardcoded input/output paths are acceptable because these are examples and the IO can be chosen freely.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:31:00.099Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/test/Snakefile:2-6
Timestamp: 2024-11-26T08:31:00.099Z
Learning: In test files for Snakemake wrappers, such as `bio/mtnucratio/test/Snakefile`, hard-coded input and output paths are acceptable as examples and do not need to use wildcards to make paths flexible.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2025-01-30T14:15:32.176Z
Learnt from: dlaehnemann
Repo: snakemake/snakemake-wrappers PR: 3432
File: bio/reference/ensembl-sequence/wrapper.py:64-70
Timestamp: 2025-01-30T14:15:32.176Z
Learning: In Snakemake wrappers that download multiple chromosomes into a single output file, if any chromosome download fails, the entire operation should fail immediately (using `break`) rather than continuing with remaining chromosomes, as partial output would be invalid.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T10:49:04.406Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3501
File: meta/bio/varscan2_snpeff/test/Snakefile:1-10
Timestamp: 2024-11-26T10:49:04.406Z
Learning: Using generic output filenames like "genome.fasta" is acceptable in the `snakemake-wrappers` project.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2025-01-30T14:14:03.928Z
Learnt from: dlaehnemann
Repo: snakemake/snakemake-wrappers PR: 3432
File: bio/reference/ensembl-sequence/wrapper.py:64-70
Timestamp: 2025-01-30T14:14:03.928Z
Learning: In Snakemake wrappers, logging should be done using `snakemake.logging.logger`, not `shell.logger`.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-15T18:36:04.660Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/call-variants/wrapper.py:15-23
Timestamp: 2024-11-15T18:36:04.660Z
Learning: In the Snakemake wrappers repository, using `shell=True` and redirecting within shell commands is acceptable.

Applied to files:

  • bio/cnvkit/batch/wrapper.py
🪛 Ruff (0.14.3)
bio/cnvkit/batch/wrapper.py

13-13: Undefined name snakemake

(F821)


92-92: Loop control variable file not used within loop body

(B007)


92-92: Undefined name snakemake

(F821)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: testing
  • GitHub Check: docs
  • GitHub Check: Summary
🔇 Additional comments (3)
bio/cnvkit/batch/environment.yaml (1)

7-7: LGTM - Dependency addition follows conventions.

The addition of snakemake-wrapper-utils =0.8.0 correctly follows the established format with proper spacing around the equals sign and supports the move_files utility used in the wrapper refactoring.

bio/cnvkit/batch/wrapper.py (2)

10-13: Good refactoring - follows reviewer suggestion.

The import of move_files from snakemake-wrapper-utils aligns with reviewer fgvieira's suggestion to reuse existing utility code. The logging change to capture both stdout and stderr with append mode improves observability.


92-93: Correct usage of move_files utility.

The implementation correctly uses the move_files utility as suggested by the reviewer. The loop variable file contains shell commands that are interpolated by shell() through variable substitution, so Ruff's "unused variable" warning (B007) is a false positive—the variable is used within the shell command string.

The move_files utility should handle edge cases (such as missing output attributes or extra generated files) according to the snakemake-wrapper-utils design.

Comment on lines +81 to +90
for file in temp_files:
for suffix, attr in file_map.items():
if file.endswith(suffix):
# Skip ambiguous matches
if attr == "segments" and any(
x in file for x in ["call.cns", "bintest.cns"]
):
continue
mapping[attr] = join(tmpdirname, file)
break # stop after first match
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if an output was not specified in the rule output files? As far as I can see, it would still be added to the mapping and it would fail on move_files, no?

Maybe something like:

Suggested change
for file in temp_files:
for suffix, attr in file_map.items():
if file.endswith(suffix):
# Skip ambiguous matches
if attr == "segments" and any(
x in file for x in ["call.cns", "bintest.cns"]
):
continue
mapping[attr] = join(tmpdirname, file)
break # stop after first match
for suffix, attr in file_map.items():
if not snakemake.output.get(attr):
continue
for file in temp_files:
if file.endswith(suffix):
# Skip ambiguous matches
if attr == "segments" and any(
x in file for x in ["call.cns", "bintest.cns"]
):
continue
mapping[attr] = join(tmpdirname, file)
break # stop after first match

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants