-
Notifications
You must be signed in to change notification settings - Fork 202
fix: resolve 'Couldn't locate file' error with input BAM with extra delimiters #4682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughThe wrapper now discovers generated files in a temporary run directory, maps file suffixes to Snakemake output attributes with ambiguity checks, and moves resolved files via Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Snakemake
participant Wrapper
participant TmpDir as "tmpdirname (temp files)"
participant MoveUtil as "snakemake-wrapper-utils:move_files"
participant FS as Filesystem
Snakemake->>Wrapper: invoke CNVkit batch wrapper
Wrapper->>TmpDir: run CNVkit -> produce temp files
Wrapper->>TmpDir: listdir(tmpdirname) -> temp_files
Wrapper->>Wrapper: build file_map (suffix → output attr)
Wrapper->>Wrapper: disambiguate ambiguous matches, assemble mapping (temp -> output)
Wrapper->>MoveUtil: move_files(snakemake, mapping)
MoveUtil->>FS: move/rename temp -> Snakemake destinations
FS-->>MoveUtil: moved paths
Wrapper->>Snakemake: finish (emit shell with resolved paths)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related issues
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (6)📓 Common learnings📚 Learning: 2025-02-11T12:24:22.592ZApplied to files:
📚 Learning: 2025-06-02T07:56:35.854ZApplied to files:
📚 Learning: 2025-02-11T12:24:22.592ZApplied to files:
📚 Learning: 2024-11-26T08:30:23.818ZApplied to files:
📚 Learning: 2025-04-17T09:24:51.738ZApplied to files:
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
🔇 Additional comments (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
bio/cnvkit/batch/wrapper.py (1)
74-75: Remove redundantbasename()call.Since
listdir()already returns filenames without directory paths, applyingbasename()to elements oftemp_filesis redundant.Apply this diff:
logging.error( f"Couldn't locate file {join(tmpdirname, source)} to copy it to {destination}. " - f"Possible files are {[basename(f) for f in temp_files]}" + f"Possible files are {temp_files}" )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
bio/cnvkit/batch/wrapper.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
bio/cnvkit/batch/wrapper.py
**/wrapper.py
⚙️ CodeRabbit configuration file
Do not complain about use of undefined variable called
snakemake.
Files:
bio/cnvkit/batch/wrapper.py
🧠 Learnings (5)
📓 Common learnings
Learnt from: johanneskoester
PR: snakemake/snakemake-wrappers#3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.
📚 Learning: 2024-11-15T18:31:15.447Z
Learnt from: johanneskoester
PR: snakemake/snakemake-wrappers#3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-21T10:23:03.427Z
Learnt from: johanneskoester
PR: snakemake/snakemake-wrappers#3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:29:27.546Z
Learnt from: tdayris
PR: snakemake/snakemake-wrappers#3496
File: bio/mtnucratio/wrapper.py:21-25
Timestamp: 2024-11-26T08:29:27.546Z
Learning: In `bio/mtnucratio/wrapper.py`, when creating symlinks within a temporary directory, it's acceptable to rely on Python's built-in error handling for I/O permissions and availability. Explicit try-except blocks around symlink creation may not be necessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:33:32.457Z
Learnt from: tdayris
PR: snakemake/snakemake-wrappers#3497
File: bio/sexdeterrmine/wrapper.py:0-0
Timestamp: 2024-11-26T08:33:32.457Z
Learning: In the `bio/sexdeterrmine/wrapper.py` file for the Sex.DetERRmine wrapper, missing input/output errors are raised by the shell command and captured in the logs. Therefore, additional error handling for file operations is not necessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
🪛 Ruff (0.14.2)
bio/cnvkit/batch/wrapper.py
69-69: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: docs
- GitHub Check: testing
- GitHub Check: Summary
🔇 Additional comments (1)
bio/cnvkit/batch/wrapper.py (1)
71-71: LGTM!The copy operation correctly constructs the source path and uses
shutil.copy2to preserve file metadata.
Updates the output file lookup logic to correctly handle BAM files whose names contain intermediate words/tags (like `{sample}.haplotagged.bam`).
|
Maybe consider using the function snakemake-wrappers/bio/tmb/pyeffgenomesize/wrapper.py Lines 53 to 54 in 60860c8
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
bio/cnvkit/batch/wrapper.py (2)
96-104: Address previous review feedback: use strict zip and consider snakemake-wrapper-utils.Two issues from previous review remain unaddressed:
zip()withoutstrict=parameter: Ifsourcesanddestinationslengths differ, files will be silently skipped, causing data integrity issues. As flagged by the previous reviewer and Ruff, usestrict=True(Python 3.10+) or add an explicit length check.Consider
move_filesutility: Reviewer fgvieira suggested using themove_filesfunction fromsnakemake-wrapper-utils(as shown in this example). This would provide more robust file handling and reduce custom logic.For the zip issue, apply this diff:
- for source, destination in zip(sources, destinations): + if len(sources) != len(destinations): + raise ValueError( + f"Mismatch: {len(sources)} matched files but {len(destinations)} outputs. " + f"Matched: {sources}" + ) + for source, destination in zip(sources, destinations):Alternatively, if Python 3.10+ is guaranteed:
- for source, destination in zip(sources, destinations): + for source, destination in zip(sources, destinations, strict=True):Please also address whether the
move_filesapproach would simplify this logic.
83-93: Missing validation that all expected outputs were matched.The matching logic silently skips any expected output attributes that don't have corresponding files in the temp directory. If CNVkit fails to generate an expected file, or if the suffix matching fails, the output will be incomplete without error.
Add validation after the matching loop:
for file in temp_files: for suffix, attr in file_map.items(): if file.endswith(suffix): # Skip ambiguous matches if attr == "segments" and any( x in file for x in ["call.cns", "bintest.cns"] ): continue output_path = getattr(snakemake.output, attr, None) if output_path is None: raise ValueError( f"Output attribute '{attr}' not defined in Snakemake rule for file '{file}'" ) destinations.append(output_path) sources.append(file) break # stop after first match # Validate that all expected outputs were found expected_outputs = set(destinations) declared_outputs = { getattr(snakemake.output, attr, None) for attr in file_map.values() if hasattr(snakemake.output, attr) } missing = declared_outputs - expected_outputs if missing: raise ValueError( f"Expected outputs not found in temp directory: {missing}. " f"Generated files: {temp_files}" )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
bio/cnvkit/batch/wrapper.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
bio/cnvkit/batch/wrapper.py
**/wrapper.py
⚙️ CodeRabbit configuration file
Do not complain about use of undefined variable called
snakemake.
Files:
bio/cnvkit/batch/wrapper.py
🧠 Learnings (11)
📓 Common learnings
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/test/Snakefile:14-22
Timestamp: 2024-11-26T09:16:39.570Z
Learning: In the `snakemake-wrappers` repository, when writing test `Snakefile`s (e.g., `bio/ngscheckmate/makesnvpattern/test/Snakefile`), hardcoded input/output paths are acceptable because these are examples and the IO can be chosen freely.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3498
File: bio/ngscheckmate/ncm/wrapper.py:68-68
Timestamp: 2024-12-06T14:25:43.922Z
Learning: In the `bio/ngscheckmate/ncm/wrapper.py` file for the NGSCheckMate wrapper, do not suggest adding file existence checks for the list file input, as the files are already required by the wrapper.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/test/Snakefile:2-6
Timestamp: 2024-11-26T08:31:00.099Z
Learning: In test files for Snakemake wrappers, such as `bio/mtnucratio/test/Snakefile`, hard-coded input and output paths are acceptable as examples and do not need to use wildcards to make paths flexible.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/wrapper.py:18-24
Timestamp: 2024-11-26T09:08:06.041Z
Learning: In Snakemake wrappers, input file validation is managed by Snakemake, so manual validation in the wrapper code is unnecessary.
📚 Learning: 2024-11-15T18:31:15.447Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-21T10:23:03.427Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:29:27.546Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/wrapper.py:21-25
Timestamp: 2024-11-26T08:29:27.546Z
Learning: In `bio/mtnucratio/wrapper.py`, when creating symlinks within a temporary directory, it's acceptable to rely on Python's built-in error handling for I/O permissions and availability. Explicit try-except blocks around symlink creation may not be necessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-12-06T14:25:43.922Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3498
File: bio/ngscheckmate/ncm/wrapper.py:68-68
Timestamp: 2024-12-06T14:25:43.922Z
Learning: In the `bio/ngscheckmate/ncm/wrapper.py` file for the NGSCheckMate wrapper, do not suggest adding file existence checks for the list file input, as the files are already required by the wrapper.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:31:00.099Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/test/Snakefile:2-6
Timestamp: 2024-11-26T08:31:00.099Z
Learning: In test files for Snakemake wrappers, such as `bio/mtnucratio/test/Snakefile`, hard-coded input and output paths are acceptable as examples and do not need to use wildcards to make paths flexible.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:33:32.457Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3497
File: bio/sexdeterrmine/wrapper.py:0-0
Timestamp: 2024-11-26T08:33:32.457Z
Learning: In the `bio/sexdeterrmine/wrapper.py` file for the Sex.DetERRmine wrapper, missing input/output errors are raised by the shell command and captured in the logs. Therefore, additional error handling for file operations is not necessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:35:42.140Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3497
File: bio/sexdeterrmine/wrapper.py:23-26
Timestamp: 2024-11-26T08:35:42.140Z
Learning: In the `bio/sexdeterrmine/wrapper.py` file (Python), we rely on Samtools to handle input validation for the depth file provided by the user, so additional file existence checks are not necessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2025-01-30T14:15:32.176Z
Learnt from: dlaehnemann
Repo: snakemake/snakemake-wrappers PR: 3432
File: bio/reference/ensembl-sequence/wrapper.py:64-70
Timestamp: 2025-01-30T14:15:32.176Z
Learning: In Snakemake wrappers that download multiple chromosomes into a single output file, if any chromosome download fails, the entire operation should fail immediately (using `break`) rather than continuing with remaining chromosomes, as partial output would be invalid.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T10:49:54.765Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3501
File: meta/bio/varscan2_snpeff/test/Snakefile:58-71
Timestamp: 2024-11-26T10:49:54.765Z
Learning: In test Snakefiles within the snakemake-wrappers repository, it is acceptable to use simplified paths and logging configurations that may differ from real-life pipelines.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T14:59:03.678Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3502
File: bio/ngsbits/sampleancestry/wrapper.py:18-23
Timestamp: 2024-11-26T14:59:03.678Z
Learning: In Snakemake wrapper scripts, Snakemake validates input and output paths, so explicit shell quoting is not necessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
🪛 Ruff (0.14.3)
bio/cnvkit/batch/wrapper.py
91-91: Undefined name snakemake
(F821)
96-96: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Summary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
bio/cnvkit/batch/wrapper.py (2)
83-93: Add validation for missing output attributes.If
snakemake.outputdoesn't have the expected attribute,getattr(..., None)returnsNone, which gets appended todestinations. This will cause a crypticTypeErrorduringcopy2()at line 98 rather than a clear error message about the missing output.Apply this diff to validate output attributes:
if attr == "segments" and any( x in file for x in ["call.cns", "bintest.cns"] ): continue - destinations.append(getattr(snakemake.output, attr, None)) + output_path = getattr(snakemake.output, attr, None) + if output_path is None: + raise ValueError( + f"Output attribute '{attr}' not defined in Snakemake rule for file '{file}'" + ) + destinations.append(output_path) sources.append(file) break # stop after first match
95-104: Add length validation before zipping sources and destinations.The
zip()operation silently stops at the shorter iterable ifsourcesanddestinationshave different lengths, which could result in missing output files or uncopied temporary files. This creates a data integrity risk if CNVkit produces unexpected auxiliary files or if the matching logic has bugs.For Python 3.10+, add
strict=Trueto zip:- for source, destination in zip(sources, destinations): + for source, destination in zip(sources, destinations, strict=True): try: shutil.copy2(join(tmpdirname, source), destination)For older Python versions, add an explicit length check:
+ if len(sources) != len(destinations): + raise ValueError( + f"Mismatch: {len(sources)} matched files but {len(destinations)} outputs expected. " + f"Matched files: {sources}" + ) for source, destination in zip(sources, destinations): try:
🧹 Nitpick comments (2)
bio/cnvkit/batch/wrapper.py (2)
70-78: Consider adding leading dots to suffixes for clarity.The suffix keys work correctly with
endswith()but could be more explicit. Adding leading dots (e.g.,.cnr,.cns,.call.cns) would make it clearer these are file extensions rather than arbitrary substrings.Example:
file_map = { - "antitargetcoverage.cnn": "antitarget_coverage", - "bintest.cns": "bins", - "cnr": "regions", - "call.cns": "segments_called", - "targetcoverage.cnn": "target_coverage", - "cns": "segments", - "reference.cnn": "reference", + ".antitargetcoverage.cnn": "antitarget_coverage", + ".bintest.cns": "bins", + ".cnr": "regions", + ".call.cns": "segments_called", + ".targetcoverage.cnn": "target_coverage", + ".cns": "segments", + ".reference.cnn": "reference", }
66-104: Consider using themove_filesutility from snakemake-wrapper-utils.As suggested by reviewer fgvieira, the
move_filesfunction from snakemake-wrapper-utils provides a battle-tested approach for moving output files from temporary directories. This would reduce custom code, improve maintainability, and leverage existing utility functions.Example usage (from other wrappers):
from snakemake_wrapper_utils.move_files import move_files # After cnvkit runs... move_files( src_dir=tmpdirname, dst_files=snakemake.output, pattern="{filename}", )See reference implementation: bio/tmb/pyeffgenomesize/wrapper.py
Note: You would need to verify that
move_filessupports the specific file suffix matching patterns required by CNVkit's output structure. Based on learnings
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
bio/cnvkit/batch/wrapper.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
bio/cnvkit/batch/wrapper.py
**/wrapper.py
⚙️ CodeRabbit configuration file
Do not complain about use of undefined variable called
snakemake.
Files:
bio/cnvkit/batch/wrapper.py
🧠 Learnings (9)
📓 Common learnings
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3498
File: bio/ngscheckmate/ncm/wrapper.py:68-68
Timestamp: 2024-12-06T14:25:43.922Z
Learning: In the `bio/ngscheckmate/ncm/wrapper.py` file for the NGSCheckMate wrapper, do not suggest adding file existence checks for the list file input, as the files are already required by the wrapper.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/test/Snakefile:14-22
Timestamp: 2024-11-26T09:16:39.570Z
Learning: In the `snakemake-wrappers` repository, when writing test `Snakefile`s (e.g., `bio/ngscheckmate/makesnvpattern/test/Snakefile`), hardcoded input/output paths are acceptable because these are examples and the IO can be chosen freely.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/wrapper.py:18-24
Timestamp: 2024-11-26T09:08:06.041Z
Learning: In Snakemake wrappers, input file validation is managed by Snakemake, so manual validation in the wrapper code is unnecessary.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/test/Snakefile:2-6
Timestamp: 2024-11-26T08:31:00.099Z
Learning: In test files for Snakemake wrappers, such as `bio/mtnucratio/test/Snakefile`, hard-coded input and output paths are acceptable as examples and do not need to use wildcards to make paths flexible.
📚 Learning: 2024-11-15T18:31:15.447Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-21T10:23:03.427Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:29:27.546Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/wrapper.py:21-25
Timestamp: 2024-11-26T08:29:27.546Z
Learning: In `bio/mtnucratio/wrapper.py`, when creating symlinks within a temporary directory, it's acceptable to rely on Python's built-in error handling for I/O permissions and availability. Explicit try-except blocks around symlink creation may not be necessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-12-06T14:25:43.922Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3498
File: bio/ngscheckmate/ncm/wrapper.py:68-68
Timestamp: 2024-12-06T14:25:43.922Z
Learning: In the `bio/ngscheckmate/ncm/wrapper.py` file for the NGSCheckMate wrapper, do not suggest adding file existence checks for the list file input, as the files are already required by the wrapper.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:33:32.457Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3497
File: bio/sexdeterrmine/wrapper.py:0-0
Timestamp: 2024-11-26T08:33:32.457Z
Learning: In the `bio/sexdeterrmine/wrapper.py` file for the Sex.DetERRmine wrapper, missing input/output errors are raised by the shell command and captured in the logs. Therefore, additional error handling for file operations is not necessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:35:42.140Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3497
File: bio/sexdeterrmine/wrapper.py:23-26
Timestamp: 2024-11-26T08:35:42.140Z
Learning: In the `bio/sexdeterrmine/wrapper.py` file (Python), we rely on Samtools to handle input validation for the depth file provided by the user, so additional file existence checks are not necessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T10:49:54.765Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3501
File: meta/bio/varscan2_snpeff/test/Snakefile:58-71
Timestamp: 2024-11-26T10:49:54.765Z
Learning: In test Snakefiles within the snakemake-wrappers repository, it is acceptable to use simplified paths and logging configurations that may differ from real-life pipelines.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T14:59:03.678Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3502
File: bio/ngsbits/sampleancestry/wrapper.py:18-23
Timestamp: 2024-11-26T14:59:03.678Z
Learning: In Snakemake wrapper scripts, Snakemake validates input and output paths, so explicit shell quoting is not necessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
🪛 Ruff (0.14.3)
bio/cnvkit/batch/wrapper.py
91-91: Undefined name snakemake
(F821)
96-96: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: testing
- GitHub Check: docs
- GitHub Check: Summary
use file.endswith(suffix) to find suffix in file - more robust; remove redundant dots in the beginning of "antitargetcoverage.cnn" and "targetcoverage.cnn"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
bio/cnvkit/batch/wrapper.py (4)
70-78: Consider adding leading dots to generic extension suffixes.The suffixes
"cnr"and"cns"on lines 73 and 76 lack leading dots, which meansfile.endswith("cnr")would match unusual filenames like"mycnr"or"samplecnr". While CNVkit likely doesn't generate such names, explicit extension matching is more robust.Consider this change for clarity:
file_map = { "antitargetcoverage.cnn": "antitarget_coverage", "bintest.cns": "bins", - "cnr": "regions", + ".cnr": "regions", "call.cns": "segments_called", "targetcoverage.cnn": "target_coverage", - "cns": "segments", + ".cns": "segments", "reference.cnn": "reference", }
87-89: Ambiguity check could be more precise.The substring matching with
any(x in file for x in ["call.cns", "bintest.cns"])works for typical CNVkit outputs but could theoretically match unexpected patterns (e.g., a filename containing"call.cns"in the middle).For stricter matching, consider:
if attr == "segments" and any( - x in file for x in ["call.cns", "bintest.cns"] + file.endswith(x) for x in ["call.cns", "bintest.cns"] ):
98-98: Add explicitstrict=Trueto prevent silent mismatches.The
zip()withoutstrict=Truewill silently truncate ifsourcesanddestinationshave different lengths, potentially leaving files uncopied or outputs missing. This was flagged in previous review rounds.If Python 3.10+ is available:
- for source, destination in zip(sources, destinations): + for source, destination in zip(sources, destinations, strict=True):Otherwise, add an explicit check before the loop:
+ if len(sources) != len(destinations): + raise ValueError( + f"Mismatch: {len(sources)} source files but {len(destinations)} destinations. " + f"Sources: {sources}, Destinations: {destinations}" + ) for source, destination in zip(sources, destinations):🧰 Tools
🪛 Ruff (0.14.3)
98-98:
zip()without an explicitstrict=parameterAdd explicit value for parameter
strict=(B905)
66-106: Consider usingmove_filesfrom snakemake-wrapper-utils.As reviewer fgvieira suggested, the snakemake-wrapper-utils library provides a
move_filesfunction designed for this exact scenario. Using it would simplify the code, reduce maintenance burden, and leverage well-tested utility logic.Reference example from bio/tmb/pyeffgenomesize/wrapper.py:
from snakemake_wrapper_utils.util import move_files # After cnvkit runs in tmpdirname: move_files(tmpdirname, snakemake.output)This would eliminate the need for custom suffix mapping, ambiguity checks, and manual file pairing. Based on learnings from reviewer feedback.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
bio/cnvkit/batch/wrapper.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
bio/cnvkit/batch/wrapper.py
**/wrapper.py
⚙️ CodeRabbit configuration file
Do not complain about use of undefined variable called
snakemake.
Files:
bio/cnvkit/batch/wrapper.py
🧠 Learnings (17)
📓 Common learnings
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3498
File: bio/ngscheckmate/ncm/wrapper.py:68-68
Timestamp: 2024-12-06T14:25:43.922Z
Learning: In the `bio/ngscheckmate/ncm/wrapper.py` file for the NGSCheckMate wrapper, do not suggest adding file existence checks for the list file input, as the files are already required by the wrapper.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/test/Snakefile:14-22
Timestamp: 2024-11-26T09:16:39.570Z
Learning: In the `snakemake-wrappers` repository, when writing test `Snakefile`s (e.g., `bio/ngscheckmate/makesnvpattern/test/Snakefile`), hardcoded input/output paths are acceptable because these are examples and the IO can be chosen freely.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3501
File: meta/bio/varscan2_snpeff/test/Snakefile:1-10
Timestamp: 2024-11-26T10:49:04.406Z
Learning: Using generic output filenames like "genome.fasta" is acceptable in the `snakemake-wrappers` project.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3502
File: bio/ngsbits/sampleancestry/wrapper.py:18-23
Timestamp: 2024-11-26T14:59:03.678Z
Learning: In Snakemake wrapper scripts, Snakemake validates input and output paths, so explicit shell quoting is not necessary.
📚 Learning: 2024-11-15T18:31:15.447Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-21T10:23:03.427Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:29:27.546Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/wrapper.py:21-25
Timestamp: 2024-11-26T08:29:27.546Z
Learning: In `bio/mtnucratio/wrapper.py`, when creating symlinks within a temporary directory, it's acceptable to rely on Python's built-in error handling for I/O permissions and availability. Explicit try-except blocks around symlink creation may not be necessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-12-06T14:25:43.922Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3498
File: bio/ngscheckmate/ncm/wrapper.py:68-68
Timestamp: 2024-12-06T14:25:43.922Z
Learning: In the `bio/ngscheckmate/ncm/wrapper.py` file for the NGSCheckMate wrapper, do not suggest adding file existence checks for the list file input, as the files are already required by the wrapper.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T14:59:03.678Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3502
File: bio/ngsbits/sampleancestry/wrapper.py:18-23
Timestamp: 2024-11-26T14:59:03.678Z
Learning: In Snakemake wrapper scripts, Snakemake validates input and output paths, so explicit shell quoting is not necessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T10:49:04.406Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3501
File: meta/bio/varscan2_snpeff/test/Snakefile:1-10
Timestamp: 2024-11-26T10:49:04.406Z
Learning: Using generic output filenames like "genome.fasta" is acceptable in the `snakemake-wrappers` project.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:31:00.099Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/test/Snakefile:2-6
Timestamp: 2024-11-26T08:31:00.099Z
Learning: In test files for Snakemake wrappers, such as `bio/mtnucratio/test/Snakefile`, hard-coded input and output paths are acceptable as examples and do not need to use wildcards to make paths flexible.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:33:32.457Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3497
File: bio/sexdeterrmine/wrapper.py:0-0
Timestamp: 2024-11-26T08:33:32.457Z
Learning: In the `bio/sexdeterrmine/wrapper.py` file for the Sex.DetERRmine wrapper, missing input/output errors are raised by the shell command and captured in the logs. Therefore, additional error handling for file operations is not necessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:35:42.140Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3497
File: bio/sexdeterrmine/wrapper.py:23-26
Timestamp: 2024-11-26T08:35:42.140Z
Learning: In the `bio/sexdeterrmine/wrapper.py` file (Python), we rely on Samtools to handle input validation for the depth file provided by the user, so additional file existence checks are not necessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T09:08:06.041Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/wrapper.py:18-24
Timestamp: 2024-11-26T09:08:06.041Z
Learning: In Snakemake wrappers, input file validation is managed by Snakemake, so manual validation in the wrapper code is unnecessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:30:23.818Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/wrapper.py:27-28
Timestamp: 2024-11-26T08:30:23.818Z
Learning: In Snakemake wrappers (e.g., `wrapper.py` files), it's unnecessary to verify the availability of tools like `mtnucratio` within the code, because Snakemake with Conda ensures that the required tools are installed and available.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T09:16:39.570Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/test/Snakefile:14-22
Timestamp: 2024-11-26T09:16:39.570Z
Learning: In the `snakemake-wrappers` repository, when writing test `Snakefile`s (e.g., `bio/ngscheckmate/makesnvpattern/test/Snakefile`), hardcoded input/output paths are acceptable because these are examples and the IO can be chosen freely.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-15T13:48:33.759Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/preprocess-variants/wrapper.py:0-0
Timestamp: 2024-11-15T13:48:33.759Z
Learning: In Snakemake wrappers, security considerations like input sanitization are unnecessary, as the wrappers are under full control of the user.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2025-01-30T14:15:32.176Z
Learnt from: dlaehnemann
Repo: snakemake/snakemake-wrappers PR: 3432
File: bio/reference/ensembl-sequence/wrapper.py:64-70
Timestamp: 2025-01-30T14:15:32.176Z
Learning: In Snakemake wrappers that download multiple chromosomes into a single output file, if any chromosome download fails, the entire operation should fail immediately (using `break`) rather than continuing with remaining chromosomes, as partial output would be invalid.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-08-21T08:33:50.878Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-08-21T08:33:50.878Z
Learning: The `snakemake` variable is inserted via a preamble during execution in `wrapper.py` scripts, so it doesn't need to be explicitly defined.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T10:49:54.765Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3501
File: meta/bio/varscan2_snpeff/test/Snakefile:58-71
Timestamp: 2024-11-26T10:49:54.765Z
Learning: In test Snakefiles within the snakemake-wrappers repository, it is acceptable to use simplified paths and logging configurations that may differ from real-life pipelines.
Applied to files:
bio/cnvkit/batch/wrapper.py
🪛 Ruff (0.14.3)
bio/cnvkit/batch/wrapper.py
91-91: Undefined name snakemake
(F821)
98-98: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: testing
- GitHub Check: docs
🔇 Additional comments (1)
bio/cnvkit/batch/wrapper.py (1)
91-94: Good handling of optional outputs.The logic correctly skips files whose corresponding output attributes are not defined in the Snakemake rule, avoiding cryptic
copy2()failures while allowing CNVkit to produce optional outputs.Note: The Ruff F821 warning about undefined
snakemakeis expected per the coding guidelines, as this variable is injected at runtime by Snakemake.
to mv cnvkit-generated files to snakemake-specified output. add leading dots to generic extension suffixes. rm not needed imports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
bio/cnvkit/batch/wrapper.py (2)
64-76: Well-designed suffix mapping structure.The file discovery and mapping approach is a significant improvement over the previous implementation. The suffix keys are intentionally mixed (some with leading dots like
.cnr,.cnsfor generic extensions; others without likecall.cns,bintest.cnsfor specific patterns). This design correctly prioritizes more-specific matches when used withendswith().Consider adding a brief comment explaining the suffix design rationale:
# Mapping of file suffixes to snakemake.output attributes # More specific suffixes (call.cns, bintest.cns) are checked before generic ones (.cns) file_map = { ... }
78-90: Robust matching logic with proper ambiguity handling.The matching logic correctly uses
endswith()for suffix matching (resolving the past review concern about substring matching). The ambiguity check for segments (lines 85-88) is a defensive measure that prevents incorrect classification when filenames contain both specific and generic patterns.Consider adding a comment to explain the ambiguity check:
# Skip ambiguous matches - if checking generic .cns but file contains # more specific patterns (call.cns or bintest.cns), don't match here if attr == "segments" and any( x in file for x in ["call.cns", "bintest.cns"] ): continue
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
bio/cnvkit/batch/environment.yaml(1 hunks)bio/cnvkit/batch/wrapper.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
bio/cnvkit/batch/wrapper.py
**/wrapper.py
⚙️ CodeRabbit configuration file
Do not complain about use of undefined variable called
snakemake.
Files:
bio/cnvkit/batch/wrapper.py
🧠 Learnings (27)
📓 Common learnings
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3498
File: bio/ngscheckmate/ncm/wrapper.py:68-68
Timestamp: 2024-12-06T14:25:43.922Z
Learning: In the `bio/ngscheckmate/ncm/wrapper.py` file for the NGSCheckMate wrapper, do not suggest adding file existence checks for the list file input, as the files are already required by the wrapper.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/test/Snakefile:14-22
Timestamp: 2024-11-26T09:16:39.570Z
Learning: In the `snakemake-wrappers` repository, when writing test `Snakefile`s (e.g., `bio/ngscheckmate/makesnvpattern/test/Snakefile`), hardcoded input/output paths are acceptable because these are examples and the IO can be chosen freely.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3502
File: bio/ngsbits/sampleancestry/wrapper.py:18-23
Timestamp: 2024-11-26T14:59:03.678Z
Learning: In Snakemake wrapper scripts, Snakemake validates input and output paths, so explicit shell quoting is not necessary.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-08-21T08:30:42.757Z
Learning: In `wrapper.py` scripts, do not flag the use of an undefined variable called `snakemake`.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-10-08T17:41:54.542Z
Learning: In `wrapper.py` scripts, do not flag the use of an undefined variable called `snakemake`.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/wrapper.py:27-28
Timestamp: 2024-11-26T08:30:23.818Z
Learning: In Snakemake wrappers (e.g., `wrapper.py` files), it's unnecessary to verify the availability of tools like `mtnucratio` within the code, because Snakemake with Conda ensures that the required tools are installed and available.
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/wrapper.py:18-24
Timestamp: 2024-11-26T09:08:06.041Z
Learning: In Snakemake wrappers, input file validation is managed by Snakemake, so manual validation in the wrapper code is unnecessary.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-08-21T08:33:50.878Z
Learning: The `snakemake` variable is inserted via a preamble during execution in `wrapper.py` scripts, so it doesn't need to be explicitly defined.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-10-08T17:41:54.542Z
Learning: The `snakemake` variable is inserted via a preamble during execution in `wrapper.py` scripts, so it doesn't need to be explicitly defined.
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/preprocess-variants/wrapper.py:0-0
Timestamp: 2024-11-15T13:48:33.759Z
Learning: In Snakemake wrappers, security considerations like input sanitization are unnecessary, as the wrappers are under full control of the user.
📚 Learning: 2025-04-17T09:24:51.738Z
Learnt from: dlaehnemann
Repo: snakemake/snakemake-wrappers PR: 0
File: :0-0
Timestamp: 2025-04-17T09:24:51.738Z
Learning: In snakemake-wrappers repository, environment.yaml files should follow these conventions:
1. Use whitespace before the equal sign in version specifications (e.g., "datavzrd =2.53.1")
2. Only specify the exact version for the main software package
3. Don't add version constraints for dependencies unless absolutely necessary
4. See full guidelines at: https://snakemake-wrappers.readthedocs.io/en/stable/contributing.html#environment-yaml-file
Applied to files:
bio/cnvkit/batch/environment.yaml
📚 Learning: 2025-02-11T12:24:22.592Z
Learnt from: dlaehnemann
Repo: snakemake/snakemake-wrappers PR: 3648
File: bio/nanosim/simulator/environment.yaml:6-6
Timestamp: 2025-02-11T12:24:22.592Z
Learning: In the nanosim bioconda recipe, dependencies are carefully managed with specific version pins (e.g., scikit-learn ~=0.23.2) to ensure compatibility with pre-trained models. These dependencies don't need to be explicitly added to environment.yaml files when the main package is listed as a dependency, as they are handled through the bioconda recipe system.
Applied to files:
bio/cnvkit/batch/environment.yaml
📚 Learning: 2025-04-17T09:24:51.738Z
Learnt from: dlaehnemann
Repo: snakemake/snakemake-wrappers PR: 0
File: :0-0
Timestamp: 2025-04-17T09:24:51.738Z
Learning: In snakemake-wrappers repository, environment.yaml files should follow these conventions:
1. Use whitespace before the equal sign in version specifications (e.g., "datavzrd =2.53.1")
2. Only specify the exact version for the main software package
3. Don't add version constraints for dependencies unless absolutely necessary
4. See guidelines at: https://snakemake-wrappers.readthedocs.io/en/stable/contributing.html#environment-yaml-file
Applied to files:
bio/cnvkit/batch/environment.yaml
📚 Learning: 2024-11-26T08:30:23.818Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/wrapper.py:27-28
Timestamp: 2024-11-26T08:30:23.818Z
Learning: In Snakemake wrappers (e.g., `wrapper.py` files), it's unnecessary to verify the availability of tools like `mtnucratio` within the code, because Snakemake with Conda ensures that the required tools are installed and available.
Applied to files:
bio/cnvkit/batch/environment.yamlbio/cnvkit/batch/wrapper.py
📚 Learning: 2025-06-02T07:56:35.854Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 4159
File: bio/pyfaidx/environment.yaml:6-6
Timestamp: 2025-06-02T07:56:35.854Z
Learning: In the Snakemake-wrapper repository, conda dependency version pinning in environment.yaml files uses spaces around the equals sign (e.g., `- pyfaidx =0.8.1.4`) as the established coding standard, even though conda itself doesn't require the spaces.
Applied to files:
bio/cnvkit/batch/environment.yaml
📚 Learning: 2025-02-11T12:24:22.592Z
Learnt from: dlaehnemann
Repo: snakemake/snakemake-wrappers PR: 3648
File: bio/nanosim/simulator/environment.yaml:6-6
Timestamp: 2025-02-11T12:24:22.592Z
Learning: For bioconda packages like nanosim, dependencies like scikit-learn are managed through the bioconda recipe and don't need to be explicitly added to the environment.yaml file when the main package is listed as a dependency.
Applied to files:
bio/cnvkit/batch/environment.yaml
📚 Learning: 2024-08-21T08:30:42.757Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-08-21T08:30:42.757Z
Learning: In `wrapper.py` scripts, do not flag the use of an undefined variable called `snakemake`.
Applied to files:
bio/cnvkit/batch/environment.yamlbio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T10:49:54.765Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3501
File: meta/bio/varscan2_snpeff/test/Snakefile:58-71
Timestamp: 2024-11-26T10:49:54.765Z
Learning: In test Snakefiles within the snakemake-wrappers repository, it is acceptable to use simplified paths and logging configurations that may differ from real-life pipelines.
Applied to files:
bio/cnvkit/batch/environment.yamlbio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-15T18:32:07.612Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:11-12
Timestamp: 2024-11-15T18:32:07.612Z
Learning: In the `snakemake-wrappers` repository, when developing wrappers, it's acceptable to use "master" as the wrapper version in the `wrapper` directive. This ensures that the documentation always reflects the latest stable version of the wrappers.
Applied to files:
bio/cnvkit/batch/environment.yaml
📚 Learning: 2024-08-21T08:33:50.878Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-08-21T08:33:50.878Z
Learning: The `snakemake` variable is inserted via a preamble during execution in `wrapper.py` scripts, so it doesn't need to be explicitly defined.
Applied to files:
bio/cnvkit/batch/environment.yamlbio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T10:49:04.406Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3501
File: meta/bio/varscan2_snpeff/test/Snakefile:1-10
Timestamp: 2024-11-26T10:49:04.406Z
Learning: In the `snakemake-wrappers` project, it's acceptable to use "master" in the wrapper path instead of pinning to a specific version.
Applied to files:
bio/cnvkit/batch/environment.yaml
📚 Learning: 2024-11-15T13:48:33.759Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/preprocess-variants/wrapper.py:0-0
Timestamp: 2024-11-15T13:48:33.759Z
Learning: In Snakemake wrappers, security considerations like input sanitization are unnecessary, as the wrappers are under full control of the user.
Applied to files:
bio/cnvkit/batch/environment.yamlbio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-21T10:23:03.427Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.
Applied to files:
bio/cnvkit/batch/environment.yamlbio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-15T18:31:15.447Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/test/Snakefile:7-10
Timestamp: 2024-11-15T18:31:15.447Z
Learning: In the Snakemake wrappers repository, avoid suggesting refactoring that involves using `tempfile.gettempdir()` or changing output paths to temporary directories.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-12-06T14:25:43.922Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3498
File: bio/ngscheckmate/ncm/wrapper.py:68-68
Timestamp: 2024-12-06T14:25:43.922Z
Learning: In the `bio/ngscheckmate/ncm/wrapper.py` file for the NGSCheckMate wrapper, do not suggest adding file existence checks for the list file input, as the files are already required by the wrapper.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T14:59:03.678Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3502
File: bio/ngsbits/sampleancestry/wrapper.py:18-23
Timestamp: 2024-11-26T14:59:03.678Z
Learning: In Snakemake wrapper scripts, Snakemake validates input and output paths, so explicit shell quoting is not necessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:29:27.546Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/wrapper.py:21-25
Timestamp: 2024-11-26T08:29:27.546Z
Learning: In `bio/mtnucratio/wrapper.py`, when creating symlinks within a temporary directory, it's acceptable to rely on Python's built-in error handling for I/O permissions and availability. Explicit try-except blocks around symlink creation may not be necessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:33:32.457Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3497
File: bio/sexdeterrmine/wrapper.py:0-0
Timestamp: 2024-11-26T08:33:32.457Z
Learning: In the `bio/sexdeterrmine/wrapper.py` file for the Sex.DetERRmine wrapper, missing input/output errors are raised by the shell command and captured in the logs. Therefore, additional error handling for file operations is not necessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:35:42.140Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3497
File: bio/sexdeterrmine/wrapper.py:23-26
Timestamp: 2024-11-26T08:35:42.140Z
Learning: In the `bio/sexdeterrmine/wrapper.py` file (Python), we rely on Samtools to handle input validation for the depth file provided by the user, so additional file existence checks are not necessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T09:08:06.041Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/wrapper.py:18-24
Timestamp: 2024-11-26T09:08:06.041Z
Learning: In Snakemake wrappers, input file validation is managed by Snakemake, so manual validation in the wrapper code is unnecessary.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T09:16:39.570Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3499
File: bio/ngscheckmate/makesnvpattern/test/Snakefile:14-22
Timestamp: 2024-11-26T09:16:39.570Z
Learning: In the `snakemake-wrappers` repository, when writing test `Snakefile`s (e.g., `bio/ngscheckmate/makesnvpattern/test/Snakefile`), hardcoded input/output paths are acceptable because these are examples and the IO can be chosen freely.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T08:31:00.099Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/test/Snakefile:2-6
Timestamp: 2024-11-26T08:31:00.099Z
Learning: In test files for Snakemake wrappers, such as `bio/mtnucratio/test/Snakefile`, hard-coded input and output paths are acceptable as examples and do not need to use wildcards to make paths flexible.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2025-01-30T14:15:32.176Z
Learnt from: dlaehnemann
Repo: snakemake/snakemake-wrappers PR: 3432
File: bio/reference/ensembl-sequence/wrapper.py:64-70
Timestamp: 2025-01-30T14:15:32.176Z
Learning: In Snakemake wrappers that download multiple chromosomes into a single output file, if any chromosome download fails, the entire operation should fail immediately (using `break`) rather than continuing with remaining chromosomes, as partial output would be invalid.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-26T10:49:04.406Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3501
File: meta/bio/varscan2_snpeff/test/Snakefile:1-10
Timestamp: 2024-11-26T10:49:04.406Z
Learning: Using generic output filenames like "genome.fasta" is acceptable in the `snakemake-wrappers` project.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2025-01-30T14:14:03.928Z
Learnt from: dlaehnemann
Repo: snakemake/snakemake-wrappers PR: 3432
File: bio/reference/ensembl-sequence/wrapper.py:64-70
Timestamp: 2025-01-30T14:14:03.928Z
Learning: In Snakemake wrappers, logging should be done using `snakemake.logging.logger`, not `shell.logger`.
Applied to files:
bio/cnvkit/batch/wrapper.py
📚 Learning: 2024-11-15T18:36:04.660Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/call-variants/wrapper.py:15-23
Timestamp: 2024-11-15T18:36:04.660Z
Learning: In the Snakemake wrappers repository, using `shell=True` and redirecting within shell commands is acceptable.
Applied to files:
bio/cnvkit/batch/wrapper.py
🪛 Ruff (0.14.3)
bio/cnvkit/batch/wrapper.py
13-13: Undefined name snakemake
(F821)
92-92: Loop control variable file not used within loop body
(B007)
92-92: Undefined name snakemake
(F821)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: testing
- GitHub Check: docs
- GitHub Check: Summary
🔇 Additional comments (3)
bio/cnvkit/batch/environment.yaml (1)
7-7: LGTM - Dependency addition follows conventions.The addition of
snakemake-wrapper-utils =0.8.0correctly follows the established format with proper spacing around the equals sign and supports themove_filesutility used in the wrapper refactoring.bio/cnvkit/batch/wrapper.py (2)
10-13: Good refactoring - follows reviewer suggestion.The import of
move_filesfromsnakemake-wrapper-utilsaligns with reviewer fgvieira's suggestion to reuse existing utility code. The logging change to capture both stdout and stderr with append mode improves observability.
92-93: Correct usage of move_files utility.The implementation correctly uses the
move_filesutility as suggested by the reviewer. The loop variablefilecontains shell commands that are interpolated byshell()through variable substitution, so Ruff's "unused variable" warning (B007) is a false positive—the variable is used within the shell command string.The
move_filesutility should handle edge cases (such as missing output attributes or extra generated files) according to the snakemake-wrapper-utils design.
| for file in temp_files: | ||
| for suffix, attr in file_map.items(): | ||
| if file.endswith(suffix): | ||
| # Skip ambiguous matches | ||
| if attr == "segments" and any( | ||
| x in file for x in ["call.cns", "bintest.cns"] | ||
| ): | ||
| continue | ||
| mapping[attr] = join(tmpdirname, file) | ||
| break # stop after first match |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if an output was not specified in the rule output files? As far as I can see, it would still be added to the mapping and it would fail on move_files, no?
Maybe something like:
| for file in temp_files: | |
| for suffix, attr in file_map.items(): | |
| if file.endswith(suffix): | |
| # Skip ambiguous matches | |
| if attr == "segments" and any( | |
| x in file for x in ["call.cns", "bintest.cns"] | |
| ): | |
| continue | |
| mapping[attr] = join(tmpdirname, file) | |
| break # stop after first match | |
| for suffix, attr in file_map.items(): | |
| if not snakemake.output.get(attr): | |
| continue | |
| for file in temp_files: | |
| if file.endswith(suffix): | |
| # Skip ambiguous matches | |
| if attr == "segments" and any( | |
| x in file for x in ["call.cns", "bintest.cns"] | |
| ): | |
| continue | |
| mapping[attr] = join(tmpdirname, file) | |
| break # stop after first match |
Fixes issue #4680
Updates the output file copying logic to correctly handle BAM files whose names contain intermediate words/tags (like
{sample}.haplotagged.bam).QC
snakemake-wrappers.While the contributions guidelines are more extensive, please particularly ensure that:
test.pywas updated to call any added or updated example rules in aSnakefileinput:andoutput:file paths in the rules can be chosen arbitrarilyinput:oroutput:)tempfile.gettempdir()points tometa.yamlcontains a link to the documentation of the respective tool or command underurl:Summary by CodeRabbit
Bug Fixes
Refactor
Chores