Skip to content

cli: add option to not get the all-altloc selection string from find_altloc_selections.py#221

Draft
k-chrispens wants to merge 3 commits intomainfrom
kmc/add-altloc-selection-option
Draft

cli: add option to not get the all-altloc selection string from find_altloc_selections.py#221
k-chrispens wants to merge 3 commits intomainfrom
kmc/add-altloc-selection-option

Conversation

@k-chrispens
Copy link
Copy Markdown
Collaborator

@k-chrispens k-chrispens commented Apr 13, 2026

Summary by CodeRabbit

  • New Features

    • Added --no-all-altlocs CLI flag to control inclusion of per-chain altloc residue selections (enabled by default).
  • Documentation

    • Updated formatting and line wrapping throughout evaluation documentation for improved readability.
  • Chores

    • Updated tool configuration ordering in project settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 13, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c3799260-f8c2-4367-9426-a888eb029bda

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR adds a new include_all_altlocs boolean parameter to control whether per-chain aggregate selections for all altloc residues are included in output. The parameter threads through the CLI script into the core utility function, with conditional logic applied. Ancillary formatting and configuration changes also included.

Changes

Cohort / File(s) Summary
Formatting & Configuration
docker-entrypoint.sh, pyproject.toml, scripts/eval/EVALUATION.md
Removed trailing whitespace, reordered tool configuration entries, and normalized documentation line wrapping without functional changes.
CLI Parameter Threading
scripts/eval/find_altloc_selections.py
Added --no-all-altlocs CLI flag that sets include_all_altlocs=False, updated _process_row() signature to accept this parameter and forward it to the underlying utility function.
Core Utility Implementation
src/sampleworks/utils/cif_utils.py
Updated find_altloc_selections() signature with new include_all_altlocs boolean parameter (default True), conditionally building and yielding per-chain aggregate selections only when enabled.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 A parameter hops through the code,
Threading its way down the path trod,
Altlocs included or cast to the side,
With include_all_altlocs as guide—
Config is tidy, the logic runs bright! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding a CLI option to exclude the all-altloc selection string from find_altloc_selections.py, which is reflected in the --no-all-altlocs flag and include_all_altlocs parameter additions across multiple files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch kmc/add-altloc-selection-option

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a CLI-controlled option to suppress the “all altloc residues per chain” selection emitted by find_altloc_selections(), enabling workflows that only want span-based selections.

Changes:

  • Extend find_altloc_selections() with include_all_altlocs to optionally omit the final per-chain “all altlocs” selection.
  • Add --no-all-altlocs to scripts/eval/find_altloc_selections.py to expose the behavior via CLI.
  • Minor formatting/maintenance updates (docs whitespace, ty rule ordering, lockfile hash, trailing whitespace).

Reviewed changes

Copilot reviewed 3 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/sampleworks/utils/cif_utils.py Adds include_all_altlocs flag and gates emission of the final per-chain selection.
scripts/eval/find_altloc_selections.py Wires CLI flag through to find_altloc_selections() and updates row processing.
scripts/eval/EVALUATION.md Whitespace/formatting cleanup only.
pyproject.toml Reorders tool.ty.rules entries (no functional behavior change expected).
pixi.lock Updates local package hash due to changes.
docker-entrypoint.sh Removes trailing whitespace in help text.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 42 to +46
Spans of altlocs shorter than this are not yielded as selection strings, but ARE
included in the final selections which includes all residues with altlocs in each chain.
include_all_altlocs : bool
If True (default), yield a final per-chain selection string containing all residues
with altlocs regardless of span length.
Comment on lines 21 to +87
@@ -38,6 +41,9 @@ def find_altloc_selections(
Minimum number of consecutive residues to consider an altloc selection.
Spans of altlocs shorter than this are not yielded as selection strings, but ARE
included in the final selections which includes all residues with altlocs in each chain.
include_all_altlocs : bool
If True (default), yield a final per-chain selection string containing all residues
with altlocs regardless of span length.

Yields
------
@@ -72,12 +78,13 @@ def find_altloc_selections(
# FIXME use new style selection https://github.com/diff-use/sampleworks/issues/56
yield f"chain {chain} and resi {start}-{end}" # old style, more compact, selection

if chain not in all_altloc_selections:
all_altloc_selections[chain] = []
if start == end:
all_altloc_selections[chain].append(f"(res_id == {start})")
else:
all_altloc_selections[chain].append(f"(res_id >= {start} and res_id <= {end})")
if include_all_altlocs:
if chain not in all_altloc_selections:
all_altloc_selections[chain] = []
if start == end:
all_altloc_selections[chain].append(f"(res_id == {start})")
else:
all_altloc_selections[chain].append(f"(res_id >= {start} and res_id <= {end})")
find_altloc_selections(cif_file, altloc_label, min_span, include_all_altlocs)
)
if not selections:
logger.warning(f"No altlocs found for {cif_file}")
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/sampleworks/utils/cif_utils.py (1)

40-46: ⚠️ Potential issue | 🟡 Minor

Docstring now overstates short-span inclusion behavior.

The min_span description still reads as unconditional inclusion in final selections, but this is now conditional on include_all_altlocs=True. Please align this text to prevent API confusion.

📝 Proposed docstring fix
     min_span : int
         Minimum number of consecutive residues to consider an altloc selection.
-        Spans of altlocs shorter than this are not yielded as selection strings, but ARE
-        included in the final selections which includes all residues with altlocs in each chain.
+        Spans shorter than this are not yielded as individual span selections.
+        When ``include_all_altlocs`` is True, they are still included in the final
+        per-chain aggregate selections.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/utils/cif_utils.py` around lines 40 - 46, Update the
docstring for the parameters min_span and include_all_altlocs in
src/sampleworks/utils/cif_utils.py to clarify behavior: state that spans of
altlocs shorter than min_span are not yielded as selection strings, and that
those short spans will only be included in the final per-chain selection string
if include_all_altlocs is True; mention both parameter names (min_span,
include_all_altlocs) so the maintainer can locate the docstring to adjust the
wording accordingly.
🧹 Nitpick comments (1)
scripts/eval/find_altloc_selections.py (1)

9-11: Add a NumPy-style docstring to _process_row().

This function is modified in this PR but still lacks a NumPy-style docstring, and it has an observable side effect (warning log when selections are empty).

📚 Proposed docstring addition
 def _process_row(
     row: pd.Series, altloc_label: str, min_span: int, include_all_altlocs: bool
 ) -> pd.Series:
+    """Convert one input row into the output selection schema.
+
+    Parameters
+    ----------
+    row : pd.Series
+        Input row with structure and map metadata.
+    altloc_label : str
+        CIF altloc field name.
+    min_span : int
+        Minimum span length for yielded altloc segments.
+    include_all_altlocs : bool
+        Whether to include per-chain aggregate altloc selections.
+
+    Returns
+    -------
+    pd.Series
+        Output row used by downstream evaluation scripts.
+
+    Notes
+    -----
+    Logs a warning when no altloc selection is found.
+    """

As per coding guidelines, "Always include NumPy-style docstrings for every function and class."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/find_altloc_selections.py` around lines 9 - 11, Add a
NumPy-style docstring to the function _process_row describing its purpose,
parameters (row: pd.Series, altloc_label: str, min_span: int,
include_all_altlocs: bool), return type (pd.Series) and behavior; explicitly
document the observable side effect that it may emit a warning log when
selections are empty and any exceptions or edge cases (e.g., empty inputs or
filtered results). Keep the docstring in NumPy style with short summary,
Parameters, Returns, and Notes/Warnings sections and reference the function's
behavior on empty selections so callers know about the logging side effect.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/sampleworks/utils/cif_utils.py`:
- Around line 40-46: Update the docstring for the parameters min_span and
include_all_altlocs in src/sampleworks/utils/cif_utils.py to clarify behavior:
state that spans of altlocs shorter than min_span are not yielded as selection
strings, and that those short spans will only be included in the final per-chain
selection string if include_all_altlocs is True; mention both parameter names
(min_span, include_all_altlocs) so the maintainer can locate the docstring to
adjust the wording accordingly.

---

Nitpick comments:
In `@scripts/eval/find_altloc_selections.py`:
- Around line 9-11: Add a NumPy-style docstring to the function _process_row
describing its purpose, parameters (row: pd.Series, altloc_label: str, min_span:
int, include_all_altlocs: bool), return type (pd.Series) and behavior;
explicitly document the observable side effect that it may emit a warning log
when selections are empty and any exceptions or edge cases (e.g., empty inputs
or filtered results). Keep the docstring in NumPy style with short summary,
Parameters, Returns, and Notes/Warnings sections and reference the function's
behavior on empty selections so callers know about the logging side effect.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8f537b4e-5d11-48a4-8fdc-9852fde03f03

📥 Commits

Reviewing files that changed from the base of the PR and between f57044e and 17e624f.

⛔ Files ignored due to path filters (1)
  • pixi.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • docker-entrypoint.sh
  • pyproject.toml
  • scripts/eval/EVALUATION.md
  • scripts/eval/find_altloc_selections.py
  • src/sampleworks/utils/cif_utils.py

@k-chrispens k-chrispens requested a review from DorisMai April 14, 2026 04:35
@k-chrispens k-chrispens marked this pull request as draft April 14, 2026 04:37
@k-chrispens
Copy link
Copy Markdown
Collaborator Author

I will add some tests, converting to draft

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants