Skip to content

feat: RSCC eval updates#224

Open
k-chrispens wants to merge 4 commits intomainfrom
kmc/rscc-eval-updates
Open

feat: RSCC eval updates#224
k-chrispens wants to merge 4 commits intomainfrom
kmc/rscc-eval-updates

Conversation

@k-chrispens
Copy link
Copy Markdown
Collaborator

@k-chrispens k-chrispens commented Apr 20, 2026

Script for finding the max rmsd subsegment in altlocs greater than a certain length, and speedups for the RSCC grid search script.

This depends on #223 and should only be merged afterwards

Summary by CodeRabbit

Release Notes

  • New Features

    • New evaluation scripts for analyzing alternate location regions and computing maximum RMSD subsegments.
    • Implemented parallel processing for RSCC grid search computations to improve performance.
    • Refactored density computation utilities with reusable transformer components.
  • Tests

    • Added integration test suite and fixtures for evaluation scripts.
  • Chores

    • Added joblib dependency for parallel processing support.

Copilot AI review requested due to automatic review settings April 20, 2026 20:11
@k-chrispens k-chrispens marked this pull request as draft April 20, 2026 20:11
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 20, 2026

📝 Walkthrough

Walkthrough

This PR extends protein structure evaluation capabilities by introducing new shared utilities for handling alternate conformations (altlocs), refactoring the density computation pipeline, adding two new evaluation scripts for classifying and analyzing altloc regions, parallelizing RSCC grid-search computations, and supporting integration tests. Dependencies include the addition of joblib for parallel processing.

Changes

Cohort / File(s) Summary
Agent Documentation
@AGENTS.md
New file introducing agent-related repository content.
Altloc Utilities
src/sampleworks/utils/atom_array_utils.py
New build_pairwise_altloc_arrays() helper constructs filtered atom pair arrays for every unordered altloc pair combination, with pair-level robustness for missing common atoms.
Density Pipeline Refactoring
src/sampleworks/utils/density_utils.py
Extracted density transformer construction and execution into new public helpers build_density_transformer() and run_density_transformer() to enable reuse across repeated computations on the same grid; compute_density_from_atomarray() refactored to delegate to these helpers.
Evaluation Scripts — Classification
scripts/eval/classify_altloc_regions.py
New script classifies altloc regions using the shared build_pairwise_altloc_arrays() utility and lDDT-based selection logic; simplified per-pair construction and updated type signatures for optional pairwise arrays.
Evaluation Scripts — RMSD Analysis
scripts/eval/find_max_rmsd_subsegment.py
New script identifies maximum all-atom RMSD windows across contiguous residue subsegments within selections, filtering out windows with compositional heterogeneity and tracking optimal window bounds, RMSD values, and altloc pairs.
Evaluation Scripts — RSCC Parallelization
scripts/eval/rscc_grid_search_script.py
Refactored to group processing by (protein, occupancy_key) and parallelize via joblib; replaced monolithic density computation with new transformer pipeline; introduced group-local base-map caching; improved per-selection error isolation and validation of density array shapes.
Test Fixtures
tests/eval/conftest.py
New shared pytest fixtures defining RsccFixture dataclass and factory for dynamically creating deterministic RSCC evaluation directory structures with symlinked CCP4/CIF resources and generated configuration files.
Integration Tests
tests/eval/test_rscc_grid_search_script.py
New integration tests validating end-to-end RSCC grid-search execution, schema correctness, error handling per selection, group processing ordering, transformer caching semantics, and immutability of cached xmap arrays.
Unit Tests
tests/utils/test_density_utils.py
New tests exercise the split density transformer helpers, validating functional equivalence with the existing wrapper and verifying deterministic repeated execution.
Dependencies
pyproject.toml
Added joblib to runtime dependencies for parallel group processing.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Main as rscc_grid_search_script.main()
    participant JobLib as joblib.Parallel
    participant Group as process_group()
    participant Setup as Transformer Setup
    participant Trial as Per-Trial Loop
    participant Select as Per-Selection Loop
    participant Output as Write CSV

    User->>Main: invoke with input CSV
    Main->>Main: load protein_configs.csv<br/>group by (protein, occ_key)
    Main->>JobLib: submit groups in parallel
    
    loop for each (protein, occ_key) group
        JobLib->>Group: process_group()
        Group->>Setup: load base_xmap<br/>build_density_transformer()
        Setup->>Setup: parse reference atoms<br/>initialize group cache
        
        loop for each trial in group
            Trial->>Trial: parse/refine alignment
            Trial->>Trial: run_density_transformer()<br/>compute full density once
            
            loop for each selection in trial
                Select->>Select: extract base/computed<br/>sub-regions (cached)
                Select->>Select: compute RSCC
                Select->>Trial: append result row<br/>(handle per-selection errors)
            end
        end
        
        Group->>JobLib: return trial results
    end
    
    JobLib->>Main: collect all group results
    Main->>Output: write rscc_results.csv
    Output->>User: return output path
Loading
sequenceDiagram
    actor User
    participant Main as find_max_rmsd_subsegment.main()
    participant LoadStruct as Load Structure
    participant BuildPair as build_pairwise_altloc_arrays()
    participant Window as _find_max_rmsd_window()
    participant RMSD as Compute RMSD
    participant Output as Write CSV

    User->>Main: invoke with input CSV
    Main->>Main: load protein_configs.csv
    
    loop for each protein
        Main->>LoadStruct: resolve CIF path<br/>load structure with altlocs
        LoadStruct->>BuildPair: extract all altloc IDs
        BuildPair->>BuildPair: construct filtered atom pairs<br/>for every unordered pair
        BuildPair->>Window: return pair_arrays dict
        
        loop for each selection in protein
            Window->>Window: parse chain/residue span<br/>enumerate windows
            
            loop for each altloc pair
                loop for each window in span
                    RMSD->>RMSD: check compositional<br/>heterogeneity
                    alt no heterogeneity
                        RMSD->>RMSD: compute all-atom RMSD<br/>on window atoms
                    end
                    RMSD->>Window: track max RMSD window
                end
            end
            
            Window->>Main: return best window,<br/>max_rmsd, altloc_pair
        end
        
        Main->>Output: aggregate per-protein<br/>selections by semicolon-join
    end
    
    Main->>Output: write output CSV<br/>+ optional diagnostic CSV
    Output->>User: return output paths
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • PR #129: Adds the same top-level agents documentation file, representing duplicated or concurrent work on repository metadata.
  • PR #112: Modifies src/sampleworks/utils/density_utils.py to refactor transformer input preparation and execution, overlapping with this PR's density pipeline reorganization.
  • PR #223: Directly related refactoring of scripts/eval/classify_altloc_regions.py to use the new shared build_pairwise_altloc_arrays utility and updated lDDT helpers.

Suggested reviewers

  • marcuscollins

Poem

🐰 Whiskers twitching with computational glee,
Altloc pairs now dance so wild and free,
Transformers split, parallelism shines bright—
RMSD windows find their highest height!
One repository, many conformations to see!

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 69.77% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The PR title 'feat: RSCC eval updates' is vague and does not clearly convey the main changes. While it references RSCC, it lacks specificity about the multiple significant additions (new scripts, refactoring, performance improvements) and only captures a subset of the work. Consider a more descriptive title like 'feat: Add altloc RMSD analysis and parallelize RSCC grid search' or 'feat: RSCC eval improvements with altloc analysis and performance optimizations' to better reflect the scope and primary objectives.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch kmc/rscc-eval-updates

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
scripts/eval/find_max_rmsd_subsegment.py (1)

38-39: Expand these function docstrings to NumPy style and document side effects.

_process_structure() loads CIF data, and main() reads/writes CSV files; those side effects should be explicit in the function docs.

As per coding guidelines, “Always include NumPy-style docstrings for every function and class” and “ALWAYS annotate complex array shapes and note side effects.”

Also applies to: 99-104, 185-185

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/find_max_rmsd_subsegment.py` around lines 38 - 39, Expand the
short docstring for _has_compositional_heterogeneity into a full NumPy-style
docstring: document Parameters (arr_i: np.ndarray, arr_j: np.ndarray — expected
shapes and dtype/contents, mask: np.ndarray[bool] — shape must match
arr_i/arr_j), the Returns (bool) and add a Notes/Side effects section stating
there are no I/O side effects and the function only inspects arrays; do the same
for _process_structure (document parameters, return type, and explicitly state
the side effect that it loads CIF data from disk/parses CIF content and any
mutation it performs on in-memory structure) and for main (document CLI args,
that it reads and writes CSV files, and that it has disk I/O side effects and
exit behavior); also update the other function docstrings called out in the
review (the functions referenced at the other noted ranges) to NumPy-style
including array shapes and explicit side-effect notes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/eval/find_max_rmsd_subsegment.py`:
- Around line 265-267: The CLI currently accepts non-positive --window-size
values which break the sliding-window logic; add a validation after parsing
(before calling main(args)) that checks args.window_size is an integer > 0 and
rejects otherwise (raise SystemExit or parser.error with a clear message).
Reference the parser variable and the main(args) call in
find_max_rmsd_subsegment.py so the guard runs immediately after
parser.parse_args() and prevents running the sliding-window routine with 0 or
negative sizes.
- Around line 81-88: The current check only ensures arr_i has some atoms for the
window, allowing partial residue coverage to be scored; change the logic to
require complete residue coverage for both altloc arrays before computing RMSD:
compute masks for both arr_i and arr_j (e.g., mask_i = (arr_i.chain_id == chain)
& np.isin(arr_i.res_id, window_res) and mask_j = (arr_j.chain_id == chain) &
np.isin(arr_j.res_id, window_res)), then verify that the set/unique res_id
values present in arr_i[mask_i] and arr_j[mask_j] exactly match window_res (or
that the count of unique res_ids equals len(window_res)); if either side is
incomplete, continue and do not call _has_compositional_heterogeneity or
biotite_rmsd on that window.

In `@scripts/eval/rscc_grid_search_script.py`:
- Line 247: The current exception handler "except (FileNotFoundError, OSError,
ValueError, RuntimeError) as e" can let AttributeError or TypeError raised
during parsing/filtering bubble up and abort the whole RSCC run; update that
except tuple to also include AttributeError and TypeError so trial-level
parsing/attribute/type failures are caught and treated as per-selection errors
(emitting the error row and continuing), i.e., change the handler around "except
(FileNotFoundError, OSError, ValueError, RuntimeError) as e" to "except
(FileNotFoundError, OSError, ValueError, RuntimeError, AttributeError,
TypeError) as e" and ensure the existing per-trial error logging/emission logic
is used for these cases.

In `@src/sampleworks/eval/grid_search_eval_utils.py`:
- Around line 27-39: The code builds Path objects from row["structure"] and
row["structure_pattern"] without guarding against pandas NaN (np.nan) which
truth-tests truthy and can cause TypeError; before constructing Path, normalize
and validate these cells: use pandas.isna(row["structure"]) (or isinstance
check) to treat NaN as missing, coerce non-empty values with
str(row["structure"]).strip() and only call Path(...) when that normalized
string is non-empty, and apply the same normalization/validation for
row["structure_pattern"] (variables p, pattern, and usage of cif_root should
remain the same); if both normalized values are missing raise the intended
ValueError with row.to_dict().

In `@src/sampleworks/utils/density_utils.py`:
- Around line 115-120: The code sets use_cuda_kernels based only on host CUDA
availability which can enable CUDA paths even when callers pass
device=torch.device("cpu"); change the call that constructs
DifferentiableTransformer so use_cuda_kernels is true only when the requested
device is CUDA and CUDA is available (e.g., use_cuda_kernels = (device.type ==
"cuda" and torch.cuda.is_available())); update the instantiation at
DifferentiableTransformer(...) and ensure any downstream code (e.g.,
dilate_atom_centric which calls torch.cuda.synchronize) will only run CUDA paths
when that flag reflects the requested device.

---

Nitpick comments:
In `@scripts/eval/find_max_rmsd_subsegment.py`:
- Around line 38-39: Expand the short docstring for
_has_compositional_heterogeneity into a full NumPy-style docstring: document
Parameters (arr_i: np.ndarray, arr_j: np.ndarray — expected shapes and
dtype/contents, mask: np.ndarray[bool] — shape must match arr_i/arr_j), the
Returns (bool) and add a Notes/Side effects section stating there are no I/O
side effects and the function only inspects arrays; do the same for
_process_structure (document parameters, return type, and explicitly state the
side effect that it loads CIF data from disk/parses CIF content and any mutation
it performs on in-memory structure) and for main (document CLI args, that it
reads and writes CSV files, and that it has disk I/O side effects and exit
behavior); also update the other function docstrings called out in the review
(the functions referenced at the other noted ranges) to NumPy-style including
array shapes and explicit side-effect notes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: eb4437dd-4cc2-47be-b5e4-b7b713559434

📥 Commits

Reviewing files that changed from the base of the PR and between 6e7a8cf and ab2de06.

📒 Files selected for processing (14)
  • .gitignore
  • CLAUDE.md
  • scripts/eval/classify_altloc_regions.py
  • scripts/eval/find_max_rmsd_subsegment.py
  • scripts/eval/rscc_grid_search_script.py
  • src/sampleworks/eval/grid_search_eval_utils.py
  • src/sampleworks/utils/atom_array_utils.py
  • src/sampleworks/utils/density_utils.py
  • tests/eval/conftest.py
  • tests/eval/test_rscc_grid_search_script.py
  • tests/resources/1vme/1VME_0.25occA_0.75occB_1.00A.ccp4
  • tests/resources/1vme/1VME_0.5occA_0.5occB_1.00A.ccp4
  • tests/resources/1vme/1VME_single_001_density_input.cif
  • tests/utils/test_density_utils.py

Comment on lines +81 to +88
mask = (arr_i.chain_id == chain) & np.isin(arr_i.res_id, window_res)
if mask.sum() == 0:
continue

if _has_compositional_heterogeneity(arr_i, arr_j, mask):
continue

rmsd_val = float(biotite_rmsd(arr_i[mask], arr_j[mask]))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Require complete residue coverage before scoring a window.

Line 82 only checks for any atoms, so a window with missing residues in either altloc can still be scored and later emitted as the full best_res[0]-best_res[-1] range. Skip windows unless both masked arrays contain exactly window_res.

🐛 Proposed fix
             mask = (arr_i.chain_id == chain) & np.isin(arr_i.res_id, window_res)
             if mask.sum() == 0:
                 continue
 
+            res_ids_i, _ = get_residues(arr_i[mask])
+            res_ids_j, _ = get_residues(arr_j[mask])
+            if list(map(int, res_ids_i)) != window_res or list(map(int, res_ids_j)) != window_res:
+                continue
+
             if _has_compositional_heterogeneity(arr_i, arr_j, mask):
                 continue
 
             rmsd_val = float(biotite_rmsd(arr_i[mask], arr_j[mask]))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/find_max_rmsd_subsegment.py` around lines 81 - 88, The current
check only ensures arr_i has some atoms for the window, allowing partial residue
coverage to be scored; change the logic to require complete residue coverage for
both altloc arrays before computing RMSD: compute masks for both arr_i and arr_j
(e.g., mask_i = (arr_i.chain_id == chain) & np.isin(arr_i.res_id, window_res)
and mask_j = (arr_j.chain_id == chain) & np.isin(arr_j.res_id, window_res)),
then verify that the set/unique res_id values present in arr_i[mask_i] and
arr_j[mask_j] exactly match window_res (or that the count of unique res_ids
equals len(window_res)); if either side is incomplete, continue and do not call
_has_compositional_heterogeneity or biotite_rmsd on that window.

Comment on lines +265 to +267
parser.add_argument("--window-size", type=int, default=3)
args = parser.parse_args()
main(args)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Reject non-positive --window-size values.

0 or negative values make the sliding-window calculation degenerate and can silently keep original selections instead of producing a valid max-RMSD subsegment.

🛡️ Proposed fix
     parser.add_argument("--window-size", type=int, default=3)
     args = parser.parse_args()
+    if args.window_size <= 0:
+        parser.error("--window-size must be a positive integer")
     main(args)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
parser.add_argument("--window-size", type=int, default=3)
args = parser.parse_args()
main(args)
parser.add_argument("--window-size", type=int, default=3)
args = parser.parse_args()
if args.window_size <= 0:
parser.error("--window-size must be a positive integer")
main(args)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/find_max_rmsd_subsegment.py` around lines 265 - 267, The CLI
currently accepts non-positive --window-size values which break the
sliding-window logic; add a validation after parsing (before calling main(args))
that checks args.window_size is an integer > 0 and rejects otherwise (raise
SystemExit or parser.error with a clear message). Reference the parser variable
and the main(args) call in find_max_rmsd_subsegment.py so the guard runs
immediately after parser.parse_args() and prevents running the sliding-window
routine with 0 or negative sizes.

f"density shape {computed_xmap.array.shape} does not match base map "
f"shape {base_xmap.array.shape}"
)
except (FileNotFoundError, OSError, ValueError, RuntimeError) as e:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve trial-level error isolation for attribute/type failures.

Line 185 raises AttributeError, and parsing/filtering can also raise TypeError; neither is caught here, so one bad trial can still abort the full RSCC run instead of emitting per-selection error rows.

🛡️ Proposed fix
-        except (FileNotFoundError, OSError, ValueError, RuntimeError) as e:
+        except (
+            FileNotFoundError,
+            OSError,
+            ValueError,
+            RuntimeError,
+            TypeError,
+            AttributeError,
+        ) as e:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except (FileNotFoundError, OSError, ValueError, RuntimeError) as e:
except (
FileNotFoundError,
OSError,
ValueError,
RuntimeError,
TypeError,
AttributeError,
) as e:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/rscc_grid_search_script.py` at line 247, The current exception
handler "except (FileNotFoundError, OSError, ValueError, RuntimeError) as e" can
let AttributeError or TypeError raised during parsing/filtering bubble up and
abort the whole RSCC run; update that except tuple to also include
AttributeError and TypeError so trial-level parsing/attribute/type failures are
caught and treated as per-selection errors (emitting the error row and
continuing), i.e., change the handler around "except (FileNotFoundError,
OSError, ValueError, RuntimeError) as e" to "except (FileNotFoundError, OSError,
ValueError, RuntimeError, AttributeError, TypeError) as e" and ensure the
existing per-trial error logging/emission logic is used for these cases.

Comment thread src/sampleworks/eval/grid_search_eval_utils.py
Comment on lines +115 to +120
transformer = DifferentiableTransformer(
xmap=xmap_torch,
scattering_params=scattering_params,
em=em_mode,
device=device,
use_cuda_kernels=torch.cuda.is_available(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Inspect how DifferentiableTransformer handles CUDA kernels and device placement.
rg -n -C5 'class\s+DifferentiableTransformer|use_cuda_kernels|cuda' --iglob '*.py'

Repository: diff-use/sampleworks

Length of output: 35412


🏁 Script executed:

# Search for calls to build_density_transformer and device parameter usage
rg -n 'build_density_transformer|def run_density_transformer' src/sampleworks/utils/density_utils.py -A 10

Repository: diff-use/sampleworks

Length of output: 2149


🏁 Script executed:

# Check how device is used in the transformer and if device.type == "cuda" is standard
rg -n 'device.type' src/sampleworks --iglob '*.py' | head -20

Repository: diff-use/sampleworks

Length of output: 46


🏁 Script executed:

# Verify device handling in the DifferentiableTransformer __init__
sed -n '407,475p' src/sampleworks/core/forward_models/xray/real_space_density.py

Repository: diff-use/sampleworks

Length of output: 2853


🏁 Script executed:

# Check what happens when CUDA kernels are called with CPU device
rg -n 'dilate_atom_centric\|_compute_radial_derivatives' src/sampleworks/core/forward_models/xray/real_space_density.py -A 5 | head -40

Repository: diff-use/sampleworks

Length of output: 46


🏁 Script executed:

# Check what happens when CUDA kernels are invoked - look at the forward path
sed -n '560,580p' src/sampleworks/core/forward_models/xray/real_space_density.py

Repository: diff-use/sampleworks

Length of output: 906


🏁 Script executed:

# Check if CPU device is ever passed to build_density_transformer by searching callers
rg -n 'build_density_transformer|device.*cpu|torch.device.*cpu' src/sampleworks --iglob '*.py' -B 2 -A 2 | head -50

Repository: diff-use/sampleworks

Length of output: 3595


🏁 Script executed:

# Look for where atom_reconciler device is used
rg -n 'to\(device\)|\.device' src/sampleworks/utils/density_utils.py -B 2 -A 2

Repository: diff-use/sampleworks

Length of output: 1348


🏁 Script executed:

# Check the compute_density function to see all device paths
sed -n '165,230p' src/sampleworks/utils/density_utils.py

Repository: diff-use/sampleworks

Length of output: 2589


🏁 Script executed:

# Check what error/behavior occurs in dilate_atom_centric when called with CPU tensors
sed -n '280,345p' src/sampleworks/core/forward_models/xray/real_space_density_deps/ops/dilate_points_cuda.py

Repository: diff-use/sampleworks

Length of output: 2545


🏁 Script executed:

# Verify device.type is the correct PyTorch idiom by checking its actual usage in the codebase
rg -n '\.device\.type\s*==' src/sampleworks --iglob '*.py'

Repository: diff-use/sampleworks

Length of output: 46


🏁 Script executed:

# Check if there are tests or explicit CPU device usages that would trigger this issue
rg -n 'device.*cpu\|torch.device.*cpu' src/sampleworks tests --iglob '*.py' | grep -E 'build_density|run_density|compute_density'

Repository: diff-use/sampleworks

Length of output: 46


🏁 Script executed:

# Look at line 338 of dilate_points_cuda more carefully and surrounding context
sed -n '330,345p' src/sampleworks/core/forward_models/xray/real_space_density_deps/ops/dilate_points_cuda.py

Repository: diff-use/sampleworks

Length of output: 686


Gate CUDA kernels on the requested device, not just host availability.

Line 120 enables CUDA kernels whenever the host has CUDA, even when callers explicitly pass device=torch.device("cpu"). This causes a runtime error: if use_cuda_kernels=True but the transformer uses CPU, the code at line 568 invokes dilate_atom_centric, which unconditionally calls torch.cuda.synchronize(device) with a CPU device, raising an error. Tie the flag to the requested device instead.

Proposed fix
         scattering_params=scattering_params,
         em=em_mode,
         device=device,
-        use_cuda_kernels=torch.cuda.is_available(),
+        use_cuda_kernels=device.type == "cuda" and torch.cuda.is_available(),
     )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
transformer = DifferentiableTransformer(
xmap=xmap_torch,
scattering_params=scattering_params,
em=em_mode,
device=device,
use_cuda_kernels=torch.cuda.is_available(),
transformer = DifferentiableTransformer(
xmap=xmap_torch,
scattering_params=scattering_params,
em=em_mode,
device=device,
use_cuda_kernels=device.type == "cuda" and torch.cuda.is_available(),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/utils/density_utils.py` around lines 115 - 120, The code sets
use_cuda_kernels based only on host CUDA availability which can enable CUDA
paths even when callers pass device=torch.device("cpu"); change the call that
constructs DifferentiableTransformer so use_cuda_kernels is true only when the
requested device is CUDA and CUDA is available (e.g., use_cuda_kernels =
(device.type == "cuda" and torch.cuda.is_available())); update the instantiation
at DifferentiableTransformer(...) and ensure any downstream code (e.g.,
dilate_atom_centric which calls torch.cuda.synchronize) will only run CUDA paths
when that flag reflects the requested device.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@k-chrispens k-chrispens force-pushed the kmc/rscc-eval-updates branch from 4cf1ffc to a657989 Compare April 23, 2026 22:26
@k-chrispens k-chrispens marked this pull request as ready for review April 23, 2026 22:36
@k-chrispens
Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 23, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
scripts/eval/rscc_grid_search_script.py (1)

124-124: ⚠️ Potential issue | 🟠 Major

Exception tuples still let AttributeError/TypeError escape.

Both the setup handler (Line 124) and the per-trial handler (Line 209) catch (FileNotFoundError, OSError, ValueError, RuntimeError), but:

  • Line 147 explicitly raises AttributeError("AtomArray | AtomArrayStack is missing coordinates").
  • atom_array.set_annotation(...) (Line 153) and biotite parsing/attribute access can plausibly raise AttributeError/TypeError on malformed inputs.

Under the per-trial handler, these uncaught errors currently abort the whole process_group call, which — because joblib.Parallel re-raises worker exceptions — will terminate the entire RSCC run instead of emitting per-selection rscc=nan rows. Apply the same widening to both handlers.

🛡️ Proposed fix
-    except (FileNotFoundError, OSError, ValueError, RuntimeError) as e:
+    except (
+        FileNotFoundError,
+        OSError,
+        ValueError,
+        RuntimeError,
+        AttributeError,
+        TypeError,
+    ) as e:
         logger.error(f"ERROR setting up group {protein}/{trials[0].altloc_occupancies}: {e}")
-        except (FileNotFoundError, OSError, ValueError, RuntimeError) as e:
+        except (
+            FileNotFoundError,
+            OSError,
+            ValueError,
+            RuntimeError,
+            AttributeError,
+            TypeError,
+        ) as e:
             logger.error(f"ERROR processing trial {trial.trial_dir}: {e}")

Also applies to: 209-209

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/rscc_grid_search_script.py` at line 124, The except clauses that
currently catch (FileNotFoundError, OSError, ValueError, RuntimeError) should
also include AttributeError and TypeError so malformed inputs don't crash the
whole run; update the two handlers (the setup handler around the code that may
raise AttributeError like the explicit raise and atom_array.set_annotation, and
the per-trial handler inside process_group/worker) to catch (FileNotFoundError,
OSError, ValueError, RuntimeError, AttributeError, TypeError) as e and preserve
the existing fallback behavior (e.g., emit rscc=nan or continue) so errors
become per-selection failures rather than terminating the job.
🧹 Nitpick comments (3)
tests/eval/test_rscc_grid_search_script.py (1)

118-118: Unused default argument in flaky_rscc.

_sel=bad_selection is never referenced inside the function body; the failure is decided purely by the target_is_next flag. Drop it to avoid confusion.

♻️ Proposed refactor
-    def flaky_rscc(a, b, _sel=bad_selection):
+    def flaky_rscc(a, b):
         if flaky_rscc.target_is_next:  # type: ignore[attr-defined]
             flaky_rscc.target_is_next = False  # type: ignore[attr-defined]
             raise RuntimeError("simulated rscc failure")
         return real_rscc(a, b)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/eval/test_rscc_grid_search_script.py` at line 118, The function
flaky_rscc currently declares an unused default argument `_sel=bad_selection`;
remove this unused parameter from the `flaky_rscc` signature (drop `_sel` and
its `bad_selection` default) and update any call sites or test invocations to no
longer pass or expect that argument; ensure only `a`, `b`, and the existing
`target_is_next` logic remain in `flaky_rscc`.
tests/eval/conftest.py (2)

87-90: Assertion-based validation disappears under python -O.

These assert calls are the only validation for fixture inputs; they’re silently skipped when Python is run with -O. Pytest doesn’t use -O by default, but if anyone invokes it that way, invalid arguments will produce confusing downstream filesystem errors. Prefer explicit raise ValueError(...) / FileNotFoundError(...) for input validation and reserve assert for invariants.

♻️ Proposed refactor
-    assert _REAL_CIF.exists(), _REAL_CIF
-    assert 1 <= n_groups <= len(_GROUP_CCP4)
-    assert trials_per_group >= 1
-    assert len(selections) >= 1
+    if not _REAL_CIF.exists():
+        raise FileNotFoundError(_REAL_CIF)
+    if not 1 <= n_groups <= len(_GROUP_CCP4):
+        raise ValueError(f"n_groups must be in [1, {len(_GROUP_CCP4)}], got {n_groups}")
+    if trials_per_group < 1:
+        raise ValueError(f"trials_per_group must be >= 1, got {trials_per_group}")
+    if len(selections) < 1:
+        raise ValueError("selections must not be empty")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/eval/conftest.py` around lines 87 - 90, Replace the assertion-based
input checks that disappear under python -O with explicit exceptions: check
existence of _REAL_CIF and raise FileNotFoundError with a helpful message if
missing; validate n_groups against len(_GROUP_CCP4) and raise ValueError when
out of range; ensure trials_per_group >= 1 and selections length >= 1 and raise
ValueError with clear messages for each invalid condition; update the checks
located near the fixture that references variables _REAL_CIF, n_groups,
_GROUP_CCP4, trials_per_group, and selections accordingly.

71-78: Symlink-only approach is Linux-centric.

dst.symlink_to(src) requires privileges on Windows and is also silently broken inside some sandboxed tempdirs. Given tests under tests/eval/ depend exclusively on symlinks, consider either:

  1. Falling back to shutil.copy2 if symlink_to raises OSError/NotImplementedError, or
  2. Skipping this module on non-POSIX via a module-level pytest.importorskip/skipif guard.

Not blocking for this PR if Linux-only CI is acceptable, but worth documenting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/eval/conftest.py` around lines 71 - 78, The _link function currently
uses dst.symlink_to(src) which fails on Windows or in sandboxed tempdirs; update
_link to catch OSError and NotImplementedError around dst.symlink_to(src) and
fall back to copying with shutil.copy2(src, dst) (ensure dst.parent exists and
remove any existing dst before the fallback), and alternatively consider adding
a module-level guard using pytest.importorskip("posix") or
pytest.mark.skipif(not os.name == "posix", reason="symlink-only tests") if you
prefer skipping on non-POSIX environments; refer to the _link function name and
dst.symlink_to and use shutil.copy2 and pytest.importorskip/skipif in your
change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sampleworks/utils/atom_array_utils.py`:
- Around line 759-776: Update the function-level docstring to a NumPy-style
docstring: add a Parameters section documenting atom_array (type:
AtomArray-like, expected shape/contents) and altloc_ids (iterable of str), a
Returns section describing that the function returns a dict mapping unordered
altloc id pairs (tuple of str) to tuples of AtomArray (array_i, array_j) and
explicitly note that pairs which raise RuntimeError inside
filter_to_common_atoms are omitted from the returned dict (and that a warning is
emitted), and a Raises section documenting any raised exceptions (e.g.,
propagate unexpected exceptions but note that RuntimeError from
filter_to_common_atoms is handled by omission). Reference
select_altloc(return_full_array=True) and filter_to_common_atoms in the
description so callers know how arrays are built and filtered.

---

Duplicate comments:
In `@scripts/eval/rscc_grid_search_script.py`:
- Line 124: The except clauses that currently catch (FileNotFoundError, OSError,
ValueError, RuntimeError) should also include AttributeError and TypeError so
malformed inputs don't crash the whole run; update the two handlers (the setup
handler around the code that may raise AttributeError like the explicit raise
and atom_array.set_annotation, and the per-trial handler inside
process_group/worker) to catch (FileNotFoundError, OSError, ValueError,
RuntimeError, AttributeError, TypeError) as e and preserve the existing fallback
behavior (e.g., emit rscc=nan or continue) so errors become per-selection
failures rather than terminating the job.

---

Nitpick comments:
In `@tests/eval/conftest.py`:
- Around line 87-90: Replace the assertion-based input checks that disappear
under python -O with explicit exceptions: check existence of _REAL_CIF and raise
FileNotFoundError with a helpful message if missing; validate n_groups against
len(_GROUP_CCP4) and raise ValueError when out of range; ensure trials_per_group
>= 1 and selections length >= 1 and raise ValueError with clear messages for
each invalid condition; update the checks located near the fixture that
references variables _REAL_CIF, n_groups, _GROUP_CCP4, trials_per_group, and
selections accordingly.
- Around line 71-78: The _link function currently uses dst.symlink_to(src) which
fails on Windows or in sandboxed tempdirs; update _link to catch OSError and
NotImplementedError around dst.symlink_to(src) and fall back to copying with
shutil.copy2(src, dst) (ensure dst.parent exists and remove any existing dst
before the fallback), and alternatively consider adding a module-level guard
using pytest.importorskip("posix") or pytest.mark.skipif(not os.name == "posix",
reason="symlink-only tests") if you prefer skipping on non-POSIX environments;
refer to the _link function name and dst.symlink_to and use shutil.copy2 and
pytest.importorskip/skipif in your change.

In `@tests/eval/test_rscc_grid_search_script.py`:
- Line 118: The function flaky_rscc currently declares an unused default
argument `_sel=bad_selection`; remove this unused parameter from the
`flaky_rscc` signature (drop `_sel` and its `bad_selection` default) and update
any call sites or test invocations to no longer pass or expect that argument;
ensure only `a`, `b`, and the existing `target_is_next` logic remain in
`flaky_rscc`.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 32784a89-eb17-409e-b26c-e9fc7049a8eb

📥 Commits

Reviewing files that changed from the base of the PR and between ab2de06 and a657989.

⛔ Files ignored due to path filters (1)
  • pixi.lock is excluded by !**/*.lock
📒 Files selected for processing (13)
  • CLAUDE.md
  • pyproject.toml
  • scripts/eval/classify_altloc_regions.py
  • scripts/eval/find_max_rmsd_subsegment.py
  • scripts/eval/rscc_grid_search_script.py
  • src/sampleworks/utils/atom_array_utils.py
  • src/sampleworks/utils/density_utils.py
  • tests/eval/conftest.py
  • tests/eval/test_rscc_grid_search_script.py
  • tests/resources/1vme/1VME_0.25occA_0.75occB_1.00A.ccp4
  • tests/resources/1vme/1VME_0.5occA_0.5occB_1.00A.ccp4
  • tests/resources/1vme/1VME_single_001_density_input.cif
  • tests/utils/test_density_utils.py
✅ Files skipped from review due to trivial changes (2)
  • pyproject.toml
  • CLAUDE.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/sampleworks/utils/density_utils.py
  • scripts/eval/find_max_rmsd_subsegment.py

Comment on lines +759 to +776
"""Return ``{(id_i, id_j): (array_i, array_j)}`` pre-filtered to common atoms.

For each unordered altloc pair we build the two per-altloc AtomArrays
via ``select_altloc(return_full_array=True)``, which includes blank-altloc
atoms as shared context and then run ``filter_to_common_atoms`` so the two
inputs have identical atom order and count.

We build per-pair rather than using ``map_altlocs_to_stack`` so residues whose
altloc set is a subset of those in the whole structure (e.g. 2YL0 res 60–64
carry only altlocs A and B, not C) still get scored for the pairs where they
exist. A stack level ``filter_to_common_atoms`` would drop them entirely.

TODO: this helper hits the broader issue in how we
handle structures with >2 altlocs.
Fixing that upstream would let us replace this helper
with a direct ``map_altlocs_to_stack`` call and remove a source of
duplication.
"""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Docstring is missing NumPy-style Parameters/Returns/Raises sections.

The description is clear, but the signature (atom_array, altloc_ids) and the non-obvious return semantics (pairs that raise RuntimeError inside filter_to_common_atoms are silently omitted, with a warning) are only documented implicitly. Please add explicit sections so callers don't have to read the implementation to learn that the returned dict can be missing pairs.

♻️ Suggested docstring
 def build_pairwise_altloc_arrays(
     atom_array: AtomArray, altloc_ids: list[str]
 ) -> dict[tuple[str, str], tuple[AtomArrayStack, AtomArrayStack]]:
     """Return ``{(id_i, id_j): (array_i, array_j)}`` pre-filtered to common atoms.
 
     For each unordered altloc pair we build the two per-altloc AtomArrays
     via ``select_altloc(return_full_array=True)``, which includes blank-altloc
     atoms as shared context and then run ``filter_to_common_atoms`` so the two
     inputs have identical atom order and count.
 
     We build per-pair rather than using ``map_altlocs_to_stack`` so residues whose
     altloc set is a subset of those in the whole structure (e.g. 2YL0 res 60–64
     carry only altlocs A and B, not C) still get scored for the pairs where they
     exist. A stack level ``filter_to_common_atoms`` would drop them entirely.
 
+    Parameters
+    ----------
+    atom_array
+        Input structure with ``altloc_id`` annotations.
+    altloc_ids
+        Altloc identifiers to enumerate over (unordered pairs, ``i < j``).
+
+    Returns
+    -------
+    dict[tuple[str, str], tuple[AtomArrayStack, AtomArrayStack]]
+        Mapping from ordered ``(id_i, id_j)`` pairs (``i < j`` in ``altloc_ids``)
+        to the common-atom-filtered stacks. Pairs for which no common atoms
+        exist are omitted (a warning is logged).
+
     TODO: this helper hits the broader issue in how we
     handle structures with >2 altlocs.
     Fixing that upstream would let us replace this helper
     with a direct ``map_altlocs_to_stack`` call and remove a source of
     duplication.
     """

As per coding guidelines: "Always include NumPy-style docstrings for every function and class."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/utils/atom_array_utils.py` around lines 759 - 776, Update the
function-level docstring to a NumPy-style docstring: add a Parameters section
documenting atom_array (type: AtomArray-like, expected shape/contents) and
altloc_ids (iterable of str), a Returns section describing that the function
returns a dict mapping unordered altloc id pairs (tuple of str) to tuples of
AtomArray (array_i, array_j) and explicitly note that pairs which raise
RuntimeError inside filter_to_common_atoms are omitted from the returned dict
(and that a warning is emitted), and a Raises section documenting any raised
exceptions (e.g., propagate unexpected exceptions but note that RuntimeError
from filter_to_common_atoms is handled by omission). Reference
select_altloc(return_full_array=True) and filter_to_common_atoms in the
description so callers know how arrays are built and filtered.

# find_altloc_selections.py appends a combined all altloc selection
# (atomworks-style with " or " clauses) at the end of each row. That one is
# a union over every span we already processed individually, so skip it.
# NOTE: This will need to be addressed when we
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will need to be addressed?

# a union over every span we already processed individually, so skip it.
# NOTE: This will need to be addressed when we
# migrate to atomworks-style selections for everything
if " or " in selection_str:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to allow selections that have "or" in them, presumably there will be some that have discontinuous but physically contacting residues that we care about?

identifies the contiguous subsegment of that size with the highest
RMSD between any pair of alternate conformations.

Only residues with identical residue names across altlocs are considered.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should fix this. Maybe make an issue.

)


def _has_compositional_heterogeneity(arr_i, arr_j, mask: np.ndarray) -> bool:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple things:

  1. Didn't one of us create a method for this somewhere else?

  2. This will return True if there are simply different sequences that are the same length, which I wouldn't think of as compositional heterogeneity. Do you want to check for modified residues like CYS->CSO?

if mask.sum() == 0:
continue

if _has_compositional_heterogeneity(arr_i, arr_j, mask):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this to prevent an error for biotite_rmsd? If so, again I'd think about whether it might make sense to check for modified but otherwise the same residues (e.g. sulfonates). You could take the rmsd on only the common atoms.

raise ValueError(f"Input CSV missing required columns: {missing}")

all_rows: list[dict] = []
for _, row in input_df.iterrows():
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a separate module for parallelizing things over rows of a dataframe, I forget what it is called, but it can speed things up a lot if _process_structure is a slow step.

)
else:
final_rows = []
for protein, group in detail_df.groupby("protein", sort=False):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use a DataFrame.groupby.agg pattern here. No reason to use a for loop.

4. Aggregate and visualize results by ensemble size, guidance weight, and scaler type

Depending on the GPU, --n-jobs=8-16 work well. A CUDA RuntimeError in a worker is caught per-trial
(the row gets ``rscc=nan``) but may affect other trials in the same worker.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cryptic. Why is there always a row with RSCC=nan? It seems like we should just fix that rather than tolerate it.

Or do you mean that if there is a CUDA error, the row gets RSCC=nan?

valid_selections = [s for s in protein_config.selection if s in group_ref_coords]
rows: list[dict] = []

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make some effort to spread these out over the available GPUs.

if base_xmap is None:
raise ValueError(f"Failed to load base map from {base_map_path}")

transformer, _ = build_density_transformer(base_xmap, em_mode=False, device=device)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably this needs a comment to explain what the method does and what the transformer is used for later. This is the object that actually builds the map from coordinates, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants