Skip to content

ci: Add CI workflow for lint, typecheck, and CPU tests#189

Merged
k-chrispens merged 13 commits intomainfrom
ci-lint-and-cpu-tests
Mar 30, 2026
Merged

ci: Add CI workflow for lint, typecheck, and CPU tests#189
k-chrispens merged 13 commits intomainfrom
ci-lint-and-cpu-tests

Conversation

@Abdelsalam-Abbas
Copy link
Copy Markdown
Contributor

@Abdelsalam-Abbas Abdelsalam-Abbas commented Mar 26, 2026

Summary

  • New ci.yml workflow with 3 jobs on ubuntu-latest, no approval gates:
    • lint: ruff check + ruff format --check
    • typecheck: ty check across boltz-dev, protenix-dev, rf3-dev (matrix)
    • cpu-tests: pytest -m 'not gpu' across all 3 envs (412 tests, matrix)
  • Switch self-hosted GPU runners from GitHub Actions cache to NFS-backed caching

Context

Addresses feedback from Karson and Marcus:

  • Non-GPU tests now run automatically on PRs without approval
  • Formatting/linting enforced in CI matching pre-commit hooks (ruff, ty)
  • GPU test approval preserved for pausing during sampleworks machine runs

Test plan

  • Workflow YAML validated
  • CPU tests pass on ubuntu-latest
  • GPU tests work with NFS cache — pixi install: 11min → 11s, boltz/rf3: 14min → 3.5min

Summary by CodeRabbit

  • Chores

    • Added comprehensive CI workflows with lint, type-check, and multi-environment test runs (including manual trigger) and adjusted GPU install caching.
    • Tightened type-checking rules for stricter diagnostics.
    • Improved runtime robustness and fallback determinism for GPU extension loading.
    • Minor formatting and argument/help string cleanups across utilities.
  • Tests

    • Made tests more deterministic by adding targeted mocking and cache resets.
    • Improved test reliability and maintainability.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 26, 2026

📝 Walkthrough

Walkthrough

Introduces a new CI workflow for linting, type-checking, and CPU tests with environment matrix and concurrency; disables caching in the GPU workflow's pixi setup; tightens ty diagnostics in pyproject.toml; ensures deterministic CUDA extension state on load failure; adds test mocking/cache clears for deterministic CCD and checkpoint behavior; assorted cosmetic edits.

Changes

Cohort / File(s) Summary
CI Workflow
.github/workflows/ci.yml
Adds a new CI workflow with lint, typecheck, and cpu-tests jobs, an environment matrix (boltz-dev, protenix-dev, rf3-dev), scoped triggers, concurrency group, and per-job timeouts.
GPU Workflow
.github/workflows/gpu-tests.yml
Updates prefix-dev/setup-pixi step to pass cache: false, disabling that action's cache.
Type-Check Configuration
pyproject.toml
Sets multiple tool.ty.rules diagnostics to warn (e.g., unresolved-attribute, invalid-argument-type, not-iterable).
CUDA Extension Initialization
src/sampleworks/core/forward_models/xray/real_space_density_deps/ops/csrc/__init__.py
On CUDA extension load failure, assigns dilate_points_cuda = None and sets CUDA_AVAILABLE = False to ensure deterministic module state.
Evaluation Scripts
scripts/eval/bond_geometry_eval.py, scripts/eval/run_and_process_tortoize.py, scripts/eval/run_and_process_phenix_clashscore.py
Minor formatting and refactorings: removed unused exception capture, condensed exception raise, and reformatted dict/call layouts — no behavioral changes.
Utilities & Eval Helpers
src/sampleworks/eval/grid_search_eval_utils.py, src/sampleworks/utils/msa.py
Import reordering, signature/indentation cleanup, and consistent double-quoted string literals; no logic changes.
Tests — Formatting
tests/eval/test_structure_utils.py, tests/utils/test_atom_array_utils.py
Formatting/whitespace and list-literal style changes only.
Tests — Deterministic Mocks
tests/models/protenix/test_ccd_expansion.py, tests/utils/test_guidance_script_arguments.py
Add unittest.mock.patch and cache clears to make CCD expansion and checkpoint resolution deterministic; update assertions accordingly.

Sequence Diagram(s)

(No sequence diagrams generated.)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

Suggested reviewers

  • marcuscollins

Poem

🐰 I hopped through code with careful paws,
CI blossoms, linted laws,
CUDA rests, mocked tests abide,
Grids and checkpoints now decide,
A carrot nod — I thump with applause! 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'ci: Add CI workflow for lint, typecheck, and CPU tests' directly and accurately describes the main changes: a new GitHub Actions CI workflow is introduced with three jobs for linting, type checking, and CPU-based testing.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ci-lint-and-cpu-tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
.github/workflows/ci.yml (1)

37-42: Consider using SHA-pinned action references for consistency and security.

The gpu-tests.yml workflow uses SHA-pinned action references (e.g., actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4), while this workflow uses tag references (@v4, @v0.8.8). SHA pinning prevents supply-chain attacks where a tag could be moved to point to malicious code. For consistency across workflows and improved security posture, consider pinning to full commit SHAs.

Example for this job (apply similarly to other jobs)
      - name: Checkout code
-       uses: actions/checkout@v4
+       uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4

      - name: Install pixi
-       uses: prefix-dev/setup-pixi@v0.8.8
+       uses: prefix-dev/setup-pixi@19eac09b398e3d0c747adc7921926a6d802df4da # v0.8.8
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/ci.yml around lines 37 - 42, Replace the tag-based action
references with SHA-pinned references: update actions/checkout@v4 and
prefix-dev/setup-pixi@v0.8.8 to their corresponding full commit SHAs (the same
format used in gpu-tests.yml, e.g., actions/checkout@<full-sha>) so the workflow
uses immutable action versions; locate the uses: lines for actions/checkout and
prefix-dev/setup-pixi in the ci.yml job and substitute the tag with the exact
commit SHA for each action.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.github/workflows/ci.yml:
- Around line 37-42: Replace the tag-based action references with SHA-pinned
references: update actions/checkout@v4 and prefix-dev/setup-pixi@v0.8.8 to their
corresponding full commit SHAs (the same format used in gpu-tests.yml, e.g.,
actions/checkout@<full-sha>) so the workflow uses immutable action versions;
locate the uses: lines for actions/checkout and prefix-dev/setup-pixi in the
ci.yml job and substitute the tag with the exact commit SHA for each action.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5f2faafa-ddac-4783-a07b-6b6261a990c1

📥 Commits

Reviewing files that changed from the base of the PR and between 488cd6c and 7fbe08c.

📒 Files selected for processing (2)
  • .github/workflows/ci.yml
  • .github/workflows/gpu-tests.yml

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
tests/models/protenix/test_ccd_expansion.py (1)

21-28: Consider using a pytest fixture to centralize cache management.

The mocking approach is appropriate here to avoid external CIF cache dependencies in CI. However, the repetitive cache_clear() calls before and after each test could be consolidated into a fixture for cleaner, more maintainable code.

♻️ Proposed fixture-based approach

Add a fixture in this file or conftest.py:

`@pytest.fixture`(autouse=True)
def clear_ccd_suffix_cache():
    """Ensure fresh CCD suffix map cache for each test."""
    _build_ccd_suffix_map.cache_clear()
    yield
    _build_ccd_suffix_map.cache_clear()

Then simplify test methods:

 def test_unique_match_expands(self):
     """~QS should expand uniquely to A1AQS."""
     fake_codes = ["A1AQS", "GLY", "ALA"]
-    _build_ccd_suffix_map.cache_clear()
     with patch("protenix.data.ccd.get_all_ccd_code", return_value=fake_codes):
         result = _expand_tilde_ccd_code("~QS")
-    _build_ccd_suffix_map.cache_clear()
     assert result == "A1AQS"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/models/protenix/test_ccd_expansion.py` around lines 21 - 28, Add an
autouse pytest fixture to clear the CCD suffix cache before and after each test
instead of calling _build_ccd_suffix_map.cache_clear() manually in tests; create
a fixture named clear_ccd_suffix_cache (placed in this test module or
conftest.py) that calls _build_ccd_suffix_map.cache_clear(), yields, then calls
_build_ccd_suffix_map.cache_clear() again, and remove the explicit pre/post
cache_clear() calls from tests like test_unique_match_expands and other tests
referencing _build_ccd_suffix_map or relying on _expand_tilde_ccd_code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/utils/test_guidance_script_arguments.py`:
- Around line 128-129: The test currently uses a broad pytest.raises(ValueError)
for validate_model_checkpoint(model_wrapper_type, "") which can catch unrelated
ValueErrors; narrow it by supplying the match argument to pytest.raises with a
regex or exact message that validate_model_checkpoint raises for an empty
checkpoint (e.g. pytest.raises(ValueError, match="checkpoint.*empty" ) ), so
update the assertion to assert the specific error text produced by
validate_model_checkpoint.
- Around line 67-70: The test is mocking the private resolver
sampleworks.utils.guidance_script_arguments._resolve_checkpoint which ties the
test to implementation; instead create a real checkpoint file under tmp_path and
exercise the public entrypoint (e.g., the CLI/argument parser function in
sampleworks.utils.guidance_script_arguments such as parse_args or main) by
passing the file path as an argument, then assert expected behavior; remove the
`@patch` of _resolve_checkpoint in these tests (and the similar patches at the
other locations) and replace them with tmp_path-backed inputs so the tests
remain black-box and resilient to refactors.
- Around line 94-112: The test
test_populate_config_uses_model_checkpoint_argument currently exits the patch on
_resolve_checkpoint before calling
GuidanceConfig.populate_config_for_guidance_type, so the auto-resolution
behavior isn't actually being asserted; update the test to keep the patch on
"sampleworks.utils.guidance_script_arguments._resolve_checkpoint" active through
the call to GuidanceConfig.populate_config_for_guidance_type (i.e., include the
config.populate_config_for_guidance_type(...) inside the with patch(...) block)
and add an assertion on the mock (e.g., mock_resolver.assert_not_called() or
equivalent) after the call to ensure _resolve_checkpoint was not invoked when
args.model_checkpoint is provided.

---

Nitpick comments:
In `@tests/models/protenix/test_ccd_expansion.py`:
- Around line 21-28: Add an autouse pytest fixture to clear the CCD suffix cache
before and after each test instead of calling
_build_ccd_suffix_map.cache_clear() manually in tests; create a fixture named
clear_ccd_suffix_cache (placed in this test module or conftest.py) that calls
_build_ccd_suffix_map.cache_clear(), yields, then calls
_build_ccd_suffix_map.cache_clear() again, and remove the explicit pre/post
cache_clear() calls from tests like test_unique_match_expands and other tests
referencing _build_ccd_suffix_map or relying on _expand_tilde_ccd_code.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c70781cc-915a-4d80-8c4f-00bf6adc649d

📥 Commits

Reviewing files that changed from the base of the PR and between 5c78057 and d69ca7b.

⛔ Files ignored due to path filters (1)
  • pixi.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • pyproject.toml
  • tests/models/protenix/test_ccd_expansion.py
  • tests/utils/test_guidance_script_arguments.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • pyproject.toml

Comment thread tests/utils/test_guidance_script_arguments.py
Comment thread tests/utils/test_guidance_script_arguments.py Outdated
Comment thread tests/utils/test_guidance_script_arguments.py Outdated
- Use ty overrides (not exclude) to suppress real_space_density.py errors
- Mock _resolve_checkpoint in guidance arg tests (no /checkpoints/ on ubuntu)
- Add missing step_scaler_type to test Namespace
- Mock CCD data in protenix expansion tests (no CIF cache on ubuntu)
- Set all pre-existing ty error rules to warn (51 diagnostics across
  many files; per-file overrides don't scale)
- Fix test assertion: populate auto-resolves when no checkpoint arg given
- Mock _resolve_checkpoint in validate test to avoid env-dependent behavior
NFS cache testing is complete. Restore the environment gate so GPU
tests require manual approval on PRs again.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pyproject.toml (1)

1-3: ⚠️ Potential issue | 🔴 Critical

CI is failing: lock file out of sync with workspace.

The pipeline reports pixi install failed because the lock file doesn't match pyproject.toml. Run pixi install (without --locked) locally to regenerate pixi.lock and commit the updated lock file.

pixi install
git add pixi.lock
git commit -m "chore: regenerate pixi.lock"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` around lines 1 - 3, CI is failing because pixi.lock is out of
sync with pyproject.toml; run `pixi install` locally (without --locked) to
regenerate pixi.lock, then add and commit the updated pixi.lock (e.g., git add
pixi.lock && git commit -m "chore: regenerate pixi.lock") so the lock file
matches the build-system section (build-backend = "hatchling.build", requires =
["hatchling"]) in pyproject.toml.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/utils/test_guidance_script_arguments.py`:
- Around line 122-129: The test mocks
sampleworks.utils.guidance_script_arguments._resolve_checkpoint with a
non-realistic message; update the mock in
test_validate_model_checkpoint_requires_non_empty_value to raise a ValueError
whose message matches the real resolver contract (e.g., start with a realistic
prefix like "could not resolve checkpoint" or "failed to resolve checkpoint")
and change the pytest.raises match to assert that prefix (use a regex matching
the prefix) so the test verifies the real resolver-style error rather than an
exact, brittle string.

---

Outside diff comments:
In `@pyproject.toml`:
- Around line 1-3: CI is failing because pixi.lock is out of sync with
pyproject.toml; run `pixi install` locally (without --locked) to regenerate
pixi.lock, then add and commit the updated pixi.lock (e.g., git add pixi.lock &&
git commit -m "chore: regenerate pixi.lock") so the lock file matches the
build-system section (build-backend = "hatchling.build", requires =
["hatchling"]) in pyproject.toml.
🪄 Autofix (Beta)

❌ Autofix failed (check again to retry)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7b2b8279-6444-43e0-858f-7425c5c2c80a

📥 Commits

Reviewing files that changed from the base of the PR and between 88e93e7 and dad7955.

📒 Files selected for processing (13)
  • .github/workflows/ci.yml
  • .github/workflows/gpu-tests.yml
  • pyproject.toml
  • scripts/eval/bond_geometry_eval.py
  • scripts/eval/run_and_process_phenix_clashscore.py
  • scripts/eval/run_and_process_tortoize.py
  • src/sampleworks/core/forward_models/xray/real_space_density_deps/ops/csrc/__init__.py
  • src/sampleworks/eval/grid_search_eval_utils.py
  • src/sampleworks/utils/msa.py
  • tests/eval/test_structure_utils.py
  • tests/models/protenix/test_ccd_expansion.py
  • tests/utils/test_atom_array_utils.py
  • tests/utils/test_guidance_script_arguments.py
✅ Files skipped from review due to trivial changes (9)
  • scripts/eval/run_and_process_phenix_clashscore.py
  • tests/eval/test_structure_utils.py
  • tests/utils/test_atom_array_utils.py
  • scripts/eval/bond_geometry_eval.py
  • src/sampleworks/eval/grid_search_eval_utils.py
  • scripts/eval/run_and_process_tortoize.py
  • .github/workflows/ci.yml
  • .github/workflows/gpu-tests.yml
  • src/sampleworks/utils/msa.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/sampleworks/core/forward_models/xray/real_space_density_deps/ops/csrc/init.py
  • tests/models/protenix/test_ccd_expansion.py

Comment thread tests/utils/test_guidance_script_arguments.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/utils/test_guidance_script_arguments.py (1)

71-87: Strengthen this test by asserting resolver invocation in the call under test.

The current assertion checks final state only. Add a direct mock assertion so this test proves resolution happens when populate_config_for_guidance_type runs.

Suggested tightening
 def test_populate_config_resolves_checkpoint_when_none_provided(_mock_resolve, model_wrapper_type):
     """populate_config_for_guidance_type should auto-resolve checkpoint if no arg exists."""
     config = GuidanceConfig(
         protein="protein",
         structure="/tmp/structure.cif",
         density="/tmp/density.mrc",
         model=model_wrapper_type,
         guidance_type=GuidanceType.PURE_GUIDANCE,
         log_path="/tmp/output/run.log",
     )
+    _mock_resolve.reset_mock()

     config.populate_config_for_guidance_type(
         _build_job(model_wrapper_type),
         Namespace(use_tweedie=False, step_scaler_type="noisespace"),
     )

+    _mock_resolve.assert_called_once()
     assert config.model_checkpoint == "/checkpoints/mock.ckpt"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/utils/test_guidance_script_arguments.py` around lines 71 - 87, The test
only asserts the final checkpoint value but doesn't verify the resolver was
invoked; update test_populate_config_resolves_checkpoint_when_none_provided to
assert that the _mock_resolve mock was called when
populate_config_for_guidance_type runs (e.g., _mock_resolve.assert_called_once()
or _mock_resolve.assert_called_once_with(...) using the expected args from the
call), referencing GuidanceConfig.populate_config_for_guidance_type and the
_mock_resolve fixture so the test proves resolution occurred rather than just
the resulting state.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/utils/test_guidance_script_arguments.py`:
- Around line 71-87: The test only asserts the final checkpoint value but
doesn't verify the resolver was invoked; update
test_populate_config_resolves_checkpoint_when_none_provided to assert that the
_mock_resolve mock was called when populate_config_for_guidance_type runs (e.g.,
_mock_resolve.assert_called_once() or _mock_resolve.assert_called_once_with(...)
using the expected args from the call), referencing
GuidanceConfig.populate_config_for_guidance_type and the _mock_resolve fixture
so the test proves resolution occurred rather than just the resulting state.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1f3bd329-18d7-47f1-85dd-3ce32690cd78

📥 Commits

Reviewing files that changed from the base of the PR and between 4404737 and 00155af.

📒 Files selected for processing (1)
  • tests/utils/test_guidance_script_arguments.py

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 30, 2026

Note

Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Autofix skipped. No unresolved CodeRabbit review comments with fix instructions found.

Copy link
Copy Markdown
Collaborator

@k-chrispens k-chrispens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, will add an issue to address the ty things

@k-chrispens k-chrispens merged commit 025fbd8 into main Mar 30, 2026
8 of 11 checks passed
@k-chrispens k-chrispens deleted the ci-lint-and-cpu-tests branch March 30, 2026 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants