ci: Add CI workflow for lint, typecheck, and CPU tests#189
ci: Add CI workflow for lint, typecheck, and CPU tests#189k-chrispens merged 13 commits intomainfrom
Conversation
📝 WalkthroughWalkthroughIntroduces a new CI workflow for linting, type-checking, and CPU tests with environment matrix and concurrency; disables caching in the GPU workflow's pixi setup; tightens Changes
Sequence Diagram(s)(No sequence diagrams generated.) Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
.github/workflows/ci.yml (1)
37-42: Consider using SHA-pinned action references for consistency and security.The
gpu-tests.ymlworkflow uses SHA-pinned action references (e.g.,actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4), while this workflow uses tag references (@v4,@v0.8.8). SHA pinning prevents supply-chain attacks where a tag could be moved to point to malicious code. For consistency across workflows and improved security posture, consider pinning to full commit SHAs.Example for this job (apply similarly to other jobs)
- name: Checkout code - uses: actions/checkout@v4 + uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4 - name: Install pixi - uses: prefix-dev/setup-pixi@v0.8.8 + uses: prefix-dev/setup-pixi@19eac09b398e3d0c747adc7921926a6d802df4da # v0.8.8🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/ci.yml around lines 37 - 42, Replace the tag-based action references with SHA-pinned references: update actions/checkout@v4 and prefix-dev/setup-pixi@v0.8.8 to their corresponding full commit SHAs (the same format used in gpu-tests.yml, e.g., actions/checkout@<full-sha>) so the workflow uses immutable action versions; locate the uses: lines for actions/checkout and prefix-dev/setup-pixi in the ci.yml job and substitute the tag with the exact commit SHA for each action.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In @.github/workflows/ci.yml:
- Around line 37-42: Replace the tag-based action references with SHA-pinned
references: update actions/checkout@v4 and prefix-dev/setup-pixi@v0.8.8 to their
corresponding full commit SHAs (the same format used in gpu-tests.yml, e.g.,
actions/checkout@<full-sha>) so the workflow uses immutable action versions;
locate the uses: lines for actions/checkout and prefix-dev/setup-pixi in the
ci.yml job and substitute the tag with the exact commit SHA for each action.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5f2faafa-ddac-4783-a07b-6b6261a990c1
📒 Files selected for processing (2)
.github/workflows/ci.yml.github/workflows/gpu-tests.yml
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
tests/models/protenix/test_ccd_expansion.py (1)
21-28: Consider using a pytest fixture to centralize cache management.The mocking approach is appropriate here to avoid external CIF cache dependencies in CI. However, the repetitive
cache_clear()calls before and after each test could be consolidated into a fixture for cleaner, more maintainable code.♻️ Proposed fixture-based approach
Add a fixture in this file or
conftest.py:`@pytest.fixture`(autouse=True) def clear_ccd_suffix_cache(): """Ensure fresh CCD suffix map cache for each test.""" _build_ccd_suffix_map.cache_clear() yield _build_ccd_suffix_map.cache_clear()Then simplify test methods:
def test_unique_match_expands(self): """~QS should expand uniquely to A1AQS.""" fake_codes = ["A1AQS", "GLY", "ALA"] - _build_ccd_suffix_map.cache_clear() with patch("protenix.data.ccd.get_all_ccd_code", return_value=fake_codes): result = _expand_tilde_ccd_code("~QS") - _build_ccd_suffix_map.cache_clear() assert result == "A1AQS"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/models/protenix/test_ccd_expansion.py` around lines 21 - 28, Add an autouse pytest fixture to clear the CCD suffix cache before and after each test instead of calling _build_ccd_suffix_map.cache_clear() manually in tests; create a fixture named clear_ccd_suffix_cache (placed in this test module or conftest.py) that calls _build_ccd_suffix_map.cache_clear(), yields, then calls _build_ccd_suffix_map.cache_clear() again, and remove the explicit pre/post cache_clear() calls from tests like test_unique_match_expands and other tests referencing _build_ccd_suffix_map or relying on _expand_tilde_ccd_code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/utils/test_guidance_script_arguments.py`:
- Around line 128-129: The test currently uses a broad pytest.raises(ValueError)
for validate_model_checkpoint(model_wrapper_type, "") which can catch unrelated
ValueErrors; narrow it by supplying the match argument to pytest.raises with a
regex or exact message that validate_model_checkpoint raises for an empty
checkpoint (e.g. pytest.raises(ValueError, match="checkpoint.*empty" ) ), so
update the assertion to assert the specific error text produced by
validate_model_checkpoint.
- Around line 67-70: The test is mocking the private resolver
sampleworks.utils.guidance_script_arguments._resolve_checkpoint which ties the
test to implementation; instead create a real checkpoint file under tmp_path and
exercise the public entrypoint (e.g., the CLI/argument parser function in
sampleworks.utils.guidance_script_arguments such as parse_args or main) by
passing the file path as an argument, then assert expected behavior; remove the
`@patch` of _resolve_checkpoint in these tests (and the similar patches at the
other locations) and replace them with tmp_path-backed inputs so the tests
remain black-box and resilient to refactors.
- Around line 94-112: The test
test_populate_config_uses_model_checkpoint_argument currently exits the patch on
_resolve_checkpoint before calling
GuidanceConfig.populate_config_for_guidance_type, so the auto-resolution
behavior isn't actually being asserted; update the test to keep the patch on
"sampleworks.utils.guidance_script_arguments._resolve_checkpoint" active through
the call to GuidanceConfig.populate_config_for_guidance_type (i.e., include the
config.populate_config_for_guidance_type(...) inside the with patch(...) block)
and add an assertion on the mock (e.g., mock_resolver.assert_not_called() or
equivalent) after the call to ensure _resolve_checkpoint was not invoked when
args.model_checkpoint is provided.
---
Nitpick comments:
In `@tests/models/protenix/test_ccd_expansion.py`:
- Around line 21-28: Add an autouse pytest fixture to clear the CCD suffix cache
before and after each test instead of calling
_build_ccd_suffix_map.cache_clear() manually in tests; create a fixture named
clear_ccd_suffix_cache (placed in this test module or conftest.py) that calls
_build_ccd_suffix_map.cache_clear(), yields, then calls
_build_ccd_suffix_map.cache_clear() again, and remove the explicit pre/post
cache_clear() calls from tests like test_unique_match_expands and other tests
referencing _build_ccd_suffix_map or relying on _expand_tilde_ccd_code.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c70781cc-915a-4d80-8c4f-00bf6adc649d
⛔ Files ignored due to path filters (1)
pixi.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
pyproject.tomltests/models/protenix/test_ccd_expansion.pytests/utils/test_guidance_script_arguments.py
🚧 Files skipped from review as they are similar to previous changes (1)
- pyproject.toml
- Use ty overrides (not exclude) to suppress real_space_density.py errors - Mock _resolve_checkpoint in guidance arg tests (no /checkpoints/ on ubuntu) - Add missing step_scaler_type to test Namespace - Mock CCD data in protenix expansion tests (no CIF cache on ubuntu)
- Set all pre-existing ty error rules to warn (51 diagnostics across many files; per-file overrides don't scale) - Fix test assertion: populate auto-resolves when no checkpoint arg given - Mock _resolve_checkpoint in validate test to avoid env-dependent behavior
NFS cache testing is complete. Restore the environment gate so GPU tests require manual approval on PRs again.
88e93e7 to
dad7955
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pyproject.toml (1)
1-3:⚠️ Potential issue | 🔴 CriticalCI is failing: lock file out of sync with workspace.
The pipeline reports
pixi installfailed because the lock file doesn't matchpyproject.toml. Runpixi install(without--locked) locally to regeneratepixi.lockand commit the updated lock file.pixi install git add pixi.lock git commit -m "chore: regenerate pixi.lock"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pyproject.toml` around lines 1 - 3, CI is failing because pixi.lock is out of sync with pyproject.toml; run `pixi install` locally (without --locked) to regenerate pixi.lock, then add and commit the updated pixi.lock (e.g., git add pixi.lock && git commit -m "chore: regenerate pixi.lock") so the lock file matches the build-system section (build-backend = "hatchling.build", requires = ["hatchling"]) in pyproject.toml.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/utils/test_guidance_script_arguments.py`:
- Around line 122-129: The test mocks
sampleworks.utils.guidance_script_arguments._resolve_checkpoint with a
non-realistic message; update the mock in
test_validate_model_checkpoint_requires_non_empty_value to raise a ValueError
whose message matches the real resolver contract (e.g., start with a realistic
prefix like "could not resolve checkpoint" or "failed to resolve checkpoint")
and change the pytest.raises match to assert that prefix (use a regex matching
the prefix) so the test verifies the real resolver-style error rather than an
exact, brittle string.
---
Outside diff comments:
In `@pyproject.toml`:
- Around line 1-3: CI is failing because pixi.lock is out of sync with
pyproject.toml; run `pixi install` locally (without --locked) to regenerate
pixi.lock, then add and commit the updated pixi.lock (e.g., git add pixi.lock &&
git commit -m "chore: regenerate pixi.lock") so the lock file matches the
build-system section (build-backend = "hatchling.build", requires =
["hatchling"]) in pyproject.toml.
🪄 Autofix (Beta)
❌ Autofix failed (check again to retry)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7b2b8279-6444-43e0-858f-7425c5c2c80a
📒 Files selected for processing (13)
.github/workflows/ci.yml.github/workflows/gpu-tests.ymlpyproject.tomlscripts/eval/bond_geometry_eval.pyscripts/eval/run_and_process_phenix_clashscore.pyscripts/eval/run_and_process_tortoize.pysrc/sampleworks/core/forward_models/xray/real_space_density_deps/ops/csrc/__init__.pysrc/sampleworks/eval/grid_search_eval_utils.pysrc/sampleworks/utils/msa.pytests/eval/test_structure_utils.pytests/models/protenix/test_ccd_expansion.pytests/utils/test_atom_array_utils.pytests/utils/test_guidance_script_arguments.py
✅ Files skipped from review due to trivial changes (9)
- scripts/eval/run_and_process_phenix_clashscore.py
- tests/eval/test_structure_utils.py
- tests/utils/test_atom_array_utils.py
- scripts/eval/bond_geometry_eval.py
- src/sampleworks/eval/grid_search_eval_utils.py
- scripts/eval/run_and_process_tortoize.py
- .github/workflows/ci.yml
- .github/workflows/gpu-tests.yml
- src/sampleworks/utils/msa.py
🚧 Files skipped from review as they are similar to previous changes (2)
- src/sampleworks/core/forward_models/xray/real_space_density_deps/ops/csrc/init.py
- tests/models/protenix/test_ccd_expansion.py
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/utils/test_guidance_script_arguments.py (1)
71-87: Strengthen this test by asserting resolver invocation in the call under test.The current assertion checks final state only. Add a direct mock assertion so this test proves resolution happens when
populate_config_for_guidance_typeruns.Suggested tightening
def test_populate_config_resolves_checkpoint_when_none_provided(_mock_resolve, model_wrapper_type): """populate_config_for_guidance_type should auto-resolve checkpoint if no arg exists.""" config = GuidanceConfig( protein="protein", structure="/tmp/structure.cif", density="/tmp/density.mrc", model=model_wrapper_type, guidance_type=GuidanceType.PURE_GUIDANCE, log_path="/tmp/output/run.log", ) + _mock_resolve.reset_mock() config.populate_config_for_guidance_type( _build_job(model_wrapper_type), Namespace(use_tweedie=False, step_scaler_type="noisespace"), ) + _mock_resolve.assert_called_once() assert config.model_checkpoint == "/checkpoints/mock.ckpt"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/utils/test_guidance_script_arguments.py` around lines 71 - 87, The test only asserts the final checkpoint value but doesn't verify the resolver was invoked; update test_populate_config_resolves_checkpoint_when_none_provided to assert that the _mock_resolve mock was called when populate_config_for_guidance_type runs (e.g., _mock_resolve.assert_called_once() or _mock_resolve.assert_called_once_with(...) using the expected args from the call), referencing GuidanceConfig.populate_config_for_guidance_type and the _mock_resolve fixture so the test proves resolution occurred rather than just the resulting state.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/utils/test_guidance_script_arguments.py`:
- Around line 71-87: The test only asserts the final checkpoint value but
doesn't verify the resolver was invoked; update
test_populate_config_resolves_checkpoint_when_none_provided to assert that the
_mock_resolve mock was called when populate_config_for_guidance_type runs (e.g.,
_mock_resolve.assert_called_once() or _mock_resolve.assert_called_once_with(...)
using the expected args from the call), referencing
GuidanceConfig.populate_config_for_guidance_type and the _mock_resolve fixture
so the test proves resolution occurred rather than just the resulting state.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1f3bd329-18d7-47f1-85dd-3ce32690cd78
📒 Files selected for processing (1)
tests/utils/test_guidance_script_arguments.py
|
Note Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. Autofix skipped. No unresolved CodeRabbit review comments with fix instructions found. |
k-chrispens
left a comment
There was a problem hiding this comment.
LGTM, will add an issue to address the ty things
Summary
ci.ymlworkflow with 3 jobs onubuntu-latest, no approval gates:ruff check+ruff format --checkty checkacross boltz-dev, protenix-dev, rf3-dev (matrix)pytest -m 'not gpu'across all 3 envs (412 tests, matrix)Context
Addresses feedback from Karson and Marcus:
Test plan
Summary by CodeRabbit
Chores
Tests