test(msa): add MSAManager unit tests (#178) by Abdelsalam-Abbas · Pull Request #220 · diff-use/sampleworks

Abdelsalam-Abbas · 2026-04-13T10:30:16Z

Summary

Adds tests/utils/test_msa.py covering the five tests requested in #178:

_hash_arguments is deterministic for identical inputs and sensitive to sequence content, pairing strategy, and value order.
Explicit msa_cache_dir is used (both Path and str forms); default None creates ~/.sampleworks/msa (with Path.home monkeypatched to tmp_path so the real home is untouched).
get_msa calls _compute_msa when cache files are missing, and skips it on the second call when they exist. Cache-hit/api-call counters are asserted.
_compute_msa forwards the expected arguments to run_mmseqs2 for both the paired and unpaired branches (2-sequence input so both fire), including auth_headers built from api_key_header / api_key_value.
_compute_msa writes matching .csv and .a3m files: the second column of the csv equals the odd (sequence) lines of the a3m.

Notes

Mocks patch sampleworks.utils.msa.run_mmseqs2 and sampleworks.utils.msa._compute_msa (not the source modules) so the actual call sites in msa.py are intercepted — run_mmseqs2 is imported at module load, and _compute_msa is module-level.
Test 3 uses a side_effect that writes a valid csv/a3m pair so the unconditional _validate_msa_cache_contents call inside get_msa passes, doubling as a smoke check that the cache layout _compute_msa writes matches what get_msa expects.
Protenix branch is intentionally not tested per the issue; no new dependency on PROTENIX_AVAILABLE.

Test plan

pixi run -e boltz-dev python3 -m pytest tests/utils/test_msa.py -v → 7 passed in 0.03s
pixi run -e boltz-dev all-tests → 748 passed, 509 skipped, no regressions

Summary by CodeRabbit

Tests
- Added a comprehensive test suite for MSA utilities covering deterministic input hashing, cache behavior and directory initialization, correct triggering of MSA computation when cache entries are missing, and subsequent cache hits. Also validates integration with the external alignment tool (correct call patterns and flags), temp-directory usage, and output consistency between generated alignment files.

coderabbitai · 2026-04-13T10:30:39Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f5044a80-6084-483c-9c8a-75906ee1a36b

📥 Commits

Reviewing files that changed from the base of the PR and between cdefe47 and 2450caf.

📒 Files selected for processing (1)

tests/utils/test_msa.py

🚧 Files skipped from review as they are similar to previous changes (1)

tests/utils/test_msa.py

📝 Walkthrough

Walkthrough

Adds a new pytest suite that tests MSA utilities: argument hashing, MSAManager cache-dir handling and cache hit/miss behavior, _compute_msa forwarding to run_mmseqs2, and writing/validation of .csv and .a3m outputs.

Changes

Cohort / File(s)	Summary
MSA Test Suite `tests/utils/test_msa.py`	New test file covering `_hash_arguments` determinism and sensitivity to input/order; `MSAManager` cache-dir handling for `Path`, `str`, and `None` (default `Path.home()`); `get_msa` cache miss/hit behavior with mocked `_compute_msa`; `_compute_msa` forwarding to `run_mmseqs2` for paired/unpaired flows; and verification that `<target>_0.csv` and `<target>_0.a3m` are written and parsable.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 I hop through hashes, cache and file,
Pairing sequences, testing all the while.
A3Ms and CSVs tucked neat in rows,
Millisecond leaps where the mmseqs flow.
🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'test(msa): add MSAManager unit tests (`#178`)' directly and clearly describes the main change: adding unit tests for the MSAManager class, covering five comprehensive test scenarios as detailed in the PR objectives.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch msa-tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tests/utils/test_msa.py (1)
90-175: Add at least one black-box, no-mock path for MSAManager.get_msa

Most tests here are implementation-coupled (private methods + patched internals). Keeping these unit tests is fine, but add one public-interface test that pre-seeds cache files and calls get_msa without patching internals.

As per coding guidelines: **/test_*.py: Write black-box tests that verify behavior, not implementation. Use realistic inputs and avoid mocks. Test public interfaces with expected behaviors.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/utils/test_msa.py` around lines 90 - 175, Add a black-box test that
exercises the public MSAManager.get_msa without mocking internals: create a
realistic input dict and compute the expected hash_key the manager would use,
pre-seed the msa cache files (both CSV and A3M as used by _compute_msa's output
naming) under tmp_path using that hash_key, instantiate MSAManager (or use
existing manager fixture) pointing to tmp_path, call manager.get_msa(data,
pairing, structure_predictor=...) and assert the returned mapping points to the
pre-seeded files and that manager._cache_hits increments, avoiding any
patch.object on _compute_msa or run_mmseqs2 so the test verifies the public
behavior only.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/utils/test_msa.py`:
- Around line 149-155: The unpaired call assertions are missing checks that auth
and environment flags are forwarded like the paired call; update the assertions
for unpaired_call in tests/utils/test_msa.py to assert that unpaired_call.kwargs
contains the same auth_headers and env-related flags as the paired call (e.g.,
assert unpaired_call.kwargs["auth_headers"] == <expected_auth_headers> and
assert unpaired_call.kwargs["env"] == <expected_env_flags> or equivalent keys
used elsewhere), mirroring the paired-call assertions so unpaired forwarding
cannot regress silently; locate the test by the symbol unpaired_call and add
matching assertions for the auth and env keys used in the existing paired-call
checks.

---

Nitpick comments:
In `@tests/utils/test_msa.py`:
- Around line 90-175: Add a black-box test that exercises the public
MSAManager.get_msa without mocking internals: create a realistic input dict and
compute the expected hash_key the manager would use, pre-seed the msa cache
files (both CSV and A3M as used by _compute_msa's output naming) under tmp_path
using that hash_key, instantiate MSAManager (or use existing manager fixture)
pointing to tmp_path, call manager.get_msa(data, pairing,
structure_predictor=...) and assert the returned mapping points to the
pre-seeded files and that manager._cache_hits increments, avoiding any
patch.object on _compute_msa or run_mmseqs2 so the test verifies the public
behavior only.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 19ebbd21-8f22-4916-9999-7a77a5ec7c0f

📥 Commits

Reviewing files that changed from the base of the PR and between f57044e and 87a005b.

📒 Files selected for processing (1)

tests/utils/test_msa.py

Closes #178. Covers hash determinism/sensitivity, default vs explicit cache directory handling, get_msa cache hit/miss wiring, run_mmseqs2 argument forwarding (paired and unpaired branches), and csv/a3m file content consistency. All mocks patch the names as imported into sampleworks.utils.msa so the module-level call sites are intercepted.

Mirror the paired-call assertions on the unpaired call so forwarding of use_env and auth_headers can't regress silently.

k-chrispens

This looks correct to me, though I don't see where the original issue said to not add Protenix branch related tests. I believe the tests requested in the issue behave similarly for either branch, so will approve here (differs just what server is called and how things are linked up for Protenix).

Abdelsalam-Abbas had a problem deploying to gpu-testing April 13, 2026 10:30 — with GitHub Actions Error

coderabbitai Bot reviewed Apr 13, 2026

View reviewed changes

Comment thread tests/utils/test_msa.py Outdated

Abdelsalam-Abbas had a problem deploying to gpu-testing April 13, 2026 19:45 — with GitHub Actions Error

Abdelsalam-Abbas requested review from k-chrispens and marcuscollins April 13, 2026 21:18

Abdelsalam-Abbas added 2 commits April 13, 2026 21:27

test(msa): assert unpaired run_mmseqs2 call forwards env/auth flags

2450caf

Mirror the paired-call assertions on the unpaired call so forwarding of use_env and auth_headers can't regress silently.

k-chrispens force-pushed the msa-tests branch from cdefe47 to 2450caf Compare April 14, 2026 04:27

k-chrispens requested a deployment to gpu-testing April 14, 2026 04:27 — with GitHub Actions Waiting

k-chrispens approved these changes Apr 14, 2026

View reviewed changes

k-chrispens linked an issue Apr 14, 2026 that may be closed by this pull request

Add tests for MSAManager #178

Closed

k-chrispens merged commit a777440 into main Apr 14, 2026
8 of 11 checks passed

k-chrispens deleted the msa-tests branch April 14, 2026 04:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(msa): add MSAManager unit tests (#178)#220

test(msa): add MSAManager unit tests (#178)#220
k-chrispens merged 2 commits intomainfrom
msa-tests

Abdelsalam-Abbas commented Apr 13, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 13, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

k-chrispens left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Abdelsalam-Abbas commented Apr 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Notes

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

k-chrispens left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Abdelsalam-Abbas commented Apr 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 13, 2026 •

edited

Loading