Add flexible Diffuse params runs by xraymemory · Pull Request #225 · diff-use/sampleworks

xraymemory · 2026-04-22T14:54:04Z

Summary

Replace fixed Sampleworks Diffuse profiles with one generic sampleworks profile that accepts a flexible params.json payload and runs via --params /diffuse/input/params.json --output-dir ....
Add params-file handling in the Docker entrypoint and grid-search runner while preserving direct legacy -e <env> run_grid_search.py ... usage.
Add Sampleworks-owned validation/mapping for model selection, nested model/guidance sections, inline proteins, checkpoints, and clear errors for invalid model/parameter combinations; preserve unknown params in run artifacts for traceability.
Write params/provenance artifacts (params.original.json, params.resolved.json, run_summary.json, enriched job_metadata.json) and update docs/examples away from stale fixed-profile flags.

Validation

bash -n docker-entrypoint.sh
python3 -m py_compile run_grid_search.py src/sampleworks/utils/run_params.py src/sampleworks/utils/guidance_script_utils.py tests/utils/test_run_params.py
Lightweight PYTHONPATH=src python3 smoke checks for nested config, inline proteins, unknown-param preservation, and model inference

Notes

Full pixi/pytest and GPU/Docker execution were not run locally because pixi and runtime dependencies are unavailable in this environment.
Params mode intentionally supports only --params <file> for parameter passing; no --params-json or --params-env user path is exposed.

Summary by CodeRabbit

New Features
- Add params-file execution mode (accepts --params and --output-dir) with automatic params materialization and structured run summaries
- New runtime contract/profile for Sampleworks experiments with GPU-minimum and volume mounts
Documentation
- CLI docs/examples updated: --models → --model and --methods → --method; help text and examples adjusted
Build
- Image now records build revision and version metadata
Tests
- Comprehensive tests for params-file behavior and artifacting

Declares sampleworks run profiles (boltz2-xrd, boltz2-md, protenix, rf3) with input schemas, args templates, volume mounts, and metadata fields. Applied to Diffuse via `diffuse apply`.

coderabbitai · 2026-04-22T14:54:17Z

📝 Walkthrough

Walkthrough

Adds a params-file execution mode and Diffuse contract plus related plumbing: new diffuse.yaml; params-mode implementation and artifacting in src/sampleworks/utils/run_params.py; CLI/entrypoint support in run_grid_search.py and docker-entrypoint.sh; Docker image labels; docs updates and tests for params-mode.

Changes

Cohort / File(s)	Summary
Diffuse contract `diffuse.yaml`	New contract for running Sampleworks via Diffuse: metadata, a `sampleworks` profile using `diffuseproject/sampleworks:latest`, inputs (`params_json` required, `output_dir`), run defaults (16Gi shm, image pull policy, poll retries), params-file materialization to `/diffuse/input/params.json`, and mounts for input/results/msa-cache; `args_template` passes `--params /diffuse/input/params.json` and `--output-dir`.
Params-mode implementation `src/sampleworks/utils/run_params.py`	New comprehensive module implementing params-file loading, validation, model inference, Pixi env mapping, normalization helpers, output-dir resolution, inline proteins CSV writing, artifact writing (`params.original.json`, `params.resolved.json`, `run_summary.json`), and public helpers used by CLI/entrypoint.
CLI & entrypoint updates `run_grid_search.py`, `docker-entrypoint.sh`	Add `--params` mode integration: argument parsing/validation, `apply_run_params` usage, early output-dir creation, run-summary emission on all outcomes, env inference from params, and updated help/examples (removal of `--use-tweedie`). Entrypoint adds `--params ... --output-dir ...` path and env inference function.
Guidance metadata formatting `src/sampleworks/utils/guidance_script_utils.py`	`run_guidance_job_queue` now builds and enriches a `metadata_payload` with added runtime metadata and writes JSON with deterministic formatting (`indent=2`, `sort_keys=True`).
Docker image metadata `Dockerfile`	Add build args `SAMPLEWORKS_GIT_SHA` and `SAMPLEWORKS_IMAGE_TAG`, expose as env vars and OCI labels (`org.opencontainers.image.revision`, `org.opencontainers.image.version`).
Docs updates `GRID_SEARCH.md`, `README.md`	Documentation updated to reflect CLI flag renames (`--models`→`--model`, `--methods`→`--method`) and removal of `--use-tweedie` from examples/notes.
Tests `tests/utils/test_run_params.py`	New pytest suite covering params-file loading/unwrapping, model/env inference, normalization helpers, `apply_run_params` behavior, output-dir precedence, inline proteins materialization, and artifact writing.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant Diffuse as Diffuse Scheduler
  participant Entrypoint as docker-entrypoint.sh
  participant RunParams as run_params module
  participant RunGrid as run_grid_search.py
  participant Container as Sampleworks Container
  participant FS as Filesystem

  User->>Diffuse: submit job with params_json
  Diffuse->>FS: materialize `/diffuse/input/params.json`
  Diffuse->>Container: start container with `--params /diffuse/input/params.json --output-dir /data/results`
  Container->>Entrypoint: invoke entrypoint with `--params` args
  Entrypoint->>RunParams: call infer_env_from_params -> determines pixi env
  Entrypoint->>RunGrid: exec run_grid_search.py under inferred env
  RunGrid->>RunParams: load and apply_run_params(params.json)
  RunParams->>FS: write `params.original.json`, `params.resolved.json`, inline proteins CSV (if present)
  RunGrid->>Container: orchestrate jobs (process pool, guidance)
  RunGrid->>FS: write `run_summary.json` (status, counts, timestamps, error if any)
  Container->>Diffuse: exit status / logs

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat(grid search args): rename --use-tweedie; switch --models to --model; switch --methods to --method #167 — Touches the same CLI surface and flag renames (--models/--methods) and removal/renaming of --use-tweedie; likely related to params/CLI changes.

Suggested reviewers

marcuscollins
DorisMai
k-chrispens

Poem

🐇 I nibble on params, tidy and bright,

JSON dreams turned to folders at night.
GPUs hum, artifacts gleam,
Containers hop through a rabbit-team,
Pipelines fold — a bunny's delight. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 43.48% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Add flexible Diffuse params runs' is partially related to the changeset. It refers to a real aspect of the change—adding params-mode execution support and Diffuse integration—but does not clearly highlight the main change of adding a new diffuse.yaml contract file.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch add-diffuse-yaml

⚔️ Resolve merge conflicts

Resolve merge conflict in branch add-diffuse-yaml

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 0/1 reviews remaining, refill in 60 minutes.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@diffuse.yaml`:
- Around line 243-247: The static_args are using per-model flags
--protenix-checkpoint and --rf3-checkpoint which the script does not recognize;
replace those occurrences in static_args with the single generic flag
--model-checkpoint and keep the same checkpoint path values so the arguments
match the parser in run_grid_search.py (search for --protenix-checkpoint and
--rf3-checkpoint and change them to --model-checkpoint).
- Around line 79-93: The args_template in the diffuse profiles references
incorrect CLI flags for run_grid_search.py: update every profile's flag_args to
use the singular flags the script expects (change --models → --model and
--methods → --method) and replace any profile-specific checkpoint flags in
static_args (e.g., --protenix-checkpoint, --rf3-checkpoint) with the generic
--model-checkpoint used by run_grid_search.py; also correct the input_schema
entry for partial_diffusion_step so its default is a numeric value (not the
string "120") and its type remains number.
- Around line 42-44: The YAML schema sets partial_diffusion_step as type: number
but its default is a quoted string ("120"); update all occurrences of the
partial_diffusion_step default (the four instances) to an unquoted numeric value
(120) so the default matches the declared type; search for the key name
partial_diffusion_step in diffuse.yaml and replace the quoted default "120" with
120 for each occurrence.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9f9e1b09-dec6-4f28-bd6d-7b87f498b09e

📥 Commits

Reviewing files that changed from the base of the PR and between 6e7a8cf and ef691f9.

📒 Files selected for processing (1)

diffuse.yaml

Copilot

Pull request overview

Adds a diffuse.yaml contract file for integrating Sampleworks with the Diffuse platform, defining 4 runnable profiles and shared metadata fields so Diffuse can apply/upsert these run configurations.

Changes:

Introduces diffuse.yaml with 4 profiles (boltz2-xrd, boltz2-md, protenix, rf3) including container config, input schemas, and argument templates.
Defines default runtime settings (GPU min, shared memory, pull policy, retry/poll behavior) plus volume mounts for inputs/results/cache/checkpoints.
Adds a fields: section describing standardized metadata fields for runs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+          ensemble_sizes: --ensemble-sizes
+          gradient_weights: --gradient-weights
+          methods: --methods
+        boolean_args:


+        static_args:
+          - --output-dir
+          - /data/results
+          - --protenix-checkpoint


+        static_args:
+          - --output-dir
+          - /data/results
+          - --rf3-checkpoint


+          default: pure_guidance
+        - key: partial_diffusion_step
+          type: number
+          default: "120"


+        - key: models
+          type: enum
+          required: true
+          allowed_values: [boltz2]
+          default: boltz2


+        - key: methods
+          type: text
+          default: "X-RAY DIFFRACTION"


- --models → --model (renamed in #150) - --methods → --method (renamed in #151) - --protenix-checkpoint/--rf3-checkpoint → --model-checkpoint (standardized) - partial_diffusion_step default: "120" → 120 (number not string)

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@GRID_SEARCH.md`:
- Around line 29-30: The continued shell lines for the example flags (--model
boltz2 and --method "X-RAY DIFFRACTION") include trailing inline comments after
the backslash which breaks bash line continuation; remove or relocate those
inline annotations so the backslash is the last character on the line (e.g., put
explanatory comments on their own preceding lines or after the full continued
command), ensuring the --model and --method lines end with a literal "\" and no
trailing text so the block is copy/pasteable.

In `@run_grid_search.py`:
- Around line 292-305: The run summary currently derives overall status only
from the JobResult objects in results, which misses jobs dropped by crashed
workers; update the post-run summary logic (the block using results, successful,
failed, status before calling write_run_summary) to treat any missing JobResult
entries as failures by computing expected = len(filtered_jobs), got =
len(results), missing = max(0, expected - got) and adding missing to failed (or
use an explicit worker-failure count returned by run_grid_search() if
available); then compute status = "dry_run" if args.dry_run else ("failed" if
failed > 0 else "success") and pass the adjusted failed_jobs and successful_jobs
into write_run_summary so the summary reflects dropped jobs.

In `@tests/utils/test_run_params.py`:
- Around line 242-243: The test's regex in the pytest.raises call uses an
unescaped dot so "inputs.proteins or proteins_path" matches any char; update the
match pattern used in the with pytest.raises(...) assertion for apply_run_params
to escape the literal dot (e.g., use "inputs\.proteins or proteins_path") so the
test only passes for the exact error message referencing inputs.proteins; keep
the rest of the assertion and the call to apply_run_params(_namespace(...,
proteins=None)) unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0c3c980b-dc60-4184-8cbb-b7928e91f3b2

📥 Commits

Reviewing files that changed from the base of the PR and between a39bb0d and 09feea5.

📒 Files selected for processing (9)

Dockerfile
GRID_SEARCH.md
README.md
diffuse.yaml
docker-entrypoint.sh
run_grid_search.py
src/sampleworks/utils/guidance_script_utils.py
src/sampleworks/utils/run_params.py
tests/utils/test_run_params.py

✅ Files skipped from review due to trivial changes (1)

README.md

coderabbitai · 2026-04-30T17:15:37Z

+    --model boltz2 \                 # options: boltz1, boltz2, protenix, rf3 (make sure env aligns!)
+    --method "X-RAY DIFFRACTION" \   # only useful for Boltz-2, ignored otherwise


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Move the inline annotations off these continued shell lines.

In bash, \ only continues the command when it is the last character on the line. With the trailing comments here, this example is not copy/pasteable as written.

🔧 Suggested doc fix

- --model boltz2 \ # options: boltz1, boltz2, protenix, rf3 (make sure env aligns!) - --method "X-RAY DIFFRACTION" \ # only useful for Boltz-2, ignored otherwise + # options: boltz1, boltz2, protenix, rf3 (make sure env aligns!) + --model boltz2 \ + # only useful for Boltz-2, ignored otherwise + --method "X-RAY DIFFRACTION" \

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

--model boltz2 \ # options: boltz1, boltz2, protenix, rf3 (make sure env aligns!)

--method "X-RAY DIFFRACTION" \ # only useful for Boltz-2, ignored otherwise

# options: boltz1, boltz2, protenix, rf3 (make sure env aligns!)

--model boltz2 \

# only useful for Boltz-2, ignored otherwise

--method "X-RAY DIFFRACTION" \

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@GRID_SEARCH.md` around lines 29 - 30, The continued shell lines for the example flags (--model boltz2 and --method "X-RAY DIFFRACTION") include trailing inline comments after the backslash which breaks bash line continuation; remove or relocate those inline annotations so the backslash is the last character on the line (e.g., put explanatory comments on their own preceding lines or after the full continued command), ensuring the --model and --method lines end with a literal "\" and no trailing text so the block is copy/pasteable.

coderabbitai · 2026-04-30T17:15:37Z

+        successful = sum(1 for result in results if result.status == "success")
+        failed = sum(1 for result in results if result.status == "failed")
+        status = "dry_run" if args.dry_run else "success"
+        if failed:
+            status = "failed"
+        write_run_summary(
+            args=args,
+            output_dir=args.output_dir,
+            status=status,
+            started_at=started_at,
+            total_jobs=len(filtered_jobs),
+            successful_jobs=successful,
+            failed_jobs=failed,
+        )


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't derive the final run status from results alone.

If a worker crashes before writing its .results.pkl, run_grid_search() records that failure at Line 223-Line 225 but returns no JobResult for the dropped jobs. This block will then write status="success" with failed_jobs=0, which makes the run summary lie about a partially failed execution.

🔧 Suggested fix

- successful = sum(1 for result in results if result.status == "success") - failed = sum(1 for result in results if result.status == "failed") + successful = sum(1 for result in results if result.status == "success") + failed = 0 if args.dry_run else max(0, len(filtered_jobs) - successful) status = "dry_run" if args.dry_run else "success" if failed: status = "failed"

At minimum, count missing JobResults as failures. Returning explicit worker-failure counts from run_grid_search() would be even better.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

successful = sum(1 for result in results if result.status == "success")

failed = sum(1 for result in results if result.status == "failed")

status = "dry_run" if args.dry_run else "success"

if failed:

status = "failed"

write_run_summary(

args=args,

output_dir=args.output_dir,

status=status,

started_at=started_at,

total_jobs=len(filtered_jobs),

successful_jobs=successful,

failed_jobs=failed,

)

successful = sum(1 for result in results if result.status == "success")

failed = 0 if args.dry_run else max(0, len(filtered_jobs) - successful)

status = "dry_run" if args.dry_run else "success"

if failed:

status = "failed"

write_run_summary(

args=args,

output_dir=args.output_dir,

status=status,

started_at=started_at,

total_jobs=len(filtered_jobs),

successful_jobs=successful,

failed_jobs=failed,

)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@run_grid_search.py` around lines 292 - 305, The run summary currently derives overall status only from the JobResult objects in results, which misses jobs dropped by crashed workers; update the post-run summary logic (the block using results, successful, failed, status before calling write_run_summary) to treat any missing JobResult entries as failures by computing expected = len(filtered_jobs), got = len(results), missing = max(0, expected - got) and adding missing to failed (or use an explicit worker-failure count returned by run_grid_search() if available); then compute status = "dry_run" if args.dry_run else ("failed" if failed > 0 else "success") and pass the adjusted failed_jobs and successful_jobs into write_run_summary so the summary reflects dropped jobs.

coderabbitai · 2026-04-30T17:15:37Z

+    with pytest.raises(ValueError, match="inputs.proteins or proteins_path"):
+        apply_run_params(_namespace(params=str(params_path), proteins=None))


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Escape the dot in this match= pattern.

pytest.raises(..., match=...) uses regex syntax, so inputs.proteins currently matches any character between inputs and proteins. That can let the test pass on the wrong error string.

🔧 Suggested fix

- with pytest.raises(ValueError, match="inputs.proteins or proteins_path"): + with pytest.raises(ValueError, match=r"inputs\.proteins or proteins_path"):

🧰 Tools

🪛 Ruff (0.15.12)

[warning] 242-242: Pattern passed to match= contains metacharacters but is neither escaped nor raw

(RUF043)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/utils/test_run_params.py` around lines 242 - 243, The test's regex in the pytest.raises call uses an unescaped dot so "inputs.proteins or proteins_path" matches any char; update the match pattern used in the with pytest.raises(...) assertion for apply_run_params to escape the literal dot (e.g., use "inputs\.proteins or proteins_path") so the test only passes for the exact error message referencing inputs.proteins; keep the rest of the assertion and the call to apply_run_params(_namespace(..., proteins=None)) unchanged.

Add diffuse.yaml for Diffuse platform integration

ef691f9

Declares sampleworks run profiles (boltz2-xrd, boltz2-md, protenix, rf3) with input schemas, args templates, volume mounts, and metadata fields. Applied to Diffuse via `diffuse apply`.

Copilot AI review requested due to automatic review settings April 22, 2026 14:54

Copilot started reviewing on behalf of xraymemory April 22, 2026 14:54 View session

coderabbitai Bot reviewed Apr 22, 2026

View reviewed changes

Comment thread diffuse.yaml Outdated

Comment thread diffuse.yaml Outdated

Comment thread diffuse.yaml Outdated

Copilot AI reviewed Apr 22, 2026

View reviewed changes

saada approved these changes Apr 22, 2026

View reviewed changes

xraymemory added 3 commits April 23, 2026 15:22

Fix CLI flag names to match run_grid_search.py

d854002

- --models → --model (renamed in #150) - --methods → --method (renamed in #151) - --protenix-checkpoint/--rf3-checkpoint → --model-checkpoint (standardized) - partial_diffusion_step default: "120" → 120 (number not string)

Fix CLI command in header comment: trigger → start

a39bb0d

Add flexible Diffuse params runs

09feea5

xraymemory requested a deployment to gpu-testing April 30, 2026 17:09 — with GitHub Actions Waiting

xraymemory changed the title ~~Add diffuse.yaml for Diffuse platform integration~~ Add flexible Diffuse params runs Apr 30, 2026

coderabbitai Bot reviewed Apr 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add flexible Diffuse params runs#225

Add flexible Diffuse params runs#225
xraymemory wants to merge 4 commits intomainfrom
add-diffuse-yaml

xraymemory commented Apr 22, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 22, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 30, 2026

Uh oh!

coderabbitai Bot Apr 30, 2026

Uh oh!

coderabbitai Bot Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		--model boltz2 \ # options: boltz1, boltz2, protenix, rf3 (make sure env aligns!)
		--method "X-RAY DIFFRACTION" \ # only useful for Boltz-2, ignored otherwise

		with pytest.raises(ValueError, match="inputs.proteins or proteins_path"):
		apply_run_params(_namespace(params=str(params_path), proteins=None))

Conversation

xraymemory commented Apr 22, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Notes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xraymemory commented Apr 22, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 22, 2026 •

edited

Loading