Skip to content

feat(runs): add analysis preset harness#245

Merged
xraymemory merged 5 commits into
mainfrom
feat/run-analysis-eval-harness
Jun 1, 2026
Merged

feat(runs): add analysis preset harness#245
xraymemory merged 5 commits into
mainfrom
feat/run-analysis-eval-harness

Conversation

@xraymemory

@xraymemory xraymemory commented May 28, 2026

Copy link
Copy Markdown
Contributor

Summary

  • add analysis preset TOML configs and run_analysis entrypoints
  • extend sampleworks-runs loading/execution support for analysis presets
  • document evaluation harness usage and add runner coverage

Tests

  • Not run locally (PR creation only).

Summary by CodeRabbit

  • New Features

    • Added analysis workflow system via run_analysis command for preset-driven evaluation jobs
    • Added support for grid-search analysis, altloc detection/classification, and external evaluation tools
    • New CLI options: --depth for controlling CIF file search depth; --skip-pre-jobs to re-run analyses without repeating patching steps
  • Documentation

    • Updated evaluation guide with analysis workflow examples and configuration details
  • Tests

    • Expanded test coverage for analysis orchestration and CLI validation

@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds a complete configurable preset-driven analysis job orchestration system by extending the Job schema to support per-job script selection and output-flag customization, parameterizing the CLI and preset loader to support arbitrary preset directories, refactoring the runner to orchestrate sequential pre-jobs followed by parallel main jobs with per-job script resolution, providing five analysis TOML preset configurations, introducing a run_analysis shell entry point with Pixi project detection and sync/mismatch policy enforcement, and integrating Docker container support with updated entrypoint routing.

Changes

Analysis Presets Framework

Layer / File(s) Summary
Job and Preset schema with analysis environment support
src/sampleworks/runs/schema.py
Job gains script (default "") and output_arg (default "output-dir") fields with validation rejecting leading dashes. Preset gains pre_jobs list field. VALID_PIXI_ENVS expands to include analysis and dev environment variants. Preset validation checks job name uniqueness across both pre_jobs and main jobs.
CLI configuration framework and refactoring
src/sampleworks/runs/cli.py
Introduces CliConfig frozen dataclass to parameterize preset directory discovery, default preset/aliases, and results-dir fallback logic. Refactors main() to delegate to run_cli(config) with centralized parser construction, --list support, config-driven target resolution, job filtering (preserving pre_jobs), expanded --show output (including _print_job helper), and results-dir selection using config defaults.
Analysis CLI entry point and wheel packaging
src/sampleworks/runs/analysis_cli.py, pyproject.toml
New analysis_cli.py module with ANALYSIS_CLI_CONFIG for analysis-specific preset discovery (analyses/ directory) and main(argv) delegating to run_cli. Adds sampleworks-analysis console script entrypoint and forces analyses package inclusion in wheel build.
Parameterized preset discovery and loading
src/sampleworks/runs/loader.py
Extends list_presets() and load_preset() to accept preset_dir_name and preset_dir_env_var parameters. Refactors directory discovery into _preset_dirs() (env override, SAMPLEWORKS_SOURCE_DIR, workspace default, upward search) and _find_upward_preset_dirs(). Parameterizes _read_toml and _find_preset_path. Recognizes pre_jobs as top-level override target. Introduces _build_jobs helper to populate script and output_arg fields during preset construction.
Runner invocation building and pre-jobs orchestration
src/sampleworks/runs/runner.py
Factors invocation building into _build_invocations_for_jobs() supporting output injection via job.output_arg and per-job script resolution via _resolve_script_path(). Adds DISABLE_GPU_ASSIGNMENTS constant and _gpu_assignment_disables_gpus() for centralizing CPU-only job detection. Updates _build_argv() to resolve scripts and emit expanded list/tuple arguments. Refactors run() to separately build pre-invocations and main invocations, support skip_pre_jobs option, validate GPU assignments per phase, print phase-labeled dry-run/launch output, and execute pre-jobs sequentially via _run_sequential() before main jobs.
Analysis preset TOML configurations
analyses/analyze_grid_search.toml, analyses/external_tools.toml, analyses/all.toml, analyses/altloc_find.toml, analyses/altloc_classify.toml
Five TOML preset files with defaults, shared_args parameterization, pre_jobs patching steps, and main job entries. analyze_grid_search: rscc (1 GPU), lddt/bond_geometry (no GPU). external_tools: tortoize/phenix_clashscore (no GPU) with shared patch pre-step. all: combines five evaluation jobs. altloc_find/altloc_classify: standalone CPU jobs independent of grid-search outputs.
run_analysis shell entry point with Pixi project detection
run_analysis, run_analysis.sh
Bash script discovering Sampleworks repo root via is_sampleworks_root() and find_sampleworks_root_upwards(). Parses --preset, --results-dir, --jobs, positional target; enforces required GRID_SEARCH_RESULTS_DIR/INPUTS_DIR; derives analysis paths and environment exports. Detects Pixi project directory by comparing /app's pyproject.toml/pixi.lock against source. Implements pixi sync/mismatch policy: disallows runtime updates when prebuilt /app diverges from source; allows updates and sets rebuild flags when synced. Execs prebuilt Python runner directly (adjusting PATH/CONDA/CUDA) or pixi run for dynamic invocation. run_analysis.sh wrapper enforces strict shell mode and forwards to run_analysis binary.
Docker container updates and entrypoint routing
Dockerfile, docker-entrypoint.sh
Dockerfile adds pixi install -e analysis, removes /app/analyses and run_* files from runtime image, creates /home/dev/workspace, copies run_analysis and run_analysis.sh to /usr/local/bin/, appends /root/.bashrc auto-cd hook. docker-entrypoint.sh extends help text, adds run_analysis to helper-command passthrough, validates analysis in environment case, updates error messages.
Documentation updates and evaluation utilities
README.md, scripts/eval/EVALUATION.md, src/sampleworks/eval/grid_search_eval_utils.py, src/sampleworks/runs/__init__.py
README.md adds "Running preset analyses on ACTL (run_analysis)" section with environment exports, example commands, --set overrides, --skip-pre-jobs flag. EVALUATION.md updates CIF preparation and running guidance to position run_analysis as preferred entrypoint. grid_search_eval_utils.py adds --depth option for CIF recursion depth. Module docstring updated to reflect preset-orchestrator terminology.
Patch script exit code handling
scripts/patch_output_cif_files.py
Refactors main() to return int exit codes (0 success, 1 no-match or errors). Updates docstring and main block to raise SystemExit(main(...)) so process status reflects return value.
Test coverage for analysis framework
tests/runs/test_runner.py, tests/runs/test_cli.py
test_analysis_preset_builds_eval_script_invocations() validates pre-job patching and RSCC wiring. test_pre_jobs_run_before_main_jobs() confirms execution ordering. test_analysis_cli_lists_analysis_presets() checks preset enumeration. test_gpu_validation_ignores_cpu_jobs_but_checks_gpu_duplicates() ensures CPU jobs bypass validation while GPU duplicates error. test_output_arg_rejects_any_leading_dash() validates output_arg constraints. test_jobs_filters_positional_preset() verifies --jobs filtering with positional preset selection.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • diff-use/sampleworks#240: Extended the same sampleworks.runs orchestration infrastructure (preset loader, CLI, runner) with analysis support and new features like pre_jobs, skip_pre_jobs, per-job script/output_arg fields.
  • diff-use/sampleworks#186: Introduced evaluation scripts (run_and_process_tortoize.py, run_and_process_phenix_clashscore.py) that this PR now orchestrates via analysis preset TOML configurations.

Suggested reviewers

  • marcuscollins
  • k-chrispens

Poem

🐰 Presets bloom in TOML rows,
Pre-jobs patch before the flows,
Pixi roots are now our test,
Scripts called per-job—truly blessed!
Analysis runs, at long last free.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding an analysis preset harness (TOML configurations and entrypoints) to the runs module.
Docstring Coverage ✅ Passed Docstring coverage is 93.10% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/run-analysis-eval-harness

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an “analysis preset” harness alongside the existing experiment preset runner, enabling evaluation scripts under scripts/eval/ to be launched via TOML presets (loaded from analyses/*.toml) using the same Python runner backend as sampleworks-runs.

Changes:

  • Introduces analysis presets (analyses/*.toml) plus a dedicated CLI entrypoint (sampleworks-analysis) and ACTL wrapper (run_analysis) to run evaluation jobs.
  • Extends the preset schema/runner to support per-job script and configurable output_arg injection (including disabling output-flag injection for eval scripts), and improves GPU validation to ignore CPU-only jobs (gpus="none").
  • Updates docs, Docker image setup, and tests to cover analysis preset loading/listing and eval invocation construction.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/runs/test_runner.py Adds coverage for analysis preset invocation building, analysis preset listing, and GPU validation behavior with CPU-only jobs.
src/sampleworks/runs/schema.py Extends job schema with script and output_arg; expands valid pixi envs to include analysis-related envs.
src/sampleworks/runs/runner.py Supports custom job scripts + output-arg injection; adds list-arg formatting; improves GPU validation for gpus="none" jobs; resolves script paths against likely repo roots.
src/sampleworks/runs/loader.py Generalizes preset discovery/loading to support both experiments/ and analyses/ directories via parameters/env vars.
src/sampleworks/runs/cli.py Refactors CLI to be config-driven (CliConfig) so it can power both experiment and analysis entrypoints.
src/sampleworks/runs/analysis_cli.py New sampleworks-analysis entrypoint configured to list/load/run analysis presets from analyses/.
src/sampleworks/runs/__init__.py Updates package docstring to reflect multi-script orchestration and the new analysis CLI.
scripts/eval/EVALUATION.md Documents run_analysis usage and shows equivalent direct script invocations.
run_analysis.sh Adds backward-compatible .sh alias wrapper for run_analysis.
run_analysis Adds ACTL-native wrapper that sets environment defaults and launches sampleworks.runs.analysis_cli.
README.md Adds an ACTL section describing run_analysis and how to override settings with --set.
pyproject.toml Registers sampleworks-analysis console script and includes analyses/ in wheel packaging.
Dockerfile Installs the analysis pixi env and ships run_analysis wrappers into the image.
docker-entrypoint.sh Updates help/validation to recognize the analysis env and advertises run_analysis.
analyses/grid_search.toml Adds core analysis preset for RSCC/LDDT/bond-geometry eval jobs.
analyses/external_tools.toml Adds analysis preset for external-tool-based eval jobs (tortoize, phenix clashscore).
analyses/altloc_find.toml Adds analysis preset for building altloc selections CSV.
analyses/altloc_classify.toml Adds analysis preset for classifying altloc regions.
analyses/all.toml Adds “run everything” analysis preset, including optional external tools.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread run_analysis
Comment on lines +155 to +159
-*)
;;
*)
if [[ -z "$target" ]]; then
target="$arg"
Comment on lines +535 to +539
for root in _source_root_candidates():
candidate = root / expanded
if candidate.exists():
return str(candidate)
return str(expanded)

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@README.md`:
- Around line 269-270: The README example uses `rscc` as an analysis preset but
`rscc` is a job name; update the example to call a valid preset (e.g.
`grid_search`) and then set job-specific overrides — for example use
`run_analysis grid_search --set shared_args.target-filename=refined.cif` and
demonstrate `--set jobs.rscc.gpus=0` to show how to override the `rscc` job;
locate the example near the `run_analysis` invocation and replace the invalid
`rscc` preset with `grid_search` (references: run_analysis, rscc, grid_search,
shared_args.target-filename, jobs.rscc.gpus).

In `@scripts/eval/EVALUATION.md`:
- Line 71: The example command uses a non-existent preset "rscc" (run_analysis
rscc --set shared_args.target-filename=refined.cif) which will break copy-paste;
update the example to reference a real preset (for example use "grid_search
--jobs rscc" or another actual preset name) and show the adjusted command using
that preset while keeping the --set shared_args.target-filename=refined.cif flag
so readers can run it successfully.

In `@src/sampleworks/runs/schema.py`:
- Around line 96-99: The validation currently only rejects values starting with
"--"; update the check on self.output_arg (in the Job/schema initializer where
the block uses self.name and self.output_arg) to reject any leading dash by
testing startswith("-") instead of startswith("--"), and raise the same
ValueError (including the job name and offending output_arg) so values like
"-output-dir" fail fast rather than producing malformed argv entries.

In `@tests/runs/test_runner.py`:
- Around line 270-347: The three tests
(test_analysis_preset_builds_eval_script_invocations,
test_analysis_cli_lists_analysis_presets,
test_gpu_validation_ignores_cpu_jobs_but_checks_gpu_duplicates) use short inline
docstrings; change each to a NumPy-style docstring: add a one-line summary, a
blank line, and a "Parameters" section (none for these tests -> list nothing or
omit) and a "Returns" section stating "None" (or omit if preferred by project
style), plus any short notes under "Notes" if needed; ensure the docstring is a
triple-quoted block directly under the def and follows NumPy formatting
conventions so linter/style checks pass.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4a080a16-4b27-40a3-9fdf-6a4e54a0e3d4

📥 Commits

Reviewing files that changed from the base of the PR and between 6595672 and 82fe167.

📒 Files selected for processing (19)
  • Dockerfile
  • README.md
  • analyses/all.toml
  • analyses/altloc_classify.toml
  • analyses/altloc_find.toml
  • analyses/external_tools.toml
  • analyses/grid_search.toml
  • docker-entrypoint.sh
  • pyproject.toml
  • run_analysis
  • run_analysis.sh
  • scripts/eval/EVALUATION.md
  • src/sampleworks/runs/__init__.py
  • src/sampleworks/runs/analysis_cli.py
  • src/sampleworks/runs/cli.py
  • src/sampleworks/runs/loader.py
  • src/sampleworks/runs/runner.py
  • src/sampleworks/runs/schema.py
  • tests/runs/test_runner.py

Comment thread README.md Outdated
Comment thread scripts/eval/EVALUATION.md Outdated
Comment thread src/sampleworks/runs/schema.py Outdated
Comment thread tests/runs/test_runner.py
@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Actionable comments posted: 0

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/sampleworks/runs/cli.py (1)

248-257: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Allow --jobs with a positional preset target.

Right now sampleworks-runs rf3 --jobs foo fails with “pass jobs either as the positional target or with --jobs” because the if jobs: error runs before checking whether target is a real preset name. That contradicts the parser help, which says --jobs filters the selected preset.

Suggested fix
-    if jobs:
-        parser.error("pass jobs either as the positional target or with --jobs, not both")
-
     if target.endswith(".toml") or "/" in target:
         parser.error("pass custom preset paths with --preset path/to/preset.toml")
 
-    if "," not in target and target in loader.list_presets(
+    if "," not in target and target in loader.list_presets(
         preset_dir_name=config.preset_dir_name,
         preset_dir_env_var=config.preset_dir_env_var,
     ):
-        return target, ""
+        return target, jobs
+
+    if jobs:
+        parser.error("pass jobs either as the positional target or with --jobs, not both")
 
     return config.default_preset, target
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/sampleworks/runs/cli.py` around lines 248 - 257, The code currently
rejects any use of --jobs before checking whether the positional target is a
named preset; change the logic so the jobs conflict is only enforced when the
positional target is not a known preset: call loader.list_presets(...) (using
the same preset_dir_name and preset_dir_env_var from config) to determine if
target is a preset, and only run parser.error("pass jobs either as the
positional target or with --jobs, not both") when jobs is set AND target is not
in that preset list (or when target is a comma-separated job list); i.e., move
or gate the existing if jobs: check behind the loader.list_presets(...) check so
--jobs is allowed when target is a preset.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Dockerfile`:
- Around line 108-115: The RUN shell block that conditionally pre-compiles CUDA
extensions is missing the closing "fi" and currently suppresses real compilation
failures with "|| echo ...", so fix the conditional by adding the matching "fi"
after the else branch and remove the "|| echo ..." that masks errors;
specifically locate the RUN that checks "[ ! -e /dev/nvidiactl ] && [ ! -e
/proc/driver/nvidia/version ]" and the pixi invocation ("pixi run -e boltz
python -c ... dilate_atom_centric") and ensure the else branch runs the pixi
command directly (or with "set -e") so compilation errors propagate and will
fail the build as intended.

---

Outside diff comments:
In `@src/sampleworks/runs/cli.py`:
- Around line 248-257: The code currently rejects any use of --jobs before
checking whether the positional target is a named preset; change the logic so
the jobs conflict is only enforced when the positional target is not a known
preset: call loader.list_presets(...) (using the same preset_dir_name and
preset_dir_env_var from config) to determine if target is a preset, and only run
parser.error("pass jobs either as the positional target or with --jobs, not
both") when jobs is set AND target is not in that preset list (or when target is
a comma-separated job list); i.e., move or gate the existing if jobs: check
behind the loader.list_presets(...) check so --jobs is allowed when target is a
preset.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6d6cf56d-ab0d-49d0-9092-01f74e0fa8f3

📥 Commits

Reviewing files that changed from the base of the PR and between 144c978 and e1f4a92.

📒 Files selected for processing (4)
  • Dockerfile
  • README.md
  • docker-entrypoint.sh
  • src/sampleworks/runs/cli.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • docker-entrypoint.sh
  • README.md

Comment thread Dockerfile Outdated

@marcuscollins marcuscollins left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will leave it to you to decide what to do with my comments. I don't see anything blocking.

Comment thread analyses/altloc_classify.toml Outdated
description = "Classify altloc selections into side-chain, loop, and domain-shift categories."

[defaults]
GRID_SEARCH_RESULTS_DIR = "/data/results/grid_search_results"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this script shouldn't need the results of the grid search. We use it to come up with the parts of protein structures to analyze in more detail.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the script run without it?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I removed the grid-search-specific defaults from the altloc presets in fc9e822: they now use ALTLOC_ANALYSIS_DIR and ALTLOC_INPUTS_DIR. classify_altloc_regions.py only needs the selections CSV, a CIF root, output path, and thresholds, so yes, it runs without grid-search results as long as those inputs are supplied.

DOMAIN_SHIFT_MIN_SPAN = "50"
LOOP_LDDT_THRESHOLD = "0.75"

[shared_args]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this name, "shared_args" a little confusing. These are just the arguments to the script, right?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes — in this schema shared_args means common CLI arguments merged into every job's script invocation, with per-job args overriding them. I left the name in place for this PR because it is already shared by experiment and analysis presets, but I agree script_args/common_args would be clearer if we want to add a compatibility alias later.

Comment thread analyses/altloc_find.toml Outdated
description = "Build an analysis protein-config CSV by finding altloc selections in input CIFs."

[defaults]
GRID_SEARCH_RESULTS_DIR = "/data/results/grid_search_results"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one also shouldn't need this variable.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, fixed in fc9e822. altloc_find now uses ALTLOC_ANALYSIS_DIR and ALTLOC_INPUTS_DIR instead of depending on GRID_SEARCH_RESULTS_DIR/GRID_SEARCH_INPUTS_DIR.

Comment thread analyses/grid_search.toml Outdated
@@ -0,0 +1,53 @@
description = "Core grid-search evaluations: RSCC, LDDT clustering, and bond geometry."

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename this preset--it makes me think of running the grid search generation code. How about "analyze_grid_search.toml"?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in fc9e822: renamed the preset to analyses/analyze_grid_search.toml. I kept grid_search as an accepted CLI alias for compatibility, but the listed/canonical preset name is now analyze_grid_search.

Comment thread src/sampleworks/runs/cli.py Outdated

from . import loader, runner
from .schema import Preset
from .schema import Job, Preset

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I favor absolute imports.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done for the new runner CLI imports in fc9e822 — switched them to absolute sampleworks.runs... imports.

unique: list[Path] = []
for candidate in candidates:
resolved = candidate.expanduser().resolve(strict=False)
if resolved not in seen:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the candidates not all unique?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily. The raw candidate list can hit the same directory via env overrides, /home/dev/workspace, cwd parents, or the installed module path, and symlinks can converge after resolution. The de-dupe after resolve(strict=False) preserves precedence while avoiding duplicate preset discovery.

Comment thread src/sampleworks/runs/runner.py Outdated
candidates: list[Path] = []
for env_var in ("SAMPLEWORKS_SOURCE_DIR", "SAMPLEWORKS_SCRIPT_ROOT"):
override = os.environ.get(env_var)
if override:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a neat new-ish piece of Python that shortens these kinds of statements: https://peps.python.org/pep-0572/. You can now use:

if override := os.environ.get(env_var):
    candidates.append(Path(override))

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, you can even use:

candidates = [Path(override) for env_var in (...) if (override := os.environ.get(env_var) is not None)]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call — applied the simple walrus form in fc9e822. I kept the explicit loop rather than a compact list comprehension because it reads cleaner here and avoids the precedence trap around os.environ.get(env_var) is not None.

return str(expanded)


def _source_root_candidates() -> list[Path]:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels odd that we need this, like it is a sign of a deeper structural problem that we don't know where the scripts are and need to look for them. Thoughts? Is there a better way to do this?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's a bit of a smell. The reason is ACTL: source code is synced to /home/dev/workspace, while the baked image/package can live under /app, and we want script edits to take effect without rebuilding the image. The current search prefers explicit env vars and the synced checkout over baked paths. Longer term, I'd like the wrapper to always set SAMPLEWORKS_SOURCE_DIR and have the runner require that for relative scripts, with package-resource fallback only for packaged/local use.

Comment thread tests/runs/test_runner.py
assert analysis_cli.main(["--list"]) == 0
listed = set(capsys.readouterr().out.splitlines())
assert {
"all",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this will have to be updated if we add additional .toml files/presets?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For adding presets, no: this is an issubset check, so extra .toml files won't break it. It only needs an update when we intentionally rename/remove one of the core bundled presets, which happened here with grid_search -> analyze_grid_search and is reflected in fc9e822.

Comment thread Dockerfile
# This image carries pixi environments and checkpoints. Runtime source should
# come from ACTL's synced checkout at /home/dev/workspace, not from stale code
# baked into /app during image construction.
RUN rm -rf /app/src /app/scripts /app/experiments /app/analyses \

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to rebuilt the current images for us to be able to update code on the fly in actl?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For code-only changes, no rebuild should be needed: ACTL syncs the checkout to /home/dev/workspace, and the runner resolves scripts from that live checkout first. Rebuilds are still needed for dependency/pixi-lock changes or wrapper/image changes baked into /usr/local/bin; RUNTIME_PIXI=1 is the escape hatch when intentionally testing a dependency change inside an existing pod.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/sampleworks/runs/runner.py (1)

668-669: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Critical: skip_pre_jobs logic is bypassed by unconditional rebuild.

Line 668 unconditionally rebuilds pre_invocations, overwriting the skip_pre_jobs conditional on line 651-653. When skip_pre_jobs=True, line 652 correctly sets pre_invocations = [], but line 668 then repopulates it by calling build_pre_invocations without checking the flag. This breaks the --skip-pre-jobs feature documented in README.md.

🐛 Proposed fix to respect skip_pre_jobs during rebuild
     for pixi_env in pixi_envs:
         _prepare_pixi_env(pixi_env)
-    pre_invocations = build_pre_invocations(preset, results_dir=results_dir)
+    pre_invocations = (
+        [] if skip_pre_jobs else build_pre_invocations(preset, results_dir=results_dir)
+    )
     invocations = build_invocations(preset, results_dir=results_dir)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/sampleworks/runs/runner.py` around lines 668 - 669, The code
unconditionally rebuilds pre_invocations by calling build_pre_invocations after
the skip_pre_jobs branch, which overrides the earlier pre_invocations = [] when
skip_pre_jobs is True; change the logic so build_pre_invocations(preset,
results_dir=results_dir) is only called when skip_pre_jobs is False (i.e.,
respect the skip_pre_jobs flag), leaving pre_invocations as the empty list
otherwise; update the section around the pre_invocations assignment that
currently calls build_pre_invocations and ensure build_invocations(preset,
results_dir=results_dir) remains unaffected.
🧹 Nitpick comments (2)
tests/runs/test_cli.py (1)

136-151: ⚡ Quick win

Test only verifies flag propagation, not actual pre-job skipping behavior.

This test mocks runner.run to check that skip_pre_jobs=True is passed, but doesn't verify that pre-jobs are actually skipped when the runner executes. Given the bug in runner.py line 668 (unconditional pre_invocations rebuild), this test passes while the feature is broken.

As per coding guidelines, "Avoid using mocks; test expected behavior directly instead." Consider enhancing this to be a behavioral test: build a minimal preset with pre_jobs, run with --skip-pre-jobs, and assert that the pre-job output/log is absent.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/runs/test_cli.py` around lines 136 - 151, Update the test
test_skip_pre_jobs_flag_is_passed_to_runner to be behavioral instead of mocking
runner.run: create a minimal preset containing a real pre_jobs (or
pre_invocations) task that produces identifiable output/logging, invoke
cli.main(["--preset", "<minimal_preset>", "--skip-pre-jobs"]) and assert the
process exit code is 0 and that the pre-job output/log entries are absent; do
not monkeypatch runner.run—call the real runner so the test validates that the
skip_pre_jobs flag actually prevents executing pre_jobs (references:
test_skip_pre_jobs_flag_is_passed_to_runner, cli.main, runner.run, and the
preset's pre_jobs/pre_invocations behavior).
README.md (1)

254-254: 💤 Low value

Minor: Mixed spelling variants of "analyze"/"analyse".

The preset name uses American spelling (analyze_grid_search) while the surrounding prose uses British spelling ("analyses"). Consider using consistent American spelling throughout: "Running preset analyses" → "Running preset analyzes" or use "analysis" as the noun form consistently.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` at line 254, The README has mixed spelling: the preset name
analyze_grid_search uses American spelling while the surrounding prose uses
British spelling "analyses"; update the surrounding text to use consistent
American spelling (e.g., change "analyses" or the phrase "Running preset
analyses" to "analyzes" or better yet replace with the noun "analysis") so the
preset identifier analyze_grid_search, other presets like all and
external_tools, and the prose use a consistent spelling; search for occurrences
of "analyses"/"analyse" near references to analyze_grid_search and adjust them
accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/sampleworks/runs/runner.py`:
- Around line 668-669: The code unconditionally rebuilds pre_invocations by
calling build_pre_invocations after the skip_pre_jobs branch, which overrides
the earlier pre_invocations = [] when skip_pre_jobs is True; change the logic so
build_pre_invocations(preset, results_dir=results_dir) is only called when
skip_pre_jobs is False (i.e., respect the skip_pre_jobs flag), leaving
pre_invocations as the empty list otherwise; update the section around the
pre_invocations assignment that currently calls build_pre_invocations and ensure
build_invocations(preset, results_dir=results_dir) remains unaffected.

---

Nitpick comments:
In `@README.md`:
- Line 254: The README has mixed spelling: the preset name analyze_grid_search
uses American spelling while the surrounding prose uses British spelling
"analyses"; update the surrounding text to use consistent American spelling
(e.g., change "analyses" or the phrase "Running preset analyses" to "analyzes"
or better yet replace with the noun "analysis") so the preset identifier
analyze_grid_search, other presets like all and external_tools, and the prose
use a consistent spelling; search for occurrences of "analyses"/"analyse" near
references to analyze_grid_search and adjust them accordingly.

In `@tests/runs/test_cli.py`:
- Around line 136-151: Update the test
test_skip_pre_jobs_flag_is_passed_to_runner to be behavioral instead of mocking
runner.run: create a minimal preset containing a real pre_jobs (or
pre_invocations) task that produces identifiable output/logging, invoke
cli.main(["--preset", "<minimal_preset>", "--skip-pre-jobs"]) and assert the
process exit code is 0 and that the pre-job output/log entries are absent; do
not monkeypatch runner.run—call the real runner so the test validates that the
skip_pre_jobs flag actually prevents executing pre_jobs (references:
test_skip_pre_jobs_flag_is_passed_to_runner, cli.main, runner.run, and the
preset's pre_jobs/pre_invocations behavior).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: da306317-a6ef-4cbc-80b0-76ab76af3988

📥 Commits

Reviewing files that changed from the base of the PR and between e1f4a92 and fc9e822.

📒 Files selected for processing (16)
  • Dockerfile
  • README.md
  • analyses/all.toml
  • analyses/altloc_classify.toml
  • analyses/altloc_find.toml
  • analyses/analyze_grid_search.toml
  • analyses/external_tools.toml
  • run_analysis
  • scripts/eval/EVALUATION.md
  • src/sampleworks/eval/grid_search_eval_utils.py
  • src/sampleworks/runs/analysis_cli.py
  • src/sampleworks/runs/cli.py
  • src/sampleworks/runs/runner.py
  • src/sampleworks/runs/schema.py
  • tests/runs/test_cli.py
  • tests/runs/test_runner.py
✅ Files skipped from review due to trivial changes (1)
  • scripts/eval/EVALUATION.md
🚧 Files skipped from review as they are similar to previous changes (8)
  • src/sampleworks/runs/analysis_cli.py
  • analyses/altloc_classify.toml
  • src/sampleworks/runs/schema.py
  • Dockerfile
  • analyses/external_tools.toml
  • tests/runs/test_runner.py
  • src/sampleworks/runs/cli.py
  • run_analysis

@xraymemory xraymemory merged commit ae01dde into main Jun 1, 2026
8 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants