feat: append JobResult timing and status to job_metadata.json by Abdelsalam-Abbas · Pull Request #230 · diff-use/sampleworks

Abdelsalam-Abbas · 2026-05-05T20:50:40Z

Summary

Closes Output metadata should be updated to include start and end time from JobResult #212. job_metadata.json now includes started_at, finished_at, runtime_seconds, status, and exit_code from JobResult in addition to the existing GuidanceConfig snapshot.
Adds JobResult.as_dict() mirroring GuidanceConfig.as_dict() so output_dir and log_path get container-to-host remapping. Without this, the merged file would regress to container paths in Docker runs.
New helper _write_job_metadata is called both inside save_everything (config-only snapshot, in case the run crashes later) and at the end of run_guidance (enriched with JobResult, on both success and failure paths).

Test plan

pixi run -e boltz-dev cpu-tests tests/utils/test_guidance_script_utils.py — 5 tests pass, including the new test_write_job_metadata_remaps_job_result_paths_to_host regression test that sets SAMPLEWORKS_HOST_RESULTS_DIR and asserts JobResult paths come out host-remapped.
Manual: run a guidance job and confirm job_metadata.json now contains both the GuidanceConfig fields and the JobResult timing/status fields.

Summary by CodeRabbit

New Features
- Job results now serialize to dictionary format with container-to-host path remapping.
Refactor
- Streamlined job metadata writing process to centralize configuration handling.
Tests
- Added comprehensive test coverage for metadata operations and path handling in various scenarios.

Closes #212. The metadata file now carries started_at, finished_at, runtime_seconds, status, and exit_code in addition to the GuidanceConfig snapshot, by merging asdict(job_result) on top after run_guidance finishes. save_everything still writes the GuidanceConfig-only snapshot as before, so a crash inside _run_guidance does not lose context.

JobResult carries output_dir and log_path that, on Docker runs, are container-internal paths. Without remapping, merging the JobResult dict on top of GuidanceConfig.as_dict() in job_metadata.json overwrites the already-remapped host paths with container paths, regressing the host-paths invariant. Add JobResult.as_dict() (mirroring GuidanceConfig.as_dict) so both halves of the merge produce host-remapped paths, and switch the helper to use it instead of dataclasses.asdict. Adds a regression test.

AGENTS.md mandates NumPy-style docstrings. JobResult.as_dict and _write_job_metadata now have Parameters/Returns sections matching the convention used elsewhere in this file (e.g., save_everything).

coderabbitai · 2026-05-05T20:50:57Z

📝 Walkthrough

Walkthrough

This PR adds metadata merging capability to the job execution pipeline. A new as_dict() method on JobResult enables its fields to be converted to dictionary form with path remapping. A centralized _write_job_metadata() helper merges GuidanceConfig and JobResult metadata, which run_guidance() now invokes to persist combined metadata. Tests validate merging, directory creation, and host-path remapping.

Changes

Metadata Merging and Persistence

Layer / File(s)	Summary
Data Shape `src/sampleworks/utils/guidance_script_arguments.py`	`JobResult` gains `as_dict()` method to serialize its fields with container-to-host path remapping via `_remap_container_path`.
Core Implementation `src/sampleworks/utils/guidance_script_utils.py`	New `_write_job_metadata()` function merges `GuidanceConfig` and optional `JobResult` dictionaries, creates output directory if needed, and writes combined metadata to `job_metadata.json`.
Integration `src/sampleworks/utils/guidance_script_utils.py`	`run_guidance()` now constructs `JobResult` on both success and failure paths and calls `_write_job_metadata()` before returning to persist runtime metadata.
Tests `tests/utils/test_guidance_script_utils.py`	Four tests verify merging behavior without and with `JobResult`, directory creation, and host-path remapping. Helper `_make_job_result()` populates test fixtures.

Sequence Diagram

sequenceDiagram
    participant RG as run_guidance()
    participant JR as JobResult
    participant GC as GuidanceConfig
    participant WM as _write_job_metadata()
    participant FS as File System
    
    RG->>JR: Construct JobResult (success/failure)
    JR->>JR: Capture timing & status
    RG->>WM: Call _write_job_metadata(output_dir, args, job_result)
    WM->>GC: Call args.as_dict()
    GC-->>WM: Return config metadata dict
    WM->>JR: Call job_result.as_dict()
    JR-->>WM: Return result dict (with remapped paths)
    WM->>WM: Merge both dicts
    WM->>FS: Create output_dir if missing
    WM->>FS: Write merged metadata to job_metadata.json
    WM-->>RG: Complete
    RG-->>RG: Return JobResult

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 Hop, skip, and a metadata merge!
JobResult joins GuidanceConfig in a synchronized surge,
Timestamps and paths dance in containers and hosts,
One file writes true what the guidance run boasts.
Start time, end time—the whole story's told! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: append JobResult timing and status to job_metadata.json' accurately describes the main change: adding JobResult fields (timing and status) to the metadata file.
Linked Issues check	✅ Passed	The PR fully addresses issue `#212`: it appends JobResult timing data (started_at, finished_at, runtime_seconds) and status information to job_metadata.json while preserving existing GuidanceConfig metadata.
Out of Scope Changes check	✅ Passed	All changes directly support the linked objective: adding JobResult fields to job_metadata.json. The as_dict() method, helper function, and tests all serve this purpose with no extraneous modifications.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch issue-212-jobresult-metadata

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

tests/utils/test_guidance_script_utils.py (1)
9-10: ⚡ Quick win

Prefer validating metadata through a public entry point, not _write_job_metadata directly.

These tests are behavior-focused, but they’re still coupled to a private helper. Consider keeping one helper-level check at most and shifting the main assertions to a public path (e.g., save_everything/run_guidance output behavior) to reduce brittleness during refactors.

As per coding guidelines: "**/test_*.py: Test public interfaces with expected behaviors."

Also applies to: 64-183
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/utils/test_guidance_script_utils.py` around lines 9 - 10, Tests
currently call the private helper _write_job_metadata directly; change them to
validate behavior via the public API (e.g., call save_everything or the
run_guidance equivalent) and assert on observable outputs instead of internal
state. Keep at most one helper-level assertion if necessary (convert any
remaining _write_job_metadata direct checks into a single, minimal unit test
that documents helper behavior), but move all main assertions that inspect
metadata files, return values, or side-effects to tests that invoke
save_everything/run_guidance and verify the public outputs/side-effects.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/utils/test_guidance_script_utils.py`:
- Around line 9-10: Tests currently call the private helper _write_job_metadata
directly; change them to validate behavior via the public API (e.g., call
save_everything or the run_guidance equivalent) and assert on observable outputs
instead of internal state. Keep at most one helper-level assertion if necessary
(convert any remaining _write_job_metadata direct checks into a single, minimal
unit test that documents helper behavior), but move all main assertions that
inspect metadata files, return values, or side-effects to tests that invoke
save_everything/run_guidance and verify the public outputs/side-effects.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3c7586d9-ec6e-42f6-9301-6900d1454f3c

📥 Commits

Reviewing files that changed from the base of the PR and between fbf6d38 and 2940ae1.

📒 Files selected for processing (3)

src/sampleworks/utils/guidance_script_arguments.py
src/sampleworks/utils/guidance_script_utils.py
tests/utils/test_guidance_script_utils.py

Abdelsalam-Abbas added 3 commits May 5, 2026 22:17

docs: convert new docstrings to NumPy style per AGENTS.md

2940ae1

AGENTS.md mandates NumPy-style docstrings. JobResult.as_dict and _write_job_metadata now have Parameters/Returns sections matching the convention used elsewhere in this file (e.g., save_everything).

Abdelsalam-Abbas requested a deployment to gpu-testing May 5, 2026 20:50 — with GitHub Actions Waiting

coderabbitai Bot reviewed May 5, 2026

View reviewed changes

Abdelsalam-Abbas requested review from k-chrispens and marcuscollins May 5, 2026 21:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: append JobResult timing and status to job_metadata.json#230

feat: append JobResult timing and status to job_metadata.json#230
Abdelsalam-Abbas wants to merge 3 commits intomainfrom
issue-212-jobresult-metadata

Abdelsalam-Abbas commented May 5, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 5, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Abdelsalam-Abbas commented May 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Abdelsalam-Abbas commented May 5, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 5, 2026 •

edited

Loading