Skip to content

fix(sdk): recognize BenchStatus phase=Skipped as terminal (closes #480)#481

Merged
epappas merged 1 commit into
mainfrom
fix/480-bench-skipped-terminal-phase
May 18, 2026
Merged

fix(sdk): recognize BenchStatus phase=Skipped as terminal (closes #480)#481
epappas merged 1 commit into
mainfrom
fix/480-bench-skipped-terminal-phase

Conversation

@epappas
Copy link
Copy Markdown
Contributor

@epappas epappas commented May 18, 2026

Summary

Closes #480.

crates/basilica-sdk-python/python/basilica/distributed.py defined BENCH_PHASE_SKIPPED = "Skipped" (line 210) but omitted it from _BENCH_TERMINAL_PHASES (lines 217-219). As a result BenchStatus(phase="Skipped").is_terminal returned False, and wait_until_bench_complete polled until the user-supplied timeout, then raised TimeoutError whose message contained (phase=Skipped) — proving the SDK already had the terminal data but did not act on it.

The basilica-backend operator (X2 fix, one-covenant/basilica-backend#650 / #653) emits a terminal BenchStatus{phase=Skipped, lastAttemptOutcome="skipped", lastAttemptAt=...} plus a BenchSkipped Event when workers exit before the bench-controller observes them. The SDK must recognise this as terminal.

This is a multi-file Skipped-as-terminal alignment, not a one-line frozenset edit.

Surface changes

python/basilica/distributed.py

  • _BENCH_TERMINAL_PHASES adds BENCH_PHASE_SKIPPED; the existing BenchStatus.is_terminal property picks the change up by construction.
  • BenchStatus gains three explicit semantic properties:
    • is_successfulTrue iff phase == Succeeded.
    • is_failedTrue iff phase in {Failed, TimedOut}. Skipped is NOT a failure.
    • is_skippedTrue iff phase == Skipped (terminal-but-not-run).
      These pin the semantic that Skipped is terminal but neither success nor failure: the probe was not run; the workload itself may have completed cleanly.
  • wait_until_bench_complete and wait_until_bench_complete_async docstrings now enumerate all four terminal phases (Succeeded, Failed, TimedOut, Skipped) and reference the example consumers.

examples/

examples/20_distributed_diloco.py and examples/22_distributed_with_bench.py already branch on phase != "Succeeded", so the natural Skipped path is the existing phase-aware print(...) + clean return. No example refactor required; the existing else-branch now handles the operator's Skipped phase end-to-end once the waiter returns instead of raising.

tests/test_bench_status_skipped.py (new)

18 new tests covering:

  • Frozenset membership: Skipped in _BENCH_TERMINAL_PHASES; Pending / Running excluded; full four-phase composition.
  • BENCH_PHASE_SKIPPED importable from the top-level basilica package and consistent with the per-module constant.
  • BenchStatus(phase=...) property matrix across all six phases (Succeeded, Failed, TimedOut, Skipped, Pending, Running) — pins is_terminal / is_successful / is_failed / is_skipped.
  • BenchStatus.from_status_dict round-trip of the operator's Skipped JSON shape (mode, phase, message, lastAttemptAt, lastAttemptOutcome).
  • wait_until_bench_complete returns the Skipped BenchStatus without raising.
  • wait_until_bench_complete_async honours the same contract.
  • Regression guard: Succeeded / Failed / TimedOut still terminate cleanly via the waiter.

Version bump

  • CHANGELOG.md: [0.29.4] - 2026-05-18 entry under Fixed + Added.
  • pyproject.toml: 0.29.3 -> 0.29.4.
  • Cargo.toml: 0.29.3 -> 0.29.4.
  • Cargo.lock: regenerated.

Test plan

  • python -m pytest crates/basilica-sdk-python/tests/ -v --ignore=tests/test_dns_propagation_e2e.py -- 157 passed (18 new + 139 pre-existing). The test_dns_propagation_e2e.py collection error is pre-existing (httpx not in venv) and unrelated.
  • Manual smoke: BenchStatus(phase="Skipped").is_terminal == True; is_skipped == True; is_successful == False; is_failed == False.
  • Manual smoke: wait_until_bench_complete against a fake client emitting phase=Skipped returns the BenchStatus rather than raising.
  • CI: test-python-sdk matrix (3.10 / 3.11 / 3.12 / 3.13).
  • CI: version-consistency guard (Cargo.toml == pyproject.toml).
  • Runtime verification: re-run examples/20_distributed_diloco.py against api.basilica.ai once this lands; expect exit 0 with phase-aware Skipped message instead of the prior TimeoutError.

Cross-ref

one-covenant/basilica-backend#419 Stage 4 take-5 Cell B captured the runtime artefact that surfaced this. Operator-side X2 fix is one-covenant/basilica-backend#650 / #653.

Cross-ref: one-covenant/basilica-backend#419

Summary by CodeRabbit

  • Bug Fixes

    • Fixed BenchStatus to treat Skipped phase as terminal, preventing timeouts in wait operations.
  • New Features

    • Added is_successful, is_failed, and is_skipped boolean properties to BenchStatus for improved phase checking.
  • Tests

    • Added comprehensive test coverage for Skipped bench status handling.
  • Chores

    • Bumped version to 0.29.4.

Review Change Stack

The SDK's `_BENCH_TERMINAL_PHASES` frozenset omitted
`BENCH_PHASE_SKIPPED` even though the constant was defined. As a
result `BenchStatus(phase="Skipped").is_terminal` returned False,
and `wait_until_bench_complete` polled until the user-supplied
timeout and then raised `TimeoutError` whose message contained
`(phase=Skipped)` -- proving the SDK had the terminal data but
did not act on it.

The operator (basilica-backend X2 fix, basilica-backend#650 /
basilica-backend#653) emits a terminal `BenchStatus{phase=Skipped,
lastAttemptOutcome="skipped", lastAttemptAt=...}` plus a
`BenchSkipped` Event when workers exit before the bench-controller
observes them. The SDK must recognise this as terminal.

This is a multi-file Skipped-as-terminal alignment, not a one-line
frozenset edit:

- `_BENCH_TERMINAL_PHASES` adds `BENCH_PHASE_SKIPPED`; existing
  `BenchStatus.is_terminal` picks the change up by construction.
- `BenchStatus` gains `is_successful` / `is_failed` / `is_skipped`
  properties. These pin the semantic that Skipped is terminal but
  neither success nor failure -- the probe was not run; the workload
  itself may have completed cleanly.
- `wait_until_bench_complete` and `wait_until_bench_complete_async`
  docstrings now enumerate all four terminal phases (Succeeded,
  Failed, TimedOut, Skipped) and reference the example consumers.
- `examples/20_distributed_diloco.py` and
  `examples/22_distributed_with_bench.py` already branch on
  `phase != "Succeeded"`, so the natural Skipped path is the
  phase-aware print + clean exit. No example refactor required.
- 18 new tests in `tests/test_bench_status_skipped.py` cover the
  frozenset membership, BenchStatus property matrix across all
  six phases, `from_status_dict` round-trip of the operator's
  Skipped shape, sync + async waiter return, and the top-level
  `BENCH_PHASE_SKIPPED` export check.
- CHANGELOG.md, pyproject.toml, Cargo.toml, Cargo.lock bumped to
  0.29.4.

Cross-repo reference: `one-covenant/basilica-backend#419` Stage 4
take-5 Cell B and `one-covenant/basilica-backend#650 / #653`.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

Walkthrough

This PR extends BenchStatus terminal-phase handling to treat phase="Skipped" as terminal, preventing wait_until_bench_complete from timing out when the operator reports a skipped benchmark. The change updates the internal terminal-phase set, clarifies boolean properties, documents waiter behavior, adds comprehensive test coverage, and bumps the version to 0.29.4.

Changes

BenchStatus Skipped phase handling

Layer / File(s) Summary
Terminal phase set and BenchStatus property updates
crates/basilica-sdk-python/python/basilica/distributed.py
BENCH_PHASE_SKIPPED is added to _BENCH_TERMINAL_PHASES; BenchStatus docstring and property docstrings (is_terminal, is_successful, is_failed, is_skipped) are updated to define Skipped as terminal but neither success nor failure; wait_until_bench_complete and wait_until_bench_complete_async docstrings are updated to list Skipped as a terminal phase that will return immediately.
Test suite for Skipped phase handling
crates/basilica-sdk-python/tests/test_bench_status_skipped.py
Test helpers construct BenchStatus instances and fake operator status dicts; tests assert _BENCH_TERMINAL_PHASES contains BENCH_PHASE_SKIPPED, verify export from the top-level basilica package, check property semantics for Skipped (is_terminal=True, is_skipped=True, is_successful=False, is_failed=False), regression-test other phases, verify from_status_dict round-trip for the operator's phase="Skipped" shape, and confirm wait_until_bench_complete(_async) returns a Skipped BenchStatus without raising TimeoutError.
Version and release metadata
crates/basilica-sdk-python/Cargo.toml, crates/basilica-sdk-python/pyproject.toml, crates/basilica-sdk-python/CHANGELOG.md
Crate version bumped to 0.29.4 in both manifest files; changelog entry documents the fix for terminal Skipped handling and the new BenchStatus boolean convenience properties.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Possibly related PRs

  • one-covenant/basilica#465: Both PRs modify terminal-phase logic and wait_until_bench_complete(_async) behavior in the same distributed.py module; this PR extends the terminal set to explicitly include Skipped.
  • one-covenant/basilica#468: This PR ensures the DistributedBenchStatus.phase field round-trips correctly from Rust to Python so that the Python logic (introduced in this PR) can correctly observe terminal phases like Skipped.

Poem

🐰 A skipped bench need not wait or fret,
Terminal it is, no timeout yet!
Properties clear: is_skipped rings true,
Neither success nor failure to pursue—
One small skip, one great release! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: treating BenchStatus phase=Skipped as terminal, which directly addresses the root issue fixed across multiple files (distributed.py, tests, versioning, and changelog).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/480-bench-skipped-terminal-phase

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/basilica-sdk-python/CHANGELOG.md`:
- Around line 10-35: The CHANGELOG footer reference block is stale and missing
entries for "Unreleased" and "0.29.4", causing the new "## [0.29.4] -
2026-05-18" heading to not resolve; update the reference-link block at the
bottom of CHANGELOG.md to add or refresh the compare links for "Unreleased"
(pointing from the latest tag to HEAD) and "0.29.4" (pointing from the previous
release, e.g. 0.29.3, to 0.29.4) so the bracketed headings resolve
correctly—look for the existing footer reference section in CHANGELOG.md and add
entries matching the style of the other links for the "Unreleased" and "0.29.4"
labels.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a1e74507-c096-4807-91f3-f4eaa4b38a18

📥 Commits

Reviewing files that changed from the base of the PR and between 7b2ca21 and f75a497.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • crates/basilica-sdk-python/CHANGELOG.md
  • crates/basilica-sdk-python/Cargo.toml
  • crates/basilica-sdk-python/pyproject.toml
  • crates/basilica-sdk-python/python/basilica/distributed.py
  • crates/basilica-sdk-python/tests/test_bench_status_skipped.py

Comment on lines +10 to +35
## [0.29.4] - 2026-05-18

### Fixed
- `BenchStatus` recognises `phase=Skipped` as a terminal state.
`_BENCH_TERMINAL_PHASES` now contains all four operator-side terminal
phases (`Succeeded`, `Failed`, `TimedOut`, `Skipped`), so
`BenchStatus.is_terminal` returns `True` on `Skipped` and
`wait_until_bench_complete` / `wait_until_bench_complete_async` return
the terminal `BenchStatus` instead of polling until the user-supplied
timeout and raising `TimeoutError`. Pre-fix, the SDK had the data
(the operator wrote terminal `BenchStatus{phase=Skipped,
lastAttemptOutcome="skipped"}` to the UD CR) but did not act on it
-- the `TimeoutError` message literally contained `(phase=Skipped)`.
Closes #480. Cross-repo reference:
`one-covenant/basilica-backend#419` Stage 4 take-5 Cell B and the
basilica-backend operator X2 fix (`one-covenant/basilica-backend#650
/ #653`).

### Added
- `BenchStatus.is_successful` / `is_failed` / `is_skipped` properties.
Pin the semantic that `Skipped` is terminal but neither success nor
failure -- the bench probe was not run (e.g. workers exited before
the bench-controller observed them). The workload's own exit codes
remain the source of truth for run success; bench is an opt-in,
best-effort measurement.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add/refresh changelog link references for 0.29.4 and Unreleased compare chain.

The new ## [0.29.4] heading is bracketed, but this file’s reference-link block is stale and does not include 0.29.4 (and Unreleased still compares from 0.14.0). Please update the footer references so version headings resolve correctly.

Suggested footer update
-[Unreleased]: https://github.com/one-covenant/basilica/compare/basilica-sdk-python-v0.14.0...HEAD
+[Unreleased]: https://github.com/one-covenant/basilica/compare/basilica-sdk-python-v0.29.4...HEAD
+[0.29.4]: https://github.com/one-covenant/basilica/compare/basilica-sdk-python-v0.29.3...basilica-sdk-python-v0.29.4
+[0.29.3]: https://github.com/one-covenant/basilica/compare/basilica-sdk-python-v0.29.2...basilica-sdk-python-v0.29.3
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/basilica-sdk-python/CHANGELOG.md` around lines 10 - 35, The CHANGELOG
footer reference block is stale and missing entries for "Unreleased" and
"0.29.4", causing the new "## [0.29.4] - 2026-05-18" heading to not resolve;
update the reference-link block at the bottom of CHANGELOG.md to add or refresh
the compare links for "Unreleased" (pointing from the latest tag to HEAD) and
"0.29.4" (pointing from the previous release, e.g. 0.29.3, to 0.29.4) so the
bracketed headings resolve correctly—look for the existing footer reference
section in CHANGELOG.md and add entries matching the style of the other links
for the "Unreleased" and "0.29.4" labels.

@epappas epappas merged commit 6dff16f into main May 18, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(sdk): BenchStatus phase=Skipped not recognized as terminal — wait_until_bench_complete spins until user timeout

1 participant