Skip to content

feat(sdk): collapse bench API to bool opt-in + lazy training.bench (fixes one-covenant/basilica-backend#661)#483

Merged
epappas merged 1 commit into
mainfrom
sdk-simplification/661-bench-bool
May 18, 2026
Merged

feat(sdk): collapse bench API to bool opt-in + lazy training.bench (fixes one-covenant/basilica-backend#661)#483
epappas merged 1 commit into
mainfrom
sdk-simplification/661-bench-bool

Conversation

@epappas
Copy link
Copy Markdown
Contributor

@epappas epappas commented May 18, 2026

Summary

  • Collapses the bench API surface to a bool opt-in (bench=True / bench=False, default False).
  • Adds the simplified training.bench_diagnostics: Optional[Dict[str, Any]] debug surface alongside the existing lazy training.bench: BenchResult | None.
  • Deprecates bench: str modes ("on-start" / "off"), training.bench_status, and training.wait_until_bench_complete[_async] with DeprecationWarning pointing at the new surface. Legacy paths remain functional for two minor versions.
  • Bumps basilica-sdk-python from 0.29.4 to 0.29.5 (S2 is a user-facing API surface change per the plan).

Why

Per docs/plans/SDK-API-SIMPLIFICATION-PLAN.md on basilica-backend main (Problem 3): the bench API exposed too much state for an opt-in diagnostic helper:

  • bench: str modes ("on-start" / "off") — two string tokens to memorize for a binary opt-in
  • training.wait_until_bench_complete(timeout=...) — raises TimeoutError or returns a four-phase BenchStatus
  • BenchStatus with four terminal phases (Succeeded / Failed / TimedOut / Skipped) + four is_* properties
  • Two access paths to the result (bench_status.result vs training.bench)

For an OPT-IN measurement helper, that is six concepts for a yes/no question. The user wants to know whether the probe measured (training.bench is not None), not which of four terminal phases it landed in. Diagnostic detail moves to the rarely-needed training.bench_diagnostics accessor.

Surface after this PR

@basilica.distributed(..., bench=True)
def train(): ...

with train() as training:
    # ... workload runs ...
    pass

# After __exit__:
if training.bench is not None:
    print(f"busbw: {training.bench.busbw_gbps_p50}")
else:
    print("no measurement")
    # Advanced: inspect why
    print(training.bench_diagnostics)
    # {"phase": "Skipped", "message": "...", "mode": "on-start", ...}

Changes

  • python/basilica/__init__.py: new _normalize_bench_param helper; bench: Union[bool, str] (default False) on deploy_distributed, deploy_distributed_async, deploy_distributed_managed, deploy_distributed_managed_async. Legacy str values emit DeprecationWarning and forward the wire token verbatim.
  • python/basilica/decorators.py: @distributed bench param type narrowed to Union[bool, str] (default False); normalization happens downstream in deploy_distributed.
  • python/basilica/distributed.py:
    • New lazy property DistributedTraining.bench_diagnostics returning Optional[Dict[str, Any]] with phase / message / mode / started_at / completed_at / last_attempt_at / last_attempt_outcome. Returns None when bench wasn't requested (mode != on-start) OR no operator status block.
    • DistributedTraining.bench_status, wait_until_bench_complete, wait_until_bench_complete_async now emit DeprecationWarning pointing at training.bench / training.bench_diagnostics. Internal _bench_status_no_warn keeps the waiters readable without double-warning.
  • tests/test_bench_bool_simplification.py: 22 new tests pinning the post-fix shape (bool acceptance, str deprecation, diagnostics dict shape, lazy bench: BenchResult | None semantics, deprecation warnings on the legacy surface).
  • Version bump: 0.29.4 -> 0.29.5 across pyproject.toml / Cargo.toml / Cargo.lock.

Wire contract

Unchanged. distributed.bench.mode on the operator-facing JSON stays "on-start" / "off". Only the user-facing SDK parameter type narrows. Operator + CRD schema untouched — this is a pure SDK ergonomics change.

Test plan

  • pytest tests/test_bench_bool_simplification.py — 22 tests assert the post-fix shape; pre-fix run produced 15 failures (anti-pattern reproduced) before the implementation landed
  • pytest tests/ --ignore=tests/test_dns_propagation_e2e.py — 179 existing + new = 201 total, all passing
  • pytest tests/test_bench_status_skipped.py — sibling bench tests still green (the Skipped-terminal-phase fix from fix(sdk): recognize BenchStatus phase=Skipped as terminal (closes #480) #481 remains intact)
  • cargo fmt --all -- --check clean
  • cargo check -p basilica-sdk-python clean

Migration impact

Existing users on bench="on-start" / bench="off" continue to work and see a DeprecationWarning pointing at the bool form. Existing users of wait_until_bench_complete / bench_status continue to work and see a DeprecationWarning pointing at training.bench / training.bench_diagnostics. No breaking change in this release; deprecated paths removed at the next major.

Cross-ref: follow-on work after basilica-backend issue 419. See SDK simplification plan for the broader S1-S7 ticket map (S1 Training context-manager-able, S3 command= factory, S4 source= Callable-only, S5 example migration, S6 docstrings, S7 major bump).

Summary by CodeRabbit

Version 0.29.5 Release

  • New Features

    • bench parameter for distributed training methods now accepts boolean values (True/False) for a simplified API.
    • Added bench_diagnostics property to access benchmark debug information.
    • Added bench lazy result property for retrieving benchmark outcomes.
  • Deprecations

    • String-based bench parameter values are deprecated; use boolean values instead.
    • wait_until_bench_complete() and wait_until_bench_complete_async() methods are deprecated in favor of the new bench property.
    • bench_status property is deprecated; use bench or bench_diagnostics instead.

Review Change Stack

…efs basilica-backend#661)

Part of the SDK API simplification plan
(docs/plans/SDK-API-SIMPLIFICATION-PLAN.md on basilica-backend main).

The bench API surface was over-exposed for an opt-in diagnostic helper:
- bench: str modes ("on-start" / "off") -- two string tokens for a binary
- training.wait_until_bench_complete(timeout=...) raises or returns
- BenchStatus with four terminal phases + four is_* properties
- Two access paths to the result (bench_status.result vs training.bench)

Target after S2 (this change):
- bench: bool -- True opts in, False opts out (default)
- training.bench: BenchResult | None (unchanged; lazy)
- training.bench_diagnostics: Optional[Dict[str, Any]] (new) -- small
  debug dict with phase / message / timings, for the rare case where
  the user wants to know WHY a probe didn't measure
- bench: str ("on-start" / "off") still accepted with DeprecationWarning
- wait_until_bench_complete[_async] and bench_status emit
  DeprecationWarning pointing at the lazy training.bench accessor
- Removed in next major

Changes:
- python/basilica/__init__.py: _normalize_bench_param helper; bench param
  type Union[bool, str] with default False on deploy_distributed,
  deploy_distributed_async, deploy_distributed_managed,
  deploy_distributed_managed_async; deprecation warning emitted by helper
- python/basilica/decorators.py: @distributed bench param type
  Union[bool, str] with default False (forwarded verbatim, normalized
  downstream)
- python/basilica/distributed.py: new training.bench_diagnostics lazy
  property; bench_status, wait_until_bench_complete[_async] emit
  DeprecationWarning; internal _bench_status_no_warn reads the
  BenchStatus without warning
- tests/test_bench_bool_simplification.py: 22 tests pinning the new
  surface (bool acceptance, str deprecation, diagnostics dict shape,
  lazy bench BenchResult|None semantics, wait_until_bench_complete +
  bench_status deprecation warnings)
- pyproject.toml + Cargo.toml + Cargo.lock: bump 0.29.4 -> 0.29.5

All 179 existing SDK tests pass; new tests bring total to 201.

Wire contract is unchanged: distributed.bench.mode is still
"on-start" / "off" on the operator-facing JSON. Only the user-facing
SDK parameter type narrows. Operator + CRD schema untouched.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

Walkthrough

This PR refactors the bench parameter for distributed training deployment from string-only ("on-start"/"off") to a boolean-primary API (True/False) with full backward compatibility. A new normalization layer maps boolean inputs to wire tokens, deprecated bench-related accessors are wrapped with warnings, and comprehensive tests validate the entire API surface and deprecation behavior.

Changes

Bench API bool simplification and deprecation

Layer / File(s) Summary
Deploy API normalization and signature updates
crates/basilica-sdk-python/python/basilica/__init__.py, crates/basilica-sdk-python/python/basilica/decorators.py
Public API signatures for deploy_distributed* methods and @distributed decorator change bench: str = "off" to bench: Union[bool, str] = False. New _normalize_bench_param() helper converts True"on-start", False"off", warns on legacy string inputs, and passes through to preserve downstream validation. Docstrings updated to document boolean opt-in semantics and deprecation path.
DistributedTraining bench deprecations and diagnostics
crates/basilica-sdk-python/python/basilica/distributed.py
New bench_diagnostics property returns a simplified debug dict (phase, message, timestamps, outcome) or None when bench is off/unavailable. Legacy bench_status property wrapped with DeprecationWarning and delegated to internal _bench_status_no_warn. Methods wait_until_bench_complete and async variant deprecated with warnings while polling via _bench_status_no_warn to avoid repeated warning spam.
Comprehensive bench API test suite
crates/basilica-sdk-python/tests/test_bench_bool_simplification.py
New test module with stubbed client/deployment helpers and test classes covering bool acceptance without warnings, decorator support, string deprecation warnings, bench_diagnostics dict structure validation, lazy training.bench result semantics, and backward compatibility for deprecated waiters and status accessor.
Version bump to 0.29.5
crates/basilica-sdk-python/Cargo.toml, crates/basilica-sdk-python/pyproject.toml
Package version incremented from 0.29.4 to 0.29.5.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

  • one-covenant/basilica#468: Adds lifecycle fields to DistributedBenchStatus to enable round-trip Rust↔Python serialization of bench phase/message/timestamps, which the main PR's new bench_diagnostics property depends on.
  • one-covenant/basilica#447: Introduces the original distributed-training deployment surface including _build_distributed_request and initial bench wire-mode support that this PR refactors and simplifies.
  • one-covenant/basilica#465: Initial bench lifecycle implementation in distributed.py (bench_status and wait_until_bench_complete waiters) that this PR wraps with deprecations.

Poem

A rabbit hops through bench booleans bright,
True and False make the API right.
Old strings fade to warnings of old,
New diagnostics, simpler and bold. 🐰✨
Backward compatible, test coverage told.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.53% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main change: collapsing the bench API to a bool opt-in and introducing lazy training.bench, with a reference to the issue number.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sdk-simplification/661-bench-bool

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/basilica-sdk-python/python/basilica/__init__.py (1)

1432-1434: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid deprecated waiters inside the supported wait_for_bench path.

These helpers currently call wait_until_bench_complete[_async](), so deploy_distributed(..., wait_for_bench="best_effort"|"required") now emits a DeprecationWarning even though the caller did not use a deprecated API. That is misleading, and it will also break consumers that run with warnings-as-errors. Please switch these helpers to a non-warning internal polling path instead of routing through the deprecated public waiter.

Also applies to: 2540-2542

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/basilica-sdk-python/python/basilica/distributed.py`:
- Around line 541-551: The internal accessor _bench_status_no_warn should treat
a published bench block with "mode": "off" as no bench; after retrieving
bench_raw in _bench_status_no_warn, check if bench_raw.get("mode") == "off" (or
equivalent string) and return None instead of calling
BenchStatus.from_status_dict, so callers like bench_status and
wait_until_bench_complete[_async] see None when benching was opted out; keep the
existing behavior for other modes and only collapse the "off" case to None.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cae08adb-47f3-4a14-a049-bca86296536b

📥 Commits

Reviewing files that changed from the base of the PR and between 6dff16f and 7895056.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • crates/basilica-sdk-python/Cargo.toml
  • crates/basilica-sdk-python/pyproject.toml
  • crates/basilica-sdk-python/python/basilica/__init__.py
  • crates/basilica-sdk-python/python/basilica/decorators.py
  • crates/basilica-sdk-python/python/basilica/distributed.py
  • crates/basilica-sdk-python/tests/test_bench_bool_simplification.py

Comment on lines +541 to 551
def _bench_status_no_warn(self) -> Optional[BenchStatus]:
"""Internal: read the full BenchStatus without emitting the
deprecation warning. Used by ``bench_diagnostics`` and the
legacy ``wait_until_bench_complete`` waiter so they remain
callable without double-warning the user."""
if self._cached_status is None:
self.refresh()
bench_raw = (self._cached_status or {}).get("distributed", {}).get("bench")
if not bench_raw:
return None
return BenchStatus.from_status_dict(bench_raw)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Collapse mode="off" to None in the internal bench accessor.

The operator can still publish a bookkeeping bench block when the user opted out, e.g. {"mode": "off", "phase": "Skipped"}. Returning a BenchStatus here breaks the documented None-on-bench-off behavior for bench_status and wait_until_bench_complete[_async], and it can make wait_for_bench="required" fail even though bench was explicitly disabled.

💡 Suggested fix
     def _bench_status_no_warn(self) -> Optional[BenchStatus]:
         """Internal: read the full BenchStatus without emitting the
         deprecation warning. Used by ``bench_diagnostics`` and the
         legacy ``wait_until_bench_complete`` waiter so they remain
         callable without double-warning the user."""
         if self._cached_status is None:
             self.refresh()
         bench_raw = (self._cached_status or {}).get("distributed", {}).get("bench")
         if not bench_raw:
             return None
+        if bench_raw.get("mode", "off") != "on-start":
+            return None
         return BenchStatus.from_status_dict(bench_raw)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _bench_status_no_warn(self) -> Optional[BenchStatus]:
"""Internal: read the full BenchStatus without emitting the
deprecation warning. Used by ``bench_diagnostics`` and the
legacy ``wait_until_bench_complete`` waiter so they remain
callable without double-warning the user."""
if self._cached_status is None:
self.refresh()
bench_raw = (self._cached_status or {}).get("distributed", {}).get("bench")
if not bench_raw:
return None
return BenchStatus.from_status_dict(bench_raw)
def _bench_status_no_warn(self) -> Optional[BenchStatus]:
"""Internal: read the full BenchStatus without emitting the
deprecation warning. Used by ``bench_diagnostics`` and the
legacy ``wait_until_bench_complete`` waiter so they remain
callable without double-warning the user."""
if self._cached_status is None:
self.refresh()
bench_raw = (self._cached_status or {}).get("distributed", {}).get("bench")
if not bench_raw:
return None
if bench_raw.get("mode", "off") != "on-start":
return None
return BenchStatus.from_status_dict(bench_raw)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/basilica-sdk-python/python/basilica/distributed.py` around lines 541 -
551, The internal accessor _bench_status_no_warn should treat a published bench
block with "mode": "off" as no bench; after retrieving bench_raw in
_bench_status_no_warn, check if bench_raw.get("mode") == "off" (or equivalent
string) and return None instead of calling BenchStatus.from_status_dict, so
callers like bench_status and wait_until_bench_complete[_async] see None when
benching was opted out; keep the existing behavior for other modes and only
collapse the "off" case to None.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant