Skip to content

fix(onboard): bound compatible endpoint probe#5400

Merged
cv merged 3 commits into
mainfrom
codex/revert-hosted-model-and-fix-e2e
Jun 13, 2026
Merged

fix(onboard): bound compatible endpoint probe#5400
cv merged 3 commits into
mainfrom
codex/revert-hosted-model-and-fix-e2e

Conversation

@cv

@cv cv commented Jun 13, 2026

Copy link
Copy Markdown
Collaborator

Summary

Reverts the hosted custom inference model-ID changes from #5399 so CI continues using the model ID actually served by https://inference-api.nvidia.com/v1/chat/completions. Bounds the ordinary OpenAI-compatible chat-completions onboarding validation probe with max_tokens: 8, and keeps rebuild/upgrade E2E registry metadata aligned with the hosted-compatible onboarding session.

Changes

  • Restore hosted custom inference defaults and workflow/test expectations to nvidia/nvidia/nemotron-3-super-v3.
  • Preserve provider/model in rebuild and upgrade E2E fixture registry seeding from the onboard session, falling back to hosted-compatible env values.
  • Use the hosted-compatible model env for post-rebuild inference smoke calls instead of hardcoding the public nvidia-prod model ID.
  • Add max_tokens: 8 to the non-strict chat-completions validation probe payload.
  • Add regression coverage for the bounded probe payload and rebuild fixture provider/model alignment.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • Git hooks passed during commit and push, or npx prek run --from-ref main --to-ref HEAD passes
  • Targeted tests pass for changed behavior
  • Full npm test passes (broad runtime changes only)
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Notes:

  • npx prek run --from-ref main --to-ref HEAD passed before the latest fixture update; commit and push hooks passed for the latest update.
  • bash -n passed for the changed rebuild/upgrade shell fixtures.
  • npm test -- src/lib/inference/onboard-probes.test.ts test/e2e-script-workflow.test.ts src/lib/onboard/providers.test.ts passed.
  • npm test -- test/onboard-selection.test.ts test/stale-dist-check.test.ts src/lib/inference/onboard-probes.test.ts passed.
  • Isolated rerun of npm test -- test/onboard-model-router.test.ts -t "prefers the managed Model Router command over PATH" passed after one transient commit-hook failure in that unrelated test.
  • npm run docs passed with 0 errors; Fern reported 2 hidden warnings, so the docs-without-warnings checkbox is left unchecked.

Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

  • Chores

    • Updated default hosted inference model identifier across CI/workflows, test fixtures, and configs to a new model.
    • Updated scripts to compute provider/model with new default/mapping logic (including a compatible-endpoint mapping).
  • Bug Fixes

    • Set a baseline max_tokens cap for certain hosted probe requests.
  • Tests

    • Adjusted E2E/contract tests and fixtures to expect the new model; improved defensive fallback handling and removed a deprecated alignment test.

cv added 2 commits June 13, 2026 12:47
@cv cv self-assigned this Jun 13, 2026
@coderabbitai

coderabbitai Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

This PR replaces the hosted inference model identifier with nvidia/nvidia/nemotron-3-super-v3 across code, CI workflows, fixtures, and tests, and adds a baseline max_tokens: 8 to the chat completions probe payload.

Changes

Hosted Inference Model Migration

Layer / File(s) Summary
Core model constants and fixtures
src/lib/onboard/providers.ts, test/e2e-scenario/fixtures/hosted-inference.ts, test/e2e/lib/ci-compatible-inference.sh
Update default hosted inference model constants to nvidia/nvidia/nemotron-3-super-v3.
Probe payload token limit
src/lib/inference/onboard-probes.ts, src/lib/inference/onboard-probes.test.ts
Add max_tokens: 8 to default chat completions probe payload and update test to assert bounded payload.
CI/workflow environment updates
.github/workflows/e2e-script.yaml, .github/workflows/e2e-vitest-scenarios.yaml, .github/workflows/nightly-e2e.yaml
Set NEMOCLAW_MODEL and NEMOCLAW_COMPAT_MODEL to nvidia/nvidia/nemotron-3-super-v3 in hosted-inference job steps across workflows.
Test expectations and workflow contract updates
test/e2e-script-workflow.test.ts
Update assertions to expect the new hosted model identifier and custom-provider conditional; replace obsolete rebuild fixture alignment test.
E2E script registry and post-rebuild changes
test/e2e/test-rebuild-hermes.sh, test/e2e/test-rebuild-openclaw.sh, test/e2e/test-upgrade-stale-sandbox.sh
Adjust embedded Python registry/session logic: prefer session/env values, default provider to compatible-endpoint, default model to the new identifier, use env-resolved model for post-rebuild inference, and add defensive session loading.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#5380: Touches credential-migration workflow and updates model/provider values used for hosted inference.
  • NVIDIA/NemoClaw#5385: Also updates hosted-inference model selection to nvidia/nvidia/nemotron-3-super-v3 in related code paths.

Suggested labels

integration: openclaw, bug-fix, area: e2e

Suggested reviewers

  • prekshivyas

Poem

🐰 I hopped through CI and changed the tune,
Swapped model strings beneath the moon,
Capped tokens small, made tests align,
Now nemotron-3-super runs just fine.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(onboard): bound compatible endpoint probe' accurately describes the main change: adding a max_tokens bound to the onboarding validation probe for the compatible endpoint.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/revert-hosted-model-and-fix-e2e

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-code-quality

github-code-quality Bot commented Jun 13, 2026

Copy link
Copy Markdown

Code Coverage Overview

Languages: TypeScript

TypeScript / code-coverage/plugin

The overall coverage in the branch is 96%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File dc23e81 +/-
nemoclaw/src/se...cret-scanner.ts 100%
nemoclaw/src/commands/slash.ts 100%
nemoclaw/src/li...bprocess-env.ts 100%
nemoclaw/src/bl...eprint/state.ts 98%
nemoclaw/src/onboard/config.ts 98%
nemoclaw/src/bl...int/snapshot.ts 97%
nemoclaw/src/bl...print/runner.ts 95%
nemoclaw/src/co...ration-state.ts 94%
nemoclaw/src/bl...ate-networks.ts 94%
nemoclaw/src/index.ts 94%

TypeScript / code-coverage/cli

The overall coverage in the branch is 44%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File dc23e81 +/-
src/lib/state/o...oard-session.ts 90%
src/lib/inference/local.ts 77%
src/lib/sandbox/config.ts 72%
src/lib/inference/nim.ts 72%
src/lib/onboard/preflight.ts 64%
src/lib/state/sandbox.ts 55%
src/lib/onboard...er-gpu-patch.ts 50%
src/lib/actions...licy-channel.ts 49%
src/lib/policy/index.ts 48%
src/lib/onboard.ts 17%

Updated June 13, 2026 20:26 UTC
Code Coverage is in Public Preview. Learn more and provide us with your feedback.

@github-actions

github-actions Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: inference-routing-e2e, cloud-onboard-e2e, credential-migration-e2e, launchable-smoke-e2e, rebuild-openclaw-e2e, rebuild-hermes-e2e, upgrade-stale-sandbox-e2e
Optional E2E: credential-migration-vitest, rebuild-hermes-stale-base-e2e, sandbox-operations-e2e, openclaw-tui-chat-correlation-e2e

Dispatch hint: inference-routing-e2e,cloud-onboard-e2e,credential-migration-e2e,launchable-smoke-e2e,rebuild-openclaw-e2e,rebuild-hermes-e2e,upgrade-stale-sandbox-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • inference-routing-e2e (medium): Validates OpenAI-compatible inference routing and onboarding probe behavior against the live hosted endpoint after adding default max_tokens and changing the hosted model default.
  • cloud-onboard-e2e (medium): Exercises the reusable e2e-script workflow path that now exports the new hosted compatible inference model during install/onboard.
  • credential-migration-e2e (medium): Covers live migration of NVIDIA_INFERENCE_API_KEY into the custom compatible provider with the updated hosted endpoint/model environment.
  • launchable-smoke-e2e (medium): Provides a direct install-flow smoke test for live hosted compatible inference with the changed model values in nightly direct-job env wiring.
  • rebuild-openclaw-e2e (medium): The OpenClaw rebuild script changed its hosted custom inference fixture/registry behavior; run the exact E2E job that consumes it.
  • rebuild-hermes-e2e (medium): The Hermes rebuild script changed its hosted custom inference fixture/registry behavior; run the main Hermes rebuild E2E lane.
  • upgrade-stale-sandbox-e2e (medium): The stale sandbox upgrade script was changed around provider/model registry state, which is a sandbox lifecycle and deployment-sensitive path.

Optional E2E

  • credential-migration-vitest (medium): Useful to validate the separate e2e-vitest-scenarios workflow job whose live hosted inference environment was changed, although the nightly credential-migration-e2e covers the same live behavior.
  • rebuild-hermes-stale-base-e2e (medium): Additional coverage for the alternate Hermes stale-base mode in the changed test-rebuild-hermes.sh script.
  • sandbox-operations-e2e (high): Broad confidence check for direct nightly hosted inference env wiring and sandbox lifecycle operations after the hosted model/default probe changes.
  • openclaw-tui-chat-correlation-e2e (medium): Adjacent real assistant flow using the updated direct hosted compatible inference environment; useful if maintainers want end-user chat confidence.

New E2E recommendations

  • None.

Dispatch hint

  • Workflow: .github/workflows/nightly-e2e.yaml
  • jobs input: inference-routing-e2e,cloud-onboard-e2e,credential-migration-e2e,launchable-smoke-e2e,rebuild-openclaw-e2e,rebuild-hermes-e2e,upgrade-stale-sandbox-e2e

@github-actions

github-actions Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: credential-migration-vitest
Optional Vitest E2E scenarios: None

Dispatch required Vitest E2E scenarios:

  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=credential-migration-vitest

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

  • credential-migration-vitest: The PR changes the discrete credential migration live Vitest job's hosted inference model wiring in e2e-vitest-scenarios.yaml and updates the hosted-inference fixture used by test/e2e-scenario/live/credential-migration.test.ts. Dispatch the free-standing job directly to verify the compatible hosted inference credential/model path.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=credential-migration-vitest

Optional Vitest E2E scenarios

  • None.

Relevant changed files

  • .github/workflows/e2e-vitest-scenarios.yaml
  • src/lib/inference/onboard-probes.ts
  • src/lib/onboard/providers.ts
  • test/e2e-scenario/fixtures/hosted-inference.ts

@github-actions

github-actions Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 0 worth checking, 0 nice ideas
Since last review: 2 prior items resolved, 0 still apply, 0 new items found

Consider writing more tests for
  • **Runtime validation** — Hosted compatible onboarding validation sends max_tokens: 8 to /chat/completions for nvidia/nvidia/nemotron-3-super-v3 and receives an OK response.. Unit and workflow contract coverage is appropriate for the code changes, but the touched paths still cross live hosted inference, credential staging, workflow execution, and OpenShell sandbox rebuild/upgrade behavior.
  • **Runtime validation** — DeepSeek V4 Pro validation still uses streaming with max_tokens: 8192 after the default chat-completions probe cap.. Unit and workflow contract coverage is appropriate for the code changes, but the touched paths still cross live hosted inference, credential staging, workflow execution, and OpenShell sandbox rebuild/upgrade behavior.
  • **Runtime validation** — Rebuild OpenClaw, rebuild Hermes, and upgrade-stale sandbox fixtures preserve the session/env provider and model when NEMOCLAW_PROVIDER=custom hosted inference is active.. Unit and workflow contract coverage is appropriate for the code changes, but the touched paths still cross live hosted inference, credential staging, workflow execution, and OpenShell sandbox rebuild/upgrade behavior.
  • **Runtime validation** — Post-rebuild inference smoke calls use NEMOCLAW_MODEL or NEMOCLAW_COMPAT_MODEL rather than the old nvidia-prod Build model in hosted-compatible lanes.. Unit and workflow contract coverage is appropriate for the code changes, but the touched paths still cross live hosted inference, credential staging, workflow execution, and OpenShell sandbox rebuild/upgrade behavior.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e/test-rebuild-hermes.sh`:
- Around line 287-288: Update the seeded registry model ID to the migrated
served value in all three E2E scripts: in test/e2e/test-rebuild-hermes.sh (lines
287-288) replace the registry 'model' value with
'nvidia/nvidia/nemotron-3-super-v3' (leave 'provider' as-is), in
test/e2e/test-rebuild-openclaw.sh (lines 222-223) replace the registry 'model'
value with 'nvidia/nvidia/nemotron-3-super-v3', and in
test/e2e/test-upgrade-stale-sandbox.sh (lines 162-163) replace the registry
'model' value with 'nvidia/nvidia/nemotron-3-super-v3' so all seeded registry
entries use the served provider model ID.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c16d651b-d919-4a3d-9591-56d9f2f1d36d

📥 Commits

Reviewing files that changed from the base of the PR and between 3f003c0 and 49f0e18.

📒 Files selected for processing (12)
  • .github/workflows/e2e-script.yaml
  • .github/workflows/e2e-vitest-scenarios.yaml
  • .github/workflows/nightly-e2e.yaml
  • src/lib/inference/onboard-probes.test.ts
  • src/lib/inference/onboard-probes.ts
  • src/lib/onboard/providers.ts
  • test/e2e-scenario/fixtures/hosted-inference.ts
  • test/e2e-script-workflow.test.ts
  • test/e2e/lib/ci-compatible-inference.sh
  • test/e2e/test-rebuild-hermes.sh
  • test/e2e/test-rebuild-openclaw.sh
  • test/e2e/test-upgrade-stale-sandbox.sh

Comment thread test/e2e/test-rebuild-hermes.sh Outdated
Signed-off-by: Carlos Villela <cvillela@nvidia.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e-script-workflow.test.ts`:
- Line 926: Replace the CWD-dependent read with a module-relative read: when
reading the fixture use readFileSync(path.join(__dirname, fixture), "utf8") (or
readFileSync(new URL(fixture, import.meta.url), "utf8") in ESM) instead of
readFileSync(fixture, "utf8"); update the import/require to include path (or
ensure URL usage) so the assignment to body uses a module-relative path derived
from __dirname or import.meta.url and no longer depends on the process CWD.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6377e0da-6fdc-47a7-ba94-65e9f9892c3d

📥 Commits

Reviewing files that changed from the base of the PR and between 49f0e18 and dc23e81.

📒 Files selected for processing (4)
  • test/e2e-script-workflow.test.ts
  • test/e2e/test-rebuild-hermes.sh
  • test/e2e/test-rebuild-openclaw.sh
  • test/e2e/test-upgrade-stale-sandbox.sh

@@ -925,9 +925,10 @@ describe("E2E reusable workflow contract", () => {
for (const fixture of rebuildFixtures) {
const body = readFileSync(fixture, "utf8");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use module-relative reads for rebuild fixtures to avoid CWD-coupled test failures.

readFileSync(fixture, "utf8") relies on the test runner’s current working directory. This can flake when the suite is invoked from a different CWD.

Suggested patch
-    for (const fixture of rebuildFixtures) {
-      const body = readFileSync(fixture, "utf8");
+    for (const fixture of rebuildFixtures) {
+      const body = readFileSync(new URL(`../${fixture}`, import.meta.url), "utf8");
       expect(body, fixture).toContain("provider = sess.get('provider')");
       expect(body, fixture).toContain("if env_provider == 'custom'");
       expect(body, fixture).toContain("'provider': provider");
       expect(body, fixture).toContain("'model': model");
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const body = readFileSync(fixture, "utf8");
const body = readFileSync(new URL(`../${fixture}`, import.meta.url), "utf8");
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e-script-workflow.test.ts` at line 926, Replace the CWD-dependent read
with a module-relative read: when reading the fixture use
readFileSync(path.join(__dirname, fixture), "utf8") (or readFileSync(new
URL(fixture, import.meta.url), "utf8") in ESM) instead of readFileSync(fixture,
"utf8"); update the import/require to include path (or ensure URL usage) so the
assignment to body uses a module-relative path derived from __dirname or
import.meta.url and no longer depends on the process CWD.

@cv cv merged commit cc1fa5c into main Jun 13, 2026
49 checks passed
@cv cv deleted the codex/revert-hosted-model-and-fix-e2e branch June 13, 2026 21:08
@cv cv added the v0.0.65 Release target label Jun 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v0.0.65 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant