fix(onboard): bound compatible endpoint probe by cv · Pull Request #5400 · NVIDIA/NemoClaw

cv · 2026-06-13T20:04:25Z

Summary

Reverts the hosted custom inference model-ID changes from #5399 so CI continues using the model ID actually served by https://inference-api.nvidia.com/v1/chat/completions. Bounds the ordinary OpenAI-compatible chat-completions onboarding validation probe with max_tokens: 8, and keeps rebuild/upgrade E2E registry metadata aligned with the hosted-compatible onboarding session.

Changes

Restore hosted custom inference defaults and workflow/test expectations to nvidia/nvidia/nemotron-3-super-v3.
Preserve provider/model in rebuild and upgrade E2E fixture registry seeding from the onboard session, falling back to hosted-compatible env values.
Use the hosted-compatible model env for post-rebuild inference smoke calls instead of hardcoding the public nvidia-prod model ID.
Add max_tokens: 8 to the non-strict chat-completions validation probe payload.
Add regression coverage for the bounded probe payload and rebuild fixture provider/model alignment.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

Git hooks passed during commit and push, or npx prek run --from-ref main --to-ref HEAD passes
Targeted tests pass for changed behavior
Full npm test passes (broad runtime changes only)
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
npm run docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

Notes:

npx prek run --from-ref main --to-ref HEAD passed before the latest fixture update; commit and push hooks passed for the latest update.
bash -n passed for the changed rebuild/upgrade shell fixtures.
npm test -- src/lib/inference/onboard-probes.test.ts test/e2e-script-workflow.test.ts src/lib/onboard/providers.test.ts passed.
npm test -- test/onboard-selection.test.ts test/stale-dist-check.test.ts src/lib/inference/onboard-probes.test.ts passed.
Isolated rerun of npm test -- test/onboard-model-router.test.ts -t "prefers the managed Model Router command over PATH" passed after one transient commit-hook failure in that unrelated test.
npm run docs passed with 0 errors; Fern reported 2 hidden warnings, so the docs-without-warnings checkbox is left unchecked.

Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

Chores
- Updated default hosted inference model identifier across CI/workflows, test fixtures, and configs to a new model.
- Updated scripts to compute provider/model with new default/mapping logic (including a compatible-endpoint mapping).
Bug Fixes
- Set a baseline max_tokens cap for certain hosted probe requests.
Tests
- Adjusted E2E/contract tests and fixtures to expect the new model; improved defensive fallback handling and removed a deprecated alignment test.

This reverts commit 3f003c0.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

coderabbitai · 2026-06-13T20:04:38Z

📝 Walkthrough

Walkthrough

This PR replaces the hosted inference model identifier with nvidia/nvidia/nemotron-3-super-v3 across code, CI workflows, fixtures, and tests, and adds a baseline max_tokens: 8 to the chat completions probe payload.

Changes

Hosted Inference Model Migration

Layer / File(s)	Summary
Core model constants and fixtures `src/lib/onboard/providers.ts`, `test/e2e-scenario/fixtures/hosted-inference.ts`, `test/e2e/lib/ci-compatible-inference.sh`	Update default hosted inference model constants to `nvidia/nvidia/nemotron-3-super-v3`.
Probe payload token limit `src/lib/inference/onboard-probes.ts`, `src/lib/inference/onboard-probes.test.ts`	Add `max_tokens: 8` to default chat completions probe payload and update test to assert bounded payload.
CI/workflow environment updates `.github/workflows/e2e-script.yaml`, `.github/workflows/e2e-vitest-scenarios.yaml`, `.github/workflows/nightly-e2e.yaml`	Set `NEMOCLAW_MODEL` and `NEMOCLAW_COMPAT_MODEL` to `nvidia/nvidia/nemotron-3-super-v3` in hosted-inference job steps across workflows.
Test expectations and workflow contract updates `test/e2e-script-workflow.test.ts`	Update assertions to expect the new hosted model identifier and custom-provider conditional; replace obsolete rebuild fixture alignment test.
E2E script registry and post-rebuild changes `test/e2e/test-rebuild-hermes.sh`, `test/e2e/test-rebuild-openclaw.sh`, `test/e2e/test-upgrade-stale-sandbox.sh`	Adjust embedded Python registry/session logic: prefer session/env values, default `provider` to `compatible-endpoint`, default `model` to the new identifier, use env-resolved model for post-rebuild inference, and add defensive session loading.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

NVIDIA/NemoClaw#5380: Touches credential-migration workflow and updates model/provider values used for hosted inference.
NVIDIA/NemoClaw#5385: Also updates hosted-inference model selection to nvidia/nvidia/nemotron-3-super-v3 in related code paths.

Suggested labels

integration: openclaw, bug-fix, area: e2e

Suggested reviewers

prekshivyas

Poem

🐰 I hopped through CI and changed the tune,
Swapped model strings beneath the moon,
Capped tokens small, made tests align,
Now nemotron-3-super runs just fine.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix(onboard): bound compatible endpoint probe' accurately describes the main change: adding a max_tokens bound to the onboarding validation probe for the compatible endpoint.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch codex/revert-hosted-model-and-fix-e2e

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-code-quality · 2026-06-13T20:05:06Z

Code Coverage Overview

Languages: TypeScript

TypeScript / code-coverage/plugin

The overall coverage in the codex/revert-hosted-... branch is 96%. Coverage data for the main branch is not yet available.

Show a code coverage summary of the most covered files.

File	main	codex/revert-hosted-... `dc23e81`	+/-
`nemoclaw/src/se...cret-scanner.ts`	—	100%	—
`nemoclaw/src/commands/slash.ts`	—	100%	—
`nemoclaw/src/li...bprocess-env.ts`	—	100%	—
`nemoclaw/src/bl...eprint/state.ts`	—	98%	—
`nemoclaw/src/onboard/config.ts`	—	98%	—
`nemoclaw/src/bl...int/snapshot.ts`	—	97%	—
`nemoclaw/src/bl...print/runner.ts`	—	95%	—
`nemoclaw/src/co...ration-state.ts`	—	94%	—
`nemoclaw/src/bl...ate-networks.ts`	—	94%	—
`nemoclaw/src/index.ts`	—	94%	—

TypeScript / code-coverage/cli

The overall coverage in the codex/revert-hosted-... branch is 44%. Coverage data for the main branch is not yet available.

Show a code coverage summary of the most covered files.

File	main	codex/revert-hosted-... `dc23e81`	+/-
`src/lib/state/o...oard-session.ts`	—	90%	—
`src/lib/inference/local.ts`	—	77%	—
`src/lib/sandbox/config.ts`	—	72%	—
`src/lib/inference/nim.ts`	—	72%	—
`src/lib/onboard/preflight.ts`	—	64%	—
`src/lib/state/sandbox.ts`	—	55%	—
`src/lib/onboard...er-gpu-patch.ts`	—	50%	—
`src/lib/actions...licy-channel.ts`	—	49%	—
`src/lib/policy/index.ts`	—	48%	—
`src/lib/onboard.ts`	—	17%	—

_{Updated June 13, 2026 20:26 UTC
Code Coverage is in Public Preview. Learn more and provide us with your feedback.}

github-actions · 2026-06-13T20:06:03Z

E2E Advisor Recommendation

Required E2E: inference-routing-e2e, cloud-onboard-e2e, credential-migration-e2e, launchable-smoke-e2e, rebuild-openclaw-e2e, rebuild-hermes-e2e, upgrade-stale-sandbox-e2e
Optional E2E: credential-migration-vitest, rebuild-hermes-stale-base-e2e, sandbox-operations-e2e, openclaw-tui-chat-correlation-e2e

Dispatch hint: inference-routing-e2e,cloud-onboard-e2e,credential-migration-e2e,launchable-smoke-e2e,rebuild-openclaw-e2e,rebuild-hermes-e2e,upgrade-stale-sandbox-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

inference-routing-e2e (medium): Validates OpenAI-compatible inference routing and onboarding probe behavior against the live hosted endpoint after adding default max_tokens and changing the hosted model default.
cloud-onboard-e2e (medium): Exercises the reusable e2e-script workflow path that now exports the new hosted compatible inference model during install/onboard.
credential-migration-e2e (medium): Covers live migration of NVIDIA_INFERENCE_API_KEY into the custom compatible provider with the updated hosted endpoint/model environment.
launchable-smoke-e2e (medium): Provides a direct install-flow smoke test for live hosted compatible inference with the changed model values in nightly direct-job env wiring.
rebuild-openclaw-e2e (medium): The OpenClaw rebuild script changed its hosted custom inference fixture/registry behavior; run the exact E2E job that consumes it.
rebuild-hermes-e2e (medium): The Hermes rebuild script changed its hosted custom inference fixture/registry behavior; run the main Hermes rebuild E2E lane.
upgrade-stale-sandbox-e2e (medium): The stale sandbox upgrade script was changed around provider/model registry state, which is a sandbox lifecycle and deployment-sensitive path.

Optional E2E

credential-migration-vitest (medium): Useful to validate the separate e2e-vitest-scenarios workflow job whose live hosted inference environment was changed, although the nightly credential-migration-e2e covers the same live behavior.
rebuild-hermes-stale-base-e2e (medium): Additional coverage for the alternate Hermes stale-base mode in the changed test-rebuild-hermes.sh script.
sandbox-operations-e2e (high): Broad confidence check for direct nightly hosted inference env wiring and sandbox lifecycle operations after the hosted model/default probe changes.
openclaw-tui-chat-correlation-e2e (medium): Adjacent real assistant flow using the updated direct hosted compatible inference environment; useful if maintainers want end-user chat confidence.

New E2E recommendations

None.

Dispatch hint

Workflow: .github/workflows/nightly-e2e.yaml
jobs input: inference-routing-e2e,cloud-onboard-e2e,credential-migration-e2e,launchable-smoke-e2e,rebuild-openclaw-e2e,rebuild-hermes-e2e,upgrade-stale-sandbox-e2e

github-actions · 2026-06-13T20:06:04Z

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: credential-migration-vitest
Optional Vitest E2E scenarios: None

Dispatch required Vitest E2E scenarios:

gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=credential-migration-vitest

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

credential-migration-vitest: The PR changes the discrete credential migration live Vitest job's hosted inference model wiring in e2e-vitest-scenarios.yaml and updates the hosted-inference fixture used by test/e2e-scenario/live/credential-migration.test.ts. Dispatch the free-standing job directly to verify the compatible hosted inference credential/model path.
- Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=credential-migration-vitest

Optional Vitest E2E scenarios

None.

Relevant changed files

.github/workflows/e2e-vitest-scenarios.yaml
src/lib/inference/onboard-probes.ts
src/lib/onboard/providers.ts
test/e2e-scenario/fixtures/hosted-inference.ts

github-actions · 2026-06-13T20:07:38Z

PR Review Advisor

Findings: 0 needs attention, 0 worth checking, 0 nice ideas
Since last review: 2 prior items resolved, 0 still apply, 0 new items found

Consider writing more tests for

**Runtime validation** — Hosted compatible onboarding validation sends max_tokens: 8 to /chat/completions for nvidia/nvidia/nemotron-3-super-v3 and receives an OK response.. Unit and workflow contract coverage is appropriate for the code changes, but the touched paths still cross live hosted inference, credential staging, workflow execution, and OpenShell sandbox rebuild/upgrade behavior.
**Runtime validation** — DeepSeek V4 Pro validation still uses streaming with max_tokens: 8192 after the default chat-completions probe cap.. Unit and workflow contract coverage is appropriate for the code changes, but the touched paths still cross live hosted inference, credential staging, workflow execution, and OpenShell sandbox rebuild/upgrade behavior.
**Runtime validation** — Rebuild OpenClaw, rebuild Hermes, and upgrade-stale sandbox fixtures preserve the session/env provider and model when NEMOCLAW_PROVIDER=custom hosted inference is active.. Unit and workflow contract coverage is appropriate for the code changes, but the touched paths still cross live hosted inference, credential staging, workflow execution, and OpenShell sandbox rebuild/upgrade behavior.
**Runtime validation** — Post-rebuild inference smoke calls use NEMOCLAW_MODEL or NEMOCLAW_COMPAT_MODEL rather than the old nvidia-prod Build model in hosted-compatible lanes.. Unit and workflow contract coverage is appropriate for the code changes, but the touched paths still cross live hosted inference, credential staging, workflow execution, and OpenShell sandbox rebuild/upgrade behavior.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e/test-rebuild-hermes.sh`:
- Around line 287-288: Update the seeded registry model ID to the migrated
served value in all three E2E scripts: in test/e2e/test-rebuild-hermes.sh (lines
287-288) replace the registry 'model' value with
'nvidia/nvidia/nemotron-3-super-v3' (leave 'provider' as-is), in
test/e2e/test-rebuild-openclaw.sh (lines 222-223) replace the registry 'model'
value with 'nvidia/nvidia/nemotron-3-super-v3', and in
test/e2e/test-upgrade-stale-sandbox.sh (lines 162-163) replace the registry
'model' value with 'nvidia/nvidia/nemotron-3-super-v3' so all seeded registry
entries use the served provider model ID.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c16d651b-d919-4a3d-9591-56d9f2f1d36d

📥 Commits

Reviewing files that changed from the base of the PR and between 3f003c0 and 49f0e18.

📒 Files selected for processing (12)

.github/workflows/e2e-script.yaml
.github/workflows/e2e-vitest-scenarios.yaml
.github/workflows/nightly-e2e.yaml
src/lib/inference/onboard-probes.test.ts
src/lib/inference/onboard-probes.ts
src/lib/onboard/providers.ts
test/e2e-scenario/fixtures/hosted-inference.ts
test/e2e-script-workflow.test.ts
test/e2e/lib/ci-compatible-inference.sh
test/e2e/test-rebuild-hermes.sh
test/e2e/test-rebuild-openclaw.sh
test/e2e/test-upgrade-stale-sandbox.sh

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e-script-workflow.test.ts`:
- Line 926: Replace the CWD-dependent read with a module-relative read: when
reading the fixture use readFileSync(path.join(__dirname, fixture), "utf8") (or
readFileSync(new URL(fixture, import.meta.url), "utf8") in ESM) instead of
readFileSync(fixture, "utf8"); update the import/require to include path (or
ensure URL usage) so the assignment to body uses a module-relative path derived
from __dirname or import.meta.url and no longer depends on the process CWD.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6377e0da-6fdc-47a7-ba94-65e9f9892c3d

📥 Commits

Reviewing files that changed from the base of the PR and between 49f0e18 and dc23e81.

📒 Files selected for processing (4)

test/e2e-script-workflow.test.ts
test/e2e/test-rebuild-hermes.sh
test/e2e/test-rebuild-openclaw.sh
test/e2e/test-upgrade-stale-sandbox.sh

coderabbitai · 2026-06-13T20:27:20Z

@@ -925,9 +925,10 @@ describe("E2E reusable workflow contract", () => {
    for (const fixture of rebuildFixtures) {
      const body = readFileSync(fixture, "utf8");


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use module-relative reads for rebuild fixtures to avoid CWD-coupled test failures.

readFileSync(fixture, "utf8") relies on the test runner’s current working directory. This can flake when the suite is invoked from a different CWD.

Suggested patch

- for (const fixture of rebuildFixtures) { - const body = readFileSync(fixture, "utf8"); + for (const fixture of rebuildFixtures) { + const body = readFileSync(new URL(`../${fixture}`, import.meta.url), "utf8"); expect(body, fixture).toContain("provider = sess.get('provider')"); expect(body, fixture).toContain("if env_provider == 'custom'"); expect(body, fixture).toContain("'provider': provider"); expect(body, fixture).toContain("'model': model");

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const body = readFileSync(fixture, "utf8");

const body = readFileSync(new URL(`../${fixture}`, import.meta.url), "utf8");

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e-script-workflow.test.ts` at line 926, Replace the CWD-dependent read with a module-relative read: when reading the fixture use readFileSync(path.join(__dirname, fixture), "utf8") (or readFileSync(new URL(fixture, import.meta.url), "utf8") in ESM) instead of readFileSync(fixture, "utf8"); update the import/require to include path (or ensure URL usage) so the assignment to body uses a module-relative path derived from __dirname or import.meta.url and no longer depends on the process CWD.

cv added 2 commits June 13, 2026 12:47

Revert "fix(ci): align hosted nightly inference defaults (#5399)"

5149557

This reverts commit 3f003c0.

fix(onboard): bound compatible endpoint probe

49f0e18

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

cv self-assigned this Jun 13, 2026

coderabbitai Bot reviewed Jun 13, 2026

View reviewed changes

Comment thread test/e2e/test-rebuild-hermes.sh Outdated

fix(e2e): preserve hosted inference in rebuild fixtures

dc23e81

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

coderabbitai Bot reviewed Jun 13, 2026

View reviewed changes

cv merged commit cc1fa5c into main Jun 13, 2026
49 checks passed

cv deleted the codex/revert-hosted-model-and-fix-e2e branch June 13, 2026 21:08

cv added the v0.0.65 Release target label Jun 15, 2026

		@@ -925,9 +925,10 @@ describe("E2E reusable workflow contract", () => {
		for (const fixture of rebuildFixtures) {
		const body = readFileSync(fixture, "utf8");

	const body = readFileSync(fixture, "utf8");
	const body = readFileSync(new URL(`../${fixture}`, import.meta.url), "utf8");

Conversation

cv commented Jun 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Type of Change

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

github-code-quality Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage Overview

TypeScript / code-coverage/plugin

TypeScript / code-coverage/cli

Uh oh!

github-actions Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

github-actions Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Vitest E2E Scenario Recommendation

Vitest E2E Scenario Advisor

Required Vitest E2E scenarios

Optional Vitest E2E scenarios

Relevant changed files

Uh oh!

github-actions Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Advisor

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cv commented Jun 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 13, 2026 •

edited

Loading

github-code-quality Bot commented Jun 13, 2026 •

edited

Loading

github-actions Bot commented Jun 13, 2026 •

edited

Loading

github-actions Bot commented Jun 13, 2026 •

edited

Loading

github-actions Bot commented Jun 13, 2026 •

edited

Loading