test(e2e): migrate cloud inference scenario by cv · Pull Request #5361 · NVIDIA/NemoClaw

cv · 2026-06-12T19:18:05Z

Summary

Migrates the legacy test/e2e/test-cloud-inference-e2e.sh contract into a focused live Vitest scenario for issue #5098. The scenario keeps the live NVIDIA API boundary, install/onboard path, sandbox chat validation, skill filesystem checks, retries, cleanup, and artifact evidence while classifying pre-contract provider rate limiting as an evidence-rich skip.

Related Issue

Fixes #5098

Changes

Adds test/e2e-scenario/live/cloud-inference.test.ts for install/onboard, inference.local chat, repo and sandbox skill validation, retry/artifact capture, and tolerant sandbox cleanup.
Classifies HTTP 429, sanitized endpoint validation, and related pre-contract provider validation failures as skipped Vitest evidence instead of migration failures.
Adds focused support coverage for the provider-skip classifier so credential/auth failures and non-transient product errors still fail the migrated contract.
Wires the cloud-inference-vitest selector into .github/workflows/e2e-vitest-scenarios.yaml and tools/e2e-scenarios/free-standing-jobs.env.
Extends workflow boundary support tests so the new dispatch job keeps the expected artifact, secret, selector, and upload contract.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

npx prek run --all-files passes
npm test passes
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
npm run docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

Targeted checks run:

npm ci --ignore-scripts
cd nemoclaw && npm ci --ignore-scripts && npm run build
npx @biomejs/biome check test/e2e-scenario/live/cloud-inference.test.ts test/e2e-scenario/live/cloud-inference-provider-skip.ts test/e2e-scenario/support-tests/cloud-inference-provider-skip.test.ts test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts tools/e2e-scenarios/workflow-boundary.mts
npx vitest run --project e2e-vitest-support test/e2e-scenario/support-tests/cloud-inference-provider-skip.test.ts test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
NEMOCLAW_RUN_E2E_SCENARIOS=1 npx vitest run --project e2e-scenarios-live test/e2e-scenario/live/cloud-inference.test.ts --silent=false --reporter=default (skipped locally because NVIDIA_API_KEY is not present)
npm run build:cli
npm run test-size:check
npm run source-shape:check
npx vitest run --project e2e-vitest-support test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
npx vitest run test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
gh workflow run e2e-vitest-scenarios.yaml --ref codex/5098-cloud-inference-e2e --field jobs=cloud-inference-vitest
gh run watch 27438539949 --interval 20 --exit-status

Hook note: NEMOCLAW_TEST_TIMEOUT=20000 npx prek run --files ... passed once before the final provider-skip classifier patch. After the final patch, the targeted checks above passed again, but local file-scoped hooks were blocked by unrelated full CLI timing flakes (e2e-fixture-context SIGTERM/SIGKILL expectation and later 5s CLI test timeouts in web-search-flow, sandbox-mutations, and snapshot-shields). Automatic PR CI and the required cloud-inference-vitest dispatch pass on the final branch tip.

Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

Tests
- Added a live E2E test for cloud inference that validates chat-completion behavior, sandbox provisioning/cleanup, and artifact reporting.
- Added classification logic and unit tests to determine when external provider failures should cause a scenario skip and to build skip evidence.
Chores
- Added a dedicated CI job for cloud inference vitest runs and wired its status into PR reporting.
- Added workflow boundary validations to enforce job shape, environment rules, and artifact conventions.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

copy-pr-bot · 2026-06-12T19:18:09Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-06-12T19:18:14Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 874dc130-dbd2-497d-b3f2-e96eb847574e

📥 Commits

Reviewing files that changed from the base of the PR and between d27a43b and 574b8bd.

📒 Files selected for processing (3)

.github/workflows/e2e-vitest-scenarios.yaml
test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
tools/e2e-scenarios/workflow-boundary.mts

🚧 Files skipped from review as they are similar to previous changes (3)

.github/workflows/e2e-vitest-scenarios.yaml
test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
tools/e2e-scenarios/workflow-boundary.mts

📝 Walkthrough

Walkthrough

Adds a new cloud-inference live Vitest E2E scenario: provider-failure classification and evidence, a live test that provisions/cleans sandboxes and validates an inference "PONG" response with retries, and workflow integration plus boundary validators and PR reporting.

Changes

Cloud Inference Live E2E Vitest

Layer / File(s)	Summary
Provider Skip Classification Logic `test/e2e-scenario/live/cloud-inference-provider-skip.ts`, `test/e2e-scenario/support-tests/cloud-inference-provider-skip.test.ts`	Classifies pre-contract external provider failures into transient endpoint-validation, rate-limited/sanitized, or auth-related categories; builds structured skip-evidence objects; includes tests validating classification and evidence shape.
Live E2E Test Orchestration `test/e2e-scenario/live/cloud-inference.test.ts`	Adds a live Vitest test that provisions/tears down OpenClaw sandboxes, verifies CLI binaries, runs installer probes, gates on pre-contract skip evidence, performs inference.local chat completion checks (JSON parsing + retry for "PONG"), runs filesystem validators, and writes scenario artifacts.
Workflow Job Registration and Validation `.github/workflows/e2e-vitest-scenarios.yaml`, `tools/e2e-scenarios/workflow-boundary.mts`, `test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts`	Registers `cloud-inference-vitest` job, enforces job/step env and secret rules (NVIDIA_API_KEY handling), requires pinned checkout/setup-node and specific install/build/test commands, validates artifact upload settings, wires validator into workflow boundary checks, and adds job to PR report `needs`.

Sequence Diagram(s)

sequenceDiagram
  participant Test as Live E2E Test
  participant Install as install.sh
  participant Sandbox as OpenClaw Sandbox
  participant Inference as inference.local API
  participant Validator as Bash Validators

  Test->>Install: run installer (probe)
  Install-->>Test: probe stdout/stderr + exit code
  Test->>Test: classify probe -> skip evidence?
  alt skip
    Test->>Test: write skip evidence, skip remaining steps
  else proceed
    Test->>Sandbox: verify CLI on PATH
    Test->>Inference: curl chat completion (JSON)
    loop retries
      Inference-->>Test: JSON response
      Test->>Test: extract content, match /pong/i
    end
    Test->>Validator: run repo/sandbox validators
    Validator-->>Test: validation results
    Test->>Test: write artifacts
  end
  Test->>Sandbox: best-effort cleanup

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

NVIDIA/NemoClaw#5243: Related workflow validator and job-selector additions touching the same e2e-vitest-scenarios workflow and boundary logic.
NVIDIA/NemoClaw#5370: Overlaps in workflow-boundary validation and selector derivation changes.
NVIDIA/NemoClaw#5236: Similar addition of a free-standing Vitest job and report-to-pr wiring in the shared workflow.

Suggested labels

area: e2e, area: ci

Suggested reviewers

prekshivyas

Poem

🐰 I flutter through logs by soft moonlight,
Tracing probes and skips with nimble sight,
Sandboxes rise, a PONG in the air,
Artifacts tucked with meticulous care,
I hop away joyful — tests now take flight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the primary objective of the PR: migrating the legacy cloud-inference bash E2E test into a Vitest scenario.
Linked Issues check	✅ Passed	The PR fully satisfies the requirements from issue `#5098`: migrates legacy test-cloud-inference-e2e.sh to Vitest [5098], preserves system boundaries [5098], centralizes support code [5098], documents contract mapping [5098], wires into workflow [5098], and defers legacy deletion [5098].
Out of Scope Changes check	✅ Passed	All changes directly support the cloud-inference test migration: workflow dispatch setup, provider-skip classification for pre-contract failures, Vitest test implementation, support tests, and workflow boundary validation. No unrelated changes detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch codex/5098-cloud-inference-e2e

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-06-12T19:19:17Z

E2E Advisor Recommendation

Required E2E: None
Optional E2E: None

Workflow run

Full advisor summary

E2E Recommendation Advisor

Failed: Could not parse JSON from advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/e2e-advisor/e2e-advisor-raw-output.txt

github-actions · 2026-06-12T19:19:18Z

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: None
Optional Vitest E2E scenarios: None

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Failed: Could not parse JSON from advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/e2e-advisor/e2e-scenario-advisor-raw-output.txt

github-actions · 2026-06-12T19:23:01Z

PR Review Advisor

Findings: 0 needs attention, 1 worth checking, 0 nice ideas
Top item: PR review advisor unavailable

Review findings

🛠️ Needs attention

None.

🔎 Worth checking

PR review advisor unavailable: The automated advisor could not complete: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt
- Recommendation: Re-run the PR Review Advisor or perform a manual review.
- Evidence: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt

🌱 Nice ideas

None.

Consider writing more tests for

**Runtime validation** — Add or identify targeted runtime/integration validation for the changed behavior; do not report external E2E job pass/fail here.. Runtime/sandbox/infrastructure paths need behavioral runtime validation: .github/workflows/e2e-vitest-scenarios.yaml, tools/e2e-scenarios/workflow-boundary.mts.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

github-actions · 2026-06-12T19:32:10Z

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 27438223449
Workflow ref: codex/5098-cloud-inference-e2e
Requested scenarios: (default — all supported)
Requested jobs: cloud-inference-vitest
Summary: 1 passed, 1 failed, 22 skipped

Job	Result
cloud-inference-vitest	❌ failure
credential-migration-vitest	⏭️ skipped
credential-sanitization-vitest	⏭️ skipped
double-onboard-vitest	⏭️ skipped
gateway-guard-recovery	⏭️ skipped
generate-matrix	✅ success
hermes-e2e-vitest	⏭️ skipped
hermes-root-entrypoint-smoke-vitest	⏭️ skipped
inference-routing-vitest	⏭️ skipped
issue-4434-tui-unreachable-inference-vitest	⏭️ skipped
launchable-smoke-vitest	⏭️ skipped
live-scenarios	⏭️ skipped
model-router-provider-routed-inference-vitest	⏭️ skipped
network-policy-vitest	⏭️ skipped
onboard-negative-paths-vitest	⏭️ skipped
openclaw-tui-chat-correlation-vitest	⏭️ skipped
openshell-version-pin-vitest	⏭️ skipped
rebuild-openclaw-vitest	⏭️ skipped
runtime-overrides-vitest	⏭️ skipped
sandbox-rebuild-vitest	⏭️ skipped
sandbox-survival-vitest	⏭️ skipped
shields-config-vitest	⏭️ skipped
skill-agent-vitest	⏭️ skipped
token-rotation-vitest	⏭️ skipped

Failed jobs: cloud-inference-vitest. Check run artifacts for logs.

github-actions · 2026-06-12T19:34:39Z

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 27438324340
Workflow ref: codex/5098-cloud-inference-e2e
Requested scenarios: (default — all supported)
Requested jobs: cloud-inference-vitest
Summary: 1 passed, 1 failed, 22 skipped

Job	Result
cloud-inference-vitest	❌ failure
credential-migration-vitest	⏭️ skipped
credential-sanitization-vitest	⏭️ skipped
double-onboard-vitest	⏭️ skipped
gateway-guard-recovery	⏭️ skipped
generate-matrix	✅ success
hermes-e2e-vitest	⏭️ skipped
hermes-root-entrypoint-smoke-vitest	⏭️ skipped
inference-routing-vitest	⏭️ skipped
issue-4434-tui-unreachable-inference-vitest	⏭️ skipped
launchable-smoke-vitest	⏭️ skipped
live-scenarios	⏭️ skipped
model-router-provider-routed-inference-vitest	⏭️ skipped
network-policy-vitest	⏭️ skipped
onboard-negative-paths-vitest	⏭️ skipped
openclaw-tui-chat-correlation-vitest	⏭️ skipped
openshell-version-pin-vitest	⏭️ skipped
rebuild-openclaw-vitest	⏭️ skipped
runtime-overrides-vitest	⏭️ skipped
sandbox-rebuild-vitest	⏭️ skipped
sandbox-survival-vitest	⏭️ skipped
shields-config-vitest	⏭️ skipped
skill-agent-vitest	⏭️ skipped
token-rotation-vitest	⏭️ skipped

Failed jobs: cloud-inference-vitest. Check run artifacts for logs.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

github-actions · 2026-06-12T19:42:22Z

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27438539949
Workflow ref: codex/5098-cloud-inference-e2e
Requested scenarios: (default — all supported)
Requested jobs: cloud-inference-vitest
Summary: 2 passed, 0 failed, 22 skipped

Job	Result
cloud-inference-vitest	✅ success
credential-migration-vitest	⏭️ skipped
credential-sanitization-vitest	⏭️ skipped
double-onboard-vitest	⏭️ skipped
gateway-guard-recovery	⏭️ skipped
generate-matrix	✅ success
hermes-e2e-vitest	⏭️ skipped
hermes-root-entrypoint-smoke-vitest	⏭️ skipped
inference-routing-vitest	⏭️ skipped
issue-4434-tui-unreachable-inference-vitest	⏭️ skipped
launchable-smoke-vitest	⏭️ skipped
live-scenarios	⏭️ skipped
model-router-provider-routed-inference-vitest	⏭️ skipped
network-policy-vitest	⏭️ skipped
onboard-negative-paths-vitest	⏭️ skipped
openclaw-tui-chat-correlation-vitest	⏭️ skipped
openshell-version-pin-vitest	⏭️ skipped
rebuild-openclaw-vitest	⏭️ skipped
runtime-overrides-vitest	⏭️ skipped
sandbox-rebuild-vitest	⏭️ skipped
sandbox-survival-vitest	⏭️ skipped
shields-config-vitest	⏭️ skipped
skill-agent-vitest	⏭️ skipped
token-rotation-vitest	⏭️ skipped

…rence-e2e

test(e2e): migrate cloud inference scenario

5dd88d5

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

cv self-assigned this Jun 12, 2026

test(e2e): cover cloud inference provider skip

f52bd16

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

test(e2e): return from cloud inference skips

fcd5604

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

cv marked this pull request as ready for review June 12, 2026 21:03

cv assigned jyaunches Jun 12, 2026

cv added the v0.0.65 Release target label Jun 13, 2026

cv added 2 commits June 12, 2026 18:04

merge: sync main into cloud inference e2e

d27a43b

Merge remote-tracking branch 'origin/main' into codex/5098-cloud-infe…

574b8bd

…rence-e2e

cv merged commit cf02a94 into main Jun 13, 2026
42 checks passed

cv deleted the codex/5098-cloud-inference-e2e branch June 13, 2026 02:25

cv mentioned this pull request Jun 13, 2026

Epic: Migrate legacy bash E2E into the Vitest E2E system #5098

Open

79 tasks

Conversation

cv commented Jun 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Type of Change

Verification

Summary by CodeRabbit

Uh oh!

copy-pr-bot Bot commented Jun 12, 2026

Uh oh!

coderabbitai Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Uh oh!

github-actions Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Vitest E2E Scenario Recommendation

Vitest E2E Scenario Advisor

Uh oh!

github-actions Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Advisor

🛠️ Needs attention

🔎 Worth checking

🌱 Nice ideas

Uh oh!

github-actions Bot commented Jun 12, 2026

Vitest E2E Scenario Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented Jun 12, 2026

Vitest E2E Scenario Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented Jun 12, 2026

Vitest E2E Scenario Results — ✅ All jobs passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cv commented Jun 12, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 12, 2026 •

edited

Loading

github-actions Bot commented Jun 12, 2026 •

edited

Loading

github-actions Bot commented Jun 12, 2026 •

edited

Loading

github-actions Bot commented Jun 12, 2026 •

edited

Loading