Skip to content

test(e2e): migrate sessions agents cli scenario#5363

Merged
cv merged 4 commits into
mainfrom
codex/5098-sessions-agents-cli
Jun 13, 2026
Merged

test(e2e): migrate sessions agents cli scenario#5363
cv merged 4 commits into
mainfrom
codex/5098-sessions-agents-cli

Conversation

@cv

@cv cv commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Summary

Migrates test/e2e/test-sessions-agents-cli.sh into a focused live Vitest scenario. The new coverage preserves the host-side nemoclaw <name> sessions and agents CLI contracts, including live credential gating, OpenClaw pairing/scope approval handling, JSON-envelope assertions, cleanup, and rate-limit-aware pre-contract skips.

Related Issue

Refs #5098

Changes

  • Added test/e2e-scenario/live/sessions-agents-cli.test.ts for the live sessions/agents CLI migration.
  • Wired sessions-agents-cli-vitest into .github/workflows/e2e-vitest-scenarios.yaml and the free-standing scenario inventory.
  • Updated workflow support tests to cover the new selector and give the growing free-standing selector loop an explicit timeout.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

Targeted checks passed: npm run build:cli, cd nemoclaw && npm ci --ignore-scripts && npm run build, workflow support Vitest under e2e-vitest-support and cli, live-test gated smoke without NVIDIA_API_KEY, Biome lint, CLI typecheck, test-size/source-shape checks, and git diff --check. File-scoped prek/commit/push hooks were attempted; the broad test-cli hook hit unrelated local flake/timeouts in existing CLI tests, so only that hook was skipped for commit/push after targeted verification.

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

  • Tests

    • Added live end-to-end Vitest coverage for sessions and agents CLI flows, including onboarding, session/agent lifecycle, JSON output parsing, retry handling, and cleanup validations.
  • Chores

    • Added a dedicated CI job to run the new live Vitest scenario and included its result in PR reporting.
    • Extended workflow validation and selector checks for the new test scenario.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@cv cv self-assigned this Jun 12, 2026
@copy-pr-bot

copy-pr-bot Bot commented Jun 12, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

Adds a live Vitest scenario testing NemoClaw sessions/agents CLI flows with robust output parsing and scope-recovery, a GitHub Actions job to run it (with Node/Docker/OpenShell setup and artifact upload), PR reporting integration, and selector/boundary test updates.

Changes

Sessions and Agents CLI E2E Vitest Scenario

Layer / File(s) Summary
E2E test setup, execution helpers, and teardown
test/e2e-scenario/live/sessions-agents-cli.test.ts
Imports, constants, environment helpers, ANSI stripping utilities, execution wrappers for CLI commands with artifact collection and timeouts, OpenShell availability checks with conditional installation, and best-effort sandbox teardown.
Output parsing and scope recovery logic
test/e2e-scenario/live/sessions-agents-cli.test.ts
JSON envelope parsing from noisy CLI output, onboarding rate-limit detection, session key and request ID extraction, and scope-upgrade recovery by approving pending device pairings and retrying gateway RPCs.
Main test scenario and agent lifecycle
test/e2e-scenario/live/sessions-agents-cli.test.ts
Live scenario gated by shouldRunLiveE2EScenarios(), prerequisite validation, non-interactive onboarding, main and secondary agent seeding, sessions and agents JSON command validation, session reset/delete with key integrity checks, and agent deletion.
Workflow job definition and PR reporting
.github/workflows/e2e-vitest-scenarios.yaml
sessions-agents-cli-vitest job with Node 22, CLI build, Docker Hub authentication with fallback, OpenShell installation, Vitest execution, and artifact uploads. Updates report-to-pr job needs array to include the new job.
Workflow boundary configuration and test validation
tools/e2e-scenarios/workflow-boundary.mts, test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
Workflow boundary validator call for sessions-agents-cli-vitest job selector condition. Extended test assertions for both scenarios-based and jobs-based dispatch inputs, verifying valid dispatch, empty registry scenarios, and correct job selection.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#5243: Related free-standing Vitest job selector and workflow-dispatch plumbing that this PR extends.
  • NVIDIA/NemoClaw#5330: Selector/inventory validator work touching the same workflow-boundary and dispatch logic.
  • NVIDIA/NemoClaw#5231: Adds another free-standing Vitest job and similar selector/test updates in the same subsystem.

Suggested labels

area: e2e

Suggested reviewers

  • prekshivyas

Poem

🐰 A tiny rabbit hops to test,
Strips the ANSI, finds JSON best,
Pairs the devices, retries the call,
Cleans the sandbox — nothing tall.
Cheers for CI, artifacts, and rest!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: migrating a legacy shell E2E test for sessions/agents CLI into a Vitest scenario, which aligns with all four files modified (workflow config, new test, test support, and workflow boundary validator).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/5098-sessions-agents-cli

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: None
Optional E2E: None

Workflow run

Full advisor summary

E2E Recommendation Advisor

Failed: Could not parse JSON from advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/e2e-advisor/e2e-advisor-raw-output.txt

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: None
Optional Vitest E2E scenarios: None

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Failed: Could not parse JSON from advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/e2e-advisor/e2e-scenario-advisor-raw-output.txt

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 1 worth checking, 0 nice ideas
Top item: PR review advisor unavailable

Review findings

🛠️ Needs attention

  • None.

🔎 Worth checking

  • PR review advisor unavailable: The automated advisor could not complete: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt
    • Recommendation: Re-run the PR Review Advisor or perform a manual review.
    • Evidence: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt

🌱 Nice ideas

  • None.
Consider writing more tests for
  • **Runtime validation** — Add or identify targeted runtime/integration validation for the changed behavior; do not report external E2E job pass/fail here.. Runtime/sandbox/infrastructure paths need behavioral runtime validation: .github/workflows/e2e-vitest-scenarios.yaml, tools/e2e-scenarios/workflow-boundary.mts.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27438223490
Workflow ref: codex/5098-sessions-agents-cli
Requested scenarios: (default — all supported)
Requested jobs: sessions-agents-cli-vitest
Summary: 2 passed, 0 failed, 22 skipped

Job Result
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
sessions-agents-cli-vitest ✅ success
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped

@cv cv marked this pull request as ready for review June 12, 2026 21:03

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tools/e2e-scenarios/workflow-boundary.mts (1)

2112-2112: 📐 Maintainability & Code Quality | ⚡ Quick win

Add a dedicated contract validator for sessions-agents-cli-vitest.

This new line only enforces selector shape; it doesn’t lock down job-level invariants (secret exposure boundaries, expected run command, artifact path/name). Adding a validateSessionsAgentsCliVitestJob(...) (as done for other free-standing jobs) will prevent silent workflow drift.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/e2e-scenarios/workflow-boundary.mts` at line 2112, The PR adds a
selector-only check via validateFreeStandingJobSelector for
"sessions-agents-cli-vitest" but lacks a full job-level contract validator; add
a new validateSessionsAgentsCliVitestJob(errors, jobs,
"sessions-agents-cli-vitest") function (modeled after other free-standing job
validators) and call it alongside or immediately after
validateFreeStandingJobSelector to enforce secrets boundaries, expected run
command, artifact path/name and other invariants; implement the validator to
locate the "sessions-agents-cli-vitest" job in jobs and assert its
secrets/exposed variables, the exact run/steps structure, and artifact
naming/paths consistent with project conventions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e-scenario/live/sessions-agents-cli.test.ts`:
- Around line 140-158: The current cleanupSandbox function swallows all errors
via bestEffort and is reused for setup preconditions, which masks failures;
split it into two helpers: rename the existing function to
bestEffortCleanupSandbox (keep its calls to runNemoclaw([... "destroy" ...]) and
host.command("openshell", ["sandbox", "delete", ...]) wrapped with bestEffort)
and create a new strict cleanupSandbox that runs the same two operations without
bestEffort so failures propagate (i.e., do not catch/suppress errors) — update
callers used during setup to call the strict cleanupSandbox and only use
bestEffortCleanupSandbox for final teardown; align naming with the fixture
client’s bestEffortCleanupSandbox in host.ts to avoid confusion.
- Around line 520-562: The current verification uses "sessions list --agent" and
parseJsonEnvelope to infer agent deletion, which is unreliable; replace that
block to call runNemoclaw with [SANDBOX_NAME, "agents", "list", "--json"] (use
the same apiKey/options pattern as deleteAgent), then parse the JSON envelope
(use parseJsonEnvelope) and assert that no agent entry matches TEST_AGENT_ID
(similar to how sessionEntries(...) was used earlier); remove the
deletedAgentStillVisible boolean/exitCode inversion logic and instead fail if
the agents-list command returns exitCode !== 0 or if the parsed agents list
contains TEST_AGENT_ID. Ensure you reference the existing variables deleteAgent,
TEST_AGENT_ID, runNemoclaw, and parseJsonEnvelope when implementing the
replacement.

---

Nitpick comments:
In `@tools/e2e-scenarios/workflow-boundary.mts`:
- Line 2112: The PR adds a selector-only check via
validateFreeStandingJobSelector for "sessions-agents-cli-vitest" but lacks a
full job-level contract validator; add a new
validateSessionsAgentsCliVitestJob(errors, jobs, "sessions-agents-cli-vitest")
function (modeled after other free-standing job validators) and call it
alongside or immediately after validateFreeStandingJobSelector to enforce
secrets boundaries, expected run command, artifact path/name and other
invariants; implement the validator to locate the "sessions-agents-cli-vitest"
job in jobs and assert its secrets/exposed variables, the exact run/steps
structure, and artifact naming/paths consistent with project conventions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 96fa9be4-316a-45d4-951e-c269bd5a3f0c

📥 Commits

Reviewing files that changed from the base of the PR and between 7f583ed and f001c41.

📒 Files selected for processing (5)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/live/sessions-agents-cli.test.ts
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • tools/e2e-scenarios/free-standing-jobs.env
  • tools/e2e-scenarios/workflow-boundary.mts

Comment thread test/e2e-scenario/live/sessions-agents-cli.test.ts
Comment thread test/e2e-scenario/live/sessions-agents-cli.test.ts
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27444257786
Workflow ref: codex/5098-sessions-agents-cli
Requested scenarios: sessions-agents-cli
Requested jobs: (default — all free-standing when no scenarios are requested)
Summary: 2 passed, 0 failed, 22 skipped

Job Result
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
sessions-agents-cli-vitest ✅ success
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped

@cv cv added the v0.0.65 Release target label Jun 13, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tools/e2e-scenarios/workflow-boundary.mts (1)

2156-2156: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Update the shared secret denylist to include NVIDIA_INFERENCE_API_KEY.

This new selector mapping means sessions-agents-cli-vitest is enforced via shared boundary checks, but the shared denylist still keys on NVIDIA_API_KEY. That misses NVIDIA_INFERENCE_API_KEY exposure for free-standing jobs without dedicated per-job validators.

Suggested fix
 const COMMON_SECRET_ENV_NAMES = [
-  "NVIDIA_API_KEY",
+  "NVIDIA_INFERENCE_API_KEY",
+  "NVIDIA_API_KEY",
   "DOCKERHUB_USERNAME",
   "DOCKERHUB_TOKEN",
   "GITHUB_TOKEN",
 ];
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/e2e-scenarios/workflow-boundary.mts` at line 2156, The shared-boundary
denylist for free-standing job validation is missing the new secret key; update
the shared denylist used by the free-standing validator so that
NVIDIA_INFERENCE_API_KEY is included alongside NVIDIA_API_KEY. Modify the code
path referenced by validateFreeStandingJobSelector (the shared
denylist/validator used for "sessions-agents-cli-vitest" /
"sessions-agents-cli") to add "NVIDIA_INFERENCE_API_KEY" to the denylist of
secret keys checked during shared boundary validation so free-standing jobs are
correctly blocked from exposing that secret.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@tools/e2e-scenarios/workflow-boundary.mts`:
- Line 2156: The shared-boundary denylist for free-standing job validation is
missing the new secret key; update the shared denylist used by the free-standing
validator so that NVIDIA_INFERENCE_API_KEY is included alongside NVIDIA_API_KEY.
Modify the code path referenced by validateFreeStandingJobSelector (the shared
denylist/validator used for "sessions-agents-cli-vitest" /
"sessions-agents-cli") to add "NVIDIA_INFERENCE_API_KEY" to the denylist of
secret keys checked during shared boundary validation so free-standing jobs are
correctly blocked from exposing that secret.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 467d7b17-5d74-4a5c-853f-b9de470208d4

📥 Commits

Reviewing files that changed from the base of the PR and between 5aea403 and cf5bdd7.

📒 Files selected for processing (3)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • tools/e2e-scenarios/workflow-boundary.mts
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • .github/workflows/e2e-vitest-scenarios.yaml

@cv cv merged commit 9982bf8 into main Jun 13, 2026
53 of 56 checks passed
@cv cv deleted the codex/5098-sessions-agents-cli branch June 13, 2026 02:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v0.0.65 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants