Skip to content

test(e2e): centralize fake OpenAI-compatible server#5373

Merged
cv merged 9 commits into
mainfrom
codex/fake-openai-compatible-e2e
Jun 13, 2026
Merged

test(e2e): centralize fake OpenAI-compatible server#5373
cv merged 9 commits into
mainfrom
codex/fake-openai-compatible-e2e

Conversation

@cv

@cv cv commented Jun 13, 2026

Copy link
Copy Markdown
Collaborator

Summary

Centralizes the generic OpenAI-compatible fake provider used by E2E tests so shell and Vitest scenarios share the same canned inference behavior. Also broadens Biome coverage to include .mts helpers and formats the existing .mts tool scripts.

Changes

  • Add a shared .mts fake OpenAI-compatible API server with models, chat completions, responses, auth, streaming, and request logging support.
  • Add shell and Vitest launch helpers and update the double-onboard, token-rotation, gateway-port, gateway-upgrade, and crash-loop E2Es to use them.
  • Include **/*.mts in Biome and apply formatting/linting to existing .mts files.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

  • Chores

    • General code formatting and minor config updates (including TypeScript module file inclusion).
    • Small script cleanups and readability improvements.
  • Tests

    • Added a reusable OpenAI-compatible mock server and helper scripts for E2E tests.
    • Updated E2E tests and test scripts to use the shared mock and improved startup/teardown.
    • Increased a test timeout for stability.
  • CI

    • Updated nightly E2E workflow to use a Node-based build/setup before tests.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@cv cv self-assigned this Jun 13, 2026
@coderabbitai

coderabbitai Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f0d62e9b-9d61-423c-928b-f5f402b91148

📥 Commits

Reviewing files that changed from the base of the PR and between 142caa0 and d2bf57e.

📒 Files selected for processing (1)
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
✅ Files skipped from review due to trivial changes (1)
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts

📝 Walkthrough

Walkthrough

Consolidates a reusable fake OpenAI-compatible server (Node + TS fixture + shell helpers), migrates tests and E2E scripts to use it, updates PR-review advisor overlap/comment logic, adjusts nightly E2E provisioning to explicit Node setup, and applies formatting-only refactors across tools.

Changes

E2E fake-server consolidation and advisor updates

Layer / File(s) Summary
Shared fake OpenAI-compatible server and fixture infrastructure
biome.json, test/e2e-scenario/fixtures/fake-openai-compatible.ts, test/e2e/lib/fake-openai-compatible-api.mts, test/e2e/lib/openai-compatible-api-proof.sh
Adds TypeScript fixture contracts and implementation that spawns a Node fake OpenAI endpoint, waits for readiness by probing /v1/models, records requests to JSONL, exposes requests()/close(), and adds shell helpers start_fake_openai_compatible_api / stop_fake_openai_compatible_api.
Live tests and E2E shell scripts migrated to shared fake server
test/e2e-scenario/live/double-onboard.test.ts, test/e2e-scenario/live/token-rotation.test.ts, test/e2e/test-double-onboard.sh, test/e2e/test-concurrent-gateway-ports.sh, test/e2e/test-issue-2478-crash-loop-recovery.sh, test/e2e/test-openshell-gateway-upgrade.sh
Removes inline fake-server implementations (Node/http and Python), rewires tests/scripts to use the shared fixture/helper, and adapts request-artifact writing and cleanup to the fixture API and FAKE_OPENAI_* env variables.
CI workflow Node setup for explicit tooling provisioning
.github/workflows/nightly-e2e.yaml
Replaces prior "Install NemoClaw via install.sh" steps for two E2E jobs with explicit Node 22 setup, npm install, CLI build, and OpenShell installation via scripts/install-openshell.sh (with specified env vars unset).
PR review advisor GitHub API, overlap computation, and comment follow-up refinements
tools/advisors/github.mts, tools/pr-review-advisor/analyze.mts, tools/pr-review-advisor/comment.mts, tools/pr-review-advisor/workflow-boundary.mts
Enriches GitHub REST/GraphQL error handling, adds bounded-concurrency overlap computation and refined sorting of monolith deltas, and changes testing follow-ups to derive “tests” from result.findings (category "tests") instead of test-depth verdicts.
E2E advisor module formatting and import reorganization
tools/e2e-advisor/analyze.mts, tools/e2e-advisor/comment.mts, tools/e2e-advisor/dispatch.mts, tools/e2e-advisor/scenario-comment.mts, tools/e2e-advisor/scenarios.mts
Reformats imports and function bodies into multi-line layouts, reorganizes helper lambdas, and preserves existing gating/normalization behavior.
OpenClaw scripts, messaging applier, and advisor utilities formatting refactors
scripts/generate-openclaw-config.mts, scripts/patch-openclaw-slack-deny-feedback.mts, src/lib/messaging/applier/build/messaging-build-applier.mts, tools/advisors/git.mts, tools/advisors/io.mts, tools/advisors/json.mts, tools/advisors/session.mts
Non-functional formatting and small quote/style changes; multi-line rewrites of expressions and error string normalizations without behavior changes.

Sequence Diagram(s)

sequenceDiagram
  participant TestCase as Test Case
  participant Fixture as startFakeOpenAiCompatibleServer (fixture)
  participant APIServer as fake-openai-compatible-api.mts (Node server)
  participant RequestLog as requests.jsonl

  TestCase->>Fixture: start({port, auth, chatContent, ...})
  Fixture->>APIServer: spawn child process with FAKE_OPENAI_* env
  Fixture->>APIServer: poll GET /v1/models until ready
  APIServer-->>Fixture: READY + bound port
  Fixture-->>TestCase: return baseUrl, requests(), close()

  TestCase->>APIServer: POST /v1/chat/completions or /v1/responses
  APIServer->>RequestLog: append JSONL request metadata

  TestCase->>Fixture: fake.requests()
  Fixture->>RequestLog: parse JSONL
  Fixture-->>TestCase: array of typed requests

  TestCase->>Fixture: fake.close()
  Fixture->>APIServer: SIGTERM
  APIServer-->>Fixture: exit
  Fixture->>Fixture: cleanup temp files
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

area: e2e, chore, area: ci

Suggested reviewers

  • jyaunches

Poem

🐰 I spawned a server for tests to attend,

logs in JSONL, requests to befriend.
Tests now reuse one tidy fixture scene,
CI gets Node, the workflows stay clean.
Rabbity cheers — the e2e carrots gleam!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 3.26% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'test(e2e): centralize fake OpenAI-compatible server' directly and concisely summarizes the main change: consolidating a fake OpenAI-compatible server into a shared fixture used across E2E tests.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/fake-openai-compatible-e2e

Comment @coderabbitai help to get the list of available commands and usage tips.

@cv cv added the v0.0.65 Release target label Jun 13, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tools/pr-review-advisor/analyze.mts (1)

773-808: ⚡ Quick win

Bound overlap file-fetch concurrency to avoid GitHub API burst throttling.

Line 773 launches up to 80 overlap checks concurrently, and each may call another paginated files API. This burst can hit secondary rate limits and silently under-report overlaps when the inner catch falls back to sameFiles = [].

Suggested refactor
-  const overlaps = await Promise.all(
-    openPulls
-      .filter((pull) => getPath<number>(pull, ["number"]) !== currentPrNumber)
-      .slice(0, 80)
-      .map(async (pull): Promise<OpenPrOverlap | null> => {
+  const overlaps: Array<OpenPrOverlap | null> = [];
+  for (const pull of openPulls
+    .filter((item) => getPath<number>(item, ["number"]) !== currentPrNumber)
+    .slice(0, 80)) {
+    overlaps.push(
+      await (async (): Promise<OpenPrOverlap | null> => {
         // existing body unchanged
-      }),
-  );
+      })(),
+    );
+  }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/pr-review-advisor/analyze.mts` around lines 773 - 808, The overlap
checks currently run up to 80 concurrent tasks (the Promise.all over
openPulls.slice(0, 80)) and each task may call githubRestPaginated, which can
trigger GitHub secondary rate limits and cause sameFiles to silently fall back
to empty; change the implementation to bound concurrency (e.g., use a
concurrency limiter like p-limit or an explicit worker pool) when mapping
openPulls to the overlap check so only a small number (e.g., 4–8) of
githubRestPaginated calls run at once; update the code around overlaps/openPulls
mapping (the async mapper that computes number, title, body, labels,
linkedIssues, duplicateLinkedIssues, sameFiles) to schedule each mapper through
the limiter and await Promise.all on the limited tasks instead of launching all
80 at once.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e-scenario/fixtures/fake-openai-compatible.ts`:
- Around line 57-87: The readiness probe currently hardcodes "127.0.0.1" inside
canReachModels which ignores the configured host from
startFakeOpenAiCompatibleServer; modify canReachModels to accept a host
parameter and use that host instead of "127.0.0.1", update waitForReady to pass
the configured host through when calling canReachModels (thread the host from
wherever waitForReady is invoked, e.g., startFakeOpenAiCompatibleServer), and
ensure the http.get options use the provided host so readiness checks honor
non-loopback/IPv6 bindings.

In `@test/e2e/test-issue-2478-crash-loop-recovery.sh`:
- Around line 142-151: The preflight in
test/e2e/test-issue-2478-crash-loop-recovery.sh is incorrectly requiring python3
even though the compatible mock is started by node in
start_fake_openai_compatible_api (in
test/e2e/lib/openai-compatible-api-proof.sh); update the preflight checks to
verify the presence of node (and optionally that node supports the required
flag, e.g., by running `node --version` or a no-op with
`--experimental-strip-types`) instead of python3, leave the existing curl
readiness check intact, and ensure any environment-variable gating for
USE_COMPAT_MOCK=1 still triggers this node check.

---

Nitpick comments:
In `@tools/pr-review-advisor/analyze.mts`:
- Around line 773-808: The overlap checks currently run up to 80 concurrent
tasks (the Promise.all over openPulls.slice(0, 80)) and each task may call
githubRestPaginated, which can trigger GitHub secondary rate limits and cause
sameFiles to silently fall back to empty; change the implementation to bound
concurrency (e.g., use a concurrency limiter like p-limit or an explicit worker
pool) when mapping openPulls to the overlap check so only a small number (e.g.,
4–8) of githubRestPaginated calls run at once; update the code around
overlaps/openPulls mapping (the async mapper that computes number, title, body,
labels, linkedIssues, duplicateLinkedIssues, sameFiles) to schedule each mapper
through the limiter and await Promise.all on the limited tasks instead of
launching all 80 at once.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1d82198e-a9aa-4f7a-9d34-8d5e98823ab3

📥 Commits

Reviewing files that changed from the base of the PR and between ef8e43b and 11f7ae8.

📒 Files selected for processing (27)
  • biome.json
  • scripts/generate-openclaw-config.mts
  • scripts/patch-openclaw-slack-deny-feedback.mts
  • src/lib/messaging/applier/build/messaging-build-applier.mts
  • test/e2e-scenario/fixtures/fake-openai-compatible.ts
  • test/e2e-scenario/live/double-onboard.test.ts
  • test/e2e-scenario/live/token-rotation.test.ts
  • test/e2e/lib/fake-openai-compatible-api.mts
  • test/e2e/lib/openai-compatible-api-proof.sh
  • test/e2e/test-concurrent-gateway-ports.sh
  • test/e2e/test-double-onboard.sh
  • test/e2e/test-issue-2478-crash-loop-recovery.sh
  • test/e2e/test-openshell-gateway-upgrade.sh
  • tools/advisors/git.mts
  • tools/advisors/github.mts
  • tools/advisors/io.mts
  • tools/advisors/json.mts
  • tools/advisors/session.mts
  • tools/e2e-advisor/analyze.mts
  • tools/e2e-advisor/comment.mts
  • tools/e2e-advisor/dispatch.mts
  • tools/e2e-advisor/scenario-comment.mts
  • tools/e2e-advisor/scenarios.mts
  • tools/e2e-scenarios/workflow-boundary.mts
  • tools/pr-review-advisor/analyze.mts
  • tools/pr-review-advisor/comment.mts
  • tools/pr-review-advisor/workflow-boundary.mts

Comment thread test/e2e-scenario/fixtures/fake-openai-compatible.ts Outdated
Comment thread test/e2e/test-issue-2478-crash-loop-recovery.sh
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@github-actions

github-actions Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: None
Optional E2E: None

Workflow run

Full advisor summary

E2E Recommendation Advisor

Failed: Could not parse JSON from advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/e2e-advisor/e2e-advisor-raw-output.txt

@github-actions

github-actions Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: None
Optional Vitest E2E scenarios: None

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Failed: Could not parse JSON from advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/e2e-advisor/e2e-scenario-advisor-raw-output.txt

@github-actions

github-actions Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 1 worth checking, 0 nice ideas
Top item: PR review advisor unavailable

Review findings

🛠️ Needs attention

  • None.

🔎 Worth checking

  • PR review advisor unavailable: The automated advisor could not complete: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt
    • Recommendation: Re-run the PR Review Advisor or perform a manual review.
    • Evidence: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt

🌱 Nice ideas

  • None.
Consider writing more tests for
  • **Runtime validation** — Add or identify targeted runtime/integration validation for the changed behavior; do not report external E2E job pass/fail here.. Runtime/sandbox/infrastructure paths need behavioral runtime validation: .github/workflows/nightly-e2e.yaml, biome.json, scripts/generate-openclaw-config.mts, scripts/patch-openclaw-slack-deny-feedback.mts, src/lib/messaging/applier/build/messaging-build-applier.mts, tools/advisors/git.mts, tools/advisors/github.mts, tools/advisors/io.mts.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

cv added 5 commits June 12, 2026 19:01
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
…patible-e2e

# Conflicts:
#	tools/e2e-scenarios/workflow-boundary.mts
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 27454997874
Target ref: codex/fake-openai-compatible-e2e
Workflow ref: main
Requested jobs: issue-2478-crash-loop-recovery-e2e,openshell-gateway-upgrade-e2e,double-onboard-e2e,concurrent-gateway-ports-e2e
Summary: 0 passed, 2 failed, 2 cancelled, 0 skipped

Job Result
concurrent-gateway-ports-e2e ❌ failure
double-onboard-e2e ❌ failure
issue-2478-crash-loop-recovery-e2e ⚠️ cancelled
openshell-gateway-upgrade-e2e ⚠️ cancelled

Failed jobs: concurrent-gateway-ports-e2e, double-onboard-e2e. Check run artifacts for logs.

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ⚠️ Run cancelled — no signal

Run: 27455166298
Target ref: codex/fake-openai-compatible-e2e
Workflow ref: main
Requested jobs: double-onboard-e2e,concurrent-gateway-ports-e2e
Summary: 0 passed, 0 failed, 2 cancelled, 0 skipped

Job Result
concurrent-gateway-ports-e2e ⚠️ cancelled
double-onboard-e2e ⚠️ cancelled

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27454997804
Workflow ref: codex/fake-openai-compatible-e2e
Requested scenarios: (default — all supported)
Requested jobs: double-onboard-vitest,token-rotation-vitest
Summary: 3 passed, 0 failed, 28 skipped

Job Result
cloud-inference-vitest ⏭️ skipped
common-egress-agent-vitest ⏭️ skipped
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
double-onboard-vitest ✅ success
gateway-drift-preflight-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
messaging-providers-vitest ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-inference-switch-vitest ⏭️ skipped
openclaw-skill-cli-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
sessions-agents-cli-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
state-backup-restore-vitest ⏭️ skipped
token-rotation-vitest ✅ success

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 27455188119
Target ref: codex/fake-openai-compatible-e2e
Workflow ref: codex/fake-openai-compatible-e2e
Requested jobs: issue-2478-crash-loop-recovery-e2e,openshell-gateway-upgrade-e2e,double-onboard-e2e,concurrent-gateway-ports-e2e
Summary: 4 passed, 0 failed, 0 cancelled, 0 skipped

Job Result
concurrent-gateway-ports-e2e ✅ success
double-onboard-e2e ✅ success
issue-2478-crash-loop-recovery-e2e ✅ success
openshell-gateway-upgrade-e2e ✅ success

@cv cv merged commit f08d878 into main Jun 13, 2026
43 checks passed
@cv cv deleted the codex/fake-openai-compatible-e2e branch June 13, 2026 04:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v0.0.65 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant