refactor(advisors): preload deterministic context as tool results by cv · Pull Request #5386 · NVIDIA/NemoClaw

cv · 2026-06-13T09:55:41Z

Summary

Refactors the PR review and E2E advisors to preload deterministic repository/PR context into the Pi SDK session as synthetic tool-call/tool-result pairs. This keeps advisor prompts focused on instructions, makes context boundaries clearer in transcripts, and removes a one-off retired E2E migration ledger rule from the PR review advisor.

Changes

Added synthetic tool-result preloading support to tools/advisors/session.mts and persist those messages into advisor session exports.
Moved PR review advisor drift, diff, security, validation, metadata, and schema context out of interpolated prompt strings and into specific synthetic tool results.
Moved general E2E and Vitest scenario advisor metadata, changed files, diff, and schema context into explicit synthetic tool results.
Removed retiredE2eMigrationLedgerChanges and its special-case blocker finding from the PR review advisor.
Updated advisor tests and tool README artifact descriptions for the new synthetic context layout.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

Targeted checks passed:

npm run typecheck:cli
npx vitest run test/pr-review-advisor.test.ts test/e2e-scenario-advisor.test.ts
Biome check on changed advisor files

Attempted npx prek run --all-files and npm test; both currently fail in unrelated runtime recovery/secret-boundary tests outside this change set, so their boxes are left unchecked.

npx prek run --all-files passes
npm test passes
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
npm run docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

Release Notes

Tests
- Updated test coverage for E2E and PR review advisors to verify new prompt construction and synthetic context handling.
Documentation
- Updated advisor documentation to clarify how deterministic synthetic tool results are injected into advisor sessions.
Refactor
- Refactored advisors to use deterministic synthetic tool results for context injection instead of embedding metadata directly in prompts.
- Removed legacy retired E2E migration ledger detection from PR review advisor.

coderabbitai · 2026-06-13T09:55:53Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1f3d9904-1aa8-4ce6-8e15-f01c55b8c9a6

📥 Commits

Reviewing files that changed from the base of the PR and between 39fd60a and e77c00f.

📒 Files selected for processing (9)

test/e2e-scenario-advisor.test.ts
test/pr-review-advisor.test.ts
tools/advisors/README.md
tools/advisors/session.mts
tools/e2e-advisor/README.md
tools/e2e-advisor/analyze.mts
tools/e2e-advisor/scenarios.mts
tools/pr-review-advisor/README.md
tools/pr-review-advisor/analyze.mts

📝 Walkthrough

Walkthrough

This PR refactors advisor session infrastructure to decouple deterministic context metadata from prompt text through synthetic tool results. The core session plumbing defines synthetic tool result types and injection logic; E2E advisor, scenario advisor, and PR review advisor are updated to build prompt turns with tool-attached metadata/diff/schema; retired E2E migration ledger governance is removed from PR review; tests are updated to validate the new flow.

Changes

Advisor prompt refactoring to synthetic tool results

Layer / File(s)	Summary
Session-level synthetic tool-result types and injection `tools/advisors/session.mts`, `tools/advisors/README.md`	New exported types `AdvisorSyntheticToolContentType` and `AdvisorSyntheticToolResult` extend `AdvisorPromptTurn` with optional `syntheticToolResults`. Session manager is created and wired into `createAgentSession`; during each turn loop, `injectSyntheticToolResults` prepends synthetic assistant tool-call and tool-result messages, persists them via `sessionManager.appendMessage`, and logs injection details. Helper functions sanitize tool names and generate stable unique IDs. Documentation updated to describe deterministic synthetic tool-result preloading.
E2E analyzer prompt-turn refactoring `tools/e2e-advisor/analyze.mts`, `tools/e2e-advisor/README.md`	Replaces inline schema embedding with structured `buildPromptTurn()` that constructs `AdvisorPromptTurn` with `syntheticToolResults` (metadata, changed files, diff, response schema). `runReadOnlyAdvisor` call updated to pass `promptTurns: [promptTurn]`. System prompt signature changed to optional schema, relying on synthetic tool schema instead. Prompt artifact written from `promptTurn.prompt`.
Scenario advisor prompt-turn refactoring `tools/e2e-advisor/scenarios.mts`	`buildScenarioPromptTurn()` now constructs prompt turn with synthetic metadata/changed-files/diff/response-schema tool results; `buildPrompt()` delegates to it. `buildSystemPrompt` signature updated to optional schema; instruction text routes JSON schema lookup to synthetic `e2e_scenario_response_schema` tool result and adds untrusted-evidence guidance. `runReadOnlyAdvisor` call wired to `promptTurns: [promptTurn]`.
PR review: retired E2E migration ledger cleanup `tools/pr-review-advisor/analyze.mts`	Removes `RetiredE2eMigrationLedgerChange` type, exported `findRetiredE2eMigrationLedgerChanges()` function, and retired-ledger field from `DeterministicReviewContext`. `collectDeterministicContext` no longer computes retired-ledger changes. `addDeterministicFindings` injection removed. System prompt item `#9` updated from legacy ledger governance to prior-comment comparison logic.
PR review: multi-turn synthetic tool context wiring `tools/pr-review-advisor/analyze.mts`, `tools/pr-review-advisor/README.md`	All four prompt turns (drift, security, validation, synthesis) now attach `syntheticToolResults` with turn-specific context (metadata, diff, security context, validation context, response schema) instead of embedding in prompt text. `buildPromptTurns` refactored to construct tool results; `syntheticToolResult()` helper creates typed results. `writePromptArtifacts` extended to emit per-turn `.synthetic-tool-results/` directories with per-tool markdown files. `normalizeReviewResult` switched to inject only source-of-truth findings. README documents synthetic tool result directories.
E2E scenario advisor test updates `test/e2e-scenario-advisor.test.ts`	Imports `buildScenarioPromptTurn`; user prompt test validates that synthetic tool results are referenced and metadata/diff/schema are not embedded inline. System prompt test verifies non-empty content and schema lookup routed via synthetic tools. `buildScenarioPromptTurn` assertions validate expected tool names and content fragments.
PR review advisor test updates `test/pr-review-advisor.test.ts`	Removes `findRetiredE2eMigrationLedgerChanges` import and metadata field; adds source-of-truth wording expectations. Strengthened `buildPromptTurns` assertions verify synthetic tool result presence with specific tool names and content matching. Diff-backtick test validates `pr_review_git_diff` synthetic tool results. Artifact assertions include synthetic-tool-results files. Removes retired-ledger test cases.

Sequence Diagrams

sequenceDiagram
  participant runReadOnlyAdvisor
  participant injectSyntheticToolResults
  participant session.agent.state.messages
  participant sessionManager
  runReadOnlyAdvisor->>injectSyntheticToolResults: pass turn + syntheticToolResults
  injectSyntheticToolResults->>injectSyntheticToolResults: sanitizeToolName, uniqueToolCallId
  injectSyntheticToolResults->>session.agent.state.messages: append assistant toolCall + toolResult
  injectSyntheticToolResults->>sessionManager: appendMessage (persist synthetic entries)
  runReadOnlyAdvisor->>session.agent.state.messages: append user prompt turn

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

NVIDIA/NemoClaw#5126: Adds retired E2E migration ledger detection and blocker finding handling via findRetiredE2eMigrationLedgerChanges, while this PR removes that function and its related test coverage.
NVIDIA/NemoClaw#5170: Modifies tools/pr-review-advisor/analyze.mts system prompt rubric guidance (main PR updates prompt-turn context handling and removes retired-ledger rule; retrieved PR adds Vitest E2E suite simplicity review lens).
NVIDIA/NemoClaw#5234: Updates tools/e2e-advisor/scenarios.mts and test/e2e-scenario-advisor.test.ts scenario advisor prompt/system-schema wiring and decision/selector behavior.

Suggested labels

refactor, area: ci, v0.0.65

Suggested reviewers

jyaunches

Poem

🐰 Synthetic whispers through the advisor's ear,
No more embedded schemas crammed in here,
Tool results arrive like turnips on a plate,
Context flows with refinement—oh, how great!
Retired ledgers rest, old rules depart,
Prompt turns blossom as a work of art. 🌱

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and accurately summarizes the main architectural change: refactoring advisors to use synthetic tool results for deterministic context preloading rather than embedding it inline.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch refactor/advisor-synthetic-context

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-code-quality · 2026-06-13T09:57:13Z

Code Coverage Overview

Languages: TypeScript

TypeScript / code-coverage/plugin

The overall coverage in the refactor/advisor-syn... branch is 96%. Coverage data for the main branch is not yet available.

Show a code coverage summary of the most covered files.

File	main	refactor/advisor-syn... `e77c00f`	+/-
`nemoclaw/src/se...cret-scanner.ts`	—	100%	—
`nemoclaw/src/commands/slash.ts`	—	100%	—
`nemoclaw/src/li...bprocess-env.ts`	—	100%	—
`nemoclaw/src/bl...eprint/state.ts`	—	98%	—
`nemoclaw/src/onboard/config.ts`	—	98%	—
`nemoclaw/src/bl...int/snapshot.ts`	—	97%	—
`nemoclaw/src/bl...print/runner.ts`	—	95%	—
`nemoclaw/src/co...ration-state.ts`	—	94%	—
`nemoclaw/src/bl...ate-networks.ts`	—	94%	—
`nemoclaw/src/index.ts`	—	94%	—

TypeScript / code-coverage/cli

The overall coverage in the refactor/advisor-syn... branch is 44%. Coverage data for the main branch is not yet available.

Show a code coverage summary of the most covered files.

File	main	refactor/advisor-syn... `e77c00f`	+/-
`src/lib/state/o...oard-session.ts`	—	90%	—
`src/lib/inference/local.ts`	—	78%	—
`src/lib/sandbox/config.ts`	—	72%	—
`src/lib/inference/nim.ts`	—	72%	—
`src/lib/onboard/preflight.ts`	—	64%	—
`src/lib/state/sandbox.ts`	—	55%	—
`src/lib/onboard...er-gpu-patch.ts`	—	50%	—
`src/lib/actions...licy-channel.ts`	—	49%	—
`src/lib/policy/index.ts`	—	48%	—
`src/lib/onboard.ts`	—	17%	—

_{Updated June 13, 2026 09:59 UTC
Code Coverage is in Public Preview. Learn more and provide us with your feedback.}

github-actions · 2026-06-13T09:57:20Z

E2E Advisor Recommendation

Required E2E: None
Optional E2E: None

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

None. No existing NemoClaw E2E job is required because the diff only changes CI advisor tooling, advisor documentation, and advisor unit tests. It does not alter runtime/user-facing flows such as installation, onboarding, sandbox lifecycle, credentials, network policy, inference routing, deployment, or real assistant execution.

Optional E2E

None.

New E2E recommendations

CI advisor synthetic tool-result integration (medium): The PR adds deterministic synthetic tool-result injection into the shared advisor session runner. Existing product E2E jobs would not validate that these synthetic messages are persisted, visible to the model session, and exported correctly without real advisor credentials.
- Suggested test: Add a focused mocked Pi SDK integration test for runReadOnlyAdvisor that verifies synthetic tool results are appended before each turn, sessionManager receives the messages, raw diagnostics include the injected tool results, and prompt artifact export preserves the synthetic context.

github-actions · 2026-06-13T09:57:21Z

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: None
Optional Vitest E2E scenarios: None

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

None. Changes are limited to advisor tooling, advisor tests outside test/e2e-scenario/, and documentation. They do not change the Vitest scenario workflow, scenario registry/runtime support, live scenario files, fixtures, manifests, or other surfaces that affect Vitest E2E scenario behavior.

Optional Vitest E2E scenarios

None.

Relevant changed files

None.

github-actions · 2026-06-13T09:59:33Z

PR Review Advisor

Findings: 1 needs attention, 2 worth checking, 0 nice ideas
Top item: Restore retired E2E migration ledger governance

Review findings

🛠️ Needs attention

Restore retired E2E migration ledger governance (tools/pr-review-advisor/analyze.mts): The PR removes the deterministic scan and injected blocker for reintroduced retired repo-local E2E migration ledgers. Current review governance still requires blocker reporting when those retired ledgers are added or modified, so removing the source guard lets future reintroductions pass without deterministic enforcement.
- Recommendation: Keep or replace the deterministic retired-ledger detector and blocker injection for `test/e2e-scenario/migration/legacy-inventory.json` and `test/e2e/docs/parity-inventory.generated.json`, and retain a regression test proving an added or modified retired ledger produces a blocker.
- Evidence: The diff removes `retiredE2eMigrationLedgerChanges` from deterministic context, deletes `findRetiredE2eMigrationLedgerChanges`, deletes `addDeterministicFindings`, removes the Legacy E2E migration governance prompt rule, and deletes tests that asserted ledger detection/blocker injection.

🔎 Worth checking

Source-of-truth review needed: Synthetic tool-result session injection: The advisor marked localized patch analysis as needs_followup.
- Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
- Evidence: `injectSyntheticToolResults` directly mutates `session.agent.state.messages`; existing tests inspect `syntheticToolResults` arrays and artifacts but not the runtime session messages.
Synthetic tool-result injection lacks a session-level contract test (tools/advisors/session.mts:390): The new runner mutates `session.agent.state.messages` and persists handcrafted assistant/toolResult messages before each prompt turn. That internal SDK coupling is security-relevant because deterministic diff, metadata, schema, and GitHub context drive the advisor's security and acceptance decisions. If the SDK/provider serializes these messages differently, drops the tool linkage, or treats synthetic content as ordinary assistant text, the advisor may be under-informed or more exposed to prompt-injection content than intended.
- Recommendation: Add a focused contract test around `runReadOnlyAdvisor` using a fake Pi session/session manager that asserts synthetic assistant tool calls and matching toolResult messages are appended before `session.prompt`, with matching sanitized IDs, expected roles, and preserved content boundaries.
- Evidence: `injectSyntheticToolResults` constructs assistant `toolCall` and `toolResult` messages and appends them directly to `session.agent.state.messages`; current tests assert prompt-turn data and artifacts, but not the runtime message injection seen by the SDK.

🌱 Nice ideas

None.

Consider writing more tests for

**Runtime validation** — Mock `createAgentSession`/`SessionManager` and assert `runReadOnlyAdvisor` appends synthetic assistant tool calls plus matching toolResult messages before `session.prompt` for every turn.. The PR changes advisor runtime/session infrastructure and deterministic context delivery. Existing tests cover prompt builders and artifact layout, but not the runtime Pi SDK message-injection contract or the removed deterministic governance guard.
**Runtime validation** — Verify duplicate, empty, and unsafe synthetic `toolName`/`toolCallId` values are sanitized and de-duplicated while preserving assistant call to tool-result linkage.. The PR changes advisor runtime/session infrastructure and deterministic context delivery. Existing tests cover prompt builders and artifact layout, but not the runtime Pi SDK message-injection contract or the removed deterministic governance guard.
**Runtime validation** — Verify synthetic prompt artifacts safely fence content containing backticks and prompt-injection-like text.. The PR changes advisor runtime/session infrastructure and deterministic context delivery. Existing tests cover prompt builders and artifact layout, but not the runtime Pi SDK message-injection contract or the removed deterministic governance guard.
**Runtime validation** — Restore or replace the retired E2E migration ledger regression test that asserts an added or modified retired ledger produces a blocker finding.. The PR changes advisor runtime/session infrastructure and deterministic context delivery. Existing tests cover prompt builders and artifact layout, but not the runtime Pi SDK message-injection contract or the removed deterministic governance guard.
**Restore retired E2E migration ledger governance** — Keep or replace the deterministic retired-ledger detector and blocker injection for `test/e2e-scenario/migration/legacy-inventory.json` and `test/e2e/docs/parity-inventory.generated.json`, and retain a regression test proving an added or modified retired ledger produces a blocker.
**Retired E2E migration ledger governance** — A PR review advisor test should feed a diff that adds or modifies `test/e2e-scenario/migration/legacy-inventory.json` and assert a blocker finding is injected.. The diff deletes `findRetiredE2eMigrationLedgerChanges`, the `retiredE2eMigrationLedgerChanges` deterministic context field, deterministic blocker injection, and the tests that covered those paths.
**Synthetic tool-result session injection** — Mock `createAgentSession` and `SessionManager` to verify synthetic tool calls/results are appended before each `session.prompt`, with matching sanitized IDs and content preserved as tool-result text.. `injectSyntheticToolResults` directly mutates `session.agent.state.messages`; existing tests inspect `syntheticToolResults` arrays and artifacts but not the runtime session messages.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

refactor(advisors): preload deterministic context as tool results

2588aa3

cv self-assigned this Jun 13, 2026

Merge branch 'main' into refactor/advisor-synthetic-context

e77c00f

cv merged commit 158f575 into main Jun 13, 2026
45 checks passed

cv deleted the refactor/advisor-synthetic-context branch June 13, 2026 10:04

Conversation

cv commented Jun 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Type of Change

Verification

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

github-code-quality Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage Overview

TypeScript / code-coverage/plugin

TypeScript / code-coverage/cli

Uh oh!

github-actions Bot commented Jun 13, 2026

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Uh oh!

github-actions Bot commented Jun 13, 2026

Vitest E2E Scenario Recommendation

Vitest E2E Scenario Advisor

Required Vitest E2E scenarios

Optional Vitest E2E scenarios

Relevant changed files

Uh oh!

github-actions Bot commented Jun 13, 2026

PR Review Advisor

🛠️ Needs attention

🔎 Worth checking

🌱 Nice ideas

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cv commented Jun 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 13, 2026 •

edited

Loading

github-code-quality Bot commented Jun 13, 2026 •

edited

Loading