Skip to content

refactor(advisors): preload deterministic context as tool results#5386

Merged
cv merged 2 commits into
mainfrom
refactor/advisor-synthetic-context
Jun 13, 2026
Merged

refactor(advisors): preload deterministic context as tool results#5386
cv merged 2 commits into
mainfrom
refactor/advisor-synthetic-context

Conversation

@cv

@cv cv commented Jun 13, 2026

Copy link
Copy Markdown
Collaborator

Summary

Refactors the PR review and E2E advisors to preload deterministic repository/PR context into the Pi SDK session as synthetic tool-call/tool-result pairs. This keeps advisor prompts focused on instructions, makes context boundaries clearer in transcripts, and removes a one-off retired E2E migration ledger rule from the PR review advisor.

Changes

  • Added synthetic tool-result preloading support to tools/advisors/session.mts and persist those messages into advisor session exports.
  • Moved PR review advisor drift, diff, security, validation, metadata, and schema context out of interpolated prompt strings and into specific synthetic tool results.
  • Moved general E2E and Vitest scenario advisor metadata, changed files, diff, and schema context into explicit synthetic tool results.
  • Removed retiredE2eMigrationLedgerChanges and its special-case blocker finding from the PR review advisor.
  • Updated advisor tests and tool README artifact descriptions for the new synthetic context layout.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

Targeted checks passed:

  • npm run typecheck:cli
  • npx vitest run test/pr-review-advisor.test.ts test/e2e-scenario-advisor.test.ts
  • Biome check on changed advisor files

Attempted npx prek run --all-files and npm test; both currently fail in unrelated runtime recovery/secret-boundary tests outside this change set, so their boxes are left unchecked.

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

Release Notes

  • Tests

    • Updated test coverage for E2E and PR review advisors to verify new prompt construction and synthetic context handling.
  • Documentation

    • Updated advisor documentation to clarify how deterministic synthetic tool results are injected into advisor sessions.
  • Refactor

    • Refactored advisors to use deterministic synthetic tool results for context injection instead of embedding metadata directly in prompts.
    • Removed legacy retired E2E migration ledger detection from PR review advisor.

@cv cv self-assigned this Jun 13, 2026
@coderabbitai

coderabbitai Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1f3d9904-1aa8-4ce6-8e15-f01c55b8c9a6

📥 Commits

Reviewing files that changed from the base of the PR and between 39fd60a and e77c00f.

📒 Files selected for processing (9)
  • test/e2e-scenario-advisor.test.ts
  • test/pr-review-advisor.test.ts
  • tools/advisors/README.md
  • tools/advisors/session.mts
  • tools/e2e-advisor/README.md
  • tools/e2e-advisor/analyze.mts
  • tools/e2e-advisor/scenarios.mts
  • tools/pr-review-advisor/README.md
  • tools/pr-review-advisor/analyze.mts

📝 Walkthrough

Walkthrough

This PR refactors advisor session infrastructure to decouple deterministic context metadata from prompt text through synthetic tool results. The core session plumbing defines synthetic tool result types and injection logic; E2E advisor, scenario advisor, and PR review advisor are updated to build prompt turns with tool-attached metadata/diff/schema; retired E2E migration ledger governance is removed from PR review; tests are updated to validate the new flow.

Changes

Advisor prompt refactoring to synthetic tool results

Layer / File(s) Summary
Session-level synthetic tool-result types and injection
tools/advisors/session.mts, tools/advisors/README.md
New exported types AdvisorSyntheticToolContentType and AdvisorSyntheticToolResult extend AdvisorPromptTurn with optional syntheticToolResults. Session manager is created and wired into createAgentSession; during each turn loop, injectSyntheticToolResults prepends synthetic assistant tool-call and tool-result messages, persists them via sessionManager.appendMessage, and logs injection details. Helper functions sanitize tool names and generate stable unique IDs. Documentation updated to describe deterministic synthetic tool-result preloading.
E2E analyzer prompt-turn refactoring
tools/e2e-advisor/analyze.mts, tools/e2e-advisor/README.md
Replaces inline schema embedding with structured buildPromptTurn() that constructs AdvisorPromptTurn with syntheticToolResults (metadata, changed files, diff, response schema). runReadOnlyAdvisor call updated to pass promptTurns: [promptTurn]. System prompt signature changed to optional schema, relying on synthetic tool schema instead. Prompt artifact written from promptTurn.prompt.
Scenario advisor prompt-turn refactoring
tools/e2e-advisor/scenarios.mts
buildScenarioPromptTurn() now constructs prompt turn with synthetic metadata/changed-files/diff/response-schema tool results; buildPrompt() delegates to it. buildSystemPrompt signature updated to optional schema; instruction text routes JSON schema lookup to synthetic e2e_scenario_response_schema tool result and adds untrusted-evidence guidance. runReadOnlyAdvisor call wired to promptTurns: [promptTurn].
PR review: retired E2E migration ledger cleanup
tools/pr-review-advisor/analyze.mts
Removes RetiredE2eMigrationLedgerChange type, exported findRetiredE2eMigrationLedgerChanges() function, and retired-ledger field from DeterministicReviewContext. collectDeterministicContext no longer computes retired-ledger changes. addDeterministicFindings injection removed. System prompt item #9 updated from legacy ledger governance to prior-comment comparison logic.
PR review: multi-turn synthetic tool context wiring
tools/pr-review-advisor/analyze.mts, tools/pr-review-advisor/README.md
All four prompt turns (drift, security, validation, synthesis) now attach syntheticToolResults with turn-specific context (metadata, diff, security context, validation context, response schema) instead of embedding in prompt text. buildPromptTurns refactored to construct tool results; syntheticToolResult() helper creates typed results. writePromptArtifacts extended to emit per-turn .synthetic-tool-results/ directories with per-tool markdown files. normalizeReviewResult switched to inject only source-of-truth findings. README documents synthetic tool result directories.
E2E scenario advisor test updates
test/e2e-scenario-advisor.test.ts
Imports buildScenarioPromptTurn; user prompt test validates that synthetic tool results are referenced and metadata/diff/schema are not embedded inline. System prompt test verifies non-empty content and schema lookup routed via synthetic tools. buildScenarioPromptTurn assertions validate expected tool names and content fragments.
PR review advisor test updates
test/pr-review-advisor.test.ts
Removes findRetiredE2eMigrationLedgerChanges import and metadata field; adds source-of-truth wording expectations. Strengthened buildPromptTurns assertions verify synthetic tool result presence with specific tool names and content matching. Diff-backtick test validates pr_review_git_diff synthetic tool results. Artifact assertions include synthetic-tool-results files. Removes retired-ledger test cases.

Sequence Diagrams

sequenceDiagram
  participant runReadOnlyAdvisor
  participant injectSyntheticToolResults
  participant session.agent.state.messages
  participant sessionManager
  runReadOnlyAdvisor->>injectSyntheticToolResults: pass turn + syntheticToolResults
  injectSyntheticToolResults->>injectSyntheticToolResults: sanitizeToolName, uniqueToolCallId
  injectSyntheticToolResults->>session.agent.state.messages: append assistant toolCall + toolResult
  injectSyntheticToolResults->>sessionManager: appendMessage (persist synthetic entries)
  runReadOnlyAdvisor->>session.agent.state.messages: append user prompt turn
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#5126: Adds retired E2E migration ledger detection and blocker finding handling via findRetiredE2eMigrationLedgerChanges, while this PR removes that function and its related test coverage.
  • NVIDIA/NemoClaw#5170: Modifies tools/pr-review-advisor/analyze.mts system prompt rubric guidance (main PR updates prompt-turn context handling and removes retired-ledger rule; retrieved PR adds Vitest E2E suite simplicity review lens).
  • NVIDIA/NemoClaw#5234: Updates tools/e2e-advisor/scenarios.mts and test/e2e-scenario-advisor.test.ts scenario advisor prompt/system-schema wiring and decision/selector behavior.

Suggested labels

refactor, area: ci, v0.0.65

Suggested reviewers

  • jyaunches

Poem

🐰 Synthetic whispers through the advisor's ear,
No more embedded schemas crammed in here,
Tool results arrive like turnips on a plate,
Context flows with refinement—oh, how great!
Retired ledgers rest, old rules depart,
Prompt turns blossom as a work of art. 🌱

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately summarizes the main architectural change: refactoring advisors to use synthetic tool results for deterministic context preloading rather than embedding it inline.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch refactor/advisor-synthetic-context

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-code-quality

github-code-quality Bot commented Jun 13, 2026

Copy link
Copy Markdown

Code Coverage Overview

Languages: TypeScript

TypeScript / code-coverage/plugin

The overall coverage in the branch is 96%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File e77c00f +/-
nemoclaw/src/se...cret-scanner.ts 100%
nemoclaw/src/commands/slash.ts 100%
nemoclaw/src/li...bprocess-env.ts 100%
nemoclaw/src/bl...eprint/state.ts 98%
nemoclaw/src/onboard/config.ts 98%
nemoclaw/src/bl...int/snapshot.ts 97%
nemoclaw/src/bl...print/runner.ts 95%
nemoclaw/src/co...ration-state.ts 94%
nemoclaw/src/bl...ate-networks.ts 94%
nemoclaw/src/index.ts 94%

TypeScript / code-coverage/cli

The overall coverage in the branch is 44%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File e77c00f +/-
src/lib/state/o...oard-session.ts 90%
src/lib/inference/local.ts 78%
src/lib/sandbox/config.ts 72%
src/lib/inference/nim.ts 72%
src/lib/onboard/preflight.ts 64%
src/lib/state/sandbox.ts 55%
src/lib/onboard...er-gpu-patch.ts 50%
src/lib/actions...licy-channel.ts 49%
src/lib/policy/index.ts 48%
src/lib/onboard.ts 17%

Updated June 13, 2026 09:59 UTC
Code Coverage is in Public Preview. Learn more and provide us with your feedback.

@github-actions

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: None
Optional E2E: None

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • None. No existing NemoClaw E2E job is required because the diff only changes CI advisor tooling, advisor documentation, and advisor unit tests. It does not alter runtime/user-facing flows such as installation, onboarding, sandbox lifecycle, credentials, network policy, inference routing, deployment, or real assistant execution.

Optional E2E

  • None.

New E2E recommendations

  • CI advisor synthetic tool-result integration (medium): The PR adds deterministic synthetic tool-result injection into the shared advisor session runner. Existing product E2E jobs would not validate that these synthetic messages are persisted, visible to the model session, and exported correctly without real advisor credentials.
    • Suggested test: Add a focused mocked Pi SDK integration test for runReadOnlyAdvisor that verifies synthetic tool results are appended before each turn, sessionManager receives the messages, raw diagnostics include the injected tool results, and prompt artifact export preserves the synthetic context.

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: None
Optional Vitest E2E scenarios: None

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

  • None. Changes are limited to advisor tooling, advisor tests outside test/e2e-scenario/, and documentation. They do not change the Vitest scenario workflow, scenario registry/runtime support, live scenario files, fixtures, manifests, or other surfaces that affect Vitest E2E scenario behavior.

Optional Vitest E2E scenarios

  • None.

Relevant changed files

  • None.

@github-actions

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 1 needs attention, 2 worth checking, 0 nice ideas
Top item: Restore retired E2E migration ledger governance

Review findings

🛠️ Needs attention

  • Restore retired E2E migration ledger governance (tools/pr-review-advisor/analyze.mts): The PR removes the deterministic scan and injected blocker for reintroduced retired repo-local E2E migration ledgers. Current review governance still requires blocker reporting when those retired ledgers are added or modified, so removing the source guard lets future reintroductions pass without deterministic enforcement.
    • Recommendation: Keep or replace the deterministic retired-ledger detector and blocker injection for `test/e2e-scenario/migration/legacy-inventory.json` and `test/e2e/docs/parity-inventory.generated.json`, and retain a regression test proving an added or modified retired ledger produces a blocker.
    • Evidence: The diff removes `retiredE2eMigrationLedgerChanges` from deterministic context, deletes `findRetiredE2eMigrationLedgerChanges`, deletes `addDeterministicFindings`, removes the Legacy E2E migration governance prompt rule, and deletes tests that asserted ledger detection/blocker injection.

🔎 Worth checking

  • Source-of-truth review needed: Synthetic tool-result session injection: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `injectSyntheticToolResults` directly mutates `session.agent.state.messages`; existing tests inspect `syntheticToolResults` arrays and artifacts but not the runtime session messages.
  • Synthetic tool-result injection lacks a session-level contract test (tools/advisors/session.mts:390): The new runner mutates `session.agent.state.messages` and persists handcrafted assistant/toolResult messages before each prompt turn. That internal SDK coupling is security-relevant because deterministic diff, metadata, schema, and GitHub context drive the advisor's security and acceptance decisions. If the SDK/provider serializes these messages differently, drops the tool linkage, or treats synthetic content as ordinary assistant text, the advisor may be under-informed or more exposed to prompt-injection content than intended.
    • Recommendation: Add a focused contract test around `runReadOnlyAdvisor` using a fake Pi session/session manager that asserts synthetic assistant tool calls and matching toolResult messages are appended before `session.prompt`, with matching sanitized IDs, expected roles, and preserved content boundaries.
    • Evidence: `injectSyntheticToolResults` constructs assistant `toolCall` and `toolResult` messages and appends them directly to `session.agent.state.messages`; current tests assert prompt-turn data and artifacts, but not the runtime message injection seen by the SDK.

🌱 Nice ideas

  • None.
Consider writing more tests for
  • **Runtime validation** — Mock `createAgentSession`/`SessionManager` and assert `runReadOnlyAdvisor` appends synthetic assistant tool calls plus matching toolResult messages before `session.prompt` for every turn.. The PR changes advisor runtime/session infrastructure and deterministic context delivery. Existing tests cover prompt builders and artifact layout, but not the runtime Pi SDK message-injection contract or the removed deterministic governance guard.
  • **Runtime validation** — Verify duplicate, empty, and unsafe synthetic `toolName`/`toolCallId` values are sanitized and de-duplicated while preserving assistant call to tool-result linkage.. The PR changes advisor runtime/session infrastructure and deterministic context delivery. Existing tests cover prompt builders and artifact layout, but not the runtime Pi SDK message-injection contract or the removed deterministic governance guard.
  • **Runtime validation** — Verify synthetic prompt artifacts safely fence content containing backticks and prompt-injection-like text.. The PR changes advisor runtime/session infrastructure and deterministic context delivery. Existing tests cover prompt builders and artifact layout, but not the runtime Pi SDK message-injection contract or the removed deterministic governance guard.
  • **Runtime validation** — Restore or replace the retired E2E migration ledger regression test that asserts an added or modified retired ledger produces a blocker finding.. The PR changes advisor runtime/session infrastructure and deterministic context delivery. Existing tests cover prompt builders and artifact layout, but not the runtime Pi SDK message-injection contract or the removed deterministic governance guard.
  • **Restore retired E2E migration ledger governance** — Keep or replace the deterministic retired-ledger detector and blocker injection for `test/e2e-scenario/migration/legacy-inventory.json` and `test/e2e/docs/parity-inventory.generated.json`, and retain a regression test proving an added or modified retired ledger produces a blocker.
  • **Retired E2E migration ledger governance** — A PR review advisor test should feed a diff that adds or modifies `test/e2e-scenario/migration/legacy-inventory.json` and assert a blocker finding is injected.. The diff deletes `findRetiredE2eMigrationLedgerChanges`, the `retiredE2eMigrationLedgerChanges` deterministic context field, deterministic blocker injection, and the tests that covered those paths.
  • **Synthetic tool-result session injection** — Mock `createAgentSession` and `SessionManager` to verify synthetic tool calls/results are appended before each `session.prompt`, with matching sanitized IDs and content preserved as tool-result text.. `injectSyntheticToolResults` directly mutates `session.agent.state.messages`; existing tests inspect `syntheticToolResults` arrays and artifacts but not the runtime session messages.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@cv cv merged commit 158f575 into main Jun 13, 2026
45 checks passed
@cv cv deleted the refactor/advisor-synthetic-context branch June 13, 2026 10:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant