Skip to content

feat: inform LLM about runId background task in prompt_sandbox#257

Merged
sweetmantech merged 3 commits intotestfrom
sweetmantech/myc-4399-api-tools-prompt_sandbox-inform-llm-of-runid-and-fromsandbox
Mar 4, 2026
Merged

feat: inform LLM about runId background task in prompt_sandbox#257
sweetmantech merged 3 commits intotestfrom
sweetmantech/myc-4399-api-tools-prompt_sandbox-inform-llm-of-runid-and-fromsandbox

Conversation

@sweetmantech
Copy link
Contributor

@sweetmantech sweetmantech commented Mar 4, 2026

Summary

  • Updated prompt_sandbox tool description to explain runId behavior to the LLM
  • When a fresh sandbox triggers a background task (no snapshot), the tool returns a runId with empty output
  • The LLM was unaware of this and would try to interpret/summarize the empty result
  • Now the description explicitly tells the LLM to inform the user their request is processing and results will appear in the task progress view

Test plan

  • All 1261 tests pass (1 pre-existing failure unrelated to this change)
  • Verify LLM no longer tries to summarize empty output when runId is returned

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Documentation
    • Expanded tool description to clarify behavior when a task ID is present: sandbox/setup and background task dispatch, cases of empty immediate output (do not summarize or interpret), and guidance that results will appear in the task progress view.

When the sandbox is being set up for the first time, prompt_sandbox
dispatches the command to a background task and returns a runId with
empty output. The LLM was unaware of this and would try to interpret
the empty result. Updated the tool description to explain that runId
means a background task is running and the UI shows live progress.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@vercel
Copy link
Contributor

vercel bot commented Mar 4, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
recoup-api Ready Ready Preview Mar 4, 2026 4:08pm

Request Review

@coderabbitai
Copy link

coderabbitai bot commented Mar 4, 2026

Important

Review skipped

Review was skipped due to path filters

⛔ Files ignored due to path filters (1)
  • lib/chat/tools/__tests__/createPromptSandboxStreamingTool.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**

CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including **/dist/** will override the default block on the dist directory, by removing the pattern from both the lists.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 56d404e1-276e-49c9-bdbd-3adf0d54561d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The public tool description for createPromptSandboxStreamingTool has been extended to clarify behavior when a runId is present, explaining that the tool dispatches background tasks and outputs appear in the progress view rather than streaming output.

Changes

Cohort / File(s) Summary
Tool Documentation
lib/chat/tools/createPromptSandboxStreamingTool.ts
Extended the description property to include guidance on background task behavior when a runId is provided, note that the tool dispatches a background task, may produce empty streaming output, and that results should be viewed in the UI progress/task view rather than as summarized streaming output.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Possibly related PRs

Poem

A prompt’s note now sings more clear,
When runId comes, the path is near.
Background work hums out of sight,
Progress shows the final light—
Small words, big clarity, cheers! 🎉

🚥 Pre-merge checks | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Solid & Clean Code ⚠️ Warning Tool description violates Single Responsibility Principle by bundling five distinct concerns into one monolithic 260+-word string, and contains a factually inaccurate UI claim contradicting the backend-only codebase. Refactor to separate concerns: remove the false UI claim, extract LLM-specific guidance into a separate constant, and simplify the description to focus on tool purpose and essential technical details for backend appropriateness.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sweetmantech/myc-4399-api-tools-prompt_sandbox-inform-llm-of-runid-and-fromsandbox

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
lib/chat/tools/createPromptSandboxStreamingTool.ts (1)

55-60: Consider extracting the background task guidance to a named constant.

The description is becoming verbose with multiple concatenated sentences. For consistency with SANDBOX_PROMPT_NOTE (lines 5-11) and improved maintainability, consider extracting the runId guidance to a dedicated constant.

♻️ Suggested refactor
 export const SANDBOX_PROMPT_NOTE =
   "IMPORTANT: When you make changes to any files inside the orgs/ directory, " +
   "always commit and push those changes directly to main so they are preserved and shared across sessions.\n\n" +
   "IMPORTANT: When a prompt includes attached file URLs (e.g. from email attachments), " +
   "always download the files first using curl and save them locally before referencing them. " +
   "These URLs are temporary and expire after 1 hour. Never store the download URL directly in files — " +
   "download the content, save it to the appropriate location in the repo, and reference the local path instead.";

+export const SANDBOX_BACKGROUND_TASK_NOTE =
+  "IMPORTANT: When the result contains a `runId`, it means the sandbox is being set up for the first time " +
+  "and the command was dispatched to a background task. The output will be empty because the task is still running. " +
+  "The UI automatically shows a live progress view for background tasks — do NOT summarize or interpret the empty output. " +
+  "Simply tell the user their request is being processed in the sandbox and the results will appear in the task progress view above. " +
+  "Do NOT automatically poll or check the task status — instead, let the user know they can ask you to check on it whenever they want.";
+
 const promptSandboxSchema = z.object({

Then use it in the description:

     description:
       "Send a prompt to the agent running in the artist's persistent sandbox environment. " +
       "This is your primary tool — use it for release management (creating, updating, or reviewing releases), " +
       "file operations, data analysis, content generation, and any multi-step task. " +
       "The sandbox has skills for managing RELEASE.md documents, generating deliverables, and more. " +
       "Reuses the account's existing running sandbox or creates one from the latest snapshot. " +
-      "Streams output in real-time. " +
-      "IMPORTANT: When the result contains a `runId`, it means the sandbox is being set up for the first time " +
-      "and the command was dispatched to a background task. The output will be empty because the task is still running. " +
-      "The UI automatically shows a live progress view for background tasks — do NOT summarize or interpret the empty output. " +
-      "Simply tell the user their request is being processed in the sandbox and the results will appear in the task progress view above. " +
-      "Do NOT automatically poll or check the task status — instead, let the user know they can ask you to check on it whenever they want.",
+      "Streams output in real-time. " +
+      SANDBOX_BACKGROUND_TASK_NOTE,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/chat/tools/createPromptSandboxStreamingTool.ts` around lines 55 - 60,
Extract the long runId/background-task guidance from the description into a
named constant (e.g., RUNID_BACKGROUND_TASK_NOTE) and reuse it in
createPromptSandboxStreamingTool's description string instead of concatenating
those sentences inline; keep the text identical, place the new constant near
SANDBOX_PROMPT_NOTE for consistency, export it if other modules need it, and
replace the concatenated block in the description with a reference to
RUNID_BACKGROUND_TASK_NOTE.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/chat/tools/createPromptSandboxStreamingTool.ts`:
- Around line 56-59: Update the long instruction string in
createPromptSandboxStreamingTool (the multi-line message built around the
`runId` behavior) to remove the inaccurate "UI automatically shows a live
progress view" claim and replace it with the provided wording: state that when a
`runId` is present the sandbox is being set up, the output will be empty while
the background task runs, do NOT summarize the empty output, and tell the user
the request is being processed and results will appear once the task completes;
locate the string construction in createPromptSandboxStreamingTool.ts (the block
concatenating multiple quoted segments) and substitute the revised text exactly
as suggested.

---

Nitpick comments:
In `@lib/chat/tools/createPromptSandboxStreamingTool.ts`:
- Around line 55-60: Extract the long runId/background-task guidance from the
description into a named constant (e.g., RUNID_BACKGROUND_TASK_NOTE) and reuse
it in createPromptSandboxStreamingTool's description string instead of
concatenating those sentences inline; keep the text identical, place the new
constant near SANDBOX_PROMPT_NOTE for consistency, export it if other modules
need it, and replace the concatenated block in the description with a reference
to RUNID_BACKGROUND_TASK_NOTE.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 27d5b8f3-c60c-4a92-aae9-d2f41577e86b

📥 Commits

Reviewing files that changed from the base of the PR and between 2ae53f2 and 1b23496.

📒 Files selected for processing (1)
  • lib/chat/tools/createPromptSandboxStreamingTool.ts

Comment on lines +56 to +59
"IMPORTANT: When the result contains a `runId`, it means the sandbox is being set up for the first time " +
"and the command was dispatched to a background task. The output will be empty because the task is still running. " +
"The UI automatically shows a live progress view for background tasks — do NOT summarize or interpret the empty output. " +
"Simply tell the user their request is being processed in the sandbox and the results will appear in the task progress view above. " +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for UI components that handle runId or task progress views

echo "=== Searching for runId handling in UI/app components ==="
rg -n 'runId' --type=ts --type=tsx | grep -E '(app/|components/)' | head -20

echo ""
echo "=== Searching for task progress view components ==="
rg -n -i 'progress.*view|task.*progress|background.*task' --type=ts --type=tsx | grep -E '(app/|components/)' | head -20

echo ""
echo "=== Searching for Trigger.dev run subscription patterns ==="
rg -n 'useRealtimeRun|subscribeToRun|task.*status' --type=ts --type=tsx | head -20

echo ""
echo "=== Checking app directory structure ==="
find app -type f -name "*.ts" -o -name "*.tsx" | head -20

Repository: recoupable/api

Length of output: 975


🏁 Script executed:

#!/bin/bash
# Search for UI components in other common locations

echo "=== Checking for components directory ==="
find . -type d -name "components" | head -10

echo ""
echo "=== Checking for pages directory ==="
find . -type d -name "pages" | head -10

echo ""
echo "=== Search for runId in all TypeScript/JavaScript files ==="
rg 'runId' --type=ts | head -30

echo ""
echo "=== Check createPromptSandboxStreamingTool.ts for imports and context ==="
head -100 lib/chat/tools/createPromptSandboxStreamingTool.ts | grep -E '^import|^export|runId|progress'

echo ""
echo "=== Verify: does promptSandboxStreaming return runId? ==="
rg -A10 'return.*runId|runId.*:' lib/ | head -40

Repository: recoupable/api

Length of output: 6782


Remove or revise the inaccurate claim about the UI progress view.

The description references "The UI automatically shows a live progress view for background tasks," but this repository is a backend API-only codebase with no frontend UI components. The claim will mislead the LLM and ultimately confuse users about where results will appear.

Revise to:

"IMPORTANT: When the result contains a `runId`, it means the sandbox is being set up for the first time " +
"and the command was dispatched to a background task. The output will be empty because the task is still running. " +
"Do NOT summarize or interpret the empty output. " +
"Simply tell the user their request is being processed in the sandbox and the results will appear once the task completes."

The runId behavior itself is correctly documented based on the backend implementation, but remove references to UI components that don't exist in this codebase.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/chat/tools/createPromptSandboxStreamingTool.ts` around lines 56 - 59,
Update the long instruction string in createPromptSandboxStreamingTool (the
multi-line message built around the `runId` behavior) to remove the inaccurate
"UI automatically shows a live progress view" claim and replace it with the
provided wording: state that when a `runId` is present the sandbox is being set
up, the output will be empty while the background task runs, do NOT summarize
the empty output, and tell the user the request is being processed and results
will appear once the task completes; locate the string construction in
createPromptSandboxStreamingTool.ts (the block concatenating multiple quoted
segments) and substitute the revised text exactly as suggested.

@sweetmantech sweetmantech merged commit 0d37cbc into test Mar 4, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant