feat: inform LLM about runId background task in prompt_sandbox by sweetmantech · Pull Request #257 · recoupable/api

sweetmantech · 2026-03-04T15:43:12Z

Summary

Updated prompt_sandbox tool description to explain runId behavior to the LLM
When a fresh sandbox triggers a background task (no snapshot), the tool returns a runId with empty output
The LLM was unaware of this and would try to interpret/summarize the empty result
Now the description explicitly tells the LLM to inform the user their request is processing and results will appear in the task progress view

Test plan

All 1261 tests pass (1 pre-existing failure unrelated to this change)
Verify LLM no longer tries to summarize empty output when runId is returned

🤖 Generated with Claude Code

Summary by CodeRabbit

Documentation
- Expanded tool description to clarify behavior when a task ID is present: sandbox/setup and background task dispatch, cases of empty immediate output (do not summarize or interpret), and guidance that results will appear in the task progress view.

When the sandbox is being set up for the first time, prompt_sandbox dispatches the command to a background task and returns a runId with empty output. The LLM was unaware of this and would try to interpret the empty result. Updated the tool description to explain that runId means a background task is running and the UI shows live progress. Co-Authored-By: Claude Opus 4.6 <[email protected]>

vercel · 2026-03-04T15:43:20Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
recoup-api	Ready	Preview	Mar 4, 2026 4:08pm

coderabbitai · 2026-03-04T15:43:32Z

Important

Review skipped

Review was skipped due to path filters

⛔ Files ignored due to path filters (1)

lib/chat/tools/__tests__/createPromptSandboxStreamingTool.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**

CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including **/dist/** will override the default block on the dist directory, by removing the pattern from both the lists.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 56d404e1-276e-49c9-bdbd-3adf0d54561d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

The public tool description for createPromptSandboxStreamingTool has been extended to clarify behavior when a runId is present, explaining that the tool dispatches background tasks and outputs appear in the progress view rather than streaming output.

Changes

Cohort / File(s)	Summary
Tool Documentation `lib/chat/tools/createPromptSandboxStreamingTool.ts`	Extended the `description` property to include guidance on background task behavior when a `runId` is provided, note that the tool dispatches a background task, may produce empty streaming output, and that results should be viewed in the UI progress/task view rather than as summarized streaming output.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Possibly related PRs

feat: stream prompt_sandbox output for faster first token #246: Introduced or previously modified the createPromptSandboxStreamingTool that this PR's description update directly affects.
feat: inline setup for fresh sandbox onboarding (simplified) #256: Adds fromSnapshot/runId background behavior and routes tied to the updated description and output handling.

Poem

A prompt’s note now sings more clear,
When runId comes, the path is near.
Background work hums out of sight,
Progress shows the final light—
Small words, big clarity, cheers! 🎉

🚥 Pre-merge checks | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Solid & Clean Code	⚠️ Warning	Tool description violates Single Responsibility Principle by bundling five distinct concerns into one monolithic 260+-word string, and contains a factually inaccurate UI claim contradicting the backend-only codebase.	Refactor to separate concerns: remove the false UI claim, extract LLM-specific guidance into a separate constant, and simplify the description to focus on tool purpose and essential technical details for backend appropriateness.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch sweetmantech/myc-4399-api-tools-prompt_sandbox-inform-llm-of-runid-and-fromsandbox

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Co-Authored-By: Claude Opus 4.6 <[email protected]>

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

lib/chat/tools/createPromptSandboxStreamingTool.ts (1)

55-60: Consider extracting the background task guidance to a named constant.

The description is becoming verbose with multiple concatenated sentences. For consistency with SANDBOX_PROMPT_NOTE (lines 5-11) and improved maintainability, consider extracting the runId guidance to a dedicated constant.

♻️ Suggested refactor

 export const SANDBOX_PROMPT_NOTE =
   "IMPORTANT: When you make changes to any files inside the orgs/ directory, " +
   "always commit and push those changes directly to main so they are preserved and shared across sessions.\n\n" +
   "IMPORTANT: When a prompt includes attached file URLs (e.g. from email attachments), " +
   "always download the files first using curl and save them locally before referencing them. " +
   "These URLs are temporary and expire after 1 hour. Never store the download URL directly in files — " +
   "download the content, save it to the appropriate location in the repo, and reference the local path instead.";

+export const SANDBOX_BACKGROUND_TASK_NOTE =
+  "IMPORTANT: When the result contains a `runId`, it means the sandbox is being set up for the first time " +
+  "and the command was dispatched to a background task. The output will be empty because the task is still running. " +
+  "The UI automatically shows a live progress view for background tasks — do NOT summarize or interpret the empty output. " +
+  "Simply tell the user their request is being processed in the sandbox and the results will appear in the task progress view above. " +
+  "Do NOT automatically poll or check the task status — instead, let the user know they can ask you to check on it whenever they want.";
+
 const promptSandboxSchema = z.object({

Then use it in the description:

     description:
       "Send a prompt to the agent running in the artist's persistent sandbox environment. " +
       "This is your primary tool — use it for release management (creating, updating, or reviewing releases), " +
       "file operations, data analysis, content generation, and any multi-step task. " +
       "The sandbox has skills for managing RELEASE.md documents, generating deliverables, and more. " +
       "Reuses the account's existing running sandbox or creates one from the latest snapshot. " +
-      "Streams output in real-time. " +
-      "IMPORTANT: When the result contains a `runId`, it means the sandbox is being set up for the first time " +
-      "and the command was dispatched to a background task. The output will be empty because the task is still running. " +
-      "The UI automatically shows a live progress view for background tasks — do NOT summarize or interpret the empty output. " +
-      "Simply tell the user their request is being processed in the sandbox and the results will appear in the task progress view above. " +
-      "Do NOT automatically poll or check the task status — instead, let the user know they can ask you to check on it whenever they want.",
+      "Streams output in real-time. " +
+      SANDBOX_BACKGROUND_TASK_NOTE,

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@lib/chat/tools/createPromptSandboxStreamingTool.ts` around lines 55 - 60,
Extract the long runId/background-task guidance from the description into a
named constant (e.g., RUNID_BACKGROUND_TASK_NOTE) and reuse it in
createPromptSandboxStreamingTool's description string instead of concatenating
those sentences inline; keep the text identical, place the new constant near
SANDBOX_PROMPT_NOTE for consistency, export it if other modules need it, and
replace the concatenated block in the description with a reference to
RUNID_BACKGROUND_TASK_NOTE.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/chat/tools/createPromptSandboxStreamingTool.ts`:
- Around line 56-59: Update the long instruction string in
createPromptSandboxStreamingTool (the multi-line message built around the
`runId` behavior) to remove the inaccurate "UI automatically shows a live
progress view" claim and replace it with the provided wording: state that when a
`runId` is present the sandbox is being set up, the output will be empty while
the background task runs, do NOT summarize the empty output, and tell the user
the request is being processed and results will appear once the task completes;
locate the string construction in createPromptSandboxStreamingTool.ts (the block
concatenating multiple quoted segments) and substitute the revised text exactly
as suggested.

---

Nitpick comments:
In `@lib/chat/tools/createPromptSandboxStreamingTool.ts`:
- Around line 55-60: Extract the long runId/background-task guidance from the
description into a named constant (e.g., RUNID_BACKGROUND_TASK_NOTE) and reuse
it in createPromptSandboxStreamingTool's description string instead of
concatenating those sentences inline; keep the text identical, place the new
constant near SANDBOX_PROMPT_NOTE for consistency, export it if other modules
need it, and replace the concatenated block in the description with a reference
to RUNID_BACKGROUND_TASK_NOTE.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 27d5b8f3-c60c-4a92-aae9-d2f41577e86b

📥 Commits

Reviewing files that changed from the base of the PR and between 2ae53f2 and 1b23496.

📒 Files selected for processing (1)

lib/chat/tools/createPromptSandboxStreamingTool.ts

coderabbitai · 2026-03-04T16:08:36Z

lib/chat/tools/createPromptSandboxStreamingTool.ts

+      "IMPORTANT: When the result contains a `runId`, it means the sandbox is being set up for the first time " +
+      "and the command was dispatched to a background task. The output will be empty because the task is still running. " +
+      "The UI automatically shows a live progress view for background tasks — do NOT summarize or interpret the empty output. " +
+      "Simply tell the user their request is being processed in the sandbox and the results will appear in the task progress view above. " +


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for UI components that handle runId or task progress views echo "=== Searching for runId handling in UI/app components ===" rg -n 'runId' --type=ts --type=tsx | grep -E '(app/|components/)' | head -20 echo "" echo "=== Searching for task progress view components ===" rg -n -i 'progress.*view|task.*progress|background.*task' --type=ts --type=tsx | grep -E '(app/|components/)' | head -20 echo "" echo "=== Searching for Trigger.dev run subscription patterns ===" rg -n 'useRealtimeRun|subscribeToRun|task.*status' --type=ts --type=tsx | head -20 echo "" echo "=== Checking app directory structure ===" find app -type f -name "*.ts" -o -name "*.tsx" | head -20

Repository: recoupable/api

Length of output: 975

🏁 Script executed:

#!/bin/bash # Search for UI components in other common locations echo "=== Checking for components directory ===" find . -type d -name "components" | head -10 echo "" echo "=== Checking for pages directory ===" find . -type d -name "pages" | head -10 echo "" echo "=== Search for runId in all TypeScript/JavaScript files ===" rg 'runId' --type=ts | head -30 echo "" echo "=== Check createPromptSandboxStreamingTool.ts for imports and context ===" head -100 lib/chat/tools/createPromptSandboxStreamingTool.ts | grep -E '^import|^export|runId|progress' echo "" echo "=== Verify: does promptSandboxStreaming return runId? ===" rg -A10 'return.*runId|runId.*:' lib/ | head -40

Repository: recoupable/api

Length of output: 6782

Remove or revise the inaccurate claim about the UI progress view.

The description references "The UI automatically shows a live progress view for background tasks," but this repository is a backend API-only codebase with no frontend UI components. The claim will mislead the LLM and ultimately confuse users about where results will appear.

Revise to:

"IMPORTANT: When the result contains a `runId`, it means the sandbox is being set up for the first time " + "and the command was dispatched to a background task. The output will be empty because the task is still running. " + "Do NOT summarize or interpret the empty output. " + "Simply tell the user their request is being processed in the sandbox and the results will appear once the task completes."

The runId behavior itself is correctly documented based on the backend implementation, but remove references to UI components that don't exist in this codebase.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@lib/chat/tools/createPromptSandboxStreamingTool.ts` around lines 56 - 59, Update the long instruction string in createPromptSandboxStreamingTool (the multi-line message built around the `runId` behavior) to remove the inaccurate "UI automatically shows a live progress view" claim and replace it with the provided wording: state that when a `runId` is present the sandbox is being set up, the output will be empty while the background task runs, do NOT summarize the empty output, and tell the user the request is being processed and results will appear once the task completes; locate the string construction in createPromptSandboxStreamingTool.ts (the block concatenating multiple quoted segments) and substitute the revised text exactly as suggested.

vercel bot deployed to Preview March 4, 2026 15:44 View deployment

feat: tell LLM not to poll task status, let user ask instead

1b23496

Co-Authored-By: Claude Opus 4.6 <[email protected]>

vercel bot deployed to Preview March 4, 2026 16:05 View deployment

test: verify prompt_sandbox description explains runId behavior

8ad50f0

Co-Authored-By: Claude Opus 4.6 <[email protected]>

vercel bot deployed to Preview March 4, 2026 16:08 View deployment

coderabbitai bot reviewed Mar 4, 2026

View reviewed changes

sweetmantech merged commit 0d37cbc into test Mar 4, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: inform LLM about runId background task in prompt_sandbox#257

feat: inform LLM about runId background task in prompt_sandbox#257
sweetmantech merged 3 commits intotestfrom
sweetmantech/myc-4399-api-tools-prompt_sandbox-inform-llm-of-runid-and-fromsandbox

sweetmantech commented Mar 4, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

vercel bot commented Mar 4, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 4, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sweetmantech commented Mar 4, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

vercel bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sweetmantech commented Mar 4, 2026 •

edited by coderabbitai bot

Loading

vercel bot commented Mar 4, 2026 •

edited

Loading

coderabbitai bot commented Mar 4, 2026 •

edited

Loading