Skip to content

Migrate sandbox backend to together-sandbox SDK#734

Open
necoline wants to merge 3 commits into
thinking-machines-lab:mainfrom
necoline:neco/fix-csb-bartender-api-clean
Open

Migrate sandbox backend to together-sandbox SDK#734
necoline wants to merge 3 commits into
thinking-machines-lab:mainfrom
necoline:neco/fix-csb-bartender-api-clean

Conversation

@necoline

@necoline necoline commented May 26, 2026

Copy link
Copy Markdown

Summary

  • Replace API tests with the official together-sandbox Python SDK
  • Rename CodeSandboxSandboxTogetherSandbox, file codesandbox_sandbox.pytogether_sandbox.py
  • Accept TOGETHER_API_KEY, CSB_API_KEY, or CSB_STREAM_API_KEY for auth

Test plan

  • Verified import: from tinker_cookbook.sandbox.together_sandbox import TogetherSandbox
  • Tested end-to-end with a single task (chess-best-move): snapshot cache hit, sandbox create/start, run_command("echo hello")exit_code=0, stdout='hello\n', cleanup/shutdown
  • Full eval run (requires all task snapshots to be cached or registry credentials for fresh builds)

🤖 Generated with Claude Code

necoline and others added 3 commits May 19, 2026 17:25
New files:
- tinker_cookbook/sandbox/codesandbox_sandbox.py: SandboxInterface implementation
  using the Bartender API for lifecycle and Pint for in-sandbox operations
- tinker_cookbook/recipes/harbor_rl/scripts/eval_terminal_bench_csb.py: eval
  script that uses the CSB backend instead of Modal

The CSB adapter handles:
- Docker image build + push to CSB registry
- Snapshot creation with alias-based caching
- Sandbox start + wait for running state
- Command execution via Pint (login shell, cd-based workdir)
- File read/write via Pint
- DNS configuration on sandbox start

Tested: 3/5 Terminal-Bench tasks completed with 0 sandbox errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mirrors train_terminal_bench.py but uses csb_sandbox_factory instead of
Modal's default_sandbox_factory. Same interface as the eval script.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…box SDK

Replace the hand-rolled _BartenderClient and _PintClient HTTP classes
(~516 lines) with a thin wrapper around the official together-sandbox
Python SDK. The SDK handles Docker builds, snapshot management, sandbox
lifecycle, command execution, and file operations.

- Delete codesandbox_sandbox.py, add together_sandbox.py
- Rename class CodeSandboxSandbox → TogetherSandbox
- Auth: accept TOGETHER_API_KEY, CSB_API_KEY, or CSB_STREAM_API_KEY
- Env vars for resources: CSB_CPU → TOGETHER_SANDBOX_CPU, etc.
- Add together-sandbox git dependency to pyproject.toml
- Tested end-to-end: snapshot cache hit, sandbox create/start,
  run_command, and cleanup all working

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant