diff --git a/README.md b/README.md index c705dc48..32f5f858 100644 --- a/README.md +++ b/README.md @@ -168,6 +168,27 @@ This is useful for containerized or multi-node deployments where config is hoste > **Security best practice:** Never hardcode secrets in remote config files. Use environment variable references like `bot_token = "${DISCORD_BOT_TOKEN}"` and inject the actual values via local environment variables or Kubernetes Secrets. OpenAB expands `${VAR}` identically for both local and remote config. +### SSH Sandbox (Local Deployments) + +For local deployments, you can run the agent inside an isolated VM or container using SSH as a transparent stdio transport β€” no changes to OpenAB are needed. SSH is a byte pipe over stdin/stdout, which is exactly how `AcpConnection` communicates with agents. + +```toml +[agent] +command = "ssh" +args = [ + "-T", # no PTY β€” required, PTY corrupts JSON-RPC + "-o", "BatchMode=yes", # fail-fast, no interactive prompts + "-o", "ServerAliveInterval=30", # keep-alive for long sessions + "-o", "ServerAliveCountMax=3", + "-o", "StrictHostKeyChecking=accept-new", # daemon has no terminal for prompts + "user@sandbox-host", + "claude", "--acp" +] +working_dir = "/tmp" # local cwd for the SSH process, not the remote agent's workdir +``` + +See [docs/ssh-sandbox.md](docs/ssh-sandbox.md) for setup details, MCP server access patterns, and known limitations. + ## Configuration Reference > πŸ“– Full reference with all options, defaults, and Helm mapping: [docs/config-reference.md](docs/config-reference.md) diff --git a/config.toml.example b/config.toml.example index 277bdcf1..cc32c8ba 100644 --- a/config.toml.example +++ b/config.toml.example @@ -93,6 +93,22 @@ working_dir = "/home/agent" # working_dir = "/home/agent" # env = {} # Auth via: kubectl exec -it -- cursor-agent login +# SSH sandbox β€” run agent inside an isolated VM or container (local deployments) +# SSH is a transparent byte pipe over stdio; no changes to ACP protocol needed. +# See docs/ssh-sandbox.md for setup guide and known limitations. +# [agent] +# command = "ssh" +# args = [ +# "-T", # no PTY β€” required, PTY corrupts JSON-RPC +# "-o", "BatchMode=yes", # fail-fast, no interactive prompts +# "-o", "ServerAliveInterval=30", # keep-alive for long sessions +# "-o", "ServerAliveCountMax=3", +# "-o", "StrictHostKeyChecking=accept-new", # daemon has no terminal for prompts +# "user@sandbox-host", +# "claude", "--acp" +# ] +# working_dir = "/tmp" # local cwd for the SSH process, not the remote agent's workdir + [pool] max_sessions = 10 session_ttl_hours = 24 diff --git a/docs/ssh-sandbox.md b/docs/ssh-sandbox.md new file mode 100644 index 00000000..adc32f17 --- /dev/null +++ b/docs/ssh-sandbox.md @@ -0,0 +1,124 @@ +# SSH Sandbox for Local Deployments + +OpenAB targets k3s on cloud, where Kubernetes NetworkPolicy and Pod isolation handle security. For local deployments (developer laptop, home server), the default config runs the agent with full host permissions: + +```toml +[agent] +command = "claude" +args = ["--acp"] +``` + +The Claude subprocess inherits the host's full filesystem and network access. For a Discord bot accepting messages from arbitrary users, this is a meaningful attack surface. + +## SSH as a Zero-Code-Change Transport + +`AcpConnection::spawn()` treats the agent as a stdio JSON-RPC process. SSH is a transparent byte pipe over that same stdio β€” no changes to the ACP protocol, `SessionPool`, or `AcpConnection` internals are needed. + +``` +Current Proposed +─────────────────────── ────────────────────────────── +OpenAB OpenAB + β”‚ spawn β”‚ spawn + β–Ό β–Ό +claude (host permissions) ssh -T user@sandbox + β”œβ”€ reads ~/.ssh βœ— β”‚ encrypted stdio pipe + β”œβ”€ reads ~/Documents βœ— β–Ό + └─ unrestricted network βœ— claude (inside sandbox) + β”œβ”€ restricted filesystem βœ“ + β”œβ”€ network allowlist βœ“ + └─ MCP via host proxy βœ“ +``` + +## Configuration + +```toml +[agent] +command = "ssh" +args = [ + "-T", # no PTY β€” required (see below) + "-o", "BatchMode=yes", # fail-fast, no interactive prompts + "-o", "ServerAliveInterval=30", # keep-alive for long sessions + "-o", "ServerAliveCountMax=3", + "-o", "StrictHostKeyChecking=accept-new", # TOFU on first connect; pre-populate known_hosts in production + "user@sandbox-host", + "claude", "--acp" +] +working_dir = "/tmp" +``` + +### Why `-T` Is Required + +| Flag | Behavior | JSON-RPC safe? | +|------|----------|----------------| +| `-T` | Clean byte pipe, stderr separated | Yes | +| `-t` | Warns "PTY not allocated", stderr leaks into stdout | No β€” corrupts JSON stream | +| `-tt` | Forced PTY + piped stdin β†’ hangs indefinitely | No β€” deadlock | + +PTY inserts CR/LF conversion (`\n` β†’ `\r\n`), merges stderr into stdout, and enables echo mode β€” all of which break JSON-RPC parsing. **`-T` is mandatory, not optional.** + +### `StrictHostKeyChecking=accept-new` and TOFU + +`accept-new` trusts the host key on first connection without prompting. This is safe for ephemeral local VMs but is trust-on-first-use (TOFU) semantics. For production environments, pre-populate `~/.ssh/known_hosts` manually: + +```bash +ssh-keyscan sandbox-host >> ~/.ssh/known_hosts +``` + +Then switch to `StrictHostKeyChecking=yes` for stronger verification. + +### Why `BatchMode=yes` + +OpenAB runs as a daemon without a terminal. Interactive password prompts will hang the process. `BatchMode=yes` forces fail-fast behavior. SSH key-based auth must be configured beforehand. + +## Sandbox Options + +The SSH target is your choice β€” OpenAB does not care what is behind the SSH connection: + +| Environment | SSH target | Notes | +|-------------|-----------|-------| +| Mac (OrbStack) | `vm-name@orb` | Via `~/.orbstack/ssh/config` ProxyCommand | +| Linux | `user@nspawn-container` | systemd-nspawn with SSH | +| Remote machine | `user@10.0.0.5` | Any Linux server | +| Docker | wrapper script using `docker exec` | Alternative to SSH | + +## MCP Server Access from Sandbox + +If MCP servers run on the host, the sandbox cannot reach them via `localhost` (which resolves to the sandbox's own loopback). Options: + +**Option A: Host DNS alias (OrbStack)** +``` +claude (VM) ──http://host.internal:PORT──► MCP server (host) +``` + +**Option B: SSH port forwarding (universal)** +```bash +# add to ssh args: +"-L", "8080:localhost:8080" +``` +``` +claude (VM) ──http://localhost:8080──► [tunnel] ──► MCP server (host) +``` + +**Option C: Network bridge (Docker `--network host`)** +``` +claude (container) ──http://localhost:PORT──► MCP server (host) +``` + +## Known Limitations + +### `kill_on_drop` does not reliably terminate remote processes + +Killing the local SSH client process leaves the remote subprocess running. The SSH server sends SIGHUP to the remote shell, but the agent may survive (especially with `nohup` or ControlMaster active). + +Mitigations: +- Do **not** use SSH ControlMaster for agent connections. If your `~/.ssh/config` has `ControlMaster auto`, add `ControlMaster no` to the sandbox host entry. +- Ensure the SSH server has `ClientAliveInterval` set to detect dead clients +- Session pool TTL cleanup (`session_ttl_hours`) will eventually reclaim idle sessions + +### SSH connection startup latency + +Each `AcpConnection::spawn()` incurs an SSH handshake (~50–200 ms). This is negligible for long-lived sessions (default pool TTL = 24 h), but noticeable if sessions are frequently recycled. + +### SSH key auth is required + +OpenAB runs as a daemon without a terminal. Configure SSH key-based authentication on the sandbox host before using this transport.