-
Notifications
You must be signed in to change notification settings - Fork 6.5k
Description
What would you like to be added?
I propose adding an API to make sandbox implementations (docker
, podman
, sandbox-exec
) pluggable.
Possible third-party implementations:
- containerd, possibly via the nerdctl (contaiNERD CTL) CLI
- An alternative to
docker
andpodman
- Allows using advanced containerd shims like gVisor and Kata
- An alternative to
- Lima VM
- VM can provide stronger isolations than containers
- sshocker
- Executes a command on a remote machine, with a mount of the local current directory
- Alcoholless (
alcless
): lightweight security sandbox for Homebrew- Incorporates
su
,sudo
, andrsync
to execute a command as another user, with a copy of the current directory - Not just specific to Homebrew, despite the name
- Incorporates
- elfconv
- Executes ELF binaries inside WebAssembly runtime, with AOT compilation
Why is this needed?
This API is needed to allow exploring alternative sandbox implementations.
An alternative implementation may alleviate problems of the existing sandbox implementations:
docker
andpodman
are relatively vulnerable compared to hardware-assisted VMsandbox-exec
has been deprecated by Apple and likely going to be removed in a future release of macOS- The existing implementations do not have advanced features, such as,
- monitoring and breaking connections to the Internet
- backing up files before making a breaking operation
Additional context
Protocol overview
The protocol of the API can be whatever like REST or gRPC, however, I suggest repurposing the existing MCP tools such as run_shell_command
and read_file
(w/ additional changes if needed):
gemini-cli/packages/core/src/tools/shell.ts
Lines 32 to 77 in ef736f0
export class ShellTool extends BaseTool<ShellToolParams, ToolResult> { | |
static Name: string = 'run_shell_command'; | |
private whitelist: Set<string> = new Set(); | |
constructor(private readonly config: Config) { | |
super( | |
ShellTool.Name, | |
'Shell', | |
`This tool executes a given shell command as \`bash -c <command>\`. Command can start background processes using \`&\`. Command is executed as a subprocess that leads its own process group. Command process group can be terminated as \`kill -- -PGID\` or signaled as \`kill -s SIGNAL -- -PGID\`. | |
The following information is returned: | |
Command: Executed command. | |
Directory: Directory (relative to project root) where command was executed, or \`(root)\`. | |
Stdout: Output on stdout stream. Can be \`(empty)\` or partial on error and for any unwaited background processes. | |
Stderr: Output on stderr stream. Can be \`(empty)\` or partial on error and for any unwaited background processes. | |
Error: Error or \`(none)\` if no error was reported for the subprocess. | |
Exit Code: Exit code or \`(none)\` if terminated by signal. | |
Signal: Signal number or \`(none)\` if no signal was received. | |
Background PIDs: List of background processes started or \`(none)\`. | |
Process Group PGID: Process group started or \`(none)\``, | |
{ | |
type: 'object', | |
properties: { | |
command: { | |
type: 'string', | |
description: 'Exact bash command to execute as `bash -c <command>`', | |
}, | |
description: { | |
type: 'string', | |
description: | |
'Brief description of the command for the user. Be specific and concise. Ideally a single sentence. Can be up to 3 sentences for clarity. No line breaks.', | |
}, | |
directory: { | |
type: 'string', | |
description: | |
'(OPTIONAL) Directory to run the command in, if not the project root directory. Must be relative to the project root directory and must already exist.', | |
}, | |
}, | |
required: ['command'], | |
}, | |
false, // output is not markdown | |
true, // output can be updated | |
); | |
} | |
gemini-cli/packages/core/src/tools/read-file.ts
Lines 45 to 80 in ef736f0
export class ReadFileTool extends BaseTool<ReadFileToolParams, ToolResult> { | |
static readonly Name: string = 'read_file'; | |
constructor( | |
private rootDirectory: string, | |
private config: Config, | |
) { | |
super( | |
ReadFileTool.Name, | |
'ReadFile', | |
'Reads and returns the content of a specified file from the local filesystem. Handles text, images (PNG, JPG, GIF, WEBP, SVG, BMP), and PDF files. For text files, it can read specific line ranges.', | |
{ | |
properties: { | |
absolute_path: { | |
description: | |
"The absolute path to the file to read (e.g., '/home/user/project/file.txt'). Relative paths are not supported. You must provide an absolute path.", | |
type: 'string', | |
pattern: '^/', | |
}, | |
offset: { | |
description: | |
"Optional: For text files, the 0-based line number to start reading from. Requires 'limit' to be set. Use for paginating through large files.", | |
type: 'number', | |
}, | |
limit: { | |
description: | |
"Optional: For text files, maximum number of lines to read. Use with 'offset' to paginate through large files. If omitted, reads the entire file (if feasible, up to a default limit).", | |
type: 'number', | |
}, | |
}, | |
required: ['absolute_path'], | |
type: 'object', | |
}, | |
); | |
this.rootDirectory = path.resolve(rootDirectory); | |
} |
This design will allow implementing sandbox plugins as MCP servers.
Flow:
- LLM calls the MCP tool
run_shell_command
- Gemini CLI redirects the
run_shell_command
request to the sandbox driver, via MCP-over-stdio - The sandbox driver executes the shell command with a sandboxing technology, and returns the result to Gemini CLI
- Gemini CLI redirects the result to LLM
Potential standardization
The protocol should be useful for other AI agent products too, so there might be a chance of standardizing the protocol, perhaps as a part of MCP, or maybe as a part of Open Container Initiative specifications.