Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 43 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,33 +87,48 @@ result outside the sandbox.

## Host Tool Registry

Host products can register generic tools for sandbox agents without adding
product-specific logic to WP Codebox core. A tool definition declares a stable
name, JSON input/output schemas, policy metadata, and a host-side handler. The
`HostToolRegistry` is a WP Codebox transport adapter, not a generic tool
contract. Agents API owns canonical tool declarations, tool calls, execution
results, pending external-tool states, and product-neutral runtime metadata.
Data Machine or another host owns the concrete tool sources and product policy.
WP Codebox only exposes caller-provided per-run tool declarations to sandbox
agents, routes allowed calls across the browser/host boundary, and records
transport diagnostics.

Host products can register a caller-provided canonical tool declaration plus a
host-side handler without adding product-specific logic to WP Codebox core. The
runtime still gates execution through `RuntimePolicy.commands`, so callers must
explicitly allow each registered tool name before a sandbox can invoke it.
explicitly allow each registered canonical tool name before a sandbox can invoke
it.

```ts
import { createHostToolRegistry, createRuntime } from "@automattic/wp-codebox-core"
import { createPlaygroundRuntimeBackend } from "@automattic/wp-codebox-playground"

const hostTools = createHostToolRegistry([
{
name: "host.echo",
description: "Echo a structured payload from the host bridge.",
inputSchema: {
type: "object",
required: ["message"],
properties: { message: { type: "string" } },
additionalProperties: false,
declaration: {
name: "client/echo",
description: "Echo a structured payload from the host bridge.",
parameters: {
type: "object",
required: ["message"],
properties: { message: { type: "string" } },
additionalProperties: false,
},
executor: "client",
scope: "run",
runtime: { completion_signal: "progress" },
},
name: "client/echo",
description: "Echo a structured payload from the host bridge.",
outputSchema: {
type: "object",
required: ["message"],
properties: { message: { type: "string" } },
additionalProperties: false,
},
policy: { capability: "host.echo", risk: "read" },
policy: { capability: "client/echo", risk: "read" },
handler: (input) => input,
},
])
Expand All @@ -124,26 +139,33 @@ const runtime = await createRuntime({
policy: {
network: "deny",
filesystem: "sandbox",
commands: ["host.echo"],
commands: ["client/echo"],
secrets: "none",
approvals: "never",
},
hostTools,
}, createPlaygroundRuntimeBackend())

const result = await runtime.execute({
command: "host.echo",
command: "client/echo",
args: ['input-json={"message":"hello"}'],
})
```

Host tool output is always a JSON envelope with schema
`wp-codebox/host-tool-result/v1`. Successful calls return `status: "ok"` and an
`output` value validated against the tool's output schema. Invalid input,
invalid output, and handler failures return `status: "error"` with a stable error
code and message instead of terminal-shaped stderr. Product-specific evidence
commands should live in product extensions that register tools through this
surface.
Host tool output is a Codebox transport diagnostic envelope with schema
`wp-codebox/host-tool-result/v1`. Successful calls return `status: "ok"`, an
`output` value validated against the transport output schema, and `toolResult`
using the canonical Agents API result shape: `success`, `tool_name`, `result`,
`metadata`, and optional `runtime`. Invalid input, invalid output, malformed
JSON, and handler failures return `status: "error"` with a stable transport error
code while `toolResult` maps the same failure to a canonical tool error. The
`diagnostics` object is the Codebox-owned portion of the envelope and preserves
the transport, policy command, validation schemas, and resolved policy metadata.

Product-specific tools such as Homeboy evidence commands should live in product
extensions that provide canonical tool declarations and handlers through this
transport surface. Codebox should not encode Data Machine policy semantics,
product tool names, or cross-product tool mediation rules in this layer.

Trusted worker hosts that need repo-local commands can use the playground
package's `createHostCommandTool()` adapter instead of exposing arbitrary shell.
Expand Down
191 changes: 173 additions & 18 deletions packages/runtime-core/src/host-tool-registry.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,18 @@ export interface HostToolPolicyMetadata {
description?: string
}

export type HostToolRuntimeMetadata = JsonObject

export interface HostToolCanonicalDeclaration {
name: string
source?: string
description: string
parameters?: HostToolJsonSchema
executor?: "client"
scope?: "run"
runtime?: HostToolRuntimeMetadata
}

export interface HostToolCallContext {
tool: string
policyCommand: string
Expand All @@ -27,19 +39,46 @@ export interface HostToolCallContext {
export type HostToolHandler = (input: JsonValue, context: HostToolCallContext) => Promise<JsonValue> | JsonValue

export interface HostToolDefinition {
/**
* Canonical per-run tool declaration supplied by the caller. Codebox treats
* this as transport input; Agents API owns the generic declaration contract.
*/
declaration?: HostToolCanonicalDeclaration
name: string
description: string
inputSchema: HostToolJsonSchema
parameters?: HostToolJsonSchema
inputSchema?: HostToolJsonSchema
outputSchema: HostToolJsonSchema
policy: HostToolPolicyMetadata
runtime?: HostToolRuntimeMetadata
handler: HostToolHandler
}

export interface HostToolCanonicalResultOk {
success: true
tool_name: string
result: JsonValue
metadata: JsonObject
runtime?: HostToolRuntimeMetadata
}

export interface HostToolCanonicalResultError {
success: false
tool_name: string
error: string
metadata: JsonObject
runtime?: HostToolRuntimeMetadata
}

export type HostToolCanonicalResult = HostToolCanonicalResultOk | HostToolCanonicalResultError

export interface HostToolResultOk {
schema: typeof HOST_TOOL_RESULT_SCHEMA
tool: string
status: "ok"
output: JsonValue
toolResult: HostToolCanonicalResultOk
diagnostics: HostToolTransportDiagnostics
startedAt: string
finishedAt: string
}
Expand All @@ -53,15 +92,29 @@ export interface HostToolResultError {
message: string
details?: JsonValue
}
toolResult: HostToolCanonicalResultError
diagnostics: HostToolTransportDiagnostics
startedAt: string
finishedAt: string
}

export type HostToolResult = HostToolResultOk | HostToolResultError

export interface HostToolCatalogEntry {
/** Agents API-shaped declaration exposed to sandbox agents. */
declaration: HostToolCanonicalDeclaration
name: string
description: string
parameters: HostToolJsonSchema
inputSchema: HostToolJsonSchema
outputSchema: HostToolJsonSchema
policy: HostToolPolicyMetadata
}

export interface HostToolTransportDiagnostics {
transport: "wp-codebox-host-tool"
resultSchema: typeof HOST_TOOL_RESULT_SCHEMA
policyCommand: string
inputSchema: HostToolJsonSchema
outputSchema: HostToolJsonSchema
policy: HostToolPolicyMetadata
Expand Down Expand Up @@ -93,13 +146,18 @@ export class HostToolRegistry {
}

list(): HostToolCatalogEntry[] {
return [...this.tools.values()].map(({ name, description, inputSchema, outputSchema, policy }) => ({
name,
description,
inputSchema,
outputSchema,
policy,
}))
return [...this.tools.values()].map((definition) => {
const declaration = canonicalDeclarationForHostTool(definition)
return {
declaration,
name: declaration.name,
description: declaration.description,
parameters: declaration.parameters ?? {},
inputSchema: inputSchemaForHostTool(definition),
outputSchema: definition.outputSchema,
policy: definition.policy,
}
})
}
}

Expand All @@ -109,32 +167,39 @@ export function createHostToolRegistry(definitions: HostToolDefinition[] = []):

export async function executeHostTool(definition: HostToolDefinition, input: JsonValue, context: HostToolCallContext): Promise<HostToolResult> {
const startedAt = new Date().toISOString()
const inputIssue = validateJsonValueAgainstSchema(input, definition.inputSchema, "input")
const inputIssue = validateJsonValueAgainstSchema(input, inputSchemaForHostTool(definition), "input")
if (inputIssue) {
return hostToolError(definition.name, startedAt, "host-tool-invalid-input", inputIssue)
return hostToolError(definition, context.policyCommand, startedAt, "host-tool-invalid-input", inputIssue)
}

try {
const output = await definition.handler(input, context)
const outputIssue = validateJsonValueAgainstSchema(output, definition.outputSchema, "output")
if (outputIssue) {
return hostToolError(definition.name, startedAt, "host-tool-invalid-output", outputIssue)
return hostToolError(definition, context.policyCommand, startedAt, "host-tool-invalid-output", outputIssue)
}

return {
schema: HOST_TOOL_RESULT_SCHEMA,
tool: definition.name,
status: "ok",
output,
toolResult: hostToolCanonicalSuccess(definition, output),
diagnostics: hostToolDiagnostics(definition, context.policyCommand),
startedAt,
finishedAt: new Date().toISOString(),
}
} catch (error) {
return hostToolError(definition.name, startedAt, "host-tool-handler-error", error instanceof Error ? error.message : String(error))
return hostToolError(definition, context.policyCommand, startedAt, "host-tool-handler-error", error instanceof Error ? error.message : String(error))
}
}

function hostToolError(tool: string, startedAt: string, code: string, message: string, details?: JsonValue): HostToolResultError {
export function createHostToolTransportError(definition: HostToolDefinition | string, policyCommand: string, startedAt: string, code: string, message: string, details?: JsonValue): HostToolResultError {
return hostToolError(definition, policyCommand, startedAt, code, message, details)
}

function hostToolError(definition: HostToolDefinition | string, policyCommand: string, startedAt: string, code: string, message: string, details?: JsonValue): HostToolResultError {
const tool = typeof definition === "string" ? definition : definition.name
return {
schema: HOST_TOOL_RESULT_SCHEMA,
tool,
Expand All @@ -144,23 +209,113 @@ function hostToolError(tool: string, startedAt: string, code: string, message: s
message,
...(details === undefined ? {} : { details }),
},
toolResult: typeof definition === "string"
? hostToolCanonicalError(tool, message, code, details)
: hostToolCanonicalError(definition, message, code, details),
diagnostics: typeof definition === "string"
? hostToolDiagnosticsForUnknown(policyCommand)
: hostToolDiagnostics(definition, policyCommand),
startedAt,
finishedAt: new Date().toISOString(),
}
}

function assertValidHostToolDefinition(definition: HostToolDefinition): void {
if (!definition.name || !/^[a-z0-9][a-z0-9._-]*$/i.test(definition.name)) {
throw new Error("Host tool name must be a stable non-empty tool id")
const declaration = canonicalDeclarationForHostTool(definition)
if (!declaration.name || !/^[a-z][a-z0-9_-]*\/[a-z][a-z0-9_-]*$/i.test(declaration.name)) {
throw new Error("Host tool name must be a stable canonical tool id such as client/search_docs")
}
if (declaration.source !== sourceFromToolName(declaration.name)) {
throw new Error(`Host tool ${declaration.name} source must match its canonical name prefix`)
}
if (declaration.executor !== "client") {
throw new Error(`Host tool ${declaration.name} executor must be client`)
}
if (!definition.description) {
throw new Error(`Host tool ${definition.name} is missing a description`)
if (declaration.scope !== "run") {
throw new Error(`Host tool ${declaration.name} scope must be run`)
}
if (!declaration.description) {
throw new Error(`Host tool ${declaration.name} is missing a description`)
}
if (typeof definition.handler !== "function") {
throw new Error(`Host tool ${definition.name} is missing a handler`)
throw new Error(`Host tool ${declaration.name} is missing a handler`)
}
definition.name = declaration.name
definition.description = declaration.description
definition.inputSchema = inputSchemaForHostTool(definition)
}

function canonicalDeclarationForHostTool(definition: HostToolDefinition): HostToolCanonicalDeclaration {
const name = definition.declaration?.name ?? definition.name
const parameters = definition.declaration?.parameters ?? definition.parameters ?? definition.inputSchema
const runtime = definition.declaration?.runtime ?? definition.runtime
return {
name,
source: definition.declaration?.source ?? sourceFromToolName(name),
description: definition.declaration?.description ?? definition.description,
parameters,
executor: definition.declaration?.executor ?? "client",
scope: definition.declaration?.scope ?? "run",
...(runtime ? { runtime } : {}),
}
}

function hostToolCanonicalSuccess(definition: HostToolDefinition, result: JsonValue): HostToolCanonicalResultOk {
const runtime = canonicalDeclarationForHostTool(definition).runtime
return {
success: true,
tool_name: definition.name,
result,
metadata: {},
...(runtime ? { runtime } : {}),
}
}

function hostToolCanonicalError(definition: HostToolDefinition | string, error: string, code: string, details?: JsonValue): HostToolCanonicalResultError {
const toolName = typeof definition === "string" ? definition : definition.name
const runtime = typeof definition === "string" ? undefined : canonicalDeclarationForHostTool(definition).runtime
return {
success: false,
tool_name: toolName,
error,
metadata: {
code,
...(details === undefined ? {} : { details }),
},
...(runtime ? { runtime } : {}),
}
}

function hostToolDiagnostics(definition: HostToolDefinition, policyCommand: string): HostToolTransportDiagnostics {
return {
transport: "wp-codebox-host-tool",
resultSchema: HOST_TOOL_RESULT_SCHEMA,
policyCommand,
inputSchema: inputSchemaForHostTool(definition),
outputSchema: definition.outputSchema,
policy: definition.policy,
}
}

function hostToolDiagnosticsForUnknown(policyCommand: string): HostToolTransportDiagnostics {
return {
transport: "wp-codebox-host-tool",
resultSchema: HOST_TOOL_RESULT_SCHEMA,
policyCommand,
inputSchema: {},
outputSchema: {},
policy: {},
}
}

function sourceFromToolName(name: string): string {
return name.includes("/") ? name.split("/", 1)[0] : "client"
}

function inputSchemaForHostTool(definition: HostToolDefinition): HostToolJsonSchema {
return definition.declaration?.parameters ?? definition.parameters ?? definition.inputSchema ?? {}
}

function validateJsonValueAgainstSchema(value: JsonValue, schema: HostToolJsonSchema, path: string): string | undefined {
if (schema.type && !jsonValueMatchesType(value, schema.type)) {
return `${path} must be ${schema.type}`
Expand Down
Loading