Skip to content

OpenShell VoiceClaw support: validate sandbox contracts and track networking follow-ons #6207

Description

@jyaunches

Problem statement

Parent issue #5998 removes VoiceClaw's NemoClaw fork by making its OpenClaw plugin installable and durable. Sibling issue #6201 tracks the OpenClaw plugin/runtime/provider boundary. This issue tracks only the OpenShell boundary: current-contract validation, confirmed defects, and longer-term sandbox networking capabilities.

The current VoiceClaw POC keeps raw audio on the host. The sandbox sends text/control traffic through an exact OpenShell policy to a host audio bridge; that bridge owns the binary WebSocket speech/persona connections. Therefore no new OpenShell feature blocks the #5998 managed-plugin MVP unless current-version validation finds a regression.

The older VoiceClaw platform brief also asked for binary media egress, webhook ingress, split-horizon egress, and injected-environment durability. Several premises in that brief predate current OpenShell behavior, so the first task is to distinguish an existing contract from a real upstream gap.

Requirements and likely OpenShell alignment

Requirement Likely OpenShell alignment Basis and proposed framing
Preserve exact sandbox-to-host HTTP access (host.openshell.internal, destination/port, binary, method/path, and credential controls) across restart, rebuild, re-onboard, and supported drivers Strong — existing contract; validation only OpenShell already keeps the agent behind the nested policy proxy while giving the Docker supervisor host networking and the stable host alias (OpenShell PR #1080). Private destinations can be admitted through narrow policy rather than a bypass (OpenShell PR #60). File a defect only if the current pinned version fails the contract.
Carry the POC's fixed-destination binary WSS audio streams Strong for the existing relay; weak evidence for a new primitive OpenShell merged post-upgrade WebSocket relay support (PR #718); a later minimal repro carried binary frames up to 256 KiB byte-for-byte and closed as non-actionable (issue #760). Current policy also documents that binary WebSocket frames are relayed but not rewritten. Test WSS through a bare/raw L4 endpoint first. Request a first-class WEBSOCKET_BINARY action or frame/rate limits only if raw relay is insufficient and the additional enforcement contract is concrete.
Add a destination-allowlisted UDP/RTP/WebRTC media lane with codec, bandwidth, and concurrency limits Uncertain / RFC-level; not a current POC requirement The POC does not put WebRTC, RTP, or UDP inside the sandbox. OpenShell's security model deliberately forces ordinary egress through its TCP CONNECT/policy path, and it has treated uncontrolled DNS/UDP as an exfiltration boundary (issue #1169). A generic socket escape is unlikely to align. Revisit only with a protocol-specific design that remains deny-by-default and proves why WSS cannot meet the use case.
Publish future Twilio/Vonage-style webhook endpoints with TLS, caller authorization/HMAC verification, rate/IP controls, and a sandbox-scoped target Strong architectural alignment; partial implementation already exists OpenShell has merged persisted, gateway-owned sandbox service exposure with HTTP/WebSocket routing (PR #1101), and the canonical ingress roadmap explicitly favors gateway authorization plus a supervisor relay to a declared sandbox target (issue #994). Extend that model instead of proposing raw inbound sockets. Keep OpenShell provider-neutral; provider-specific HMAC semantics may belong in the plugin/host integration unless a generic gateway verifier is agreed. This is future carrier work, not an MVP blocker.
Make allowlisted upstream egress work on split-horizon/corporate DNS or an enterprise forward proxy Medium-high for a policy-preserving upstream connector; reproduce the exact path first OpenShell has already triaged supervisor support for corporate HTTP_PROXY/HTTPS_PROXY as a valid enterprise gap requiring a security spike (issue #1792). Docker supervisors use host networking (PR #1080), and private answers have a constrained policy path (PR #60), so VoiceClaw's older failure may already be driver/version-specific. Ask for resolver/upstream-dial composition that preserves NO_PROXY, destination, DNS/SSRF, binary, L7, credential, and CONNECT-chain controls—not a bypass around the OpenShell proxy.
Provide a declarative, policy-bound supervisor proxy to a host-local service as a fallback Medium; there is an almost exact open proposal but no maintainer commitment OpenShell issue #1633 proposes generalizing the inference.local shape so a declared endpoint reuses the existing CONNECT listener and reaches supervisor-side loopback with normal L7 policy. That is safer and more portable than VoiceClaw's 0.0.0.0:11435 Sonar proxy. Avoid a second listener: OpenShell closed that shape once its concrete downstream need disappeared (PR #1501). Prefer fixing native upstream routing; use #1633 only when a real host-local broker remains necessary.
Reapply the complete OpenShell-injected proxy/TLS environment to every OpenShell-owned agent launch, exec, and connect child Strong; likely already aligned, so reproduce ownership before filing OpenShell has repeatedly accepted small additive trust-store fixes, including GIT_SSL_CAINFO (PR #918) and DENO_CERT (PR #1441). Current source derives NODE_EXTRA_CA_CERTS, DENO_CERT, SSL_CERT_FILE, REQUESTS_CA_BUNDLE, CURL_CA_BUNDLE, and GIT_SSL_CAINFO from one helper for restricted children. Validate Node, curl, Python, and git after an OpenClaw gateway respawn and a fresh openshell sandbox connect. If OpenClaw or an SDK replaces its inherited environment, OpenShell is likely to redirect the fix downstream, as it did for MCP grandchildren (issue #886).

What OpenShell activity suggests

  • Maintainers accept integration capabilities when they are generic, tied to a concrete consumer, policy-gated, fail-closed, documented, and covered end to end. Private-IP policy (PR #60), messaging credential rewrite (PR #1286), and gateway-owned service exposure (PR #1101) follow that pattern.
  • They avoid expanding the trusted/listening surface without a durable need. PR #1501 was closed after its downstream blocker disappeared, specifically to avoid an additional listener.
  • They expect a minimal, account-independent reproduction before treating behavior as an OpenShell defect. Issue #760 was closed after its binary echo test proved the relay correct; conversely, focused failing tests are acted on quickly (for example issue #1878 and PR #1880).
  • They prefer generic hooks over integration-specific semantics. The current extensibility roadmap keeps common protocol enforcement in core and routes custom payload validation toward proxy middleware or gateway interceptors.
  • Upstream feature proposals need a design rather than a request to "please build this," agent diagnostic evidence, tests/docs, DCO sign-off, and—for a first external PR—the project's vouch workflow (CONTRIBUTING.md).

Validation sequence

  1. Prove the existing sandbox-to-host REST contract to the VoiceClaw audio bridge after rebuild/re-onboard on each supported driver.
  2. Put a TLS/WSS wrapper on the fixed host speech endpoints and run a binary frame integrity/latency smoke test through OpenShell's existing raw L4 policy path.
  3. Reproduce split-horizon egress directly through OpenShell and compare it with a host-origin request; record the driver, resolved address, policy decision, and upstream dial result.
  4. Probe Node, curl, Python, and git TLS trust before and after gateway respawn and from a new openshell sandbox connect session.
  5. Exercise current openshell service expose behavior with a synthetic signed webhook and record which layer owns TLS termination, caller authentication/HMAC validation, routing, and sandbox policy.

Only confirmed gaps should become separate OpenShell issues. Each upstream issue should have one primitive, a current-version minimal repro, explicit security invariants, and an end-to-end test plan. Ingress follow-up should join the design in OpenShell #994 rather than duplicate it.

Ownership corrections

Acceptance criteria

  • The five current-version validation probes above have reproducible evidence against the NemoClaw-pinned OpenShell release.
  • Existing OpenShell capabilities are recorded as validation contracts rather than filed as duplicate feature requests.
  • Every surviving OpenShell defect has a focused upstream issue or linked fix with a minimal repro and security-preserving acceptance test.
  • Any new binary-media proposal states exactly which protocol and enforcement are missing and why the existing WSS/raw-L4 path is insufficient.
  • Any webhook work extends OpenShell [MacOS] Invalid model Id was ignored and returned successful response #994's gateway-owned service model and separates generic routing/auth controls from provider/plugin-specific behavior.
  • Product Capability Support via Managed Plugins #5998's managed-plugin MVP remains unblocked by optional media/ingress platform work.

Not in this issue

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: networkingDNS, proxy, TLS, ports, host aliases, or connectivityarea: sandboxOpenShell sandbox lifecycle, runtime, config, or recoveryneeds: designRequires product or architecture direction

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions