Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .changeset/device-auth-default.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
"a2go": patch
---

fix: enable OpenClaw device pairing by default

Changed `A2GO_DISABLE_DEVICE_AUTH` default from `true` to `false` so device pairing is enabled out of the box. Users who want headless/automated access can opt in by setting `A2GO_DISABLE_DEVICE_AUTH=true`. Documented the env var and device pairing flow in the README, RunPod template readme, and add-model skill.
5 changes: 5 additions & 0 deletions .changeset/hermes-api-key-fix.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"a2go": patch
---

Fix Hermes gateway failing to start when `A2GO_AUTH_TOKEN` is a short or placeholder value. Hermes rejects weak API keys when binding to 0.0.0.0 — the entrypoint now auto-generates a secure 32-byte hex key and displays it in the ready banner. Also fix the LFM2.5-Audio plugin to use `llama-server` instead of the removed `llama-liquid-audio-server` binary (audio support merged upstream in llama.cpp b8967 via mtmd).
2 changes: 1 addition & 1 deletion .claude/skills/add-model/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@ Hermes is the agent framework. It manages tools internally (browser, terminal, s
If the pod was deployed with `"agent":"openclaw"`, OpenClaw serves a web UI on port 18789. Use the `/agent-browser` skill to test end-to-end:

1. **Open the agent UI** — navigate to `https://{pod-id}-18789.proxy.runpod.net`
2. **Device pairing** — approve the device when prompted
2. **Device pairing** — approve the device via SSH: `openclaw devices list` then `openclaw devices approve <id>` (unless `A2GO_DISABLE_DEVICE_AUTH=true` is set)
3. **Send a chat message** — verify the agent responds correctly
4. **Test tool calling in the UI** — ask the agent to perform a task that requires tools
5. **Test image generation** (if image service is running) — ask the agent to generate an image
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile.unified
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ RUN python3 -m venv /opt/engines/pytorch/venv && \
sdnq git+https://github.com/huggingface/diffusers.git \
transformers accelerate safetensors \
qwen-tts liquid-audio soundfile \
fastapi "uvicorn[standard]" httpx && \
fastapi "uvicorn[standard]" httpx python-multipart && \
/opt/engines/pytorch/venv/bin/pip install --no-cache-dir \
flash-attn --no-build-isolation || true && \
find /opt/engines/pytorch/venv -type d -name __pycache__ -exec rm -rf {} + 2>/dev/null; \
Expand Down
22 changes: 21 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,26 @@ a2go helps you run open-source AI models on your own hardware — locally, on a

On Runpod the URL is `https://<pod-id>-18789.proxy.runpod.net/?token=<A2GO_AUTH_TOKEN>`.

First time: approve device pairing when prompted (SSH into the machine, run `openclaw devices list` then `openclaw devices approve <requestId>`).
First time: approve device pairing when prompted — see [Device pairing](#device-pairing).

### Device pairing

OpenClaw requires you to approve each new browser before it can chat. This prevents unauthorized access even if someone discovers your pod URL and token.

When you open the Web UI for the first time, you'll see a pairing request. SSH into the machine and approve it:

```bash
openclaw devices list # shows pending requests
openclaw devices approve <id> # approve the device
```

To skip device pairing entirely (e.g. for automated/headless setups), set the environment variable:

```
A2GO_DISABLE_DEVICE_AUTH=true
```

This is less secure — anyone with your token can connect without approval.

## Docker (Linux / Windows / Runpod)

Expand All @@ -51,6 +70,7 @@ Models download on first start and persist on the volume.
| `A2GO_CONFIG` | JSON config — models to load | `{}` (auto-detect) |
| `A2GO_AUTH_TOKEN` | Web UI + API auth token | `changeme` |
| `A2GO_API_KEY` | LLM API key (OpenAI-compatible endpoint) | `changeme` |
| `A2GO_DISABLE_DEVICE_AUTH` | Skip OpenClaw device pairing (`true`/`false`). When `false`, new browsers must be approved via SSH — see [Device pairing](#device-pairing). | `false` |
| `TELEGRAM_BOT_TOKEN` | Enable Telegram bot integration | — |
| `GITHUB_TOKEN` | GitHub auth for Claude Code | — |

Expand Down
2 changes: 1 addition & 1 deletion scripts/a2go-media-server
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ def build_app(config: dict, port: int, llm_url: str, web_root: str) -> FastAPI:
import traceback
print(f"[MediaServer] ERROR: plugin '{plugin_id}' failed to load: {exc}")
traceback.print_exc()
print(f"[MediaServer] Skipping '{engine_id}', continuing with remaining plugins")
print(f"[MediaServer] Skipping '{plugin_id}', continuing with remaining plugins")

# ── Aggregated health (probes LLM + gateway + media plugins) ──
# Registered before backward-compatible flat routes so that plugin /health
Expand Down
25 changes: 22 additions & 3 deletions scripts/entrypoint-unified.sh
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ if [ -n "${RUNPOD_POD_ID:-}" ]; then
else
A2GO_ALLOWED_ORIGINS_JSON='[]'
fi
A2GO_DISABLE_DEVICE_AUTH="${A2GO_DISABLE_DEVICE_AUTH:-true}"
A2GO_DISABLE_DEVICE_AUTH="${A2GO_DISABLE_DEVICE_AUTH:-false}"

BOT_CMD="openclaw"
if [ "$AGENT" = "openclaw" ] && ! command -v "$BOT_CMD" >/dev/null 2>&1; then
Expand Down Expand Up @@ -921,14 +921,26 @@ EOF
# Start Hermes gateway (API server on port 8642, public — protected by
# API_SERVER_KEY; needed for external API access and platform webhooks
# like Telegram/Discord/WhatsApp)
#
# Hermes rejects "placeholder" API keys (short strings, common words)
# when binding to 0.0.0.0. Generate a proper hex key if the user's
# A2GO_AUTH_TOKEN would be blocked.
HERMES_API_KEY="$A2GO_AUTH_TOKEN"
if [ ${#HERMES_API_KEY} -lt 32 ] || echo "$HERMES_API_KEY" | grep -qiE '^(changeme|test|dummy|password|secret|placeholder|a2go-local-changeme)'; then
HERMES_API_KEY="$(openssl rand -hex 32)"
echo "Note: Generated secure API key for Hermes gateway (original token too short/placeholder)."
echo " Hermes API key: $HERMES_API_KEY"
fi
echo "$HERMES_API_KEY" > /tmp/oc_hermes_api_key

echo ""
echo "Starting Hermes gateway..."
OPENAI_API_KEY="$A2GO_API_KEY" \
OPENAI_BASE_URL="http://localhost:${LLM_PORT}/v1" \
API_SERVER_ENABLED=true \
API_SERVER_PORT=8642 \
API_SERVER_HOST=0.0.0.0 \
API_SERVER_KEY="$A2GO_AUTH_TOKEN" \
API_SERVER_KEY="$HERMES_API_KEY" \
"$HERMES_CMD" gateway run &
GATEWAY_PID=$!
;;
Expand Down Expand Up @@ -956,12 +968,19 @@ gpu_vram = data.get('gpuDetected', {}).get('vramMb', 0) or data.get('gpu', {}).g
print(f\"VRAM: {' + '.join(parts)} = ~{total // 1000}GB / {gpu_vram // 1000}GB\")
" 2>/dev/null || echo "VRAM: see profile")"

HERMES_KEY_FILE="/tmp/oc_hermes_api_key"
HERMES_KEY_INFO=""
if [ -f "$HERMES_KEY_FILE" ]; then
HERMES_KEY_INFO="Hermes API key: $(cat "$HERMES_KEY_FILE")"
fi

echo ""
oc_print_ready "LLM API" "$LLM_MODEL_NAME" "$LLM_CONTEXT tokens" "token" \
"$VRAM_SUMMARY" \
"Profile: $PROFILE_NAME ($PROFILE_ID)" \
"Media UI (local): http://localhost:${MEDIA_SERVER_PORT}" \
"${MEDIA_PROXY_URL:+Media UI (public): ${MEDIA_PROXY_URL}}"
"${MEDIA_PROXY_URL:+Media UI (public): ${MEDIA_PROXY_URL}}" \
"${HERMES_KEY_INFO}"

# Print service details
if [ -n "$MEDIA_PID" ]; then
Expand Down
20 changes: 12 additions & 8 deletions scripts/media_plugins/audio_lfm2.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
"""
LFM2.5-Audio plugin — TTS and STT via native llama-liquid-audio-server (GGUF).
LFM2.5-Audio plugin — TTS and STT via llama-server with mtmd audio support (GGUF).

Spawns llama-liquid-audio-server as a subprocess, loading 4 GGUF files (~2GB VRAM).
Spawns llama-server as a subprocess, loading 4 GGUF files (~2GB VRAM).
Proxies OpenAI-compatible TTS/STT requests through the native binary.

Note: Prior to llama.cpp b8967 this used a separate llama-server (audio)
binary. That binary was merged upstream into llama-server via the mtmd library.
"""

import base64
Expand Down Expand Up @@ -36,7 +39,7 @@


class LFM2AudioPlugin(MediaPlugin):
"""LFM2.5-Audio TTS/STT via native GGUF binary (~2GB VRAM)."""
"""LFM2.5-Audio TTS/STT via llama-server with mtmd audio (~2GB VRAM)."""

name = "lfm2-audio"
role = "audio"
Expand Down Expand Up @@ -69,6 +72,7 @@ def load_model(self, config: dict) -> None:
"-ngl", "999",
"--host", "127.0.0.1",
"--port", str(SUBPROCESS_PORT),
"--jinja",
]
if files.get("vocoder"):
cmd.extend(["-mv", files["vocoder"]])
Expand All @@ -90,11 +94,11 @@ def load_model(self, config: dict) -> None:
print(f"[Audio] Native server ready (PID {self._process.pid})")

def _find_binary(self) -> str:
"""Find llama-liquid-audio-server binary."""
path = "/opt/engines/a2go-llamacpp/bin/llama-liquid-audio-server"
"""Find llama-server binary (audio support merged via mtmd in b8967)."""
path = "/opt/engines/a2go-llamacpp/bin/llama-server"
if os.path.isfile(path) and os.access(path, os.X_OK):
return path
raise FileNotFoundError(f"llama-liquid-audio-server not found at {path}")
raise FileNotFoundError(f"llama-server not found at {path}")

def _find_gguf_files(self, model_dir: str) -> dict:
"""Locate the 4 GGUF files in the model directory."""
Expand Down Expand Up @@ -129,15 +133,15 @@ def _wait_for_ready(self, timeout: int = 120) -> None:
except Exception:
out = "(no log)"
raise RuntimeError(
f"llama-liquid-audio-server exited with code {self._process.returncode}\n{out}"
f"llama-server (audio) exited with code {self._process.returncode}\n{out}"
)
try:
self._client.get(f"{self._base_url}/", timeout=2.0)
return
except (httpx.ConnectError, httpx.ReadTimeout):
pass
time.sleep(2)
raise TimeoutError(f"llama-liquid-audio-server not ready after {timeout}s")
raise TimeoutError(f"llama-server (audio) not ready after {timeout}s")

def health(self) -> dict:
alive = self._process is not None and self._process.poll() is None
Expand Down
2 changes: 1 addition & 1 deletion scripts/validate-openclaw-config.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ function generateDockerConfig() {
...base.gateway,
controlUi: {
allowedOrigins: ['https://test-pod-18789.proxy.runpod.net'],
dangerouslyDisableDeviceAuth: true,
dangerouslyDisableDeviceAuth: false,
},
},
}
Expand Down
15 changes: 15 additions & 0 deletions templates/runpod/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,21 @@ Each agent supports additional environment variables for integrations like Teleg
- [OpenClaw documentation](https://docs.openclaw.ai/getting-started)
- [Hermes documentation](https://hermes-agent.nousresearch.com/docs)

### Device pairing (OpenClaw)

OpenClaw requires you to approve each new browser before it can chat. When you open the Web UI for the first time, you'll see a pairing request. SSH into the pod and approve it:

```bash
openclaw devices list # shows pending requests
openclaw devices approve <id> # approve the device
```

To skip device pairing (e.g. for automated/headless setups), add this environment variable:

| Variable | Value | Effect |
|----------|-------|--------|
| `A2GO_DISABLE_DEVICE_AUTH` | `true` | Skips device pairing — anyone with your token can connect without approval |

### 3. Deploy

Hit deploy. The pod will automatically download your selected models and start all services. First boot takes a few minutes depending on model size. Subsequent starts use the cached models on your volume.
Expand Down
Loading