Skip to content

gh-aw compiler: ARC/DinD topology requires daemon-visible path redirections #42807

Description

@lpcox

Summary

When compiling workflows with topology: arc-dind frontmatter, the gh-aw compiler must redirect all file paths to daemon-visible locations. In ARC/DinD, the runner container and Docker daemon (DinD sidecar) have separate filesystems — they share only the workspace volume (/home/runner/_work/). Files at /tmp, /usr/local/bin, and most of /home/runner are invisible to the Docker daemon and cannot be bind-mounted into the agent container.

The canonical daemon-visible base path is ${RUNNER_TEMP} (/home/runner/_work/_temp), which is under the shared workspace volume.

Security invariant: ARC/DinD must provide identical protections to non-ARC runners. The split-filesystem architecture requires different mechanisms to achieve the same security properties, but the agent's effective capabilities (write surface, credential access, network isolation, audit integrity) must be equivalent.

Validated via canary

All changes below were validated end-to-end on bbq-beets-four-nines/agentic-workflows-canary PR #1559 (branch fix/arc-dind-lock-yml-v0.27.15) using AWF v0.27.20 on ARC runners. Copilot CLI successfully starts and executes workloads.

Prerequisites

Bump firewall version to v0.27.20

The compiler must use AWF v0.27.20 or later for topology: arc-dind. Earlier versions have bugs in sysroot volume/mount handling that prevent the agent container from starting. The fixes shipped across three PRs:

Required Compiler Changes

1. Redirect all gh-aw artifacts to ${RUNNER_TEMP}/gh-aw/

Artifact Current location Required location
Tool cache /tmp/gh-aw/tool-cache or RUNNER_TOOL_CACHE ${RUNNER_TEMP}/gh-aw/tool-cache
Copilot CLI binary /usr/local/bin/copilot ${RUNNER_TEMP}/gh-aw/bin/copilot (copy step)
Prompts /tmp/gh-aw/aw-prompts/ ${RUNNER_TEMP}/gh-aw/aw-prompts/ (copy step)
Node binary setup-node install path ${RUNNER_TEMP}/gh-aw/tool-cache/node/ (copy step)

Why: /tmp, /usr/local/bin, and /home/runner/_work/_tool are all invisible to the Docker daemon in split-fs mode.

2. Mount strategy: ro base + rw overlays

Use overlapping mounts — Docker applies the most-specific path, so rw subdirs override the ro parent:

--mount "${RUNNER_TEMP}/gh-aw:${RUNNER_TEMP}/gh-aw:ro" \
--mount "${RUNNER_TEMP}/gh-aw/home:${RUNNER_TEMP}/gh-aw/home:rw" \
--mount "${RUNNER_TEMP}/gh-aw/sandbox/agent:${RUNNER_TEMP}/gh-aw/sandbox/agent:rw"

Parity with non-ARC protections

On non-ARC runners, AWF achieves isolation via separate filesystem paths with distinct ownership and permissions — the agent never has write access to system prompt files, harness scripts, safe-output configs, or firewall logs. The ARC/DinD mount strategy must replicate these same properties:

Protection Non-ARC mechanism ARC/DinD mechanism
Prompt integrity Separate path, different owner ro mount
Binary integrity Installed to /usr/local/bin (root-owned) ro mount of bin/
Safe outputs config Written by runner, agent accesses via MCP gateway only ro mount of safeoutputs/
Firewall audit logs Written by squid/api-proxy (different UID), agent can't modify ro mount of sandbox/firewall/
Credential isolation API proxy injects tokens agent never sees Same (unchanged)
Network isolation Squid domain allowlist + iptables Same (unchanged)
Agent write surface HOME dot-dirs + workspace + /tmp HOME (rw mount) + workspace + agent log dir

A blanket rw mount would break parity by giving the agent write access to paths it cannot reach on non-ARC runners.

What each mount provides

Path Mode Contents Rationale
${RUNNER_TEMP}/gh-aw/ (base) ro bin/, actions/, aw-prompts/, safeoutputs/, sandbox/firewall/ Protects executables, config, audit logs from tampering
${RUNNER_TEMP}/gh-aw/home/ rw Copilot HOME (.cache, .config, .copilot, etc.) SEA extraction, auth, tool state
${RUNNER_TEMP}/gh-aw/sandbox/agent/ rw Copilot's own log files (logs/) --log-dir target for Copilot CLI

Not agent-writable (protected by ro parent):

  • safeoutputs/config.json, safeoutputs/tools.json — permission definitions
  • sandbox/firewall/logs/ — squid access logs (network audit trail)
  • sandbox/firewall/audit/ — api-proxy audit logs (API call records)
  • bin/copilot, actions/*.cjs — executables (prevents modification between harness retries)

3. Set HOME inside user command

export HOME=${RUNNER_TEMP}/gh-aw/home
  • /home/runner is read-only in sysroot/chroot mode (comes from sysroot base image, not writable)
  • Copilot CLI needs writable HOME for:
    • SEA bundle extraction (~/.cache)
    • Config/auth data (~/.config)
    • Tool state (~/.copilot, .local, etc.)
  • Single HOME redirect replaces the need for individual dot-directory mounts

4. Rewrite all paths in user command

All /tmp/gh-aw references in the user command must point to daemon-visible paths:

  • Binary: ${RUNNER_TEMP}/gh-aw/bin/copilot
  • --prompt-file ${RUNNER_TEMP}/gh-aw/aw-prompts/prompt.txt
  • --log-dir ${RUNNER_TEMP}/gh-aw/sandbox/agent/logs/
  • --add-dir ${RUNNER_TEMP}/gh-aw/

5. AWF log/audit dirs under daemon-visible path

--proxy-logs-dir ${RUNNER_TEMP}/gh-aw/sandbox/firewall/logs
--audit-dir ${RUNNER_TEMP}/gh-aw/sandbox/firewall/audit

Default /tmp/awf-*/logs isn't daemon-visible, so post-run log access fails. These are protected from agent modification by the ro parent mount (§2).

What the Compiler Does NOT Need to Handle

  • Sysroot stage setup → automatic with --runner-topology arc-dind
  • Network isolation mode → automatic with sysroot
  • Docker host path prefix → auto-detected from DOCKER_HOST

Key Principles

  1. Security parity: ARC/DinD protections must be identical to non-ARC. Different mechanisms, same effective isolation.
  2. Daemon visibility: Anything the agent container needs must exist under ${RUNNER_TEMP} (the shared workspace volume).
  3. Minimal write surface: The agent gets write access only to HOME and its own log directory — nothing else.

Debugging Reference

The full iterative debugging log is in the canary PR (bbq-beets-four-nines/agentic-workflows-canary#1559). Key failure modes discovered:

  1. /etc/ld.so.cache mount failure → fixed by skipping etc mounts in sysroot
  2. Hosts file mount failure → fixed by skipping hosts generation in sysroot
  3. Custom mount filtered by sysroot home filter → fixed by narrowing filter to dot-dirs only
  4. Node not found → tool-cache redirect + copy to RUNNER_TEMP
  5. Copilot binary not found → copy to RUNNER_TEMP
  6. SEA extraction EACCES on ~/.cache → writable HOME
  7. Silent crash (exit 1, 0B output) → /home/runner read-only, HOME redirect fixes

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions