[ARC-DinD] Chroot /host base userland not staged on split-fs runners — security-preserving fix + empty-/host test harness

## Summary

On ARC (Actions Runner Controller) runners with a Docker-in-Docker (DinD) sidecar — i.e. a **split runner/daemon filesystem** — AWF chroot mode cannot currently run an agent end-to-end. The community thread [github/gh-aw#34896](https://github.com/github/gh-aw/issues/34896) has tracked this layer-by-layer across many gh-aw releases. As of **gh-aw v0.81.3 (firewall v0.27.10, mcpg v0.3.30)** two of three recent blockers are fixed:

- ✅ MCP gateway `gateway.domain: awmg-mcpg` accepted by mcpg v0.3.30.
- ✅ `binariesSourcePath` read-only collision fixed in #5482 (runner-binaries overlay now mounts at `/host/tmp/awf-runner-bin`, AWF v0.27.10).
- ❌ **Remaining blocker:** after chrooting into `/host`, the daemon's base userland is absent:
  ```
  [entrypoint][WARN] one-shot-token.so failed to load on host dynamic linker (host libc incompatibility, e.g. musl/Alpine)
  chroot: failed to run command '/bin/sh': No such file or directory
  [entrypoint][ERROR] capsh not found on host system
  ```

This issue is a deep-dive on the **root cause of the remaining blocker**, the **security constraints** that must shape the fix, the **latent gaps queued behind it**, and a **comprehensive test plan** that simulates ARC split-fs conditions (notably an **empty mounted `/host`**) so this class of failure is caught in CI instead of being discovered one layer at a time on real runners.

## Background: how the chroot `/host` is assembled

`buildSystemMounts()` (`src/services/agent-volumes/system-mounts.ts:13-37`) emits fixed read-only bind mounts for the chroot base system:

```
/usr:/host/usr:ro
/bin:/host/bin:ro
/sbin:/host/sbin:ro
/lib:/host/lib:ro
/lib64:/host/lib64:ro
/opt:/host/opt:ro
...
/tmp:/host/tmp:rw
```

On a normal runner the source paths (`/usr`, `/bin`, …) are the runner's own glibc userland and everything works. On a **split-fs ARC/DinD** runner, gh-aw emits `--docker-host-path-prefix /tmp/gh-aw`, and `translateBindMountHostPath()` (`src/services/host-path-prefix.ts`) rewrites every source to the daemon-visible staging root:

```
/tmp/gh-aw/usr:/host/usr:ro
/tmp/gh-aw/bin:/host/bin:ro
/tmp/gh-aw/lib:/host/lib:ro
...
```

(Kernel VFS `/dev`, `/sys`, `/proc` and `/dev/null` are correctly excluded from prefixing — `host-path-prefix.ts:41-48`.)

**The defect:** nothing ever *populates* `/tmp/gh-aw/{usr,bin,lib,…}` with a base userland. The mounts point at empty staged directories, so inside the chroot `/host/bin/sh` and `/host/usr/sbin/capsh` do not exist. The entrypoint's chroot preflight (`containers/agent/entrypoint.sh:681-704`) then fails exactly as reported. The "musl/Alpine" wording in the warning is **misleading**: the reporter's daemon is Debian/glibc with both `/bin/sh` and `capsh` present — the chroot simply enters an *empty* `/host`, and the generic warning blames musl because no dynamic loader is found at all.

## Why no existing AWF primitive fixes this

The thread (and our own `docs/arc-dind.md`) points at `dind.preStageDirs` as the staging step. **It does not populate the system tree.** `DEFAULT_PRE_STAGE_DIRS` (`src/dind-bootstrap.ts:11-19`) only `mkdir`s empty work dirs:

```
.cache  .config  .local  .local/state  home  mcp-logs  sandbox
```

`stageEngineBinary()` stages a *single* binary. `runDindBootstrap()` (`src/dind-bootstrap.ts:103-127`) returns early unless `config.dind.preStageDirs`/`stageEngineBinary` is set — and gh-aw does not emit those, so the resolved config shows `enableDind=false` even when `dockerHostPathPrefix` is set.

**Conclusion: there is no capability today that stages a base userland into the chroot.** Any fix that just "emits `dind.preStageDirs`" will produce empty system dirs and still fail. This is a *missing capability*, not a config-emission oversight.

## Security considerations (these must shape the fix)

The remaining blocker has two superficially attractive fixes that are **security-regressive** and should be rejected:

1. **Bind the daemon's real `/bin`, `/usr`, `/lib` into `/host`.** This sources the chroot base userland — including the binaries that run *before* capability drop — from the **runner/daemon filesystem**, which on ARC is attacker-influenceable (a malicious or compromised DinD image, or anything that can write the shared `/tmp/gh-aw` emptyDir, controls the code AWF executes as root pre-`capsh`). This is the exact trust boundary AWF deliberately moved away from in the iptables → network-isolation work: **egress/identity enforcement must not depend on untrusted runner-side state.** Trusting the daemon rootfs for the chroot base reintroduces that dependency at an even more sensitive point (pre-privilege-drop code execution).

2. **Copy the daemon's userland into the staging root at runtime.** Same problem — provenance is the daemon image, not a verified AWF artifact.

**Security-preserving direction:** source the chroot base userland from **AWF's own signed agent image** (`ghcr.io/github/gh-aw-firewall/agent`), which already ships a glibc base + `bash` + `libcap2-bin` (`capsh`) + the loader needed by `one-shot-token.so`. Two viable mechanisms, both keeping provenance inside AWF's trust boundary:

- **(A) Self-bind from the agent container.** In `entrypoint.sh`, before chroot, detect an empty/foreign `/host` and overlay the agent image's own `/bin`, `/usr/sbin/capsh`, `/lib`, loader, and a minimal busybox/coreutils set into `/host` (e.g. via a writable overlay assembled in `/host/tmp` and `PATH`/loader redirection). No daemon trust; the binaries come from the image AWF was built and signed as.
- **(B) Stage from the signed image via a helper container.** Extend `dind-bootstrap.ts` with a real `stageBaseSystem()` that runs `DEFAULT_STAGING_IMAGE` (already `ghcr.io/github/gh-aw-firewall/agent:latest`, `src/dind-bootstrap.ts:8`) to copy a curated base userland into the daemon-visible staging root *before* compose start. Provenance is the AWF image, but it crosses the daemon filesystem — so it must be paired with integrity checks (see below).

Whichever mechanism is chosen, the following invariants must hold and be tested:
- The base userland executed before `capsh` privilege-drop must originate from the AWF-signed image, never from runner/daemon-writable paths.
- The staged tree must not be writable by the agent (post-drop UID) at exec time.
- Credential-isolation guarantees (procfs `hidepid=2`, `/dev/null` credential overlays, `/etc/shadow` exclusion) must remain intact when `/host` is synthesized.
- If integrity cannot be assured (e.g. an unverifiable shared staging path), AWF should **fail closed** with a clear diagnostic rather than silently chrooting into an attacker-influenceable `/host`.

## Latent gaps queued behind the current blocker

The thread's recent progression table tracks only three layers (gateway → container start → chroot exec). Once `/bin/sh` + `capsh` are present, the originally-enumerated gaps will resurface in order. They should be designed for now, not rediscovered serially:

- **Engine identity vars through `capsh`** — `engine.env` `HOME`/`USER`/`LOGNAME` were historically clobbered to the pre-drop values; verify `chroot.identity` (now emitted by gh-aw) actually wins *after* the user switch.
- **Agent binary visibility** — confirm the runner-installed `copilot`/engine binary lands in the #5482 overlay (`/host/tmp/awf-runner-bin`) and is on `PATH` inside the chroot for **both** the agent job and the `safe-outputs.threat-detection` job.
- **`/etc/passwd`, `/etc/group`, `/etc/hosts` synthesis** — AWF should synthesize minimal identity + `host.docker.internal` entries for the UID it switches to, without requiring workflow-level `sandbox.agent.mounts`.
- **Threat-detection silent no-op (security regression)** — the auto-generated detection job runs without the agent job's pre-steps and, on chroot setup failure (`spawn ENOENT`), is marked successful because `GH_AW_DETECTION_CONTINUE_ON_ERROR !== 'false'`. A correctly-configured workflow then believes outputs were screened when the detector no-op'd. AWF/gh-aw must distinguish "engine never started" (fail loud) from "model produced unparseable output" (continue-on-error). This is the highest-severity latent gap.

## The meta-gap: no CI reproduces split-fs DinD

Every fix so far has advanced exactly one layer, then a new layer breaks weeks later on real runners — because **no automated test reproduces an empty/foreign `/host`**. The existing chroot integration tests (`tests/integration/chroot-*.test.ts`) and `smoke-chroot` all run on a normal runner where `/host` is the runner's own populated glibc tree, so they never exercise the split-fs staging path. The reporter independently noted "why CI likely doesn't catch it" (the chroot patch is gated on a `tcp://localhost` `DOCKER_HOST` absent on GitHub-hosted runners). Closing this meta-gap is arguably more valuable than any single layer fix.

## Related gap: pre-agent toolchain installs don't reach the chroot on ARC split-fs

A second architectural gap, distinct from the empty-`/host` base-userland problem above and worth solving in the same effort. The base-userland fix gets the chroot a working `/bin/sh` + `capsh`; it does **not** get a build-test workflow's compilers and SDKs into the chroot.

### The mental model that breaks on ARC

For a build-test-style workflow the assumption is: *pre-agent steps install packages/toolchains on the host, then the agent sees them via chroot `/host`.* This holds on a **normal runner** (one filesystem) but breaks on **ARC/DinD** (two filesystems), because the installs land on the wrong one.

**Normal runner — one filesystem:**
- Pre-agent steps (`apt-get install`, `setup-go`/`setup-node`, `npm i -g`, tool caches) run in the **runner shell**, writing the **runner's** `/usr`, `/opt/hostedtoolcache`, `$HOME`, …
- AWF bind-mounts that same FS read-only: `/usr:/host/usr:ro`, `/opt:/host/opt:ro`, etc. (`src/services/agent-volumes/system-mounts.ts:13-24`).
- chroot `/host` = the runner's world → the agent sees everything pre-agent steps installed. ✅

**ARC/DinD — runner FS ≠ daemon FS:**
1. **Pre-agent workflow steps run in the runner container**, on the **runner's filesystem** — that's where `apt`/`setup-*`/tool caches land, same as a normal runner.
2. **AWF's agent container is launched by the daemon** (compose over `DOCKER_HOST=tcp://…`). Its bind-mount sources (`/usr`, `/bin`, `/opt`, …) are resolved by the **daemon**, against the **daemon's** filesystem — or, with `--docker-host-path-prefix /tmp/gh-aw`, against the shared `/tmp/gh-aw` staging dir.
3. So chroot `/host` is assembled from the **daemon's world, not the runner's**. The toolchains the pre-agent steps installed on the runner are **invisible to the agent**.

This is upstream Gap 4 (runner-installed `copilot` not visible in chroot) **generalized to every package and toolchain a build-test workflow installs** — and it compounds the empty-`/host` problem: on split-fs the daemon's `/tmp/gh-aw/{usr,bin,lib}` isn't even populated, so `/host` is empty rather than "the daemon's toolchain."

### What actually crosses the split into the chroot on ARC

Only things on a path **both** containers can see, or baked/staged into the daemon side:

- **The workspace and `/tmp`** — `${workspaceDir}:/host…:rw` and `/tmp:/host/tmp:rw` (`system-mounts.ts:23-24`). In ARC these are typically the shared `gh-aw-tmp` emptyDir, so writes there *are* visible.
- **The runner tool cache, only if explicitly wired** — `container.runnerToolCachePath` (`src/awf-config-schema.json:607`, `src/runner-tool-cache.ts`) mounts `/opt/hostedtoolcache` RO into the chroot. This knob exists *specifically because* the tool cache doesn't otherwise cross the split — but it only helps if that cache lives on a volume the daemon can also see.
- **The #5482 runner-binaries overlay** — `/host/tmp/awf-runner-bin`, a narrow path for staging a couple of CLIs into the daemon side.
- **Whatever is baked into the DinD image** — which is why the upstream reporter had to build a custom Ubuntu DinD with Node/`capsh` pre-installed.

### Implications for build-test on ARC

A workflow that installs toolchains in pre-agent steps **won't expose them to the agent on ARC** unless one of:

- **Install into a shared volume** the daemon also mounts (workspace, `/tmp/gh-aw`, or a shared tool-cache) instead of the runner's `/usr`/`/opt`.
- **Bake toolchains into the DinD daemon image.**
- **Stage them daemon-side** via a helper container (the manual bootstrap pattern from the upstream thread).
- **Move the installs inside the agent/chroot** (post-firewall, network permitting via the egress allowlist) rather than pre-agent.

### Security note (same trust boundary as above)

Staging runner-side toolchains into the daemon-visible path is acceptable *because the provenance is the workflow's own pre-agent steps mounted RO* — but it must not become a vector for the agent (post privilege-drop UID) to write paths that earlier/other privileged steps then execute. Anything staged for the chroot must be RO at agent exec time, and this must not weaken the `/host` integrity / fail-closed posture proposed for the base userland.

### Suggested scope

Generalize the `runnerToolCachePath` + `awf-runner-bin` overlay into a first-class **"stage runner toolchains into a daemon-visible chroot path"** capability, with `build-test` as the motivating workflow.


---

## Proposed implementation plan

### 1. Add a `stageBaseSystem()` capability sourced from the AWF-signed image
- Implement base-userland staging from `DEFAULT_STAGING_IMAGE` (mechanism A self-bind preferred; B as fallback) in `src/dind-bootstrap.ts` and/or `containers/agent/entrypoint.sh`.
- Curate the minimal set: dynamic loader + `libc`/`libcap`/`libutil`, `/bin/sh` (+ `bash`), `capsh`, and the coreutils the entrypoint uses (`mkdir`, `chmod`, `cat`, `head`, `tee`, `cp`, `tar`).
- Wire detection: when `dockerHostPathPrefix` is set (or an empty `/host` is detected at entrypoint), run staging automatically. Today `enableDind=false` even with the prefix set — close that half-configured state.

### 2. Preserve security invariants
- Base userland provenance = AWF image only; never daemon/runner-writable paths for pre-drop execution.
- Fail-closed diagnostic when `/host` is empty/foreign and a verified base cannot be staged.
- Re-assert procfs `hidepid=2`, `/dev/null` credential overlays, and `/etc/shadow` exclusion under the synthesized `/host`.

### 3. Close the queued layers (design now)
- Verify `chroot.identity` HOME/USER/LOGNAME survive `capsh`.
- Ensure the engine binary overlay (`/host/tmp/awf-runner-bin`) is on `PATH` for agent **and** detection jobs.
- Synthesize `/etc/passwd`/`/etc/group`/`/etc/hosts` in chroot.
- Make threat-detection fail loud on engine-spawn failure (distinguish from parse failure).

### 4. Comprehensive ARC-simulation test suite (the core deliverable)
Add tests that reproduce split-fs DinD **without** needing a real ARC cluster:

- **Empty `/host` integration test** — start the agent with the system mounts pointed at a freshly-created **empty** staging dir (simulating `/tmp/gh-aw/{usr,bin,lib}` that was never populated). Assert: (a) without the fix, the chroot preflight fails with the documented `/bin/sh`/`capsh` error; (b) with `stageBaseSystem()`, the agent runs a trivial command to completion inside the chroot.
- **Foreign/musl `/host` test** — point the base mounts at an Alpine/musl rootfs (or a deliberately-incompatible loader) and assert AWF either stages its own glibc base and succeeds, or **fails closed** with the actionable diagnostic — never silently proceeds.
- **Split-fs path-prefix test** — exercise `translateBindMountHostPath()` with `--docker-host-path-prefix` and assert the staged tree is what the chroot actually enters (no empty-dir passthrough), with kernel VFS still excluded.
- **Provenance/integrity test** — assert the staged base userland originates from the AWF image and that agent-UID is not able to write the staged tree before exec.
- **Identity-vars probe test** — a probe binary prints `id`/`$HOME`/`$USER`/`$LOGNAME` from inside the chroot; assert `chroot.identity` values win post-`capsh`.
- **Engine-binary visibility test** — in simulated chroot mode, the installed `copilot` is discoverable on `PATH` from inside `/host`.
- **Threat-detection ARC test** — with `safe-outputs.threat-detection` enabled in the simulated split-fs environment: a successful run produces a parseable result; a deliberately unstaged engine causes the detection job to **fail**, not silently pass.
- **CI wiring** — add a `smoke-chroot`-style job (or extend the existing one) that runs the empty-`/host` and foreign-`/host` scenarios on every PR, so this layer is permanently guarded.
- **Toolchain-visibility test** — install a toolchain in a simulated pre-agent step on a "runner" path distinct from the daemon-visible staging root; assert it is **not** visible in the chroot by default, and **is** visible once staged via the capability in step 5.

### 5. Stage runner toolchains into a daemon-visible chroot path (build-test on ARC)
- Generalize `runnerToolCachePath` + the `/host/tmp/awf-runner-bin` overlay into a first-class capability that stages workflow-installed toolchains (compilers, SDKs, tool caches) into a daemon-visible path the chroot mounts RO.
- Keep provenance = the workflow's own pre-agent steps; staged tree must be RO at agent exec time.

### 6. Docs
- Update `docs/arc-dind.md` / `docs/chroot-mode.md` to describe base-userland staging, correct the `dind.preStageDirs` expectation (it does not stage the system tree), and document the security model (image-sourced base, fail-closed behavior).

## Acceptance criteria

- On a simulated split-fs runner with an **empty mounted `/host`**, the AWF agent chroots and runs a command to completion using an AWF-image-sourced base userland — with **no** dependency on the daemon's rootfs for pre-`capsh` execution.
- A foreign/musl or unverifiable `/host` causes a **fail-closed** error, never a silent chroot into untrusted state.
- The `safe-outputs.threat-detection` job runs end-to-end in the simulated environment and **fails loudly** if the engine cannot be spawned.
- The empty-`/host` and foreign-`/host` scenarios run in CI on every PR.
- A toolchain installed in a simulated pre-agent step is invisible in the chroot by default and visible once staged via the daemon-visible staging capability (regression-guarded in CI).

## References

- Upstream thread: github/gh-aw#34896 (esp. comments dated 2026-06-23 and 2026-06-25 enumerating the v0.80.x → v0.81.3 progression).
- Prior fix: #5482 — `binariesSourcePath` overlay relocated to `/host/tmp/awf-runner-bin` (AWF v0.27.10).
- Code: `src/services/agent-volumes/system-mounts.ts:13-37`, `src/services/host-path-prefix.ts:23-48`, `src/dind-bootstrap.ts:8-127`, `containers/agent/entrypoint.sh:681-704`.
- Docs: `docs/arc-dind.md`, `docs/chroot-mode.md`.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ARC-DinD] Chroot /host base userland not staged on split-fs runners — security-preserving fix + empty-/host test harness #5541

Summary

Background: how the chroot `/host` is assembled

Why no existing AWF primitive fixes this

Security considerations (these must shape the fix)

Latent gaps queued behind the current blocker

The meta-gap: no CI reproduces split-fs DinD

Related gap: pre-agent toolchain installs don't reach the chroot on ARC split-fs

The mental model that breaks on ARC

What actually crosses the split into the chroot on ARC

Implications for build-test on ARC

Security note (same trust boundary as above)

Suggested scope

Proposed implementation plan

1. Add a `stageBaseSystem()` capability sourced from the AWF-signed image

2. Preserve security invariants

3. Close the queued layers (design now)

4. Comprehensive ARC-simulation test suite (the core deliverable)

5. Stage runner toolchains into a daemon-visible chroot path (build-test on ARC)

6. Docs

Acceptance criteria

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[ARC-DinD] Chroot /host base userland not staged on split-fs runners — security-preserving fix + empty-/host test harness #5541

Description

Summary

Background: how the chroot /host is assembled

Why no existing AWF primitive fixes this

Security considerations (these must shape the fix)

Latent gaps queued behind the current blocker

The meta-gap: no CI reproduces split-fs DinD

Related gap: pre-agent toolchain installs don't reach the chroot on ARC split-fs

The mental model that breaks on ARC

What actually crosses the split into the chroot on ARC

Implications for build-test on ARC

Security note (same trust boundary as above)

Suggested scope

Proposed implementation plan

1. Add a stageBaseSystem() capability sourced from the AWF-signed image

2. Preserve security invariants

3. Close the queued layers (design now)

4. Comprehensive ARC-simulation test suite (the core deliverable)

5. Stage runner toolchains into a daemon-visible chroot path (build-test on ARC)

6. Docs

Acceptance criteria

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Background: how the chroot `/host` is assembled

1. Add a `stageBaseSystem()` capability sourced from the AWF-signed image