Skip to content

Update runner doctor catalog with ARC/DinD modes A14 and A15#5832

Merged
lpcox merged 5 commits into
mainfrom
copilot/update-runner-doctor-a14-a15
Jul 3, 2026
Merged

Update runner doctor catalog with ARC/DinD modes A14 and A15#5832
lpcox merged 5 commits into
mainfrom
copilot/update-runner-doctor-a14-a15

Conversation

Copilot AI commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

The runner doctor knowledge base was missing two newly observed ARC/DinD failure modes and a citation update for the existing path-prefix guidance. This PR syncs those lessons into the shared catalog, the workflow prompt, and the portable self-contained doctor agent.

  • Catalog updates

    • Added A14 for ARC/DinD sidecars missing the Docker Compose v2 plugin (docker compose up -d fails with unknown shorthand flag: 'd' in -d).
    • Added A15 for rootless artifact permission repair on ARC/DinD when the repair bind mount does not apply dockerHostPathPrefix, leaving Squid logs unreadable.
    • Appended github/gh-aw-firewall#5753 to A1 to capture the canonical daemon-visible path guidance.
  • Diagnostic lookup improvements

    • Added quick-lookup mappings for the new A14 and A15 error strings.
    • Added A15 to the Known unresolved items section with the tracked fix/workaround status.
  • Doctor playbook sync

    • Added A14/A15 examples to the workflow doctor’s “Match symptom → failure mode” section.
    • Mirrored the same playbook and catalog changes into the portable .github/agents/self-hosted-runner-doctor.md so the embedded agent stays aligned with the shared source.
  • Regression guard

    • Extended the focused runner-doctor Jest test to assert the shared catalog and portable agent both contain the new rows, quick-lookup entries, unresolved-item note, and A1 citation update.
| A14 | `unknown shorthand flag: 'd' in -d` / `Command failed with exit code 125: docker compose up -d --pull never` | ARC/DinD sidecar image lacks `docker-compose-plugin` | Add `docker-compose-plugin` to the DinD sidecar Dockerfile | `docker compose version` inside the DinD sidecar | github/gh-aw-firewall#5729 |
| A15 | `[WARN] Rootless artifact permission repair failed for .../sandbox/firewall/logs (exit 1)`; squid log files unreadable after ARC/DinD run; `awf logs summary` returns `Failed to load logs: EACCES` | repair bind mount skips `dockerHostPathPrefix` translation | Open fix in progress; document workaround and probe | `ls -la <proxy-logs-dir>` and AWF warning check | github/gh-aw-firewall#5816 |

Copilot AI changed the title [WIP] Update Runner Doctor knowledge-base for A14 and A15 issues Update runner doctor catalog with ARC/DinD modes A14 and A15 Jul 2, 2026
Copilot AI requested a review from lpcox July 2, 2026 23:37
Copilot finished work on behalf of lpcox July 2, 2026 23:37
@lpcox lpcox marked this pull request as ready for review July 3, 2026 01:40
Copilot AI review requested due to automatic review settings July 3, 2026 01:40

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the self-hosted runner “doctor” knowledge base and related prompts to include two newly observed ARC/DinD failure modes (A14, A15) and to keep the shared catalog, workflow playbook, and portable agent in sync—backed by a Jest regression test.

Changes:

  • Added failure modes A14 (missing Docker Compose v2 plugin in ARC/DinD sidecar) and A15 (rootless permission repair bind mount not applying dockerHostPathPrefix).
  • Updated A1 citations to include github/gh-aw-firewall#5753 and added quick-lookup mappings + an unresolved-items note for A15.
  • Extended the runner-doctor workflow Jest test to assert shared + portable + workflow playbook alignment for these additions.
Show a summary per file
File Description
scripts/ci/self-hosted-runner-doctor-workflow.test.ts Adds regression assertions that shared catalog + portable agent + workflow playbook include A14/A15 and updated citations.
.github/workflows/shared/self-hosted-failure-modes.md Adds A14/A15 rows, quick-lookup entries, unresolved A15 note, and A1 citation update in the shared catalog.
.github/workflows/self-hosted-runner-doctor.md Updates the workflow doctor playbook examples to reference A14/A15 symptom matches.
.github/agents/self-hosted-runner-doctor.md Mirrors playbook + catalog updates into the portable, self-contained runner-doctor agent doc.

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 4/4 changed files
  • Comments generated: 2
  • Review effort level: Low

Comment thread .github/workflows/shared/self-hosted-failure-modes.md Outdated
Comment thread .github/agents/self-hosted-runner-doctor.md Outdated
lpcox and others added 2 commits July 2, 2026 20:32
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

✅ Copilot review passed with no inline comments.

@copilot Add the ready-for-aw label to this PR to trigger agentic CI smoke tests.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

🔌 Smoke Services — All services reachable! ✅

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Chroot tests failed Smoke Chroot failed - See logs for details.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

📡 Smoke OTel Tracing completed. All tracing scenarios validated. ✅

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Smoke Gemini completed. All facets verified. 💎

Smoke test completed with failures. Connectivity to GitHub.com and MCP tools for reads were unavailable.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Contribution Check completed successfully!

Contribution check passed for PR #5832: the PR has a clear description with related issue references, includes a focused regression test for the catalog updates, updates the relevant documentation/playbook files, and places files in appropriate locations.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

🔑 Smoke Copilot PAT PAT auth validated. All systems operational. ✅

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Smoke Copilot BYOK AOAI (api-key) completed. Copilot AOAI BYOK (api-key) mode operational. 🔓

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Smoke Copilot BYOK completed. Copilot BYOK mode operational. 🔓

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Smoke Copilot BYOK AOAI (Entra) completed. Copilot AOAI BYOK (Entra) mode operational. 🔓

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Build Test Suite completed successfully!

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

🌑 The shadows whisper... Smoke Codex failed. The oracle requires further meditation...

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

🚀 Security Guard has started processing this pull request

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

✅ Coverage Check Passed

Overall Coverage

Metric Base PR Delta
Lines 98.62% 98.66% 📈 +0.04%
Statements 98.51% 98.55% 📈 +0.04%
Functions 99.56% 99.56% ➡️ +0.00%
Branches 94.42% 94.42% ➡️ +0.00%
📁 Per-file Coverage Changes (1 files)
File Lines (Before → After) Statements (Before → After)
src/workdir-setup.ts 93.0% → 94.8% (+1.74%) 93.0% → 94.8% (+1.74%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions github-actions Bot mentioned this pull request Jul 3, 2026
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Smoke Test: GitHub Actions Services Connectivity

  • Redis PING: ❌ Network is unreachable
  • PostgreSQL pg_isready: ❌ No response
  • PostgreSQL SELECT 1: ❌ Network is unreachable

Overall: FAILhost.docker.internal (172.17.0.1) is unreachable from this runner. Service containers may not be running or the host bridge is not accessible.

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

🔌 Service connectivity validated by Smoke Services
Add label ready-for-aw to run again

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Smoke Test: Claude Engine Validation

Check Result
API Status ✅ PASS
GH Check ✅ PASS
File Status ✅ PASS

Overall Result: PASS

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

Generated by Smoke Claude for #5832 · 55.6 AIC · ⊞ 3.3K ·
Add label ready-for-aw to run again

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

@lpcox
GitHub MCP Testing: ✅
GitHub.com connectivity: ✅
File write/read test: ✅
BYOK inference test: ✅

Running in direct BYOK mode (AWF_AUTH_TYPE=github-oidc + AWF_AUTH_AZURE_* + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (Foundry, o4-mini-aw) authenticated via Microsoft Entra

Status: PASS

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

🪪 BYOK (AOAI Entra) report filed by Smoke Copilot BYOK AOAI (Entra)
Add label ready-for-aw to run again

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Smoke Test: Copilot BYOK Mode

  • ✅ GitHub MCP connectivity
  • ✅ github.com HTTP 200
  • ✅ File write/read test
  • ✅ Direct BYOK inference (COPILOT_PROVIDER_API_KEY → api-proxy → api.githubcopilot.com)

Status: PASS — Running in direct BYOK mode via api-proxy sidecar

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

🔑 BYOK report filed by Smoke Copilot BYOK
Add label ready-for-aw to run again

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Smoke Test: Copilot PAT Auth — PASS

Test Result
GitHub MCP connectivity
GitHub.com HTTP ✅ 200
File write/read

Overall: PASS | Auth mode: PAT (COPILOT_GITHUB_TOKEN)

cc @lpcox

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

🔑 PAT report filed by Smoke Copilot PAT
Add label ready-for-aw to run again

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

@lpcox

Smoke Test COPILOT BYOK Direct Mode:

  • MCP connectivity: ✅ (pre-fetched PR data validated)
  • GitHub.com HTTP: ✅
  • Sandbox file I/O: ✅
  • Direct BYOK inference via api-proxy → Azure OpenAI (Foundry, o4-mini-aw): ✅

Overall: PASS

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

🔑 BYOK (AOAI api-key) report filed by Smoke Copilot BYOK AOAI (api-key)
Add label ready-for-aw to run again

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

🔬 Smoke Test Results

Test Status
GitHub MCP connectivity
GitHub.com HTTP ✅ 200
File write/read ⚠️ pre-step template vars unexpanded

Overall: PASS (core connectivity verified)

PR author: @lpcox

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

📰 BREAKING: Report filed by Smoke Copilot
Add label ready-for-aw to run again

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Gemini Smoke Test Results

  • GitHub MCP Testing: ❌
  • GitHub.com Connectivity: ❌
  • File Writing Testing: ✅
  • Bash Tool Testing: ✅

PR Titles:

  1. chore: update model-to-API mapping (chore: update model-to-API mapping (2026-07-02) #5804)
  2. (Not found)

Overall status: FAIL

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • localhost

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "localhost"

See Network Configuration for more information.

💎 Faceted by Smoke Gemini
Add label ready-for-aw to run again

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

API Proxy OTEL Tracing — Smoke Test Results

Scenario Status Detail
Module Loading otel.js loads successfully; exports: startRequestSpan, setTokenAttributes, setBudgetAttributes, endSpan, endSpanError, shutdown, isEnabled. isEnabled() returns true.
Test Suite 59 tests across 2 suites (otel.test.js, otel-fanout.test.js) — all pass
Env Var Forwarding src/services/api-proxy-env-config.ts forwards GH_AW_OTLP_ENDPOINTS, OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_EXPORTER_OTLP_HEADERS, GITHUB_AW_OTEL_TRACE_ID, GITHUB_AW_OTEL_PARENT_SPAN_ID, OTEL_SERVICE_NAME to the api-proxy container
Token Tracker Integration onUsage callback present in token-tracker-http.js (line 324) as the OTEL hook point; onSpanEnd callback also wired
OTEL Diagnostics No spans exported (no OTLP endpoint configured in smoke run); graceful fallback to /var/log/api-proxy/otel.jsonl confirmed — no errors thrown

All 5 scenarios pass.

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

📡 OTel tracing validated by Smoke OTel Tracing
Add label ready-for-aw to run again

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

🏗️ Build Test Suite Results

Ecosystem Project Build/Install Tests Status
Bun elysia 1/1 passed ✅ PASS
Bun hono 1/1 passed ✅ PASS
C++ fmt N/A ✅ PASS
C++ json N/A ✅ PASS
Deno oak N/A 1/1 passed ✅ PASS
Deno std N/A 1/1 passed ✅ PASS
.NET hello-world N/A ✅ PASS
.NET json-parse N/A ✅ PASS
Go color 1/1 passed ✅ PASS
Go env 1/1 passed ✅ PASS
Go uuid 1/1 passed ✅ PASS
Java gson 1/1 passed ✅ PASS
Java caffeine 1/1 passed ✅ PASS
Node.js clsx passed ✅ PASS
Node.js execa passed ✅ PASS
Node.js p-limit passed ✅ PASS
Rust fd 1/1 passed ✅ PASS
Rust zoxide 1/1 passed ✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

Generated by Build Test Suite for #5832 · 47.9 AIC · ⊞ 6.9K ·
Add label ready-for-aw to run again

@lpcox

lpcox commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator

Copilot AI commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

@copilot fix the failing ci check https://github.com/github/gh-aw-firewall/actions/runs/28636561500/job/84924077190?pr=5832

Fixed in the latest commit — the A14 table rows in both .github/workflows/shared/self-hosted-failure-modes.md and .github/agents/self-hosted-runner-doctor.md were missing a trailing |, which caused the MD055 markdownlint error. Added the trailing pipe to both rows.

Copilot finished work on behalf of lpcox July 3, 2026 04:15
@lpcox lpcox merged commit b7ededb into main Jul 3, 2026
21 of 23 checks passed
@lpcox lpcox deleted the copilot/update-runner-doctor-a14-a15 branch July 3, 2026 04:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment