docs: add B7 failure mode (EACCES during chroot-home cleanup in rootless Docker)#5692
Conversation
…ess Docker) Add B7 to the runner doctor knowledge base across all three files: - .github/workflows/shared/self-hosted-failure-modes.md - .github/workflows/self-hosted-runner-doctor.md - .github/agents/self-hosted-runner-doctor.md B7 covers the case where AWF's removeWorkDirectories() fails with EACCES on agent-written files in the chroot-home temp directory due to UID namespace remapping in rootless Docker mode. Fixed in AWF v0.27.13. Closes #5689 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a new self-hosted runner “runner doctor” failure mode entry (B7) to document and help diagnose AWF cleanup failures on rootless Docker where UID namespace remapping can cause EACCES during deletion of chroot-home temp files.
Changes:
- Added failure mode B7 to the shared failure-mode catalog, including an error-string quick lookup entry.
- Updated the workflow runner-doctor playbook to map the new
EACCES/unlinksymptom to B7. - Updated the portable agent copy of the runner-doctor doc to keep it in sync with the shared catalog and playbook.
Show a summary per file
| File | Description |
|---|---|
| .github/workflows/shared/self-hosted-failure-modes.md | Adds B7 row and corresponding error-string lookup mapping for chroot-home cleanup EACCES/unlink. |
| .github/workflows/self-hosted-runner-doctor.md | Adds a symptom → failure mode hint mapping the cleanup EACCES/unlink pattern to B7. |
| .github/agents/self-hosted-runner-doctor.md | Mirrors the B7 row + lookup entry + symptom hint in the portable agent version to keep catalogs consistent. |
Review details
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 3/3 changed files
- Comments generated: 0
- Review effort level: Low
|
✅ Build Test Suite completed successfully! |
|
🔑 Smoke Copilot PAT PAT auth validated. All systems operational. ✅ |
|
Chroot tests passed! Smoke Chroot - All security and functionality tests succeeded. |
|
📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤 |
|
✅ Smoke Gemini completed. All facets verified. 💎 |
|
🔌 Smoke Services — All services reachable! ✅ |
|
✅ Smoke Copilot BYOK completed. Copilot BYOK mode operational. 🔓 |
|
✅ Contribution Check completed successfully! Contribution check complete for PR #5692: the docs-only change follows the applicable CONTRIBUTING.md guidelines, with clear PR description, issue reference, appropriate file placement, and no missing tests required for new code functionality. |
|
🚀 Security Guard has started processing this pull request |
|
✅ Smoke Copilot BYOK AOAI (Entra) completed. Copilot AOAI BYOK (Entra) mode operational. 🔓 |
|
📡 Smoke OTel Tracing completed. All tracing scenarios validated. ✅ |
|
✅ Smoke Copilot BYOK AOAI (api-key) completed. Copilot AOAI BYOK (api-key) mode operational. 🔓 |
|
✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟 |
|
✅ Smoke Claude passed |
Smoke Test: Claude Engine Validation
Overall result: PASS ✅
|
|
✅ Copilot BYOK (Direct) Mode: PASS All smoke tests passed:
Running direct BYOK mode via
|
🤖 Smoke Test: PASS
PR: docs: add B7 failure mode (EACCES during chroot-home cleanup in rootless Docker) Overall: PASS
|
🔥 Smoke Test: Copilot PAT — PASS
Auth mode: PAT (COPILOT_GITHUB_TOKEN) Overall: PASS
|
Smoke Test
Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "registry.npmjs.org"See Network Configuration for more information.
|
Running in direct BYOK mode (AWF_AUTH_TYPE=github-oidc + AWF_AUTH_AZURE_* + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (Foundry, o4-mini-aw) authenticated via Microsoft Entra Status: PASS
|
🧪 Chroot Version Comparison Results
Overall: ❌ Tests did not fully pass — Python and Node.js versions differ between host and chroot environments.
|
|
@lpcox Smoke Test (Direct BYOK → Azure Foundry) Results:
Running in direct BYOK mode via api-proxy → Azure OpenAI (Foundry, o4-mini-aw) Overall: PASS
|
🔬 Smoke Test: API Proxy OpenTelemetry Tracing
Overall: all scenarios pass (graceful degradation confirmed for unconfigured OTEL environment).
|
Smoke Test: GitHub Actions Services Connectivity
Result: FAIL
|
🏗️ Build Test Suite Results
Overall: 8/8 ecosystems passed — ✅ PASS
|
Smoke Test: Gemini Engine Validation
Overall status: FAIL Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "localhost"See Network Configuration for more information.
|
Summary
Adds the B7 failure mode to the runner doctor knowledge base, as proposed in #5689.
B7: AWF exits with unhandled
EACCESduring cleanup whenremoveWorkDirectories()tries to delete agent-written files in/tmp/awf-<ts>-chroot-home/that are owned by remapped UIDs (rootless Docker UID namespace remapping). Fixed in AWF v0.27.13 via a repair-container pattern.Changes
Updates three files:
.github/workflows/shared/self-hosted-failure-modes.md— B7 row + error-string lookup entry.github/workflows/self-hosted-runner-doctor.md— symptom → failure mode hint.github/agents/self-hosted-runner-doctor.md— B7 row + error-string lookup + symptom hintCloses #5689