Skip to content

Runner Doctor Updater: add cache-memory restore fallback key#5888

Open
lpcox with Copilot wants to merge 3 commits into
mainfrom
copilot/aw-fix-runner-doctor-updater
Open

Runner Doctor Updater: add cache-memory restore fallback key#5888
lpcox with Copilot wants to merge 3 commits into
mainfrom
copilot/aw-fix-runner-doctor-updater

Conversation

Copilot AI commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

The Runner Doctor Updater failed when the agent emitted missing_data (cache_memory_miss) and then terminated before producing a useful issue payload. This change reduces cache cold-start misses by broadening cache restore fallback behavior in the updater workflow.

  • Cache restore key behavior

    • Updated self-hosted-runner-doctor-updater.lock.yml to add a workflow-level fallback restore key in addition to the existing date-scoped key.
    • This allows reuse of prior cache-memory entries when a same-day key is unavailable.
  • Regression guard

    • Updated scripts/ci/self-hosted-runner-doctor-updater-workflow.test.ts to assert presence of the new fallback restore key so future lock updates don’t regress this behavior.
restore-keys: |
  memory-none-nopolicy-${{ env.GH_AW_WORKFLOW_ID_SANITIZED }}-${{ env.CACHE_MEMORY_DATE }}-
  memory-none-nopolicy-${{ env.GH_AW_WORKFLOW_ID_SANITIZED }}-

Copilot AI linked an issue Jul 3, 2026 that may be closed by this pull request
Copilot AI changed the title [WIP] Fix missing data in Runner Doctor Updater Runner Doctor Updater: add cache-memory restore fallback key Jul 3, 2026
Copilot finished work on behalf of lpcox July 3, 2026 20:29
Copilot AI requested a review from lpcox July 3, 2026 20:29
@lpcox lpcox marked this pull request as ready for review July 3, 2026 20:30
Copilot AI review requested due to automatic review settings July 3, 2026 20:30

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts the locked Runner Doctor Updater workflow’s actions/cache restore behavior to reduce cache_memory_miss cold-start failures by adding a broader fallback restore key, and updates CI coverage to guard against regressions in that restore-key behavior.

Changes:

  • Added a workflow-level cache restore fallback key (in addition to the existing date-scoped prefix) in self-hosted-runner-doctor-updater.lock.yml.
  • Updated the workflow-lock regression test to check for the presence of the new fallback restore key.
Show a summary per file
File Description
scripts/ci/self-hosted-runner-doctor-updater-workflow.test.ts Adds a regression assertion for the fallback restore key (but currently too broad to reliably detect regressions).
.github/workflows/self-hosted-runner-doctor-updater.lock.yml Adds a second, less-specific restore-keys entry to allow reuse of prior cache-memory entries when same-day cache is missing.

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 1
  • Review effort level: Low

expect(lock).toContain('🩺 Runner Doctor Update');
expect(lock).toContain('shared/self-hosted-failure-modes.md');
expect(lock).toContain('Compute scan window');
expect(lock).toContain('memory-none-nopolicy-${{ env.GH_AW_WORKFLOW_ID_SANITIZED }}-');
@lpcox

lpcox commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator

@copilot address review feedback

Copilot AI commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

@copilot address review feedback

Fixed in the latest commit. The assertion now uses a regex that anchors a newline immediately after the trailing -, so the date-scoped key (...-${{ env.CACHE_MEMORY_DATE }}-) cannot match it:

expect(lock).toMatch(/memory-none-nopolicy-\$\{\{ env\.GH_AW_WORKFLOW_ID_SANITIZED \}\}-\n/);

Copilot finished work on behalf of lpcox July 3, 2026 20:36
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

✅ Copilot review passed with no inline comments.

@copilot Add the ready-for-aw label to this PR to trigger agentic CI smoke tests.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Smoke Copilot BYOK AOAI (api-key) completed. Copilot AOAI BYOK (api-key) mode operational. 🔓

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

🚀 Security Guard has started processing this pull request

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Chroot tests failed Smoke Chroot failed - See logs for details.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

🔑 Smoke Copilot PAT PAT auth validated. All systems operational. ✅

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Contribution Check completed successfully!

PR #5888 follows the applicable contribution guidelines; no review comment needed.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Smoke Copilot BYOK AOAI (Entra) completed. Copilot AOAI BYOK (Entra) mode operational. 🔓

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Smoke Claude passed

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

🌑 The shadows whisper... Smoke Codex failed. The oracle requires further meditation...

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Smoke Copilot BYOK completed. Copilot BYOK mode operational. 🔓

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Build Test Suite completed successfully!

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

🔌 Smoke Services — All services reachable! ✅

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Smoke Gemini completed. All facets verified. 💎

Smoke test completed with failures. Connectivity to GitHub.com is blocked and GitHub MCP tools were not found in the environment.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

📡 Smoke OTel Tracing completed. All tracing scenarios validated. ✅

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

✅ Coverage Check Passed

Overall Coverage

Metric Base PR Delta
Lines 98.59% 98.63% 📈 +0.04%
Statements 98.52% 98.55% 📈 +0.03%
Functions 99.44% 99.44% ➡️ +0.00%
Branches 94.34% 94.34% ➡️ +0.00%
📁 Per-file Coverage Changes (1 files)
File Lines (Before → After) Statements (Before → After)
src/workdir-setup.ts 93.2% → 94.9% (+1.69%) 93.2% → 94.9% (+1.69%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

🔥 Smoke Test: PAT Auth — PASS

Test Result
GitHub MCP ✅ PR list fetched
GitHub.com HTTP ✅ 200
File write/read ✅ Verified

Auth mode: PAT (COPILOT_GITHUB_TOKEN)@lpcox

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

🔑 PAT report filed by Smoke Copilot PAT
Add label ready-for-aw to run again

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Smoke Test: Claude Engine Validation

Check Result
API Status ✅ PASS
GH Check ✅ PASS
File Status ✅ PASS

Overall Result: PASS

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

Generated by Smoke Claude for #5888 · 55.6 AIC · ⊞ 3.3K ·
Add label ready-for-aw to run again

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Copilot BYOK (Direct Mode) Smoke Tests - PASS

  • Test 1: GitHub MCP Testing - ✅
  • Test 2: GitHub.com Connectivity - ✅ (HTTP 200)
  • Test 3: File Write/Read - ✅
  • Test 4: BYOK Inference - ✅

Running in direct BYOK mode (COPILOT_PROVIDER_API_KEY) via api-proxy → api.githubcopilot.com

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

🔑 BYOK report filed by Smoke Copilot BYOK
Add label ready-for-aw to run again

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Gemini Engine Smoke Test Results

  • GitHub MCP Testing: ❌ (Tools unavailable/failed)
  • GitHub.com Connectivity: ❌ (Connection failed)
  • File Writing Testing: ✅
  • Bash Tool Testing: ✅

Overall Status: FAIL

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • localhost

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "localhost"

See Network Configuration for more information.

💎 Faceted by Smoke Gemini
Add label ready-for-aw to run again

@github-actions github-actions Bot mentioned this pull request Jul 3, 2026
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

✅ GitHub MCP connectivity & PR data validated
✅ GitHub.com connectivity
✅ Sandbox file I/O
✅ BYOK inference path
Running in direct BYOK mode (COPILOT_PROVIDER_API_KEY + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (Foundry, o4-mini-aw)
Overall: PASS
cc @lpcox

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

🔑 BYOK (AOAI api-key) report filed by Smoke Copilot BYOK AOAI (api-key)
Add label ready-for-aw to run again

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

🔥 Smoke Test Results

Test Status
GitHub MCP Connectivity
GitHub.com HTTP ✅ 200
File Write/Read

Overall: PASS

Author: @lpcox

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

📰 BREAKING: Report filed by Smoke Copilot
Add label ready-for-aw to run again

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Smoke Test: Services Connectivity

  • Redis PING: ❌ Network is unreachable
  • PostgreSQL pg_isready: ❌ No response
  • PostgreSQL SELECT 1: ❌ Network is unreachable

Overall: FAILhost.docker.internal (172.17.0.1) is unreachable from this runner. Service containers are not accessible.

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

🔌 Service connectivity validated by Smoke Services
Add label ready-for-aw to run again

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

@lpcox

  • GitHub MCP connectivity: ✅
  • GitHub.com connectivity: ✅
  • File write/read test: ✅
  • BYOK inference test: ✅
    Running in direct BYOK mode (AWF_AUTH_TYPE=github-oidc + AWF_AUTH_AZURE_* + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (Foundry, o4-mini-aw) authenticated via Microsoft Entra
    Overall status: PASS

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

🪪 BYOK (AOAI Entra) report filed by Smoke Copilot BYOK AOAI (Entra)
Add label ready-for-aw to run again

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

🏗️ Build Test Suite Results

Ecosystem Project Build/Install Tests Status
Bun elysia 1/1 passed ✅ PASS
Bun hono 1/1 passed ✅ PASS
C++ fmt N/A ✅ PASS
C++ json N/A ✅ PASS
Deno oak N/A 1/1 passed ✅ PASS
Deno std N/A 1/1 passed ✅ PASS
.NET hello-world N/A ✅ PASS
.NET json-parse N/A ✅ PASS
Go color 1/1 passed ✅ PASS
Go env 1/1 passed ✅ PASS
Go uuid 1/1 passed ✅ PASS
Java gson 1/1 passed ✅ PASS
Java caffeine 1/1 passed ✅ PASS
Node.js clsx All passed ✅ PASS
Node.js execa All passed ✅ PASS
Node.js p-limit All passed ✅ PASS
Rust fd 1/1 passed ✅ PASS
Rust zoxide 1/1 passed ✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

Generated by Build Test Suite for #5888 · 48.8 AIC · ⊞ 6.9K ·
Add label ready-for-aw to run again

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

📡 Smoke Test: API Proxy OpenTelemetry Tracing

Scenario Result Notes
S1: Module Loading ✅ Pass otel.js loads successfully; isEnabled: true; exports: startRequestSpan, setTokenAttributes, setBudgetAttributes, endSpan, endSpanError, shutdown, isEnabled + internal test helpers
S2: Test Suite ✅ Pass 59 tests passed across 2 suites (otel.test.js, otel-fanout.test.js); 0 failures
S3: Env Var Forwarding ⚠️ Expected-pending src/services/api-proxy-service.ts does not yet forward OTEL_EXPORTER_OTLP_ENDPOINT/GITHUB_AW_OTEL_TRACE_ID — expected during development
S4: Token-Tracker Integration ✅ Pass token-tracker-http.js contains onUsage callback (OTEL hook point)
S5: OTEL Diagnostics i️ No spans No span file at runtime path — api-proxy OTEL file export not yet active in this run

Summary: All core OTEL implementation is in place and all 59 tests pass. Env var forwarding from host service config to the container is a known pending item. No unexpected failures.

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

📡 OTel tracing validated by Smoke OTel Tracing
Add label ready-for-aw to run again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[aw] Runner Doctor Updater is missing required data

3 participants