Skip to content

fix(copilot-sdk): post-completion idle watchdog to bound SDK hang after final tool result#41572

Merged
pelikhan merged 6 commits into
mainfrom
copilot/fix-cli-process-hang-issue
Jun 26, 2026
Merged

fix(copilot-sdk): post-completion idle watchdog to bound SDK hang after final tool result#41572
pelikhan merged 6 commits into
mainfrom
copilot/fix-cli-process-hang-issue

Conversation

Copilot AI commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

After the agent's final tool call returns success, the SDK's sendAndWait can fail to resolve — leaving the process hung until the step timeout kills the runner and burning the full remaining step budget (up to ~20 min) with no additional work produced.

Changes

copilot_sdk_session.cjs — post-completion idle watchdog

  • Arms a setTimeout (default 5 min, override via GH_AW_SDK_IDLE_MS) whenever the session enters "work complete" state: hasOutput === true && pendingToolCalls.size === 0.
  • Disarms on every new tool.execution_start event (agent still mid-turn) and cleared in finally when sendAndWait resolves normally.
  • On fire: calls session.disconnect() and sets postCompletionWatchdogTriggered; the catch block treats this as a successful completion — same semantics as the existing SDK_IDLE_TIMEOUT_PATTERN path.
  • Exports SDK_POST_COMPLETION_IDLE_MS_DEFAULT for external reference.
T+12m45s  create_issue → safeoutputs returns success
T+12m45s  agent emits final markdown summary
T+12m45s  [sdk-driver] warning: post-completion idle watchdog fired after 300000ms — force-disconnecting session
T+17m45s  [sdk-driver] warning: post-completion watchdog triggered disconnect — treating as completed
          [copilot-harness] attempt 1: process exit event exitCode=0   ← instead of T+20m timeout

copilot_sdk_driver.test.cjs — 4 new regression tests

Test Asserts
watchdog fires on hang exitCode 0, output preserved
watchdog disarmed by new tool call normal completion, single disconnect
watchdog not armed before first output no spurious disconnect
sendAndWait resolves before timer fires watchdog cleared, no double-disconnect


✨ PR Review Safe Output Test - Run 28215703568

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

  • accounts.google.com
  • android.clients.google.com
  • clients2.google.com
  • contentautofill.googleapis.com
  • safebrowsingohttpgateway.googleapis.com
  • www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

💥 [THE END] — Illustrated by Smoke Claude · 67.9 AIC · ⌖ 25.5 AIC · ⊞ 8.7K ·

Copilot AI and others added 2 commits June 26, 2026 01:51
… after final tool result

When the Copilot SDK's sendAndWait never resolves after the agent's final
tool call returns success, the process would hang until the full step timeout
fires (burning the entire remaining step budget).

This adds a post-completion idle watchdog (GH_AW_SDK_IDLE_MS, default 5 min)
to copilot_sdk_session.cjs:

- The watchdog is armed whenever the session enters the "work complete" state:
  output has been collected AND no tool calls are still pending.
- It is disarmed when a new tool call starts (the agent is still mid-turn).
- If no new SDK events arrive within the idle window, the watchdog
  force-disconnects the session; the catch block treats that as a successful
  completion (same semantics as the existing idle-timeout path).

This bounds the post-completion hang from full-step-timeout (~20 min) to at
most the watchdog idle period (~5 min), without affecting normal completions
where sendAndWait resolves on its own.

Closes #41551

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…ce-condition guard

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix Copilot CLI process hanging after safe-output fix(copilot-sdk): post-completion idle watchdog to bound SDK hang after final tool result Jun 26, 2026
Copilot AI requested a review from pelikhan June 26, 2026 01:53
@pelikhan

Copy link
Copy Markdown
Collaborator

@copilot Review the copilot SDK documentation to ensure we are not missing any events.

Copilot AI commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

@copilot Review the copilot SDK documentation to ensure we are not missing any events.

Reviewed the @github/copilot-sdk SessionEvent union in session-events.d.ts. Found several events not yet handled in copilot_sdk_session.cjs:

SDK event Was missing Action taken
assistant.turn_start Sets inAssistantTurn = true — disarms the post-completion watchdog while the agent is mid-LLM-inference (before it dispatches any tool calls in the new turn)
assistant.turn_end Clears inAssistantTurn = false — allows the watchdog to re-arm after the turn
session.task_complete Logged to JSONL with success + summary fields
subagent.started Logged to JSONL
subagent.completed Logged to JSONL
subagent.failed Logged to JSONL

Also updated the watchdog arm/fire conditions to include !inAssistantTurn, so a slow LLM inference turn that starts with zero pending tool calls no longer risks a premature force-disconnect.

Events intentionally left in default (skip) after review:

  • tool.execution_partial_result / tool.execution_progress — always ephemeral: true, filtered before the switch
  • session.idle — always ephemeral: true, filtered before the switch
  • external_tool.requested / external_tool.completed — not used in gh-aw's driver (no external tools registered); external_tool.completed is always ephemeral so it cannot be used to remove entries from pendingToolCalls

Two new regression tests added: watchdog disarmed during turn_startturn_end cycle, and session.task_complete written to JSONL. All 11,123 tests pass.

@pelikhan pelikhan marked this pull request as ready for review June 26, 2026 02:21
Copilot AI review requested due to automatic review settings June 26, 2026 02:21

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a “post-completion” idle watchdog to the Copilot SDK session runner to prevent hangs where sendAndWait never resolves after the agent has effectively finished, reducing wasted runner time and step timeouts.

Changes:

  • Arm/rearm a post-completion watchdog timer when hasOutput === true and pendingToolCalls.size === 0, force-disconnecting the session if no further events arrive within the configured window.
  • Treat watchdog-triggered disconnects as successful completion when output is present and there are no pending tool calls.
  • Add regression tests covering watchdog firing, disarming behavior, and normal-completion behavior.
Show a summary per file
File Description
actions/setup/js/copilot_sdk_session.cjs Adds the post-completion idle watchdog and success-handling path to bound SDK hangs after final tool completion.
actions/setup/js/copilot_sdk_driver.test.cjs Adds regression tests validating watchdog behavior and ensuring completion semantics are preserved.

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 2
  • Review effort level: Low

Comment thread actions/setup/js/copilot_sdk_driver.test.cjs Outdated
Comment on lines +580 to +592
// sendAndWait resolves before watchdog fires (watchdog idle = 20ms in test).
return { data: { content: "completed normally" } };
}),
disconnect,
};
class FakeCopilotClient {
start = vi.fn().mockResolvedValue(undefined);
createSession = vi.fn().mockResolvedValue(session);
stop = stop;
}

const prevIdleMs = process.env.GH_AW_SDK_IDLE_MS;
process.env.GH_AW_SDK_IDLE_MS = "20";
@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

PR Code Quality Reviewer completed the code quality review.

@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Test Quality Sentinel completed test quality analysis.

TQS analysis already completed in this workflow run (run-id 28213036855): comment posted and APPROVE review submitted for PR #41572 (score 90/100, 4 design tests, 0 violations). Both add_comment and submit_pull_request_review limits were consumed in the prior agent session.

@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR does not have the 'implementation' label and has ≤100 new lines of code in business logic directories (default_business_additions=0, file_count=2).

@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

@github-actions

Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel Report

Test Quality Score: 90/100 — Excellent

Analyzed 4 test(s) in actions/setup/js/copilot_sdk_driver.test.cjs: 4 design, 0 implementation, 0 guideline violation(s). Test inflation detected (281 test lines / 63 production lines ≈ 4.46:1); 10-point deduction applied.

📊 Metrics & Test Classification (4 tests analyzed)
Metric Value
New/modified tests analyzed 4
✅ Design tests (behavioral contracts) 4 (100%)
⚠️ Implementation tests (low value) 0 (0%)
Tests with error/edge cases 4 (100%)
Duplicate test clusters 0
Test inflation detected YES — 281 test lines / 63 production lines ≈ 4.46:1 (threshold: 2:1)
🚨 Coding-guideline violations 0
Test File Classification Issues Detected
post-completion idle watchdog fires and treats session as completed copilot_sdk_driver.test.cjs ✅ Design
post-completion watchdog does not fire when tool calls are still pending copilot_sdk_driver.test.cjs ✅ Design
post-completion watchdog does not trigger when output not yet collected copilot_sdk_driver.test.cjs ✅ Design
post-completion watchdog does not treat success as failure when sendAndWait resolves before timer fires copilot_sdk_driver.test.cjs ✅ Design

Go: 0 (*_test.go); JavaScript: 4 (*.test.cjs). No other languages detected.

Score components: behavioral_ratio=40/40, edge_case_ratio=30/30, duplication_penalty=0, inflation_penalty=10 → Score = 90

📝 Per-Test Analysis

post-completion idle watchdog fires and treats session as completed — ✅ Design test. Simulates the production regression (sendAndWait hanging after the final tool result) by having sendAndWait block until the watchdog calls session.disconnect(). Asserts exitCode=0, hasOutput=true, output content, and disconnect call count. High value — deleting this test would allow the core SDK hang regression to go undetected. Mocks target external Copilot SDK session objects (legitimate).

post-completion watchdog does not fire when tool calls are still pending — ✅ Design test. Tests the watchdog disarm invariant: a new tool.execution_start must disarm the watchdog even after a prior arm. Asserts disconnect called exactly once (finally cleanup, not watchdog). High value — deleting would allow premature disconnects mid-session.

post-completion watchdog does not trigger when output not yet collected — ✅ Design test. Verifies the guard condition: watchdog only arms when hasOutput=true. Tool calls completing before any assistant.message must not arm the watchdog. Asserts normal completion and single disconnect. High value — covers the guard precondition preventing false triggers on noisy early events.

post-completion watchdog does not treat success as failure when sendAndWait resolves before timer fires — ✅ Design test. Regression guard for the happy path: watchdog armed but unneeded must not cause double-disconnect or wrong exit codes. Asserts disconnect called exactly once. High value — ensures the watchdog does not degrade normal completions.

Mocking assessment: All vi.fn() calls target the external Copilot SDK session interface (disconnect, stop, sendAndWait, createSession, start). Mocking external SDK client objects is correct for testing driver-layer behavior. Call-count assertions (toHaveBeenCalledTimes(1), toHaveBeenCalled) verify behavioral side-effects (watchdog fires / does not fire), not internal function routing — appropriate use of mocks.

Inflation note: The 4.46:1 ratio exceeds the 2:1 threshold and incurs a 10-point deduction. The scaffolding cost is justified — each async timing test requires Promise-coordinated fake SDKs, event sequences, and env-var isolation. Verbosity is structural, not duplicative.

Verdict

Check passed. 0% implementation tests (threshold: 30%). All 4 new tests verify observable behavioral contracts of the post-completion idle watchdog — covering the hang-fix scenario, watchdog disarm invariant, no-output guard, and happy-path non-interference. No coding-guideline violations detected.

🧪 Test quality analysis by Test Quality Sentinel · 95.5 AIC · ⌖ 16.9 AIC · ⊞ 8.4K ·

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 90/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). All 4 new tests verify behavioral contracts of the post-completion idle watchdog.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /diagnose, /tdd, /zoom-out, and /grill-with-docs — commenting with minor-to-medium observations. No blocking issues; the fix is correct and well-tested.

📋 Key Themes & Highlights

Key Themes

  • Test spec precision: One assertion uses .toHaveBeenCalled() where .toHaveBeenCalledTimes(2) is stated in the comment — a small but exploitable gap in the regression spec.
  • Env var naming: GH_AW_SDK_IDLE_MS is ambiguous alongside the existing SDK_IDLE_TIMEOUT_PATTERN path; GH_AW_SDK_POST_COMPLETION_IDLE_MS would reduce operator confusion.
  • Test boilerplate: The env var save/restore pattern is copy-pasted 4× — a beforeEach/afterEach block would tighten this.
  • Comment accuracy: The "race condition guard" label and one test name do not reflect the actual mechanics.
  • Sliding-window semantics: The watchdog re-arms on every qualifying event; worth documenting whether this is intentional (it probably is) so future maintainers don't change it inadvertently.

Positive Highlights

  • ✅ Correct placement of the watchdog in the event handler and solid finally-block cleanup — no timer leaks.
  • ✅ The !catastrophicToolDenialsError guard in the catch path correctly preserves the existing denial-failure semantics.
  • ✅ 4 targeted regression tests that each isolate a distinct state transition — arm, disarm, no-premature-arm, and normal-resolution.
  • ✅ Exported SDK_POST_COMPLETION_IDLE_MS_DEFAULT constant makes the default visible and testable without reading the source.
  • ✅ Clear PR description with a worked timeline example that makes the fix easy to validate.

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 74.2 AIC · ⌖ 11.4 AIC · ⊞ 6.5K

expect(result.hasOutput).toBe(true);
expect(result.output).toContain("Issue filed successfully");
// disconnect is called twice: once by the watchdog and once in finally.
expect(disconnectWithSignal).toHaveBeenCalled();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] The comment says disconnect is called twice (watchdog + finally), but the assertion only checks .toHaveBeenCalled() — it does not verify the exact count.

This leaves the spec ambiguous: if the watchdog somehow called disconnect three times, the test would still pass. Pin the count to match the stated expectation.

💡 Suggested fix
// disconnect is called twice: once by the watchdog and once in finally.
expect(disconnectWithSignal).toHaveBeenCalledTimes(2);

Using toHaveBeenCalledTimes(2) turns the comment into a machine-checked specification, preventing a future over-disconnect from going undetected.

postCompletionWatchdog = setTimeout(() => {
postCompletionWatchdog = null;
// Re-check conditions at fire time: a new tool call could have started
// between arming the watchdog and the timer firing (race condition guard).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/diagnose] The phrase "race condition guard" is inaccurate in a single-threaded JavaScript runtime — there is no concurrent mutation here.

The guard is catching stale state at timer-fire time: between when the watchdog was armed and when the macrotask fires, new events could have changed hasOutput or pendingToolCalls. "Stale-state guard" is the accurate description, and makes the intent clearer to future readers.

💡 Suggested comment
// Stale-state guard: events processed between arming the watchdog and the
// timer firing could have changed hasOutput / pendingToolCalls.
if (!hasOutput || pendingToolCalls.size !== 0 || !session) return;

// block treats the result as a successful completion. This bounds the damage
// from the SDK driver bug where sendAndWait never resolves after the final
// tool result is returned.
const postCompletionIdleMs = getEnvPositiveIntOrDefault("GH_AW_SDK_IDLE_MS", SDK_POST_COMPLETION_IDLE_MS_DEFAULT);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/grill-with-docs] GH_AW_SDK_IDLE_MS is ambiguous alongside the existing SDK_IDLE_TIMEOUT_PATTERN mechanism — an operator could easily conflate the two and misconfigure the wrong timeout.

Consider renaming to GH_AW_SDK_POST_COMPLETION_IDLE_MS to match the exported constant name (SDK_POST_COMPLETION_IDLE_MS_DEFAULT) and to self-document the specific hang scenario it addresses.

💡 Context

The existing idle-timeout path is triggered by the SDK emitting a Timeout after Nms waiting for session.idle error — a different failure mode. A new operator reading the env var docs would have no way to distinguish which timeout GH_AW_SDK_IDLE_MS controls without reading the source.

}
});

it("post-completion watchdog does not fire when tool calls are still pending", async () => {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] The test name "does not fire when tool calls are still pending" doesn't match the actual scenario.

The scenario emits an assistant.message first (which arms the watchdog), then starts a new tool call (which disarms it), completes it (which re-arms it), and finally resolves sendAndWait normally before the timer fires. The scenario being tested is watchdog disarmed on new tool start, re-armed on completion, cleared on normal resolution.

A more accurate name would help future readers understand the contract at a glance, e.g.: "post-completion watchdog is disarmed by a new tool call and cleared on normal sendAndWait resolution".


const prevIdleMs = process.env.GH_AW_SDK_IDLE_MS;
// Use a very short idle timeout so the watchdog fires quickly in tests.
process.env.GH_AW_SDK_IDLE_MS = "20";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] The GH_AW_SDK_IDLE_MS save/restore pattern is copy-pasted across all 4 new tests. A beforeEach/afterEach pair would eliminate the repetition and make it impossible to forget cleanup in one test.

💡 Suggested refactor
let prevSdkIdleMs;

beforeEach(() => {
  prevSdkIdleMs = process.env.GH_AW_SDK_IDLE_MS;
  process.env.GH_AW_SDK_IDLE_MS = "20";
});

afterEach(() => {
  if (prevSdkIdleMs === undefined) delete process.env.GH_AW_SDK_IDLE_MS;
  else process.env.GH_AW_SDK_IDLE_MS = prevSdkIdleMs;
});

Place these inside the relevant describe block so they scope only to the watchdog tests.

postCompletionWatchdog = null;
// Re-check conditions at fire time: a new tool call could have started
// between arming the watchdog and the timer firing (race condition guard).
if (!hasOutput || pendingToolCalls.size !== 0 || !session) return;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/diagnose] The !session guard is always false at this point — the watchdog can only be armed inside session.on(...), which is called after session is assigned. The guard is dead code.

Consider removing it to keep the condition accurate, or replace it with a comment explaining why it was considered (and why it was safe to drop).

// tool call was just started, or no output yet).
// The watchdog fires only if sendAndWait never resolves on its own after
// the final tool result is returned — the common SDK post-completion hang.
if (hasOutput && pendingToolCalls.size === 0) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/zoom-out] The watchdog uses a sliding-window strategy: it re-arms on every event that satisfies hasOutput && pendingToolCalls.size === 0, not just on the initial transition into that state.

This means if the SDK keeps emitting qualifying events (e.g., repeated assistant.message chunks) after the final tool call, the window keeps resetting and the watchdog never fires — potentially defeating its purpose for a chatty SDK.

If the intent is "N ms of silence after the final tool result", the behaviour is correct and just needs a clearer comment. If the intent is "fire N ms after first entering the complete state", the logic needs a one-shot arm:

💡 One-shot arm alternative
// Arm once when transitioning into the post-completion state.
if (hasOutput && pendingToolCalls.size === 0 && !postCompletionWatchdog) {
  postCompletionWatchdog = setTimeout(() => { ... }, postCompletionIdleMs);
}
// Disarm when no longer complete.
if (!(hasOutput && pendingToolCalls.size === 0) && postCompletionWatchdog) {
  clearTimeout(postCompletionWatchdog);
  postCompletionWatchdog = null;
}

The PR description says the SDK "fails to resolve after the final tool result", which implies silence after that point — so sliding-window is likely fine. Just worth making the choice explicit in a comment.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REQUEST_CHANGES — Two issues that need fixing before merge; two additional cleanup items.

🔴 Blocking issues

1. Double-disconnect can override the successful return (HIGH)

The watchdog fires void session.disconnect() and the finally block unconditionally calls await session.disconnect() again. If the SDK's disconnect() rejects on a second call, the finally-block rejection propagates to the caller instead of the return { exitCode: 0 } from the catch block. In JavaScript, an exception thrown from finally overrides any return or throw in try/catch. The feature works correctly only if the SDK happens to be idempotent on disconnect — an assumption that is not documented or tested.

See inline comment on line 418 for a concrete fix.

2. Watchdog re-arms on any SDK event — defeats the timeout on keepalive-emitting sessions (MEDIUM)

The watchdog is rearmed (clearTimeout + fresh setTimeout) on every event where hasOutput && pendingToolCalls.size === 0. In the exact hung-session scenario this watchdog is designed to fix, if the SDK emits any periodic keepalive or status event while sendAndWait is stuck, the idle window resets indefinitely. The watchdog will never fire.

See inline comment on line 331 for two alternative designs.

🟡 Cleanup items (non-blocking)
  • Test assertion too weak (line 409, test file): toHaveBeenCalled() should be toHaveBeenCalledTimes(2) since the comment explicitly says disconnect fires twice.
  • Dead !session guard (line 335, session file): session is a const that is never nulled; the guard is unreachable.

🔎 Code quality review by PR Code Quality Reviewer · 127.1 AIC · ⌖ 8.86 AIC · ⊞ 5.2K

Comments that could not be inline-anchored

actions/setup/js/copilot_sdk_session.cjs:418

Double-disconnect: watchdog and finally both call session.disconnect() — if the SDK is not idempotent, the finally-block exception overrides the successful catch-block return.

<details>
<summary>💡 Details and suggested fix</summary>

When the watchdog fires (line 338), it calls void session.disconnect() (fire-and-forget). The finally block then unconditionally calls await session.disconnect() a second time. In JavaScript, a finally block always runs even after a return i…

actions/setup/js/copilot_sdk_session.cjs:331

Watchdog re-arms on every SDK event — SDK keepalives or status messages emitted while sendAndWait is hung will prevent the watchdog from ever firing.

<details>
<summary>💡 Details</summary>

The watchdog is rearmed (clearTimeout + fresh setTimeout) on every SDK event where hasOutput &amp;&amp; pendingToolCalls.size === 0 — not only on tool or message events. This implements an idle timer: "no events for N ms."

The problem is the hang scenario this watchdog is designed to fix: `sendAnd…

actions/setup/js/copilot_sdk_driver.test.cjs:409

Assertion is weaker than the comment claims — toHaveBeenCalled() does not verify the expected two-call behavior.

<details>
<summary>💡 Suggested fix</summary>

The inline comment on the previous line says:

// disconnect is called twice: once by the watchdog and once in finally.
expect(disconnectWithSignal).toHaveBeenCalled();

toHaveBeenCalled() passes with one or more calls. If the double-disconnect is intentional behavior (and it is — see line 418 in `copilot_sdk_sessi…

actions/setup/js/copilot_sdk_session.cjs:335

!session guard is dead code — session is a const that is never set to null.

<details>
<summary>💡 Details</summary>

session is assigned once via const session = await client.createSession(...) and is never reassigned or nulled anywhere in runWithCopilotSDK. The !session branch in the watchdog callback guard:

if (!hasOutput || pendingToolCalls.size !== 0 || !session) return;

...will never be true. This is confusing to future readers who might wonder what lifec…

@pelikhan

Copy link
Copy Markdown
Collaborator

@copilot Run pr-finisher skill.

…val in flaky test

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

@copilot Run pr-finisher skill.

Done. Addressed the two open review threads:

  • Removed the unused disconnect mock from the "watchdog fires" test.
  • Increased GH_AW_SDK_IDLE_MS from 20 to 500 in the "resolves before timer fires" test so normal completion always finishes before the watchdog on any CI runner.

Local validation passes. Commit: fix(tests): remove unused disconnect mock and increase watchdog interval in flaky test

@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

✅ All tools validated successfully! Agent Container Smoke Test confirms agent container is ready.

@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

🚀 Smoke Pi MISSION COMPLETE! Pi delivered. 🥧

Smoke test completed; created issue. PR auto-comment unsupported in current event context, label skipped because tests failed.

@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

🚀 Smoke Gemini MISSION COMPLETE! Gemini has spoken. ✨

@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

🚀 Smoke Antigravity MISSION COMPLETE! Antigravity has spoken. ✨

@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

🎬 THE ENDSmoke Claude MISSION: ACCOMPLISHED! The hero saves the day! ✨

@github-actions

Copy link
Copy Markdown
Contributor

📰 BREAKING: Smoke Copilot is now investigating this pull request. Sources say the story is developing...

@github-actions

Copy link
Copy Markdown
Contributor

📰 BREAKING: Smoke Copilot - AOAI (apikey) is now investigating this pull request. Sources say the story is developing...

@github-actions

Copy link
Copy Markdown
Contributor

📰 BREAKING: Smoke Copilot - AOAI (Entra) is now investigating this pull request. Sources say the story is developing...

@github-actions

Copy link
Copy Markdown
Contributor

Agent Container Tool Check

Tool Status Version
bash 5.2.21
sh available
git 2.54.0
jq 1.7
yq 4.53.3
curl 8.5.0
gh 2.95.0
node 22.23.0
python3 3.11.15 (PyPy 7.3.23)
go 1.24.13
java 21.0.11
dotnet 10.0.301

Result: 12/12 tools available ✅

Overall Status: PASS

🔧 Tool validation by Agent Container Smoke Test · 17.4 AIC · ⌖ 6.3 AIC · ⊞ 5.4K ·

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test Results

  • GitHub MCP: ✅
  • Web Fetch: ✅
  • File Writing: ✅
  • Bash Tool: ✅
  • Build gh-aw: ❌

Overall Status: FAIL

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • localhost

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "localhost"

See Network Configuration for more information.

Smoke Gemini — Powered by Gemini ·

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test: Codex - 28215703546
Merged PRs: #41578, #41575
Results: ✅ GitHub PR reads, ✅ Serena, ✅ Playwright, ❌ web-fetch unavailable, ✅ temp file, ✅ build, ✅ comment-memory, ✅ cache memory, ⚪ issue field skipped
Overall: FAIL

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

  • accounts.google.com
  • android.clients.google.com
  • clients2.google.com
  • contentautofill.googleapis.com
  • safebrowsingohttpgateway.googleapis.com
  • www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex · 4.04 AIC · ⌖ 4.49 AIC · ⊞ 12.5K ·

@github-actions

Copy link
Copy Markdown
Contributor

Comment Memory

Quiet terminal
Tools whisper through the smoke test
Git waits, caches hum

Note

This comment is managed by comment memory.

It stores persistent context for this thread in the code block at the top of this comment.
Edit only the text inside the backtick fences; workflow metadata and the footer are regenerated automatically.

Learn more about comment memory

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

  • accounts.google.com
  • android.clients.google.com
  • clients2.google.com
  • contentautofill.googleapis.com
  • safebrowsingohttpgateway.googleapis.com
  • www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex · 4.04 AIC · ⌖ 4.49 AIC · ⊞ 12.5K ·

@github-actions

Copy link
Copy Markdown
Contributor

💥 Smoke Test: Claude — Run 28215703568

Core #1-12: 1✅ 2✅ 3✅ 4✅ 5❌(Tavily) 6✅ 7✅ 8✅ 9✅ 10✅ 11✅ 12✅
PR Review #13-19: 13✅ 14✅ 15✅ 16✅ 17✅ 18✅ 19⚠️(skipped)

Overall: PARTIAL — Tavily MCP registered 0 tools.

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

  • accounts.google.com
  • android.clients.google.com
  • clients2.google.com
  • contentautofill.googleapis.com
  • safebrowsingohttpgateway.googleapis.com
  • www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

💥 [THE END] — Illustrated by Smoke Claude · 67.9 AIC · ⌖ 25.5 AIC · ⊞ 8.7K ·

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💥 Automated smoke test review - all systems nominal!

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

  • accounts.google.com
  • android.clients.google.com
  • clients2.google.com
  • contentautofill.googleapis.com
  • safebrowsingohttpgateway.googleapis.com
  • www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

💥 [THE END] — Illustrated by Smoke Claude · 67.9 AIC · ⌖ 25.5 AIC · ⊞ 8.7K

// Default idle period for the post-completion watchdog: 5 minutes.
// When the agent has produced output and all tracked tool calls have completed,
// the driver arms a watchdog timer. If no new SDK events arrive within this
// window, the driver force-disconnects the session and treats it as a successful

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider documenting the default timeout value here for maintainability.

// (default 5 minutes), the watchdog force-disconnects the session and the catch
// block treats the result as a successful completion. This bounds the damage
// from the SDK driver bug where sendAndWait never resolves after the final
// tool result is returned.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice watchdog guard — a brief comment on the race-condition rationale would help future readers.

@pelikhan pelikhan merged commit d5a2522 into main Jun 26, 2026
183 checks passed
@pelikhan pelikhan deleted the copilot/fix-cli-process-hang-issue branch June 26, 2026 03:56
@github-actions

Copy link
Copy Markdown
Contributor

Comment Memory

Stone bot tap the keys
Green sparks jump on GitHub page
Smoke run done by dawn

Note

This comment is managed by comment memory.

It stores persistent context for this thread in the code block at the top of this comment.
Edit only the text inside the backtick fences; workflow metadata and the footer are regenerated automatically.

Learn more about comment memory

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

  • accounts.google.com
  • android.clients.google.com
  • clients2.google.com
  • contentautofill.googleapis.com
  • safebrowsingohttpgateway.googleapis.com
  • www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

📰 BREAKING: Report filed by Smoke Copilot · 362.2 AIC · ⌖ 24.2 AIC · ⊞ 18.9K ·

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smoke review done. Two inline taps.

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

  • accounts.google.com
  • android.clients.google.com
  • clients2.google.com
  • contentautofill.googleapis.com
  • safebrowsingohttpgateway.googleapis.com
  • www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

📰 BREAKING: Report filed by Smoke Copilot · 362.2 AIC · ⌖ 24.2 AIC · ⊞ 18.9K

@github-actions

Copy link
Copy Markdown
Contributor

Smoke tests: ✅1, ✅2, ❌3, ❌4, ✅5, ✅6, ✅7, ✅8. Overall: FAIL. @github-actions

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

  • accounts.google.com
  • android.clients.google.com
  • clients2.google.com
  • contentautofill.googleapis.com
  • safebrowsingohttpgateway.googleapis.com
  • www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

📰 BREAKING: Report filed by Smoke Copilot - AOAI (apikey) · 295.7 AIC · ⌖ 6.44 AIC · ⊞ 17.9K ·

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

  • accounts.google.com
  • android.clients.google.com
  • clients2.google.com
  • contentautofill.googleapis.com
  • safebrowsingohttpgateway.googleapis.com
  • www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

📰 BREAKING: Report filed by Smoke Copilot - AOAI (apikey) · 295.7 AIC · ⌖ 6.44 AIC · ⊞ 17.9K

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants