[audit-workflows] π‘οΈ Agentic Workflow Audit β 2026-06-23 (92.5% healthy; copilot-sdk LONG-RUN dominant) #41112
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
π‘οΈ Agentic Workflow Audit β 2026-06-23
Window: 2026-06-23 10:36Z β 21:22Z (~10.8h active). Overnight (06-22 21:44Z β 06-23 10:36Z) had 0 runs β confirmed quiet via empty continuation, so this 24h window is effectively complete despite the short active span.
Overview
Engines: copilot 108 Β· claude 28 Β· pi 18 Β· codex 7 Β· antigravity 2 Β· gemini 2. No
null_metadataruns this window (clean classification).Health verdict: On-baseline and healthy. No new failure class, no shared incident, zero missing-tools/data/MCP failures. Two chronic offenders went quiet: Avenger (0 runs, 2nd consecutive day β schedule likely paused) and Skillet (0 fails, de-escalation holds).
π Trends
Success rate sits at 92.5%, comfortably above the 90% guide line and continuing the rebound from the 06-20 Skillet-driven dip (69.6%). The last four days (91.5 β 69.6 β 91.7 β 94.5 β 92.5) show the fleet has re-stabilized once the Skillet push-trigger storm cleared; remaining failures are almost entirely known-recurring families rather than new breakage.
Token volume is claude-measured only β copilot (the majority engine at 108/174) and codex do not report token counts, so recent bars are sparse and the absolute level understates true fleet consumption. AI-credit (aic) spend (~12.6K) is the more complete cost signal this window, and it was skewed by two long-running failed copilot jobs (see anomaly below).
π Failure Breakdown (13 total)
8 real prod-main failures β all known-recurring families
Dominant family: copilot-sdk-driver LONG-RUN agent-job fail β 4 of 8 real prod-main fails. Either the agent runs 20β53 min then the agent job concludes failure, or it never starts (0-tok). Day 21+ of this chronic path.
2 intentional test failures (by design) + 3 dev/PR-branch failures
Intentional (excluded from health math):
Daily Credit Limit Test (Intentionally Broken),Daily Max AI Credits Test (Intentionally Fails).Dev/PR branch (copilot/ feature branches, not main):*
Changeset Generator(codex PR),Design Decision Gate ποΈ(claude PR),Smoke Claude on Copilot(workflow_dispatch probe oncopilot/fix-clause-copilot-configurationβ by-design config-fix probe).Daily Agent of the Day Blog Writer(run 28036291237) is the most expensive single failing run of the window: 52.6 minutes, $144.15 AI credits, 2.59M cache-read tokens, 56 assistant messages / 75 tool calls β modelclaude-sonnet-4.6routed via the copilot engine. Its firewall summary shows 1,336 of 1,541 requests blocked (87%), including repeatedproxy.golang.org:443denials. The signature suggests the agent spent most of its 52 minutes retrying blocked network egress before the agent job ultimately failed β a costly burn with no output. Paired withDaily Safe Output Integrator(24.5m/$145), these two runs dominated wasted aic spend.π Recurrence Tracking (repo memory updated)
[aw] Failure Investigator (6h).β Next Actions (no new recommendations β existing fixes still pending)
nodeinto the AWF chroot (rec-chroot-node-bindmount) β Issues Report Generator now failing 3 days straight; highest-confidence fixable item.fix-cache-strategy-binary-path) β Cache Strategy Analyzer remains ~100% fail, count 5.References:
Beta Was this translation helpful? Give feedback.
All reactions