Keep detection in warn mode on parser/agent failures so non-reviewable safe outputs are blocked#41547
Conversation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. No ADR enforcement needed: PR #41547 does not have the 'implementation' label and has 0 new lines of code in business logic directories (threshold: 100). |
|
✅ PR Code Quality Reviewer completed the code quality review. |
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
|
✅ Test Quality Sentinel completed test quality analysis. Test Quality Sentinel analysis for PR #41547 was already completed in the prior invocation of this workflow run (28204877295). Score: 90/100 APPROVE. Comment and review were already submitted successfully. |
There was a problem hiding this comment.
Pull request overview
Adjusts threat-detection result parsing so that “warn mode” (GH_AW_DETECTION_CONTINUE_ON_ERROR=true) consistently produces a warning conclusion for detection-side failures, preserving downstream safe-output gating behavior (blocking non-reviewable outputs while avoiding hard failures).
Changes:
- Updated
setDetectionFailure()to always emitconclusion=warningin warn mode (removing the prior conditional escalation tofailureforagent_failure/parse_error). - Updated the corresponding
main()test to expect warn-mode behavior for detection execution failures (nosetFailed()).
Show a summary per file
| File | Description |
|---|---|
| actions/setup/js/parse_threat_detection_results.cjs | Ensures warn mode always outputs conclusion=warning (and does not fail the job) for detection-side failures. |
| actions/setup/js/parse_threat_detection_results.test.cjs | Updates assertions to match warn-mode semantics for detection execution failure. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 2/2 changed files
- Comments generated: 0
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /diagnose and /tdd — approving with two non-blocking suggestions.
📋 Key Themes & Highlights
Key Themes
- Dead variable (
/diagnose):detectionExecutionOutcomeis still read from env but no longer used in any conditional — only in a diagnostic log. A comment or inlining would prevent future confusion. - Missing symmetric test (
/tdd): There is no test forGH_AW_DETECTION_CONTINUE_ON_ERROR=false+DETECTION_AGENTIC_EXECUTION_OUTCOME=failure, so the "strict mode stays strict" invariant has no explicit guard after this change.
Positive Highlights
- ✅ Minimal and correct fix — removing
mustFailis the right call; theisWarnModeflag is now the sole gatekeeper, which matches the documented contract ofcontinueOnError. - ✅ Test rename is accurate — the old name "should fail even in warn mode" described the buggy behaviour; the new name reads as a spec.
- ✅ Clean PR description — the before/after code snippet makes the intent immediately clear.
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 52.7 AIC · ⌖ 9.66 AIC · ⊞ 6.5K
| }); | ||
|
|
||
| it("should fail when detection execution failed even in warn mode", async () => { | ||
| it("should warn when detection execution failed in warn mode", async () => { |
There was a problem hiding this comment.
[/tdd] The updated test covers warn mode + execution failure → warning, but there is no symmetric test for error mode (GH_AW_DETECTION_CONTINUE_ON_ERROR=false) combined with DETECTION_AGENTIC_EXECUTION_OUTCOME=failure. Adding it would document that the fix is bounded to warn mode and that strict mode still produces failure.
💡 Suggested test to add
it("should fail when detection execution failed in error mode", async () => {
process.env.GH_AW_DETECTION_CONTINUE_ON_ERROR = "false";
process.env.DETECTION_AGENTIC_EXECUTION_OUTCOME = "failure";
mockExistsSync.mockReturnValue(false);
await mod.main();
expect(mockCore.setOutput).toHaveBeenCalledWith("conclusion", "failure");
expect(mockCore.setOutput).toHaveBeenCalledWith("success", "false");
expect(mockCore.setOutput).toHaveBeenCalledWith("reason", "agent_failure");
expect(mockCore.exportVariable).toHaveBeenCalledWith("GH_AW_DETECTION_CONCLUSION", "failure");
expect(mockCore.setFailed).toHaveBeenCalledWith(expect.stringContaining("Detection log file not found"));
});This pins that DETECTION_AGENTIC_EXECUTION_OUTCOME no longer overrides warn mode and confirms strict mode is unchanged — a regression guard for both sides of the logic.
🧪 Test Quality Sentinel Report✅ Test Quality Score: 90/100 — Excellent
📊 Metrics & Test Classification (1 test analyzed)
Go: 0 ( Inflation note: Test file shows 4 lines added vs 1 line added in the production file (4:1 ratio, threshold 2:1). However, this is test maintenance — the production change was primarily a deletion of a special-case guard ( Verdict
|
There was a problem hiding this comment.
Correct fix. Removing mustFail makes isWarnMode the single authoritative gate for setDetectionFailure, which is the right design — the old escalation-to-failure was bypassing warn-mode safe-output policy rather than honoring it. Test update accurately reflects the new behavior; reason=agent_failure + success=false still distinguish scanner
🔎 Code quality review by PR Code Quality Reviewer · 64.6 AIC · ⌖ 7.11 AIC · ⊞ 5.2K
| @@ -453,7 +453,6 @@ async function main() { | |||
| const runDetection = process.env.RUN_DETECTION; | |||
| const continueOnError = process.env.GH_AW_DETECTION_CONTINUE_ON_ERROR !== "false"; | |||
| const detectionExecutionOutcome = process.env.DETECTION_AGENTIC_EXECUTION_OUTCOME || ""; | |||
There was a problem hiding this comment.
detectionExecutionOutcome is now read but has no behavioral effect — it only appears in a core.info log statement after this change.
💡 Suggested clarification
Without a comment, a future reader may wonder whether detectionExecutionFailed was accidentally dropped, or whether this variable can be removed entirely. A short note prevents that confusion:
// retained for diagnostic logging only; no longer gates conclusion (see setDetectionFailure)
const detectionExecutionOutcome = process.env.DETECTION_AGENTIC_EXECUTION_OUTCOME || "";Downstream consumers can still distinguish the failure type via reason=agent_failure and success=false outputs, so nothing is lost behaviorally — this is purely a readability concern.
|
@copilot run pr-finisher skill |
In the referenced Issue Monster run, threat detection execution failed, but detection conclusion was escalated to
failureinstead of staying in warn mode. That prevented warn-mode safe-output policy from activating, soassign_to_agentproceeded when it should have been blocked and only cautionary commenting should have remained.Detection conclusion semantics
actions/setup/js/parse_threat_detection_results.cjssoGH_AW_DETECTION_CONTINUE_ON_ERROR=truealways yieldsconclusion=warningfor detection-side failures (agent_failure/parse_error), instead of conditionally escalating tofailure.Safe-output policy alignment
assign_to_agent) are gated by warn-mode threat policy, while comments retain the threat caution path.Behavioral expectation update
actions/setup/js/parse_threat_detection_results.test.cjsto reflect warn-mode outcome for detection execution failure.