Harden issue-intent confidence handling: clearer INTENT suffix + lenient validation#43344
Conversation
…rip-on-error lenient validation - Fix 1: Strengthen INTENT description suffix in generate_safe_outputs_tools.cjs from 'confidence (LOW/MEDIUM/HIGH)' to 'confidence (string, exactly one of: LOW, MEDIUM, HIGH)' - Fix 2: Add StripOnError/x-strip-on-error to FieldValidation struct and mark confidence and rationale in set_issue_type and set_issue_field as lenient fields. When these optional enrichment fields fail validation, they are stripped from the item instead of rejecting the whole message. - Also make validateIssueIntentLabels lenient: invalid label confidence and non-string rationale are stripped rather than rejecting the whole item. - Update tests: mock config includes x-strip-on-error, new tests cover lenient behavior for numeric confidence values (0.9, '0.95') and non-string rationale. Closes #43164 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
|
✅ Test Quality Sentinel completed test quality analysis. |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. No ADR enforcement needed: PR #43344 does not have the 'implementation' label and has only 16 new lines of code in business logic directories (threshold: 100). |
|
✅ PR Code Quality Reviewer completed the code quality review. |
There was a problem hiding this comment.
Pull request overview
This PR hardens the “issue intent” safe-output validation pipeline so optional enrichment fields (notably confidence and rationale) don’t cause otherwise-valid tool calls to be rejected when the agent sends malformed values. It also updates the tool description suffix to reduce model confusion about the expected types/values.
Changes:
- Clarifies the INTENT description suffix to explicitly state
confidenceis a string enum andrationaleis a string with max length. - Introduces
x-strip-on-error/StripOnErrorto allow invalid optional enrichment fields to be dropped instead of rejecting the entire item. - Makes
add_labelslabel-levelconfidence/rationalehandling lenient, and adds JS/Go test coverage for stripping behavior.
Show a summary per file
| File | Description |
|---|---|
| pkg/workflow/safe_outputs_validation_config.go | Adds StripOnError to FieldValidation and enables it for intent enrichment fields. |
| pkg/workflow/safe_output_validation_config_test.go | Verifies StripOnError is enabled for confidence/rationale in issue-intent configs. |
| actions/setup/js/safe_output_type_validator.cjs | Implements strip-on-error behavior in validateItem and makes label intent validation lenient. |
| actions/setup/js/safe_output_type_validator.test.cjs | Adds tests ensuring invalid confidence/rationale are stripped (not rejected). |
| actions/setup/js/generate_safe_outputs_tools.cjs | Updates INTENT suffix text to be more explicit about type/allowed values. |
| actions/setup/js/generate_safe_outputs_tools.test.cjs | Updates expected INTENT suffix string in tests. |
Review details
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 6/6 changed files
- Comments generated: 1
- Review effort level: Low
🧪 Test Quality Sentinel Report✅ Test Quality Score: 81/100 — Excellent
📊 Metrics (7 tests)
i️ Inflation Note
Verdict
Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "awmgmcpg"See Network Configuration for more information.
|
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /diagnosing-bugs, /tdd, and /codebase-design — commenting with non-blocking suggestions.
📋 Key Themes & Highlights
Key Themes
- Dual code paths: The
x-strip-on-errorlogic is correctly implemented for item-level fields via the flag invalidateItem, but the equivalent label-level stripping invalidateIssueIntentLabelsis a parallel, hand-rolled implementation that doesn't reference the flag. This duplication is a future maintenance risk. - Test granularity: Some new tests bundle two independent failure modes in a single
itblock, making failure output less diagnostic. - Cross-language naming bridge: The Go
StripOnError/ JSx-strip-on-errormapping is undocumented in the JSDoc typedef, which will puzzle future maintainers.
Positive Highlights
- ✅ The bug description is clear and the root cause (agent sends numeric confidence, validator rejects the whole message) is properly addressed — not just the symptom.
- ✅ The
x-strip-on-errorextension follows the existingx-naming convention — nice consistency. - ✅ Strong test coverage added (+75 lines) covering both
set_issue_typeandset_issue_fieldfor both numeric-string and raw-number confidence values. - ✅ The INTENT description suffix improvement is precise and directly actionable for the model.
- ✅ Go and JS validation configs stay in sync —
StripOnErroradded to Go struct and the test validates it's set.
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 63.8 AIC · ⌖ 8.02 AIC · ⊞ 6.6K
Comment /matt to run again
Comments that could not be inline-anchored
actions/setup/js/safe_output_type_validator.cjs:723
[/diagnosing-bugs] The x-strip-on-error flag is accessed via bracket notation on the validation object (validation["x-strip-on-error"]), but FieldValidation JSDoc defines the property as x-strip-on-error (with hyphens). In strict TypeScript / closure-annotated JS, this might not align with the typed property. More importantly, there is no test asserting that normalizedItem retains the valid fields when stripping — only that confidence is undefined. Consider adding an asserti…
actions/setup/js/safe_output_type_validator.cjs:159
[/diagnosing-bugs] The label-level rationale/confidence stripping is handled directly in validateIssueIntentLabels by silently ignoring invalid values, but the same x-strip-on-error extension used for item-level fields is not applied here. This creates two separate code paths for the same concept. A regression in either path may be harder to spot because the logic is duplicated rather than unified.
<details>
<summary>💡 Why this matters</summary>
If someone later changes the l…
actions/setup/js/safe_output_type_validator.test.cjs:1025
[/tdd] The two new x-strip-on-error tests batch multiple distinct input scenarios ("0.95" numeric string and raw 0.9) into a single it block. Per the red-green-refactor discipline, each distinct failure mode deserves its own test so the failure output is immediately diagnostic.
<details>
<summary>💡 Suggested split</summary>
it('strips confidence when sent as a numeric string ("0.95")', async () => {
const result = validateItem({ type: 'set_issue_type', issue_type: 'Bug', …
</details>
<details><summary>actions/setup/js/safe_output_type_validator.cjs:217</summary>
**[/codebase-design]** The JSDoc for `x-strip-on-error` is placed in the `@typedef` comment block for `FieldValidationConfig` but the runtime checks the flag via `validation["x-strip-on-error"]` at line 723. Because the hyphenated name cannot be a normal JS property key in the typed interface, two naming conventions now exist simultaneously — the JSON key `x-strip-on-error` and its Go mirror `StripOnError`. The JSDoc comment uses the JSON form, but downstream readers need to know the mapping.
…
</details>
<details><summary>actions/setup/js/safe_output_type_validator.test.cjs:395</summary>
**[/tdd]** The new label-strip tests cover `add_labels` but not `set_issue_type` or `set_issue_field` label entries. If those tools also accept a `labels` array with `confidence`/`rationale`, edge cases there are untested. If they do not, a comment clarifying that `validateIssueIntentLabels` is only called from `add_labels` would prevent future confusion.
<details>
<summary>💡 Clarification or additional test</summary>
If `validateIssueIntentLabels` is used only by `add_labels`, document that…
</details>There was a problem hiding this comment.
Review: Harden issue-intent confidence handling
Overall: The fix is correct, well-scoped, and safe. The root cause (model occasionally sending numeric confidence) is real, and the x-strip-on-error mechanism is a clean, extensible pattern to handle optional enrichment fields leniently.
What's good:
- The Go ↔ JS config round-trip is consistent:
StripOnError boolserialises to"x-strip-on-error"and the JS validator reads it with bracket access. - New tests cover all four cases (numeric string
"0.95", raw number0.9, non-string rationale, non-string confidence) across both top-level fields and label-level fields. - The description suffix change is minimal and unambiguous.
validateIssueIntentLabelsis hardened inline (strip instead of reject) to match the new flag semantics.
Minor suggestions (non-blocking):
Two inline comments filed — both are style/readability improvements with no correctness impact.
🧵 Reviewed using Impeccable skills by Impeccable Skills Reviewer · 62.9 AIC · ⌖ 6.39 AIC · ⊞ 4.9K
| const rationale = normalizeIssueIntentRationale(label.rationale, options); | ||
| if (rationale) { | ||
| normalizedLabel.rationale = rationale; | ||
| // Strip non-string rationale instead of rejecting (optional enrichment field) |
There was a problem hiding this comment.
The empty-body if branch for non-string rationale is intentional but easy to misread as a coding error. Consider inverting the condition to make the "strip and continue" intent explicit:
if (typeof label.rationale === 'string') {
const rationale = normalizeIssueIntentRationale(label.rationale, options);
if (rationale) normalizedLabel.rationale = rationale;
}
// non-string rationale silently stripped (optional enrichment field)This avoids the empty if body and makes the stripping behaviour self-evident.
@copilot please address this.
| errors.push(result.error); | ||
| // When x-strip-on-error is set, strip the invalid optional field instead of rejecting the item. | ||
| // This is used for enrichment-only fields like confidence and rationale. | ||
| if (validation["x-strip-on-error"]) { |
There was a problem hiding this comment.
The x-strip-on-error flag is accessed with bracket notation (validation["x-strip-on-error"]) here, but the JSDoc typedef (line 217) documents it as a property named x-strip-on-error. Bracket access is the only valid approach given the hyphen in the name — that's fine — but the typedef should explicitly note that the hyphen makes dot-access impossible, to avoid confusion:
/**
* `@property` {boolean} ['x-strip-on-error'] - ...
* (Bracket access only due to hyphen: validation['x-strip-on-error'])
*/Also consider whether a camelCase alias (e.g. stripOnError) in the JS config would be cleaner than a hyphenated key — the Go struct already uses StripOnError with json:"x-strip-on-error" for serialisation.
@copilot please address this.
There was a problem hiding this comment.
Two medium-priority maintainability concerns; no blocking correctness bugs found
The stripping-instead-of-rejecting approach is sound for the immediate problem (numeric confidence values silently dropping valid tool calls). The Go struct addition is clean and the test coverage is adequate for the new paths.
Two issues worth addressing before the pattern proliferates:
-
x-strip-on-errorhas no guard against co-occurring withrequired(line 716) — currently safe, but the code will silently pass structurally invalid items if someone combines both flags in future config. A one-line guard or an assertion in tests would close the gap permanently. -
Two independent stripping paths —
validateIssueIntentLabelshardcodes stripping for label sub-fields whilevalidateItemuses the schema flag. They work correctly today but will require parallel maintenance for any future behavioural change (e.g. adding a warning-on-strip log). This should be documented or unified.
🔎 Code quality review by PR Code Quality Reviewer · 110.3 AIC · ⌖ 6.53 AIC · ⊞ 5.4K
Comment /review to run again
| errors.push(result.error); | ||
| // When x-strip-on-error is set, strip the invalid optional field instead of rejecting the item. | ||
| // This is used for enrichment-only fields like confidence and rationale. | ||
| if (validation["x-strip-on-error"]) { |
There was a problem hiding this comment.
x-strip-on-error silently suppresses required-field failures with no guard against misconfiguration. If a future field is ever marked both required: true and x-strip-on-error: true, this branch swallows the required-field error and passes a structurally invalid item — with no runtime warning.
💡 Details and suggested fix
validateField returns isValid: false for required-field missing errors. The new x-strip-on-error branch catches that return path identically to a type-mismatch error. Nothing prevents a config author from combining both flags in the future.
Add an explicit guard at the call site:
if (validation["x-strip-on-error"]) {
if (validation.required) {
// required-field failures are hard errors regardless of x-strip-on-error
errors.push(result.error);
} else {
delete normalizedItem[fieldName];
}
} else {
errors.push(result.error);
}Alternatively, add a config-time assertion (e.g., in tests) that no field is simultaneously required and x-strip-on-error. The current config is safe, but the code provides no structural defence against future drift.
| const rationale = normalizeIssueIntentRationale(label.rationale, options); | ||
| if (rationale) { | ||
| normalizedLabel.rationale = rationale; | ||
| // Strip non-string rationale instead of rejecting (optional enrichment field) |
There was a problem hiding this comment.
Two parallel stripping paths for label sub-fields (validateIssueIntentLabels) and item-level fields (validateItem) are now structurally independent and will diverge. The hardcoded stripping here is not driven by the x-strip-on-error schema flag, so the flag set on labels items' confidence/rationale entries in the config has no effect on this code path.
💡 Details
validateIssueIntentLabels strips invalid rationale (line 150) and confidence (line 174) unconditionally via hardcoded logic. The generic validateItem path (line 716) strips based on the x-strip-on-error flag in the field schema.
These two paths are already inconsistent: any future change to the stripping behaviour — adding a log/warning on strip, or making stripping conditional on an option — will need to be applied in two places. The test at line 119 of the test file marks the labels array with x-strip-on-error: true, but validateIssueIntentLabels is the actual runtime code path for label validation and never checks that flag.
If the intent is a single authoritative strip-on-error mechanism, the hardcoded stripping in validateIssueIntentLabels should be made configurable (e.g. pass an options flag or read from the schema), or the JSDoc/comment should explicitly note that these are separate paths with separate maintenance requirements.
|
@copilot run pr-finisher skill |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…el stripping Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Ran the pr-finisher pass and pushed fixes in Local validation is passing ( |
The agent intermittently sends
confidenceas a numeric value (0.9,"0.95") instead of the required enum string. The validator was rejecting the entire message on this optional field, silently dropping valid tool calls.Changes
Fix 1 — Clearer INTENT description suffix (
generate_safe_outputs_tools.cjs)Explicitly states both type and allowed values to reduce model confusion:
Fix 2 —
x-strip-on-errorlenient validationIntroduces
x-strip-on-errortoFieldValidation(following the existingx-extension convention likex-synonyms). When set, a failing optional enrichment field is stripped from the item rather than rejecting the whole message.StripOnError bool \json:"x-strip-on-error,omitempty"`added to GoFieldValidation` structconfidenceandrationaleinset_issue_type/set_issue_fieldmarked withx-strip-on-error: truevalidateIteminsafe_output_type_validator.cjsdeletes the field instead of accumulating an error when the flag is setvalidateIssueIntentLabelsmade lenient for label-levelconfidence(non-enum values stripped) andrationale(non-string values stripped)Net effect: a
set_issue_typeorset_issue_fieldcall with"confidence": 0.9now passes validation with the bad field dropped, rather than being silently lost.