Skip to content

feat(contract)!: tool-call parse-failure + truncation diagnostics#1875

Merged
roryford merged 1 commit into
mainfrom
feat/tool-call-observability
Jun 14, 2026
Merged

feat(contract)!: tool-call parse-failure + truncation diagnostics#1875
roryford merged 1 commit into
mainfrom
feat/tool-call-observability

Conversation

@roryford

Copy link
Copy Markdown
Owner

What

Adds two non-fatal GenerationEvent diagnostics so hosts can observe tool calls that previously vanished silently inside ToolCallTransform (ManifoldContract).

#1857.toolCallParseFailed(rawBody:)

When a well-formed open/close marker pair surrounds a body the dialect parser rejects (parseBody returns nil), the transform now emits a non-fatal diagnostic carrying the raw body instead of dropping the call with no event. Hosts can finally distinguish "model emitted a broken tool call" from "model emitted no tool call" and recover/report.

#1858.toolCallTruncated(rawBody:) (opt-in)

ToolCallTransform.finalize() (and the body-size cap) discarded an open-but-unclosed tool block including all buffered body text, so a truncated tool call disappeared silently. New opt-in seam:

ToolCallTransform(markers: markers, surfaceTruncatedToolBody: true)

Default is falsedefault behavior is unchanged (still drops silently). When enabled, the partial body is surfaced as a .toolCallTruncated(rawBody:) diagnostic so a mid-stream truncation is observable.

Both cases follow the existing throttleDiagnostic(reason:) precedent: advisory metadata, Sendable/Equatable String payloads, no chat-message state mutation.

Freeze interaction

GenerationEvent carries a "Vocabulary freeze (1.0)" header. We are pre-1.0 (0.51), so completing the vocabulary is appropriate feat!: work. Done with full freeze hygiene:

  • Freeze-doc header updated to list both new cases as part of the frozen vocabulary.
  • Every exhaustive switch over GenerationEvent across Sources/ and Tests/ updated (12 sites): GenerationStreamConsumer, EventRecorder, ScenarioRunner, the APIFreezeTests/BackendSeamConsumer freeze fixture, and 8 backend/contract test switches (ToolCallContractTests ×2, ParallelToolCallOrderingTests, Claude/Ollama/OpenAI/OpenAIResponses stream-extractor eventKey switches, CloudThinkingTokenTests, OpenAIResponsesBackendTests, OllamaToolCallLiveReplayTests). Core-internal switches stay exhaustive (new arms added); none introduced a new @unknown default convention.
  • .github/api-breakage-allowlist.txt: two enumelement ... has been added as a new enum case lines (matching the generationCompleted precedent) plus a line for the ToolCallTransform.init(markers:) signature change (added a defaulted parameter; existing markers:-only callers still compile). Digester run locally → exit 0.

Tests

In OutputParserSessionTests:

Sabotage checks were used during development and removed before commit.

Verification

  • swift build --build-tests — clean
  • swift build --build-tests --traits Server,Macros — clean (switched-enum trait sweep)
  • OutputParserSessionTests slice: 18/18 pass
  • API digester (ManifoldContract): exit 0 with updated allowlist

BREAKING CHANGE: GenerationEvent gains .toolCallParseFailed(rawBody:) and .toolCallTruncated(rawBody:); exhaustive switches over GenerationEvent without a default/@unknown default arm must add handling for the new cases.

Resolves #1857
Resolves #1858

@roryford roryford marked this pull request as ready for review June 14, 2026 21:34
@roryford

Copy link
Copy Markdown
Owner Author

Orchestrator review (adversarial): approve. Freeze handling is correct — both new cases (toolCallParseFailed/toolCallTruncated) follow the throttleDiagnostic precedent, the freeze-doc header is updated to list them, and the allowlist has bare API breakage: lines (no # comments) incl. the source-compatible init(markers:) defaulted-param signature change. The switch-site sweep is complete by compiler-truth — both swift build --build-tests and --traits Server,Macros compile, and Swift enforces exhaustiveness, so a missed site couldn't build. #1857 is always-on (per the issue); #1858 is opt-in via surfaceTruncatedToolBody (default false → silent-discard unchanged), gating both the body-cap drop and the finalize() flush. Good catch that the brief's GenerationEventClosedAuditTest premise was wrong (it's ConversationEvent/RunEvent disjointness) — the real freeze fixture BackendSeamConsumer.swift was updated instead. Merging on green.

@roryford roryford changed the title feat!(contract): tool-call parse-failure + truncation diagnostics feat(contract)!: tool-call parse-failure + truncation diagnostics Jun 14, 2026
Add two non-fatal GenerationEvent diagnostics so hosts can observe tool
calls that previously vanished silently in ToolCallTransform:

- #1857 .toolCallParseFailed(rawBody:): when a delimited open/close marker
  pair surrounds a body the dialect parser rejects (parseBody returns nil),
  emit a diagnostic carrying the raw body instead of dropping the call with
  no event. Hosts can now distinguish "broken tool call" from "no tool call".
- #1858 .toolCallTruncated(rawBody:): opt-in via
  ToolCallTransform(markers:surfaceTruncatedToolBody:) (default false, so
  default behavior is unchanged). When enabled, finalize() and the body-size
  cap surface the buffered partial body of an unterminated tool block so a
  mid-stream truncation is observable rather than silently discarded.

Both follow the throttleDiagnostic(reason:) precedent — advisory metadata,
Sendable/Equatable String payloads, no chat-message state mutation.

Freeze hygiene: the GenerationEvent "Vocabulary freeze (1.0)" header is
updated to list the new cases. Every exhaustive switch over GenerationEvent
across Sources/ and Tests/ gains the new arms (12 sites: GenerationStream-
Consumer, EventRecorder, ScenarioRunner, the APIFreeze BackendSeamConsumer
freeze fixture, and 8 backend/contract test switches). api-breakage-allowlist
gains the two new-enum-case lines plus the ToolCallTransform.init signature
change (defaulted param; existing markers: callers still compile). Digester
passes locally with exit 0.

BREAKING CHANGE: GenerationEvent gains .toolCallParseFailed(rawBody:) and
.toolCallTruncated(rawBody:); exhaustive switches over GenerationEvent without
a default/@unknown default arm must add handling for the new cases.

Resolves #1857
Resolves #1858

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@roryford roryford force-pushed the feat/tool-call-observability branch from 6c7cef8 to fe3461d Compare June 14, 2026 21:37
@roryford roryford merged commit 578a892 into main Jun 14, 2026
11 checks passed
@roryford roryford deleted the feat/tool-call-observability branch June 14, 2026 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant