fix(parsing): schema-aware tool-arg coercion for XML-style parsers by hallerite · Pull Request #28 · PrimeIntellect-ai/renderers

hallerite · 2026-05-13T13:54:14Z

Summary

XML-style chat templates (Qwen3.5, GLM-5/4.5, MiniMax-M2, Laguna) render tool-call argument values verbatim inside <arg_value>X</arg_value> (or <parameter>X</parameter>) tags with no quoting. A value of true and the string "true" produce identical wire bytes — without the tool schema, the parser has no signal to choose between them and defaults to json.loads, silently corrupting string args that happen to look like JSON. This PR plumbs the schema in and preserves the type.

Raised by Robin (Poolside) on #21.

Changes

API change — additive, backwards-compatible:

class Renderer(Protocol):
    def parse_response(
        self,
        token_ids: list[int],
        *,
        tools: list[ToolSpec] | None = None,   # ← new
    ) -> ParsedResponse: ...

When tools=None (the historical default), behavior is unchanged. When tools is supplied, the four XML-style parsers consult each parameter's declared JSON-schema type and preserve declared-string params verbatim. Matches vLLM / SGLang reference parsers (glm45_tool_parser.py, hermes_tool_parser.py).

Helpers added to renderers/parsing.py:

_build_param_type_index(tools) — accepts either flat ToolSpec or OpenAI envelope {"type":"function","function":{…}}, returns {tool_name: {param_name: schema_fragment}}.
_coerce_arg_value(text, schema) -> (value, used_json_fallback) — declared-string → text verbatim; anything else → try json.loads, fall back to text. The bool lets callers flag INVALID_JSON only on genuine parse failures (not on schema-driven string preservation).

Parsers updated (with tools plumbing): parse_qwen35, parse_glm, parse_minimax, parse_laguna_xs2.

Parsers that accept-but-ignore tools (their wire formats quote strings, so schema isn't needed): Qwen3 hermes, Qwen3-VL, DeepSeek-V3, Kimi K2, Kimi K2.5, Nemotron3, gpt-oss harmony, Default.

Client (renderers/client.py): generate() already accepted tools; now forwards it through to parse_response so HTTP callers get schema-aware parsing automatically.

Downstream callers

Anything outside this repo that calls renderer.parse_response(token_ids) needs a one-line update to opt into schema-aware parsing:

parsed = renderer.parse_response(completion_ids, tools=tools)

Existing callers keep working unchanged (default tools=None preserves the historical behavior).

Test outcome

Model	Parser shape	Before	After
`Qwen/Qwen3-8B`	hermes JSON	✅ 5/5	✅ 5/5
`moonshotai/Kimi-K2-Instruct`	section JSON	✅ 5/5	✅ 5/5
`Qwen/Qwen3.5-9B`	XML	❌ 0/5	✅ 5/5
`zai-org/GLM-5`	XML	❌ 0/5	✅ 5/5
`MiniMaxAI/MiniMax-M2.5`	XML	❌ 0/5	✅ 5/5
`poolside/Laguna-XS.2`	XML	❌ 0/5	✅ 5/5

Test plan

uv run pytest tests/test_tool_arg_type_preservation.py → 30 passed
uv run pytest tests/ (full suite) → 1067 passed, 49 skipped, 1 xfailed, 0 failed
uv run ruff check renderers/ tests/ clean
uv run ruff format --check renderers/ tests/ clean

🤖 Generated with Claude Code

Every XML-style tool parser (Qwen3.5, GLM, MiniMax) currently ``json.loads`` each ``<arg_value>`` body, so a string argument that happens to look like JSON ("true", "42", "[1,2,3]") round-trips as a bool / int / list. The hermes-JSON (Qwen3) and section-JSON (Kimi K2) parsers sidestep this because their wire format quotes strings; both serve as controls. These tests fail loudly against current main — 15 failures across the three XML parsers, 10 passes on the controls. They are the spec for the fix, not documentation of accepted behavior; CI stays red until ``parse_response`` becomes schema-aware (Robin PR #21). Laguna-XS.2 has the same bug and should be added to the matrix when PR #21 merges. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

LagunaXS2Renderer landed in #21; its parser has the same string-type corruption as the other XML parsers (5/5 cases fail). Count is now 20 failed, 10 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

XML-style chat templates (Qwen3.5, GLM-5/4.5, MiniMax-M2, Laguna) render tool-call argument values verbatim inside ``<arg_value>X</arg_value>`` (or ``<parameter>X</parameter>``) tags with no quoting. A value of ``true`` and the string ``"true"`` produce identical wire bytes; without the tool schema, the parser has no signal to choose between them and defaults to ``json.loads`` — silently corrupting string args that look like JSON. This adds ``tools: list[ToolSpec] | None = None`` to ``parse_response`` on the ``Renderer`` Protocol and every concrete renderer. When supplied, the four XML-style parsers (``parse_qwen35``, ``parse_glm``, ``parse_minimax``, ``parse_laguna_xs2``) consult each parameter's declared JSON-schema ``type`` and preserve declared-string params verbatim. Without ``tools``, behavior is unchanged. Two new helpers in ``parsing.py``: - ``_build_param_type_index`` — accepts either the flat ``ToolSpec`` shape or the OpenAI ``{"type":"function","function":{...}}`` envelope and returns ``{tool_name: {param_name: schema_fragment}}``. - ``_coerce_arg_value`` — returns ``(value, used_json_fallback)``; the bool is True only when ``json.loads`` was tried and raised, so the ``INVALID_JSON`` status fires only for genuine parse failures, not for schema-driven string preservation. The JSON-shaped parsers (Qwen3 hermes, Qwen3-VL, DeepSeek-V3, Kimi K2, Kimi K2.5, Nemotron3, gpt-oss harmony, Default) sidestep this bug because their wire format quotes strings; they accept the ``tools`` kwarg for API uniformity but ignore it. Matches the reference behavior of vLLM / SGLang's ``glm45_tool_parser.py`` and ``hermes_tool_parser.py``. Raised by Robin (Poolside) on PR #21. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

The in-repo SGLang HTTP client already had ``tools`` on its ``generate`` signature; plumb it through to the renderer's ``parse_response`` so XML-style parsers can use the schema-aware coercion path. Updates the ``_FakeRenderer`` test double to accept the kwarg and adds an assertion that the client actually forwards it. Downstream callers (e.g. verifiers) need the matching change on their side to opt into schema-aware parsing. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

hallerite and others added 4 commits May 13, 2026 13:51

Merge main: pick up LagunaXS2Renderer (#21)

3837056

test: add Laguna-XS.2 to the XML-parser failure matrix

26b92c8

LagunaXS2Renderer landed in #21; its parser has the same string-type corruption as the other XML parsers (5/5 cases fail). Count is now 20 failed, 10 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

hallerite changed the title ~~test: pin tool-arg string-type corruption in XML-style parsers~~ fix(parsing): schema-aware tool-arg coercion for XML-style parsers May 13, 2026

hallerite force-pushed the test/xml-tool-arg-type-preservation branch from 0a68898 to 1186398 Compare May 13, 2026 14:36

hallerite merged commit 4062040 into main May 13, 2026
6 checks passed

hallerite deleted the test/xml-tool-arg-type-preservation branch May 13, 2026 14:55

hallerite mentioned this pull request May 13, 2026

chore: bump renderers to 0.1.8.dev1 and migrate to ParsedToolCall API PrimeIntellect-ai/verifiers#1366

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(parsing): schema-aware tool-arg coercion for XML-style parsers#28

fix(parsing): schema-aware tool-arg coercion for XML-style parsers#28
hallerite merged 5 commits into
mainfrom
test/xml-tool-arg-type-preservation

hallerite commented May 13, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hallerite commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Downstream callers

Test outcome

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hallerite commented May 13, 2026 •

edited

Loading