Add deepseek_dsml tool parser for DeepSeek-V4's native DSML format#1337
Open
snagnever wants to merge 1 commit into
Open
Add deepseek_dsml tool parser for DeepSeek-V4's native DSML format#1337snagnever wants to merge 1 commit into
snagnever wants to merge 1 commit into
Conversation
DeepSeek-V4 (Flash/Pro) emits tool calls in its native DSML format: <|DSML|tool_calls> <|DSML|invoke name="get_weather"> <|DSML|parameter name="city" string="true">Paris</|DSML|parameter> </|DSML|invoke> ... </|DSML|tool_calls> with multiple <|DSML|invoke> per block (native parallel calls). string="true" means a literal string value, string="false" a JSON value. Adds mlx_lm/tool_parsers/deepseek_dsml.py (modeled on minimax_m2) and an _infer_tool_parser entry so the official DSML chat template auto-selects it. The start/end markers use the "<|DSML|tool_calls" prefix (dropping the trailing ">"): mlx-lm matches markers by token-id sequence and the ">" merges with the following byte on this tokenizer (same class of issue as ml-explore#1335); the parser extracts the invokes/parameters regardless of the leftover ">". Verified on mlx-community/DeepSeek-V4-Flash-2bit-DQ: 0 -> 39/40 (98%) on a jdhodges-style tool suite (the no-tool-template baseline scored 0), 8/8 on parallel multi-tool cases. Adds tests (single, parallel, mixed string/JSON). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a tool parser for DeepSeek-V4 (Flash/Pro)'s native DSML tool-call format, so tool calling works for these models through the OpenAI-compatible server.
Why
DeepSeek-V4 emits tool calls in its own DSML format — not Hermes
<tool_call>JSON or any of the existing parsers' formats — so there is currently notool_parser_typethat parses its output, and tool calls come back as plain content.Format
<|DSML|invoke>per block → native parallel calls.string="true"→ literal string value;string="false"→ JSON value (number/bool/array/object).Change
mlx_lm/tool_parsers/deepseek_dsml.py— new parser (modeled onminimax_m2)._infer_tool_parser— one entry so the official DSML chat template (deepseek-ai/DeepSeek-V4-Flash#16) auto-selects it; no manualtool_parser_typeneeded.Marker note
The start/end markers use the
<|DSML|tool_callsprefix (dropping the trailing>). mlx-lm matches markers by exact token-id sequence, and on this tokenizer the>merges with the following byte (...calls>\n→ one token), so the full marker never matches — the same class of issue as #1335. The|DSML|core is a special token, so the prefix is a stable anchor; the parser tolerates the leftover>before the first<|DSML|invoke>.Verification
On
mlx-community/DeepSeek-V4-Flash-2bit-DQ(no tool template → tools dropped → 0 tool calls as a baseline), adding the official DSML template + this parser: 39/40 (98%) on a jdhodges-style tool suite and 8/8 on the parallel multi-tool cases — competitive with full-size locals.python -m unittest tests.test_tool_parsingpasses (existing parsers unaffected; this is purely additive).Pairs with the official template PR #16. Independent of #1335/#1336 (different module), though it uses the same prefix-marker insight.