Fix LFM2.5 pythonic tool parser auto-detection#1333
Open
ChristianWeyer wants to merge 1 commit into
Open
Conversation
LFM2.5 renamed the chat-template delimiters from <|tool_list_start|>/ <|tool_list_end|> (original LFM2) to <|tool_call_start|>/<|tool_call_end|>, but _infer_tool_parser() only matched the old token. Loading any LFM2.5 checkpoint (e.g. LiquidAI/LFM2.5-8B-A1B, -8B-A1B-MLX-8bit, -1.2B-Thinking) left tokenizer.has_tool_calling False, so the server returned tool calls as raw <|tool_call_start|>[...]<|tool_call_end|> text in message.content with tool_calls = [] — clients see no tool call and the agent loop breaks. The pythonic parser already handles both formats correctly; only the auto-detection needed updating. Added a unit test for _infer_tool_parser covering both LFM2 and LFM2.5 template snippets plus the negative path. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
_infer_tool_parser()intokenizer_utils.pydetects the pythonic tool-callformat via the chat-template token
<|tool_list_start|>. LFM2.5 renamed thosedelimiters to
<|tool_call_start|>/<|tool_call_end|>(the inner pythoniccall format is unchanged), so the detection misses for every LFM2.5
checkpoint — including the official
LiquidAI/LFM2.5-8B-A1B,LiquidAI/LFM2.5-8B-A1B-MLX-8bit,LiquidAI/LFM2.5-1.2B-Thinking, etc.Symptom: loading any LFM2.5 model gives
tokenizer.has_tool_calling == False,so
mlx_lm.server's/v1/chat/completionsreturns tool calls as raw<|tool_call_start|>[func(arg='val')]<|tool_call_end|>text insidemessage.content, withtool_calls = []andfinish_reason = "stop".OpenAI-compatible clients (Mastra, LangChain, ai-sdk, …) see no tool call and
the agent loop breaks.
Fix
Match either token. The
pythonicparser already handles both formats — onlythe auto-detection needed updating.
Same root cause and fix shape as the parallel
llama.cppissueggml-org/llama.cpp#23838,
and the same class of token-rename detection lag that prompted the Gemma 4
auto-detect addition.
Test
Added
TestTokenizers.test_infer_tool_parserintests/test_tokenizers.pycovering:
<|tool_list_start|>) — still returns"pythonic"<|tool_call_start|>) — now returns"pythonic"(wasNone)Noneinput)This is a pure unit test against
_infer_tool_parser, so no HF download —fast and deterministic. Verified locally that the new test fails on
mainwithout the fix and passes with it.
End-to-end verified against the real
LiquidAI/LFM2.5-8B-A1B-MLX-8bittokenizer:
And via
mlx_lm.server/v1/chat/completionswith atoolspayload — afterthe fix, the response now correctly returns:
{ "finish_reason": "tool_calls", "message": { "tool_calls": [{ "function": {"name": "...", "arguments": "{\"query\": \"...\"}"}, "type": "function", "id": "..." }] } }All existing
tests/test_tool_parsing.pyandtests/test_tokenizers.pytestsstill pass; ran
black(25.1.0) andisort --profile=blackperCONTRIBUTING.md.