You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thinking model disabled assistant prefill (ggml-org#15404)
* feat: Set enable_thinking IFF not disabled and supported
Branch: gabe-l-hart/thinking-model-disabled-agent-prefill
Signed-off-by: Gabe Goodhart <[email protected]>
* fix: Fix inverted logic condition for prefill error
Branch: gabe-l-hart/thinking-model-disabled-agent-prefill
Signed-off-by: Gabe Goodhart <[email protected]>
* fix: Always parse the enable_thinking kwarg to overwrite the default value
From what I can tell, this started as a Qwen3-specific keyword, but from
the use in `chat.cpp` translates this inputs.enable_thinking to the right
thinking kwarg for the given model, this is now more of a standardized
kwarg, so it should always override the default value when sent as part of
the chat_template_kwargs field in the API.
Branch: gabe-l-hart/thinking-model-disabled-agent-prefill
Signed-off-by: Gabe Goodhart <[email protected]>
* fix: Don't limit tempalte expansion check to jinja
With the use_jinja check, non-jinja models would enable thinking and always
fail assistant prefill
Branch: gabe-l-hart/thinking-model-disabled-agent-prefill
Signed-off-by: Gabe Goodhart <[email protected]>
* feat: Add the error text to json type errors in json_value
Branch: gabe-l-hart/thinking-model-disabled-agent-prefill
Signed-off-by: Gabe Goodhart <[email protected]>
* feat: Explicitly reject string values for "enable_thinking"
There are too many possible "truthy" / "falsy" strings and too many
ambiguous strings that don't have a clear truthy/falsy value, so the
simplest thing to do here is to reject the request. Ideally, this would be
a 422 (Unprocessable Entity), but right now it's coming back as a 500.
Branch: gabe-l-hart/thinking-model-disabled-agent-prefill
Signed-off-by: Gabe Goodhart <[email protected]>
* refactor: Move logic for detecting template enable_thinking support to common
Branch: gabe-l-hart/thinking-model-disabled-agent-prefill
Signed-off-by: Gabe Goodhart <[email protected]>
* fix: Use raw pointer for common chat template function
Branch: gabe-l-hart/thinking-model-disabled-agent-prefill
Signed-off-by: Gabe Goodhart <[email protected]>
---------
Signed-off-by: Gabe Goodhart <[email protected]>
0 commit comments