Description
Some thinking models (eg. MiniMax M2.5 2.7) rely on their own <think>...</think> blocks being present in historical assistant messages - it's how they were trained to reason. OpenFang drops them. We are using openAI connector to a local version of minimax to test, and this is confirmed.
What happens
- The OpenAI-compat driver captures
reasoning_content on ingress — you can see it in the logs: "Captured reasoning_content from response".
- The session-message schema stores only
{role, content}. The reasoning is never persisted.
- Inline
<think>...</think> in content (what a proxy emits when it merges reasoning back in as a workaround) is also stripped before storage.
- On the next turn, the outbound messages array is content-only. The model doesn't see its own prior thinking and quality degrades.
Reproducible with any agent + any reasoning model + a wire capture on turn 2: the thinking arrives intact from the provider, but does not reach disk.
Fix requests
- Stop stripping inline
<think>...</think> from content on ingress. Strip at display time, never at storage time.
- Extend the session-message schema to persist
reasoning_content as an optional sibling to content.
- Re-emit it on outbound assistant turns — native shape where supported (OpenAI o-series, Claude thinking blocks, Gemini thought signatures), or inlined into
content wrapped in the model's native thinking tags otherwise.
- optional this should be optional for models for dont support it or people with smaller context windows, not sure the right way to implement optionality
Alternatives Considered
No response
Additional Context
No response
Description
Some thinking models (eg. MiniMax M2.5 2.7) rely on their own
<think>...</think>blocks being present in historical assistant messages - it's how they were trained to reason. OpenFang drops them. We are using openAI connector to a local version of minimax to test, and this is confirmed.What happens
reasoning_contenton ingress — you can see it in the logs: "Captured reasoning_content from response".{role, content}. The reasoning is never persisted.<think>...</think>incontent(what a proxy emits when it merges reasoning back in as a workaround) is also stripped before storage.Reproducible with any agent + any reasoning model + a wire capture on turn 2: the thinking arrives intact from the provider, but does not reach disk.
Fix requests
<think>...</think>fromcontenton ingress. Strip at display time, never at storage time.reasoning_contentas an optional sibling tocontent.contentwrapped in the model's native thinking tags otherwise.Alternatives Considered
No response
Additional Context
No response