Skip to content

fix: treat lfm2 as ambiguous model type (LLM + embedding + audio)#1036

Open
unstoppablesssss wants to merge 1 commit into
jundot:mainfrom
unstoppablesssss:fix/lfm2-ambiguous-model-type
Open

fix: treat lfm2 as ambiguous model type (LLM + embedding + audio)#1036
unstoppablesssss wants to merge 1 commit into
jundot:mainfrom
unstoppablesssss:fix/lfm2-ambiguous-model-type

Conversation

@unstoppablesssss
Copy link
Copy Markdown

Problem

LFM2.5 Instruct models (e.g. LFM2.5-1.2B-Instruct-MLX-4bit) are misdetected as embedding models and fail on /v1/chat/completions requests.

Root cause: lfm2 is listed in EMBEDDING_MODEL_TYPES, so any model with model_type: lfm2 is unconditionally classified as embedding — even when the architecture is Lfm2ForCausalLM (a standard LLM).

LFM2 family variants

Model model_type Architecture Expected type
LFM2.5-1.2B-Instruct lfm2 Lfm2ForCausalLM llm
LFM2.5-1.2B-Thinking lfm2 Lfm2ForCausalLM llm
LFM2 embedding lfm2 BertModel etc. embedding
LFM2 audio (STS) lfm_audio LFM2AudioModel audio_sts

Fix

  1. Move lfm2 from EMBEDDING_MODEL_TYPES → AMBIGUOUS_EMBEDDING_MODEL_TYPES (same pattern as qwen3, gemma3-text)
  2. Narrow the LFM2 audio catch-all from startswith(lfm) to exact match lfm_audio to prevent LLM variants from being misclassified as audio_sts

Verification

Tested locally on oMLX 0.3.7 with LFM2.5-1.2B-Instruct-MLX-4bit:

  • Before: Discovered model: LFM2.5-1.2B-Instruct-MLX-4bit (type: embedding, engine: embedding)
  • After: Discovered model: LFM2.5-1.2B-Instruct-MLX-4bit (type: llm, engine: batched)

Chat completions now work correctly.

…ants)

lfm2 was unconditionally classified as embedding model type, causing
LFM2.5 Instruct models (model_type: lfm2, arch: Lfm2ForCausalLM) to be
misdetected as embedding engines and fail on chat/completions requests.

LFM2 family has multiple variants:
- LFM2.5-1.2B-Instruct → LLM (CausalLM architecture)
- LFM2 embedding models → embedding
- LFM2 audio models → audio_sts (model_type: lfm_audio)

Move lfm2 from EMBEDDING_MODEL_TYPES to AMBIGUOUS_EMBEDDING_MODEL_TYPES
so architecture-based disambiguation applies (same pattern as qwen3,
gemma3-text). Also narrow the LFM2 audio catch-all from startswith('lfm')
to exact match on 'lfm_audio' to avoid misclassifying LFM2 LLM variants.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant