Summary
Qwen3-VL-Embedding (multimodal vision-language embedding model) crashes during embedding generation on oMLX v0.3.8. The server application stays alive, but the model crashes internally, returning empty responses.
Environment
Steps to Reproduce
- Download Qwen3-VL-Embedding via oMLX
- Send embedding request via OpenAI-compatible
/v1/embeddings endpoint
- Model crashes during embedding generation
Expected Behavior
Qwen3-VL-Embedding should generate embeddings (same as Qwen3-Embedding, which works perfectly).
Actual Behavior
curl: (18) transfer closed with outstanding read data remaining
- Response is empty
- Model crashes internally during embedding generation
- oMLX application process stays alive (health endpoint still responds)
- Only the embedding API endpoint returns errors
Comparison
| Model |
Type |
Status |
| Qwen3-Embedding-8B-mxfp8 |
Text-only embedding |
✅ Works perfectly |
| Qwen3-VL-Embedding |
Multimodal VL embedding |
❌ Crash on embed |
Impact
Cannot use multimodal embeddings for applications that need to embed both text and images (e.g., visual memory search, image-text cross-modal retrieval). This blocks the upgrade path to VL-capable embedding models on oMLX.
Logs / Debug Info
- Swagger UI: transfer closed error
- curl:
transfer closed with outstanding read data remaining
- No visible error in health endpoint — application remains healthy
- Suggests model-level crash inside the embedding pipeline, not a server-level crash
Summary
Qwen3-VL-Embedding (multimodal vision-language embedding model) crashes during embedding generation on oMLX v0.3.8. The server application stays alive, but the model crashes internally, returning empty responses.
Environment
Steps to Reproduce
/v1/embeddingsendpointExpected Behavior
Qwen3-VL-Embedding should generate embeddings (same as Qwen3-Embedding, which works perfectly).
Actual Behavior
Comparison
Impact
Cannot use multimodal embeddings for applications that need to embed both text and images (e.g., visual memory search, image-text cross-modal retrieval). This blocks the upgrade path to VL-capable embedding models on oMLX.
Logs / Debug Info
transfer closed with outstanding read data remaining