- **Regression introduced in:** 0.4.0 (also present in 0.4.1) - **Last working version:** 0.3.8 - **Model:** `Qwen3-VL-8B-Instruct-MLX-4bit` - **Error:** `There is no Stream(gpu, 1) in current thread` on every `POST /v1/chat/completions` - **SpecPrefill:** off - **Likely cause:** 0.4.0 VLM loader thread refactor breaks Qwen3-VL's secondary GPU stream
Qwen3-VL-8B-Instruct-MLX-4bitThere is no Stream(gpu, 1) in current threadon everyPOST /v1/chat/completions