Fix #1249: Handle unexpected 'strict' kwarg in mlx_lm load()#1297
Conversation
|
Thanks for the patch. The signature-guarded shape is the right direction for forward-compat. I want to hold the merge for now though, for two reasons:
My preference is to wait until both mlx-lm and mlx-vlm expose One small forward note when we do land it: |
…lm load() using fallback
40b3427 to
697b9a8
Compare
|
I completely agree with the approach! I have updated this PR to implement the "Option B" fallback strategy for both mlx_lm (omlx/engine/batched.py) and mlx_vlm (omlx/engine/vlm.py). It now attempts a standard load first, specifically catches a I have also opened the sibling PR on mlx-vlm as requested: Blaizzy/mlx-vlm#1198 Great news: Testing this updated logic locally, I can confirm it works perfectly end-to-end! I am now able to successfully load and run both gemma-4-e4b-it-OptiQ-4bit and the problematic Let me know if there's anything else you need before this is ready to merge once the upstream pins are bumped! |
Fixes #1249
Motivation:
In
v0.3.9.dev2,omlx/engine/batched.pyattempts to passstrict=Falsetomlx_lm.load()to tolerate extra weights from converted/abliterated models. However, the upstreammlx_lm.load()function does not currently accept astrictargument. This results in a 500 error:load() got an unexpected keyword argument 'strict'.Solution:
This PR uses
inspect.signatureto dynamically check if the installed version ofmlx_lmsupports thestrictargument before passing it. This ensures compatibility with both older versions ofmlx_lmand future versions where the argument is exposed (e.g., via the upstream PR I've opened on ml-explore/mlx-lm).