Skip to content

Fix #1249: Handle unexpected 'strict' kwarg in mlx_lm load()#1297

Open
zyguy wants to merge 1 commit into
jundot:mainfrom
zyguy:fix-1249-strict-load
Open

Fix #1249: Handle unexpected 'strict' kwarg in mlx_lm load()#1297
zyguy wants to merge 1 commit into
jundot:mainfrom
zyguy:fix-1249-strict-load

Conversation

@zyguy
Copy link
Copy Markdown

@zyguy zyguy commented May 18, 2026

Fixes #1249

Motivation:
In v0.3.9.dev2, omlx/engine/batched.py attempts to pass strict=False to mlx_lm.load() to tolerate extra weights from converted/abliterated models. However, the upstream mlx_lm.load() function does not currently accept a strict argument. This results in a 500 error: load() got an unexpected keyword argument 'strict'.

Solution:
This PR uses inspect.signature to dynamically check if the installed version of mlx_lm supports the strict argument before passing it. This ensures compatibility with both older versions of mlx_lm and future versions where the argument is exposed (e.g., via the upstream PR I've opened on ml-explore/mlx-lm).

@jundot
Copy link
Copy Markdown
Owner

jundot commented May 19, 2026

Thanks for the patch. The signature-guarded shape is the right direction for forward-compat. I want to hold the merge for now though, for two reasons:

  1. omlx currently pins mlx-lm at ed1fca4, which does not have strict in load(). So in the version oMLX ships today, this PR is a no-op — the if "strict" in load_sig.parameters branch never fires until I bump the pin after [mlx_lm] Expose 'strict' parameter in load() function ml-explore/mlx-lm#1284 lands.

  2. The model in the original report (supergemma4-e4b-abliterated-mlx) is model_type=gemma4, which routes through VLMBatchedEngine and calls mlx_vlm.utils.load, not mlx_lm.load. mlx-vlm does not expose strict on its load() either, so the abliterated-weight case will stay broken on the VLM path even after this PR. Same situation for mlx-embeddings.

My preference is to wait until both mlx-lm and mlx-vlm expose strict upstream, then land the omlx-side change in one go so issue #1249 actually closes end-to-end. Would you be up for opening a sibling PR against mlx-vlm? I am happy to coordinate.

One small forward note when we do land it: strict=False as a blanket default also silences missing/shape/type mismatches, not just extra-key ones. The Option B pattern from the issue (try strict=True, catch the "parameters not in model" ValueError specifically, then retry with strict=False + warning) keeps the safety net for everything except the abliterated case. Worth considering for the eventual merged version.

@zyguy zyguy force-pushed the fix-1249-strict-load branch from 40b3427 to 697b9a8 Compare May 19, 2026 16:07
@zyguy
Copy link
Copy Markdown
Author

zyguy commented May 19, 2026

I completely agree with the approach!

I have updated this PR to implement the "Option B" fallback strategy for both mlx_lm (omlx/engine/batched.py) and mlx_vlm (omlx/engine/vlm.py). It now attempts a standard load first, specifically catches a
ValueError containing "parameters not in model", checks for strict support via inspect, and retries with strict=False while logging a warning. This safely preserves the error reporting for actual missing weights
or shape mismatches.

I have also opened the sibling PR on mlx-vlm as requested: Blaizzy/mlx-vlm#1198

Great news: Testing this updated logic locally, I can confirm it works perfectly end-to-end! I am now able to successfully load and run both gemma-4-e4b-it-OptiQ-4bit and the problematic
supergemma4-e4b-abliterated-mlx without hitting any 500 errors. 🚀

Let me know if there's anything else you need before this is ready to merge once the upstream pins are bumped!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Model load fails with strict parameter mismatch for converted/abliterated models

2 participants