Add JANG model loader integration#212
Conversation
Add JANG model loader integration
|
Validation update:
|
|
Validation update:
|
|
Final validation update:
|
|
Performance/streaming update:
|
2f48ce6 to
0ee615b
Compare
# Conflicts: # vllm_mlx/routes/chat.py
ea128df to
9b0bb10
Compare
|
Hi @samuelfaj — thanks for the work. Applying our new SOP §0 necessity gate (see docs/development/pr_merge_sop.md) I need a demand signal before merging. Holding for clarification, not closing yet. Reasoning:
To unlock merge, I need one or more of:
For now please rebase on top of latest Apologies for the friction — the necessity gate is new this week and I'm working through the backlog. Your #204 (Qwen tool-call fix) is being reviewed now since it has clear user value. |
|
Thanks for putting this together. Two requests before review: (1) Please split this into independent PRs. The diff is +4007 LOC across 27 files but the title scopes it to the JANG loader. The JANG-loader part is a coherent change on its own:
The TUI ( (2) Verify the JANG import path. The PR imports
Happy to review the loader-only PR once it's split out — that part looks reasonable on first read. |
Summary
jang_config.jsonbefore the vendored architecture fallback.jang_tools.load_jangtq.load_jangtq_modeland standard JANG models throughjang_tools.loader.load_jang_model.rapid-mlx[jang]dependency extra and regression tests for JANGTQ, JANG v2, and normal DeepSeek V4 fallback behavior.jang-toolsdoes not fall through Transformers AutoConfig for the vendoreddeepseek_v4architecture.Root cause
DeepSeek V4 JANGTQ bundles declare
weight_format: mxtqand store routed experts astq_packed/tq_normstensors. The existing loader treated them like normal DeepSeek V4 MLX weights, somlx_lm.load_modelrejected thousands of unexpected JANGTQ parameters. During live validation,jang-toolsalso hit a DSV4 tokenizer/EOS expansion path that calls Transformers AutoConfig; the wrapper now patches that call for DSV4 JANGTQ to loadtokenizer.jsondirectly.Validation
uv run --extra dev --extra jang python -m pytest tests/test_jangtq_loader.py tests/test_deepseek_v4_vendored.py -quv run --extra dev ruff check pyproject.toml vllm_mlx/utils/tokenizer.py tests/test_jangtq_loader.pyuv run --extra jang python - <<'PY' ... import jang_tools ... PYDeepSeek-V4-Flash-JANGTQdetected asweight_format=mxtq,profile=JANGTQ2.