Skip to content

Python Parity#2

Merged
skryl merged 20 commits intomainfrom
codex/parity
Feb 27, 2026
Merged

Python Parity#2
skryl merged 20 commits intomainfrom
codex/parity

Conversation

@skryl
Copy link
Owner

@skryl skryl commented Feb 26, 2026

No description provided.

claude and others added 20 commits February 25, 2026 18:15
Update mlx-ruby submodule to 476f721 (main) which includes:
- mx.array() dtype: keyword argument support (Issue #1)
- mx.mean() keepdims: parameter (Issue #2)
- Array#coerce for numeric-left ops (Issues #7, #8)
- update_modules_impl Module→Hash recursion fix (Issue #12)
- MLX_BUILD_SAFETENSORS=ON by default (Issue #5)

Update mlx-onnx submodule to 128d7de which adds GreaterEqual
ONNX lowering (Issue #14a). All 11 dense models now export to
ONNX with 100% node coverage.

Workarounds removed across lib/ and test/:
- .array(X).astype(T) → .array(X, dtype: T) in 50+ locations
- expand_dims(mean(x, axis), -1) → mean(x, axis, keepdims: true)
- Local update_modules_impl patch (now in upstream)

PRD updated to mark 8 issues as resolved, with remaining open
issues documented.

https://claude.ai/code/session_01BTKLwVSDwcZ7bcG4StMxUZ
Port SwitchLinear and SwitchGLU from Python mlx-lm switch_layers.py,
replacing per-token tolist routing loops in mixtral.rb and deepseek.rb
with batched gather_mm operations. This eliminates the SIGILL crashes
during ONNX tracing (data-dependent control flow from tolist) and
brings MoE models to 95-98% ONNX node coverage.

- Add lib/mlx_lm/models/switch_layers.rb with SwitchLinear and SwitchGLU
- Refactor mixtral.rb SparseMoeBlock to use SwitchGLU
- Refactor deepseek.rb DeepseekMoE and MoEGate to use SwitchGLU
- Add sanitize methods for stacking per-expert weights on load

https://claude.ai/code/session_01BTKLwVSDwcZ7bcG4StMxUZ
Exhaustive class-by-class comparison of upstream Python mlx-lm (v0.30.7,
~130 files, ~575 classes) against the Ruby port (33 files, 13 architectures).
Covers all 11 categories: models, generation, sampling, KV cache, tokenizer,
quantization, tuner, CLI/server, tool parsers, chat templates, and utilities.

Current coverage: 13/107 model architectures (12%), core inference pipeline
fully functional. Biggest gaps: 94 missing models, training pipeline, batch
generation, HF Hub integration, advanced quantization, tool calling.

https://claude.ai/code/session_01BTKLwVSDwcZ7bcG4StMxUZ
Adds a detailed 9-phase implementation plan covering:
- Phase 1 (A-E): Shared infrastructure (RoPE variants, extended cache, MLA, SSM, gated delta, etc.)
- Phases 2-7: All ~96 missing model architectures grouped by dependency/complexity
- Phase 8 (A-C): HuggingFace Hub integration (download, save, upload)
- Phase 9 (A-E): ONNX export support for all model architectures

Phase 9 leverages the existing onnx_export_test.rb subprocess-based testing
infrastructure, which auto-generates compat and export tests from TINY_CONFIGS
entries. Notes that ArgPartition and GatherMM are now supported upstream in
mlx-onnx (commit 33d4b2e), resolving the MoE compat report false positives.

https://claude.ai/code/session_01BTKLwVSDwcZ7bcG4StMxUZ
- Delete prd/conversion_plan.md (original 12-phase plan, now superseded)
- Create prd/implementation_plan.md with 9-phase execution plan:
  Phase 1: Shared infra (RoPE, cache, MLA, SSM, gated delta, etc.)
  Phase 2-7: All ~96 missing model architectures by dependency group
  Phase 8: HuggingFace Hub integration
  Phase 9: ONNX export validation for all ~109 architectures
- Each phase includes a native library gap report checkpoint to
  surface missing mlx-ruby/mlx-onnx functionality before proceeding
- Move detailed phase content from parity checklist to implementation
  plan; checklist now references the plan document

https://claude.ai/code/session_01BTKLwVSDwcZ7bcG4StMxUZ
@skryl skryl merged commit b80517d into main Feb 27, 2026
2 checks passed
@skryl skryl deleted the codex/parity branch February 27, 2026 04:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants