Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions .claude/skills
58 changes: 58 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
name: CI

on:
push:
branches:
- master
pull_request:

jobs:
test:
name: Full Test Suite (${{ matrix.os }}, Ruby ${{ matrix.ruby }})
runs-on: ${{ matrix.os }}
timeout-minutes: 120
strategy:
fail-fast: false
matrix:
os:
- ubuntu-latest
- macos-14
ruby:
- "4.0.1"

steps:
- name: Checkout
uses: actions/checkout@v4
with:
submodules: recursive

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.12"
cache: pip
cache-dependency-path: requirements.txt

- name: Install Linux native dependencies
if: runner.os == 'Linux'
run: |
sudo apt-get update
sudo apt-get install -y libopenblas-dev liblapack-dev liblapacke-dev gfortran
if [ ! -f /usr/include/lapacke.h ] && [ -f /usr/include/x86_64-linux-gnu/lapacke.h ]; then
sudo ln -s /usr/include/x86_64-linux-gnu/lapacke.h /usr/include/lapacke.h
fi
echo "CMAKE_INCLUDE_PATH=/usr/include/x86_64-linux-gnu:/usr/include" >> "$GITHUB_ENV"

- name: Set up Ruby
uses: ruby/setup-ruby@v1
with:
ruby-version: ${{ matrix.ruby }}
bundler-cache: true

- name: Install test dependencies
run: |
bundle exec rake test:deps
echo "${GITHUB_WORKSPACE}/.venv-test/bin" >> "$GITHUB_PATH"

- name: Run all tests
run: bundle exec rake test
7 changes: 5 additions & 2 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,8 @@ source "https://rubygems.org"

gemspec

gem "minitest", "~> 5.20"
gem "rake", "~> 13.0"
# Force CI/dependency resolution to use released mlx gem, not local submodule gemspecs.
gem "mlx", ">= 0.30.7.6", "< 1.0"

# Use local mlx-ruby submodule during development.
# gem "mlx", path: "mlx-ruby"
44 changes: 41 additions & 3 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,63 @@ PATH
remote: .
specs:
mlx-ruby-lm (0.30.7.1)
mlx (~> 0.1)
mlx (>= 0.30.7.5, < 1.0)
safetensors (~> 0.2)
tokenizers (~> 0.6)

GEM
remote: https://rubygems.org/
specs:
minitest (5.20.0)
rake (13.1.0)
minitest (5.27.0)
mlx (0.30.7.6)
ostruct (0.6.3)
rake (13.3.1)
safetensors (0.2.2-aarch64-linux)
safetensors (0.2.2-aarch64-linux-musl)
safetensors (0.2.2-arm64-darwin)
safetensors (0.2.2-x86_64-darwin)
safetensors (0.2.2-x86_64-linux)
safetensors (0.2.2-x86_64-linux-musl)
tokenizers (0.6.3-aarch64-linux)
tokenizers (0.6.3-aarch64-linux-musl)
tokenizers (0.6.3-arm64-darwin)
tokenizers (0.6.3-x86_64-darwin)
tokenizers (0.6.3-x86_64-linux)
tokenizers (0.6.3-x86_64-linux-musl)

PLATFORMS
aarch64-linux
aarch64-linux-musl
arm64-darwin
x86_64-darwin
x86_64-linux
x86_64-linux-musl

DEPENDENCIES
minitest (~> 5.20)
mlx (>= 0.30.7.6, < 1.0)
mlx-ruby-lm!
ostruct
rake (~> 13.0)

CHECKSUMS
minitest (5.27.0) sha256=2d3b17f8a36fe7801c1adcffdbc38233b938eb0b4966e97a6739055a45fa77d5
mlx (0.30.7.6) sha256=1bd1f6b944e990147fdbe2654ba2830f14dc9ff7dcb9c4c5a314c916b4b92d66
mlx-ruby-lm (0.30.7.1)
ostruct (0.6.3) sha256=95a2ed4a4bd1d190784e666b47b2d3f078e4a9efda2fccf18f84ddc6538ed912
rake (13.3.1) sha256=8c9e89d09f66a26a01264e7e3480ec0607f0c497a861ef16063604b1b08eb19c
safetensors (0.2.2-aarch64-linux) sha256=5b50146d50a76fe0395b7aef4d13a1da8fcad44e9cf0f5aead935d5d17fb04dd
safetensors (0.2.2-aarch64-linux-musl) sha256=d6dea4e4f5ca11cff8ba4c017382838df5d33d78f79fabd9a5e5e482aa6afd57
safetensors (0.2.2-arm64-darwin) sha256=19d77df47154038974f76a4e1bac2d778ea04ca2c49abcd5b9f9c0f1a899d10b
safetensors (0.2.2-x86_64-darwin) sha256=a1dc2b415f6ef35c8887b15a6f72c673f3b008455c33aa399154c0eabf5adbcd
safetensors (0.2.2-x86_64-linux) sha256=f447d3d3110a7592b521a23f58b0251283659b27f3700ab627ac6ba517fa04ff
safetensors (0.2.2-x86_64-linux-musl) sha256=0d52871f2b672485cda73bc94807bb6bd74409a33414fdc341950ceb88f76049
tokenizers (0.6.3-aarch64-linux) sha256=9d54a23f2e2246cc942d183af4549e3972b937d9b01f7a387cb146bf698eee84
tokenizers (0.6.3-aarch64-linux-musl) sha256=c178d8556769256857d77fb396f8ab004b29d058f59c620a2cfc56b01b501e27
tokenizers (0.6.3-arm64-darwin) sha256=29a6a5582dce106d846a906ee9e4254c12db45a3855c3ff6881d4be8be03e6b6
tokenizers (0.6.3-x86_64-darwin) sha256=4b71386cc08ceff5f86b448c74b2b297c00a280a1d502399b6cda23ef94e01fd
tokenizers (0.6.3-x86_64-linux) sha256=77a45cbde59daac33bdda1a74d45c18080478992a00ee7d898e7b8d15d0b3149
tokenizers (0.6.3-x86_64-linux-musl) sha256=a4b08c53bf0c8f7674c3abd03e013f0bb7c0c2457174b116c2872a37c64f0297

BUNDLED WITH
4.0.6
176 changes: 176 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
# mlx-ruby-lm

Ruby LLM inference toolkit built on the `mlx` gem.

## Included tools

### CLI

Executable: `mlx_lm`

Commands:

- `mlx_lm generate`
- `mlx_lm chat`
- `mlx_lm server`

Example:

```bash
mlx_lm generate --model /path/to/model --prompt "Hello"
```

### Ruby APIs

- `MlxLm::LoadUtils`: load model weights/config/tokenizer from model directory.
- `MlxLm::Generate`: token generation (`generate`, `stream_generate`, `generate_step`).
- `MlxLm::SampleUtils`: samplers and logits processors (`top_p`, `top_k`, repetition penalty).
- `MlxLm::ChatTemplate`: default/chatml prompt formatting.
- `MlxLm::Server`: OpenAI-compatible chat completion server (`/v1/models`, `/v1/chat/completions`).
- `MlxLm::Quantize`: model quantization/dequantization helpers.
- `MlxLm::Perplexity`: perplexity/log-likelihood helpers.
- `MlxLm::Benchmark`: simple generation throughput and model stats helpers.
- `MlxLm::Tuner`: LoRA adapters (`LoRALinear`, `LoRAEmbedding`, `apply_lora_layers`).
- `MlxLm::ConvertUtils`: dtype conversion and parameter/size utilities.

Minimal usage:

```ruby
require "mlx"
require "mlx_lm"

model, tokenizer = MlxLm::LoadUtils.load("/path/to/model")
text = MlxLm::Generate.generate(model, tokenizer, "Hello", max_tokens: 64)
puts text
```

## Included models

Current registry includes 106 `model_type` values.

Families covered include:

- Llama/Gemma/Qwen/Phi
- Mistral/Mixtral/Granite/Cohere
- DeepSeek/GLM/InternLM/Kimi
- Mamba/RWKV/Recurrent Gemma
- MoE variants (for example `*_moe`, `mixtral`, `jamba`, `granitemoe*`)
- Vision-language variants (for example `qwen*_vl`, `kimi_vl`, `pixtral`, `lfm2-vl`)

Registered `model_type` values:

```text
Klear
afm7
afmoe
apertus
baichuan_m1
bailing_moe
bailing_moe_linear
bitnet
cohere
cohere2
dbrx
deepseek
deepseek_v2
deepseek_v3
deepseek_v32
dots1
ernie4_5
ernie4_5_moe
exaone
exaone4
exaone_moe
falcon_h1
gemma
gemma2
gemma3
gemma3_text
gemma3n
glm
glm4
glm4_moe
glm4_moe_lite
glm_moe_dsa
gpt2
gpt_bigcode
gpt_neox
gpt_oss
granite
granitemoe
granitemoehybrid
helium
hunyuan
hunyuan_v1_dense
internlm2
internlm3
iquestloopcoder
jamba
kimi_k25
kimi_linear
kimi_vl
lfm2
lfm2-vl
lfm2_moe
lille-130m
llama
llama4
llama4_text
longcat_flash
longcat_flash_ngram
mamba
mamba2
mimo
mimo_v2_flash
minicpm
minicpm3
minimax
ministral3
mistral3
mixtral
nanochat
nemotron
nemotron-nas
nemotron_h
olmo
olmo2
olmo3
olmoe
openelm
phi
phi3
phi3small
phimoe
phixtral
pixtral
plamo
plamo2
qwen
qwen2
qwen2_moe
qwen2_vl
qwen3
qwen3_5
qwen3_5_moe
qwen3_moe
qwen3_next
qwen3_vl
qwen3_vl_moe
recurrent_gemma
rwkv7
seed_oss
smollm3
solar_open
stablelm
starcoder2
step3p5
telechat3
youtu_llm
```

To inspect the current registry from Ruby:

```ruby
require "mlx_lm"
puts MlxLm::Models::REGISTRY.keys.sort
```
36 changes: 36 additions & 0 deletions Rakefile
Original file line number Diff line number Diff line change
@@ -1,15 +1,51 @@
require "rake/testtask"
require_relative "tasks/onnx_report_task"
require_relative "tasks/parity_inventory_task"

VENV_DIR = File.expand_path(".venv-test", __dir__)
VENV_PYTHON = File.join(VENV_DIR, "bin", "python")
REQUIREMENTS_FILE = File.expand_path("requirements.txt", __dir__)

Rake::TestTask.new(:test) do |t|
t.libs << "test" << "lib"
t.test_files = FileList["test/**/*_test.rb"]
end

namespace :test do
desc "Install Python dependencies required by parity tests"
task :deps do
next unless File.exist?(REQUIREMENTS_FILE)

sh("python3 -m venv #{VENV_DIR}") unless File.exist?(VENV_PYTHON)
sh("#{VENV_PYTHON} -m pip install --upgrade pip")
sh("#{VENV_PYTHON} -m pip install -r #{REQUIREMENTS_FILE}")
end

Rake::TestTask.new(:parity) do |t|
t.libs << "test" << "lib"
t.test_files = FileList["test/parity/**/*_test.rb"]
end
end

namespace :parity do
desc "Regenerate the Python/Ruby parity inventory snapshot"
task :inventory do
ParityInventoryTask.run!
end

desc "Verify the parity inventory snapshot is up-to-date"
task :inventory_check do
next if ParityInventoryTask.run!(check: true)

raise "parity inventory snapshot is stale"
end
end

namespace :onnx do
desc "Run compat-only ONNX suite and generate report artifacts under test/reports"
task :report do
OnnxReportTask.run!
end
end

task default: :test
Loading