Skip to content

New MTP models not supported #323

@danielringwald

Description

@danielringwald

Post from Google about new MTP models: https://blog.google/innovation-and-ai/technology/developers-tools/multi-token-prediction-gemma-4/

HF link to models: https://huggingface.co/collections/mlx-community/gemma-4-assistant-mtp

Simple script to download and prompt the model:

from mlx_lm import load, generate

model, tokenizer = load("mlx-community/gemma-4-31B-it-assistant-bf16")
messages = [{"role": "user", "content": "Explain quantum entanglement simply."}]
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
response = generate(model, tokenizer, prompt=prompt, max_tokens=512, verbose=True)

Resulting error: ValueError: Model type gemma4_assistant not supported.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions