Skip to content

Support for PARO quantized models #73

@sangemaru

Description

@sangemaru

Problem

Regular quant options degrade model accuracy/fidelity.

Proposal

Pairwise rotation quantization implements INT4 quantization at near-BF16 fidelity in MLX.
Candidate models: https://huggingface.co/z-lab/Qwen3.6-27B-PARO and https://huggingface.co/z-lab/Qwen3.6-35B-A3B-PARO

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions