Skip to content

MTP FP16 for M1/M2 #43

@beamivalice

Description

@beamivalice

Problem

Just drop in to say that someone found out why M1/M2 is slower with MTP. here mlx-lm

Proposal

If you can provide a model in hugging face with fp16 mtp for m1/m2 selection, it will be a tremendous help.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions