MTP FP16 for M1/M2

### Problem

Just drop in to say that someone found out why M1/M2 is slower with MTP. [here mlx-lm](https://github.com/ml-explore/mlx-lm/pull/990#issuecomment-4410052832)



### Proposal

If you can provide a model in hugging face with fp16 mtp for m1/m2 selection, it will be a tremendous help.