51 14 138

Daniel Han-Chen

danielhanchen

https://unsloth.ai/

danielhanchen

AI & ML interests

None yet

Recent Activity

new activity about 12 hours ago

microsoft/phi-4:Suggested tokenizer changes by Unsloth.ai

posted an update about 17 hours ago

We fixed many bugs in Phi-4 & uploaded fixed GGUF + 4-bit versions! ✨ Our fixed versions are even higher on the Open LLM Leaderboard than Microsoft's! GGUFs: https://huggingface.co/unsloth/phi-4-GGUF Dynamic 4-bit: https://huggingface.co/unsloth/phi-4-unsloth-bnb-4bit You can also now finetune Phi-4 for free on Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Phi_4-Conversational.ipynb Read our blogpost for more details on bug fixes etc: https://unsloth.ai/blog/phi4

new activity 1 day ago

unsloth/DeepSeek-V3-GGUF:Getting error with Q3-K-M

View all activity

Articles

Faster fine-tuning using TRL & Unsloth

Jan 10, 2024

• 44

Organizations

danielhanchen's activity

New activity in microsoft/phi-4 about 12 hours ago

Suggested tokenizer changes by Unsloth.ai

#21 opened about 22 hours ago by

gugarosa

New activity in unsloth/DeepSeek-V3-GGUF 1 day ago

Getting error with Q3-K-M

#2 opened 3 days ago by

alain401

Advice on running llama-server with Q2_K_L quant

#6 opened 2 days ago by

vmajor

New activity in unsloth/DeepSeek-V3-GGUF 2 days ago

llama.cpp cannot load Q6_K model

#3 opened 3 days ago by

vmajor

New activity in unsloth/Llama-3.3-70B-Instruct 29 days ago

Big thanks for these "without original" uploads!

#1 opened about 1 month ago by

jukofyork

New activity in unsloth/gemma-2-27b-it-bnb-4bit 4 months ago

Aphrodite/VLLM/SGLang all refuse to load this model

#5 opened 4 months ago by

fullstack

New activity in unsloth/gemma-7b-bnb-4bit 4 months ago

No module named 'triton'

#3 opened 4 months ago by

NeelM0906

New activity in unsloth/Hermes-3-Llama-3.1-8B-bnb-4bit 4 months ago

update base_model

#1 opened 4 months ago by

davanstrien

New activity in unsloth/mistral-7b-instruct-v0.3 4 months ago

ValueError: The following `model_kwargs` are not used by the model: ['num_logits_to_keep'] (note: typos in the generate arguments will also show up in this list)

#1 opened 4 months ago by

NeelM0906

New activity in unsloth/Phi-3-mini-4k-instruct-v0-bnb-4bit 5 months ago

Cant use the tokenizer using Unsloth Fastmodel

#2 opened 5 months ago by

aryarishit

New activity in unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit 6 months ago

RuntimeError: Unsloth: `unsloth/Meta-Llama-3.1-8B-bnb-4bit` is not a base model or a PEFT model.

#3 opened 6 months ago by

yorickdejong

New activity in unsloth/Mistral-Nemo-Base-2407 6 months ago

difference

#1 opened 6 months ago by

ehartford

New activity in google/gemma-2-9b-it 6 months ago

9B - query_pre_attn_scalar = 256 not 224

#26 opened 6 months ago by

danielhanchen

New activity in google/gemma-2-9b 6 months ago

9B - query_pre_attn_scalar = 256 not 224

#22 opened 6 months ago by

danielhanchen

New activity in unsloth/llama-3-8b 7 months ago

is this the llama-3-8b model clone?

#1 opened 9 months ago by

malhajar

New activity in unsloth/gemma-2b-bnb-4bit 7 months ago

Model seems to be not PEFT model

#1 opened 7 months ago by

neuralresearcher

New activity in unsloth/mistral-7b-v0.2-bnb-4bit 7 months ago

full disk on colab

#2 opened 7 months ago by

Dav22

New activity in unsloth/Phi-3-mini-4k-instruct-bnb-4bit 7 months ago

TGI - RuntimeError: mat1 and mat2 shapes cannot be multiplied (4145x3072 and 1x14155776)

#3 opened 8 months ago by

turjo4nis

New activity in unsloth/llama-3-8b-bnb-4bit 7 months ago

34 hour for file tunning ?

#7 opened 8 months ago by

dad1909

New activity in unsloth/llama-3-70b-Instruct-bnb-4bit 8 months ago

Update config.json

#1 opened 8 months ago by

huseink