Phi-4 (All Versions) Collection Microsoft's new Phi-4 model in all formats. Includes GGUF, 4-bit bnb and original versions. Includes Unsloth's bug fixes. • 4 items • Updated 3 days ago • 22
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published 3 days ago • 176
view article Article Fine-tune ModernBERT for text classification using synthetic data By davidberenstein1957 • 12 days ago • 22
Tulu 3 Models Collection All models released with Tulu 3 -- state of the art open post-training recipes. • 7 items • Updated 5 days ago • 33
Awesome SFT datasets Collection A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12, 2024 • 127
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 26 items • Updated 3 days ago • 545
The Big Benchmarks Collection Collection Gathering benchmark spaces on the hub (beyond the Open LLM Leaderboard) • 13 items • Updated Nov 18, 2024 • 183
Gemma-APS Release Collection Gemma models for text-to-propositions segmentation. The models are distilled from fine-tuned Gemini Pro model applied to multi-domain synthetic data. • 3 items • Updated 29 days ago • 20
Zeroshot Classifiers Collection These are my current best zeroshot classifiers. Some of my older models are downloaded more often, but the models in this collection are newer/better. • 12 items • Updated 5 days ago • 120
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 May 28, 2024 • 171
Llama 3.1 GPTQ, AWQ, and BNB Quants Collection Optimised Quants for high-throughput deployments! Compatible with Transformers, TGI & VLLM 🤗 • 9 items • Updated Sep 26, 2024 • 56
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Nov 28, 2024 • 354
EVIDENT PlatVR [datasets] Collection This work is supported by the Ministry of Industry, Trade and Tourism, Spain (AEI-010500-2023-280). • 3 items • Updated Apr 17, 2024 • 1
EVIDENT PlatVR [models] Collection This work is supported by the Ministry of Industry, Trade and Tourism, Spain (AEI-010500-2023-280). • 3 items • Updated Apr 17, 2024 • 1
ASR Collection Automatic Speech Recognition ITG model collection • 3 items • Updated Apr 12, 2024 • 1
MT5 release Collection The MT5 release follows the T5 family, but is pretrained on multilingual data. The update UMT5 models are pretrained on an updated corpus. • 10 items • Updated 29 days ago • 17
Flan-T5 release Collection The Flan-T5 covers 4 checkpoints of different sizes each time. It also includes upgrades versions trained using Universal sampling • 7 items • Updated 29 days ago • 21