Phi-4 (All Versions) Collection Microsoft's new Phi-4 model in all formats. Includes GGUF, 4-bit bnb and original versions. Includes Unsloth's bug fixes. β’ 4 items β’ Updated 3 days ago β’ 22
BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning Paper β’ 2501.03226 β’ Published 5 days ago β’ 33
Cosmos Collection The collection of Cosmos models β’ 31 items β’ Updated about 10 hours ago β’ 206
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper β’ 2501.00958 β’ Published 10 days ago β’ 91
EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation Paper β’ 2501.01895 β’ Published 8 days ago β’ 43
Executable Code Actions Elicit Better LLM Agents Paper β’ 2402.01030 β’ Published Feb 1, 2024 β’ 40
view article Article πΊπ¦ββ¬ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark By wolfram β’ 9 days ago β’ 36
How Well Do LLMs Generate Code for Different Application Domains? Benchmark and Evaluation Paper β’ 2412.18573 β’ Published 18 days ago β’ 1
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines Paper β’ 2310.03714 β’ Published Oct 5, 2023 β’ 33
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response Paper β’ 2412.14922 β’ Published 23 days ago β’ 85
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper β’ 2412.18925 β’ Published 17 days ago β’ 89
YuLan-Mini: An Open Data-efficient Language Model Paper β’ 2412.17743 β’ Published 19 days ago β’ 61
Spectrum: Targeted Training on Signal to Noise Ratio Paper β’ 2406.06623 β’ Published Jun 7, 2024 β’ 12