CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings Paper • 2501.01257 • Published 9 days ago • 45
Llama-3.1-Nemotron-70B Collection SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated about 15 hours ago • 150
HelpSteer2-Preference: Complementing Ratings with Preferences Paper • 2410.01257 • Published Oct 2, 2024 • 22
view article Article Synthetic dataset generation techniques: Self-Instruct By davanstrien • May 15, 2024 • 14
Llama 3.x Models Collection Our highest-performance models, built with Llama 3, 3.1, and 3.2 • 10 items • Updated Oct 31, 2024 • 3
Llamafied Yi Collection Yi base models converted to Llama architecture. • 4 items • Updated Nov 14, 2023 • 9
Open LLM Leaderboard best models ❤️🔥 Collection A daily uploaded list of models with best evaluations on the LLM leaderboard: • 60 items • Updated 28 minutes ago • 503