Nathan Habib's picture

Nathan Habib

SaylorTwift

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Agent Laboratory: Using LLM Agents as Research Assistants

upvoted an article 2 days ago

🌁#82: AI and ML in Real Life

upvoted a collection 3 days ago

View all activity

Articles

CO₂ Emissions and Models Performance: Insights from the Open LLM Leaderboard

Open LLM Leaderboard: DROP deep dive

What's going on with the Open LLM Leaderboard?

Organizations

Posts 1

Post

414

How do I test an LLM for my unique needs?
If you work in finance, law, or medicine, generic benchmarks are not enough.
This blog post uses Argilla, Distilllabel and 🌤️Lighteval to generate evaluation dataset and evaluate models.

https://github.com/argilla-io/argilla-cookbook/blob/main/domain-eval/README.md

Collections 2

Papers 1

arxiv:2310.16944

spaces 2

Mt Bench Viz No Compare

Mt Bench Viz

models 2

SaylorTwift/gpt2_test

Text Generation • Updated Sep 23, 2024 • 795

SaylorTwift/xlm-roberta-base-finetuned-panx-fr

Updated Mar 13, 2023

datasets 8

SaylorTwift/TinyLlama__TinyLlama-1.1B-Chat-v0.6-details

Viewer • Updated Nov 1, 2024 • 780 • 32

SaylorTwift/results-test

Viewer • Updated Nov 1, 2024 • 3 • 4

SaylorTwift/gpt2

Viewer • Updated Jul 22, 2024 • 43.2k • 31

SaylorTwift/bbh

Viewer • Updated Jun 16, 2024 • 6.76k • 5.58k • 3

SaylorTwift/drop

Updated May 6, 2024 • 3

SaylorTwift/details_mistralai__Mistral-7B-Instruct-v0.2_private

Viewer • Updated Apr 2, 2024 • 162 • 39

SaylorTwift/the_pile_books3_minus_gutenberg

Viewer • Updated Mar 3, 2023 • 193k • 425 • 8

SaylorTwift/Gutenberg

Viewer • Updated Mar 2, 2023 • 54.8k • 82 • 5