20 15

Timothe Laborie

timothelaborie

AI & ML interests

Recent Activity

upvoted a paper about 5 hours ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

upvoted a paper about 2 months ago

Cautious Optimizers: Improving Training with One Line of Code

View all activity

Organizations

timothelaborie's activity

upvoted a paper about 5 hours ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published 3 days ago • 176

upvoted a paper about 2 months ago

Cautious Optimizers: Improving Training with One Line of Code

Paper • 2411.16085 • Published Nov 25, 2024 • 15

commented a paper 2 months ago

BitNet a4.8: 4-bit Activations for 1-bit LLMs

Paper • 2411.04965 • Published Nov 7, 2024 • 64 •

upvoted a paper 2 months ago

BitNet a4.8: 4-bit Activations for 1-bit LLMs

Paper • 2411.04965 • Published Nov 7, 2024 • 64

upvoted a paper 3 months ago

FlatQuant: Flatness Matters for LLM Quantization

Paper • 2410.09426 • Published Oct 12, 2024 • 13

New activity in huggingface/HuggingDiscussions 3 months ago

[FEEDBACK] Daily Papers

106

#32 opened 7 months ago by

kramp

upvoted a paper 3 months ago

nGPT: Normalized Transformer with Representation Learning on the Hypersphere

Paper • 2410.01131 • Published Oct 1, 2024 • 9

commented a paper 3 months ago

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1, 2024 • 145 •

upvoted a paper 3 months ago

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1, 2024 • 145

New activity in mistralai/Mistral-7B-v0.1 3 months ago

Fine Tuning for Classification

#129 opened 11 months ago by

MUHAMMAD-SOHAIL-ZZU

New activity in madbuda/triton-windows-builds 5 months ago

triton 3

#3 opened 5 months ago by

timothelaborie

upvoted a paper 6 months ago

Q-Sparse: All Large Language Models can be Fully Sparsely-Activated

Paper • 2407.10969 • Published Jul 15, 2024 • 21

updated a model 6 months ago

timothelaborie/tweetclassifier

Updated Jul 14, 2024 • 1

upvoted 2 papers 7 months ago

MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark

Paper • 2406.01574 • Published Jun 3, 2024 • 44

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Paper • 2405.21060 • Published May 31, 2024 • 64

commented a paper 7 months ago

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Paper • 2405.21060 • Published May 31, 2024 • 64 •

commented a paper 9 months ago

Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting

Paper • 2404.18911 • Published Apr 29, 2024 • 29 •

New activity in 1bitLLM/bitnet_b1_58-3B 9 months ago

Why are these models fp32?

#2 opened 10 months ago by

supercharge19

commented a paper 10 months ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 606 •

142

upvoted a paper 10 months ago

Chronos: Learning the Language of Time Series

Paper • 2403.07815 • Published Mar 12, 2024 • 46