Aramis

amenur

amenur

AI & ML interests

None yet

Recent Activity

upvoted an article 4 days ago

Superposition in Transformers: A Novel Way of Building Mixture of Experts

upvoted a paper 22 days ago

Qwen2.5 Technical Report

upvoted a collection 24 days ago

Scaling Test-Time Compute with Open Models

View all activity

Organizations

None yet

amenur's activity

upvoted an article 4 days ago

Article

Superposition in Transformers: A Novel Way of Building Mixture of Experts

•

7 days ago

• 14

upvoted a paper 22 days ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 23 days ago • 339

upvoted a collection 24 days ago

Scaling Test-Time Compute with Open Models

Collection

Models and datasets used in our blog post: https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling-test-time-compute • 10 items • Updated 6 days ago • 21

upvoted a paper 3 months ago

MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages

Paper • 2410.01036 • Published Oct 1, 2024 • 14

upvoted an article 4 months ago

Article

Llama can now see and run on your device - welcome Llama 3.2

Sep 25, 2024

• 180

upvoted a collection 4 months ago

Moshi v0.1 Release

Collection

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18, 2024 • 225

upvoted 2 articles 4 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18, 2024

• 215

Article

Scaling robotics datasets with video encoding

Aug 27, 2024

• 35

upvoted 2 articles 5 months ago

Article

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

•

Jul 29, 2024

• 262

Article

Memory-efficient Diffusion Transformers with Quanto and Diffusers

Jul 30, 2024

• 61

upvoted a paper 7 months ago

TextGrad: Automatic "Differentiation" via Text

Paper • 2406.07496 • Published Jun 11, 2024 • 28

upvoted 2 articles 7 months ago

Article

Extracting Concepts from LLMs: Anthropic’s recent discoveries 📖

•

Jun 20, 2024

• 26

Article

Uncensor any LLM with abliteration

•

Jun 13, 2024

• 391

upvoted 3 articles 8 months ago

Article

License to Call: Introducing Transformers Agents 2.0

May 13, 2024

• 123

Article

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

•

May 7, 2024

• 43

Article

SeeMoE: Implementing a MoE Vision Language Model from Scratch

•

Jun 23, 2024

• 34

upvoted an article 9 months ago

Article

seemore: Implement a Vision Language Model from Scratch

•

Jun 23, 2024

• 69

upvoted 2 papers 9 months ago

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22, 2024 • 254

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2, 2024 • 104

upvoted an article 9 months ago

Article

Custom architectures with HuggingFace 🤗

•

Apr 22, 2024

• 25