Arunkumar Venkataramanan's picture

52 137

Arunkumar Venkataramanan

ArunkumarVR

·

https://arunkumarramanan.github.io

AI & ML interests

AGI Research: Reasoning, Safety & Alignment (Superalignment), Generative AI (GenAI), Multi-Modal Foundation Models (FMs), Large Language Models (LLMs), Transformers & Diffusion Models, Open LLM Training, Optimization & Finetuning, Serving & Inference

Recent Activity

liked a model 2 days ago

microsoft/phi-4-gguf

upvoted a collection 2 days ago

liked a model 2 days ago

microsoft/phi-4

View all activity

Organizations

ArunkumarVR's activity

upvoted a collection 2 days ago

Phi-4

Phi-4 small language model. • 2 items • Updated 3 days ago • 30

upvoted a collection 25 days ago

Scaling Test-Time Compute with Open Models

Models and datasets used in our blog post: https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling-test-time-compute • 10 items • Updated 6 days ago • 21

upvoted a collection 28 days ago

Phi-3

Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 26 items • Updated 3 days ago • 545

upvoted 6 collections about 1 month ago

Llama 3.3

This collection hosts the transformers and original repos of the Llama 3.3 • 1 item • Updated Dec 6, 2024 • 106

PixMo

A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog • 9 items • Updated 5 days ago • 53

Molmo

Artifacts for open multimodal language models. • 5 items • Updated 5 days ago • 292

Tulu 3 Datasets

All datasets released with Tulu 3 -- state of the art open post-training recipes. • 32 items • Updated 5 days ago • 64

OLMoE

Artifacts for open mixture-of-experts language models. • 13 items • Updated 5 days ago • 29

OLMo 2

Artifacts for the second set of OLMo models. • 22 items • Updated 5 days ago • 74

upvoted a collection about 2 months ago

OpenScholar_V1

The set of models, index, data associated with the paper "OpenScholar: Synthesizing Scientific Literature with Retrieval-Augmented LMs". • 8 items • Updated Nov 22, 2024 • 31

upvoted 4 collections 2 months ago

🪐 SmolLM

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated 20 days ago • 208

MobileLLM

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 9 items • Updated Nov 27, 2024 • 101

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated Nov 28, 2024 • 457

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 15 items • Updated 20 days ago • 198

upvoted a paper 2 months ago

GPT-4o System Card

Paper • 2410.21276 • Published Oct 25, 2024 • 83

upvoted 5 collections 3 months ago

Llama 3.1

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 638

NVLM 1.0

A family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks and text-only tasks. • 2 items • Updated about 12 hours ago • 51

Llama-3.1-Nemotron-70B

SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated about 12 hours ago • 150

Llama 3.2 Evals

This collection provides detailed information on how we derived the reported benchmark metrics for the Llama 3.2 models, including the configurations • 4 items • Updated Dec 6, 2024 • 22

Llama 3.2

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 554