Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2406.12246

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 25
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 12
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 40
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 20

DS' Daily paper

Instruction Pre-Training: Language Models are Supervised Multitask Learners

Paper • 2406.14491 • Published Jun 20, 2024 • 87
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Paper • 2405.21060 • Published May 31, 2024 • 64
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models

Paper • 2405.20541 • Published May 30, 2024 • 22
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark

Paper • 2406.01574 • Published Jun 3, 2024 • 44

Multi-modality LVM

VoCo-LLaMA: Towards Vision Compression with Large Language Models

Paper • 2406.12275 • Published Jun 18, 2024 • 29
TroL: Traversal of Layers for Large Language and Vision Models

Paper • 2406.12246 • Published Jun 18, 2024 • 34
Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning

Paper • 2406.15334 • Published Jun 21, 2024 • 9
Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop Reasoning

Paper • 2406.12742 • Published Jun 18, 2024 • 14

TroL: Traversal of Layers for Large Language and Vision Models

Paper • 2406.12246 • Published Jun 18, 2024 • 34
A Closer Look into Mixture-of-Experts in Large Language Models

Paper • 2406.18219 • Published Jun 26, 2024 • 16
ThinK: Thinner Key Cache by Query-Driven Pruning

Paper • 2407.21018 • Published Jul 30, 2024 • 31
Meltemi: The first open Large Language Model for Greek

Paper • 2407.20743 • Published Jul 30, 2024 • 68

Vision-Language

SILC: Improving Vision Language Pretraining with Self-Distillation

Paper • 2310.13355 • Published Oct 20, 2023 • 8
Woodpecker: Hallucination Correction for Multimodal Large Language Models

Paper • 2310.16045 • Published Oct 24, 2023 • 15
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Paper • 2201.12086 • Published Jan 28, 2022 • 3
ImageNetVC: Zero-Shot Visual Commonsense Evaluation on 1000 ImageNet Categories

Paper • 2305.15028 • Published May 24, 2023 • 1

MADLAD-400: A Multilingual And Document-Level Large Audited Dataset

Paper • 2309.04662 • Published Sep 9, 2023 • 22
Neurons in Large Language Models: Dead, N-gram, Positional

Paper • 2309.04827 • Published Sep 9, 2023 • 16
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs

Paper • 2309.05516 • Published Sep 11, 2023 • 9
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs

Paper • 2309.03907 • Published May 18, 2023 • 10

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs