ICML2023

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

hysts updated a Space about 10 hours ago

ICML2023/ICML2023_papers

vwxyzjn authored a paper 6 days ago

The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization

vwxyzjn authored a paper 6 days ago

A2C is a special case of PPO

View all activity

ICML2023's activity

hysts

updated a Space about 10 hours ago

ICML2023 Papers

vwxyzjn

authored 5 papers 6 days ago

The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization

Paper • 2403.17031 • Published Mar 24, 2024 • 3

A2C is a special case of PPO

Paper • 2205.09123 • Published May 18, 2022

Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models

Paper • 2410.18252 • Published Oct 23, 2024 • 5

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

Paper • 2411.15124 • Published Nov 22, 2024 • 58

2 OLMo 2 Furious

Paper • 2501.00656 • Published 11 days ago • 15

mbrack

authored a paper 19 days ago

LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps

Paper • 2412.15035 • Published 23 days ago • 4

akhaliq

posted an update 23 days ago

Post

4595

Google drops Gemini 2.0 Flash Thinking

a new experimental model that unlocks stronger reasoning capabilities and shows its thoughts. The model plans (with thoughts visible), can solve complex problems with Flash speeds, and more

now available in anychat, try it out: akhaliq/anychat

AtAndDev

posted an update 24 days ago

Post

388

@s3nh Hey man check your discord! Got some news.

4 replies

·

Kameshr

authored a paper about 1 month ago

Think Beyond Size: Adaptive Prompting for More Effective Reasoning

Paper • 2410.08130 • Published Oct 10, 2024 • 1

akhaliq

posted an update about 1 month ago

Post

5749

QwQ-32B-Preview is now available in anychat

A reasoning model that is competitive with OpenAI o1-mini and o1-preview

try it out: akhaliq/anychat

1 reply

·

akhaliq

posted an update about 1 month ago

Post

3789

New model drop in anychat

allenai/Llama-3.1-Tulu-3-8B is now available

try it here: akhaliq/anychat

akhaliq

posted an update about 2 months ago

Post

2766

anychat

supports chatgpt, gemini, perplexity, claude, meta llama, grok all in one app

try it out there: akhaliq/anychat

xzyao

authored a paper about 2 months ago

RedPajama: an Open Dataset for Training Large Language Models

Paper • 2411.12372 • Published Nov 19, 2024 • 48

Lupin1998

authored 6 papers 3 months ago

Cascade-DETR: Delving into High-Quality Universal Object Detection

Paper • 2307.11035 • Published Jul 20, 2023

Behavior Contrastive Learning for Unsupervised Skill Discovery

Paper • 2305.04477 • Published May 8, 2023

Rethinking Memory and Communication Cost for Efficient Large Language Model Training

Paper • 2310.06003 • Published Oct 9, 2023 • 2

SemiReward: A General Reward Model for Semi-supervised Learning

Paper • 2310.03013 • Published Oct 4, 2023 • 1

LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory

Paper • 2404.11163 • Published Apr 17, 2024

Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Paper • 2405.19893 • Published May 30, 2024 • 31