CelesteChen
's Collections
Align
updated
PopAlign: Diversifying Contrasting Patterns for a More Comprehensive
Alignment
Paper
•
2410.13785
•
Published
•
19
Aligning Large Language Models via Self-Steering Optimization
Paper
•
2410.17131
•
Published
•
22
Baichuan Alignment Technical Report
Paper
•
2410.14940
•
Published
•
50
SemiEvol: Semi-supervised Fine-tuning for LLM Adaptation
Paper
•
2410.14745
•
Published
•
47
RM-Bench: Benchmarking Reward Models of Language Models with Subtlety
and Style
Paper
•
2410.16184
•
Published
•
24
Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs
Paper
•
2410.18451
•
Published
•
16
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis
from Scratch
Paper
•
2410.18693
•
Published
•
40
A Critical Evaluation of AI Feedback for Aligning Large Language Models
Paper
•
2402.12366
•
Published
•
3
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum
Reinforcement Learning
Paper
•
2411.02337
•
Published
•
35
SelfCodeAlign: Self-Alignment for Code Generation
Paper
•
2410.24198
•
Published
•
23
Constraint Back-translation Improves Complex Instruction Following of
Large Language Models
Paper
•
2410.24175
•
Published
•
17
Accelerating Direct Preference Optimization with Prefix Sharing
Paper
•
2410.20305
•
Published
•
6
Self-Consistency Preference Optimization
Paper
•
2411.04109
•
Published
•
17
Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large
Language Model
Paper
•
2411.04496
•
Published
•
22
Direct Preference Optimization Using Sparse Feature-Level Constraints
Paper
•
2411.07618
•
Published
•
15
OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented
LMs
Paper
•
2411.14199
•
Published
•
30
SynFinTabs: A Dataset of Synthetic Financial Tables for Information and
Table Extraction
Paper
•
2412.04262
•
Published
•
4
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases
Paper
•
2412.04862
•
Published
•
50
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at
Scale
Paper
•
2412.05237
•
Published
•
47
I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token
Paper
•
2412.06676
•
Published
•
9
Paper
•
2412.08905
•
Published
•
101