Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 444 Bytes

230324 Scaling Expert Language Models with Unsupervised Domain Discovery.md

File metadata and controls

7 lines (4 loc) · 444 Bytes

https://arxiv.org/abs/2303.14177

Scaling Expert Language Models with Unsupervised Domain Discovery (Suchin Gururangan, Margaret Li, Mike Lewis, Weijia Shi, Tim Althoff, Noah A. Smith, Luke Zettlemoyer)

domain expert moe 모델이군요. domain 설정을 k-means clustering으로 했는데...전 이게 늘 찝찝하긴 하네요. 도메인 지식을 주입하는 것이 이제 더 꺼림칙한 시대가 되었다보니.

#mixture_of_experts