https://arxiv.org/abs/2002.03532
Understanding and Improving Knowledge Distillation (Jiaxi Tang, Rakesh Shivanna, Zhe Zhao, Dong Lin, Anima Singh, Ed H. Chi, Sagar Jain)
KD의 효과를 분석.
- label smoothing
- teacher의 confidence를 활용해 example를 reweighting
- logit에 prior를 제공
#distillation