Skip to content

Latest commit

 

History

History
10 lines (6 loc) · 663 Bytes

200316 Weak and Strong Gradient Directions.md

File metadata and controls

10 lines (6 loc) · 663 Bytes

https://arxiv.org/abs/2003.07422

Weak and Strong Gradient Directions: Explaining Memorization, Generalization, and Hardness of Examples at Scale (Piotr Zielinski, Shankar Krishnan, Satrajit Chatterjee)

Explaining Memorization and Generalization: A Large-Scale Study with Coherent Gradients (Piotr Zielinski, Shankar Krishnan, Satrajit Chatterjee)

coherent gradient (https://arxiv.org/abs/2002.10657) 논문에서 실험을 보강. coherent gradient는 데이터들 내에서 비슷한 그래디언트들이 강화되고 서로 다른 그래디언트가 억제되는 현상이 일반화나 학습 성능에 중요한 의미가 있다는 제안.

#optimization