https://arxiv.org/abs/2004.13342
Scheduled DropHead: A Regularization Method for Transformer Models (Wangchunshu Zhou, Tao Ge, Ke Xu, Furu Wei, Ming Zhou)
#regularization
https://arxiv.org/abs/2004.13342
Scheduled DropHead: A Regularization Method for Transformer Models (Wangchunshu Zhou, Tao Ge, Ke Xu, Furu Wei, Ming Zhou)
#regularization