Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 286 Bytes

230515 Improved baselines for vision-language pre-training.md

File metadata and controls

7 lines (4 loc) · 286 Bytes

https://arxiv.org/abs/2305.08675

Improved baselines for vision-language pre-training (Enrico Fini, Pietro Astolfi, Adriana Romero-Soriano, Jakob Verbeek, Michal Drozdzal)

clip pretraining에 대한 튜닝. augmentation 투입과 non contrastive loss 추가가 메인이군요.

#clip