Hi, thanks for your great work! Would it be possible for you to also release the Recap-CLIP models trained with ViT-S/16 vision encoders?
I am specifically interested in all of these models from tab. 4 in your paper:
- ViT-S/16 img encoder + small text encoder
- ViT-S/16 img encoder + base text encoder
Thanks and looking forward to the release!
Hi, thanks for your great work! Would it be possible for you to also release the Recap-CLIP models trained with ViT-S/16 vision encoders?
I am specifically interested in all of these models from tab. 4 in your paper:
Thanks and looking forward to the release!