You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thanks for your interest in our work. Indeed, we train a codebook of 262144 codes. However, as stated in the paper, we observe that optimizing such a large codebook in relatively small amount of data (Imagenet) is difficult. To help the model predict with large vocabulary, we adopt the asymmetric token factorization technique. Actually the codebook size with the factorization method is 64 and 4096 respectively. The vocab_size here is for when not adopting the asymmetric method.
The text was updated successfully, but these errors were encountered: