questions about the autoregressive model vocab size #41

sysuyy · 2024-11-03T09:14:21Z

Hello ! Thank you for your wonderful work. Why the vocab size in config set to 512 , but not 262,144?

RobertLuo1 · 2024-11-03T09:02:39Z

Hi, thanks for your interest in our work. Indeed, we train a codebook of 262144 codes. However, as stated in the paper, we observe that optimizing such a large codebook in relatively small amount of data (Imagenet) is difficult. To help the model predict with large vocabulary, we adopt the asymmetric token factorization technique. Actually the codebook size with the factorization method is 64 and 4096 respectively. The vocab_size here is for when not adopting the asymmetric method.

sysuyy · 2024-11-03T09:25:36Z

Thank you for your detailed reply!

RobertLuo1 closed this as completed Nov 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

questions about the autoregressive model vocab size #41

questions about the autoregressive model vocab size #41

sysuyy commented Nov 3, 2024

RobertLuo1 commented Nov 3, 2024

sysuyy commented Nov 3, 2024

questions about the autoregressive model vocab size #41

questions about the autoregressive model vocab size #41

Comments

sysuyy commented Nov 3, 2024

RobertLuo1 commented Nov 3, 2024

sysuyy commented Nov 3, 2024