Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare to SimVQ #43

Closed
suruoxi opened this issue Dec 12, 2024 · 2 comments
Closed

Compare to SimVQ #43

suruoxi opened this issue Dec 12, 2024 · 2 comments

Comments

@suruoxi
Copy link

suruoxi commented Dec 12, 2024

Have you compared IBQ to SimVQ? I think SimVQ has the similar motivation of "optimize all codebook embeddings".

@ShiFengyuan1999
Copy link
Collaborator

Hi @suruoxi, thanks for your interest in our work.

SimVQ has a similar motivation to IBQ, but we adopt different methods to optimize all codebook embeddings. SimVQ optimizes a linear transformation space $W \in R^{D \times D}$, while our IBQ optimizes the codes ($C \in R^K\times D$) themselves, where $K\gg D$.

It can not compare SimVQ with IBQ directly, because SimVQ is trained on ImageNet 128x128 and IBQ is trained on ImageNet 256x256. Moreover, SimVQ only shows the reconstruction performance and does not give the generation results. But we can provide some results from the papers for reference.

Method Train Resolution Codebook Size rFID $\downarrow$
SimVQ 128 $\times$ 128 262144 1.99
OpenMAGVIT2 128 $\times$ 128 262144 1.18
OpenMAGVIT2 256 $\times$ 256 262144 1.17
IBQ 256 $\times$ 256 262144 1.00

@suruoxi
Copy link
Author

suruoxi commented Dec 16, 2024

thanks for the reply, the rFID of IBQ is impressive.

@suruoxi suruoxi closed this as completed Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants