Will the 16 or 32 codebook dimension hamper the representational capacity of autoencoders?

Thanks for the excellent collection and implementation of vector quantization techniques! It is very helpful for me to get to know about this technique and study the details.

I've integrated these techniques into my autoencoder and encountered some challenges. Initially, training the autoencoder without quantization yielded good results. However, introducing quantization methods such as residualFSQ and residualVQ adversely affected both training and validation losses, preventing them from reaching the levels achieved without quantization. Intriguingly, testing the quantization on a smaller subset of the data (0.8k out of 120k) yielded consistent results, with or without quantization. Yet, using the entire dataset resulted in persistently high and slowly decreasing training and validation losses, almost plateauing.

Upon examining the implementation, I noticed that the quantization mechanisms project the input features into significantly smaller dimensions (8, 16, or 32) before actual quantization occurs. I'm concerned this dimensionality reduction might compromise the model’s representational capacity, as any further quantization steps, such as binarization or scalar quantization, are confined to this limited-dimensional space.

Has anyone else experienced similar issues with quantization in autoencoders, and if so, how did you address them?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Will the 16 or 32 codebook dimension hamper the representational capacity of autoencoders? #127

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Will the 16 or 32 codebook dimension hamper the representational capacity of autoencoders? #127

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions