Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About specific details of whitening the pretrained extractor #30

Closed
mcl0 opened this issue Aug 29, 2024 · 3 comments
Closed

About specific details of whitening the pretrained extractor #30

mcl0 opened this issue Aug 29, 2024 · 3 comments

Comments

@mcl0
Copy link

mcl0 commented Aug 29, 2024

Hi, thank you for your amazing work!
I'm still a little confused how the whitened extractor checkpoint is obtained. From the appendix of the paper, I think the linear layer used for whitening should be added to the last layer of the original extractor. But I'm still at a loss how to implement that. I would really appreciate if you could help me know how to write a script to convert the provided dec_48b.pth to dec_48b_whit.torchscript.pt.

Thanks a lot!

@asdcaszc
Copy link

asdcaszc commented Sep 2, 2024

Do you know how to split the checkpoint.pth to dec_48b.pth from my own trained model in 'hidden'?

@pierrefdz
Copy link
Contributor

Hi,

@mcl0 The code for the whitening layer is in https://github.com/facebookresearch/stable_signature/blob/main/finetune_ldm_decoder.py#L112
If the ckpt is not whitened, it will create a new ckpt and save it.

@asdcaszc the ckpt that you obtain after training with the code of hidden will also have the optimizer states. I think you have to load the state dict, then take the "encoder_decoder" key.

ckpt = torch.load(ckpt_path, map_location="cpu")
ckpt = ckpt["encoder_decoder"]
torch.save(ckpt, "new_model.pth")

@mcl0
Copy link
Author

mcl0 commented Sep 4, 2024

@pierrefdz
Thanks a lot for your reply!😄 I'm sorry I missed that code earlier.

@mcl0 mcl0 closed this as completed Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants