About specific details of whitening the pretrained extractor #30

mcl0 · 2024-08-29T09:56:18Z

Hi, thank you for your amazing work!
I'm still a little confused how the whitened extractor checkpoint is obtained. From the appendix of the paper, I think the linear layer used for whitening should be added to the last layer of the original extractor. But I'm still at a loss how to implement that. I would really appreciate if you could help me know how to write a script to convert the provided dec_48b.pth to dec_48b_whit.torchscript.pt.

Thanks a lot!

asdcaszc · 2024-09-02T13:31:17Z

Do you know how to split the checkpoint.pth to dec_48b.pth from my own trained model in 'hidden'?

pierrefdz · 2024-09-03T14:40:22Z

Hi,

@mcl0 The code for the whitening layer is in https://github.com/facebookresearch/stable_signature/blob/main/finetune_ldm_decoder.py#L112
If the ckpt is not whitened, it will create a new ckpt and save it.

@asdcaszc the ckpt that you obtain after training with the code of hidden will also have the optimizer states. I think you have to load the state dict, then take the "encoder_decoder" key.

ckpt = torch.load(ckpt_path, map_location="cpu")
ckpt = ckpt["encoder_decoder"]
torch.save(ckpt, "new_model.pth")

mcl0 · 2024-09-04T11:29:33Z

@pierrefdz
Thanks a lot for your reply!😄 I'm sorry I missed that code earlier.

mcl0 closed this as completed Sep 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About specific details of whitening the pretrained extractor #30

About specific details of whitening the pretrained extractor #30

mcl0 commented Aug 29, 2024

asdcaszc commented Sep 2, 2024

pierrefdz commented Sep 3, 2024

mcl0 commented Sep 4, 2024

About specific details of whitening the pretrained extractor #30

About specific details of whitening the pretrained extractor #30

Comments

mcl0 commented Aug 29, 2024

asdcaszc commented Sep 2, 2024

pierrefdz commented Sep 3, 2024

mcl0 commented Sep 4, 2024