Hi, thanks for sharing the code and the paper!
I'm trying to reproduce the results and noticed a small mismatch between the paper(supplementary) and the code.
In the supplementary:
(p.2 bottom, Section 1.2 "Implementation Details of Report Generation")
In the first stage, following [8], we set the learning rate for the image encoder at 5e−4 and for the encoder projection layer and text decoder at 5e−5.
- encoder learning rate = 5e-4
- decoder learning rate = 5e-5
But in the code, the defaults seem swapped:
parser.add_argument("--encoder_lr", type=float, default=5e-5)
parser.add_argument("--decoder_lr", type=float, default=5e-4)
Could you please confirm which setting was used for the main experiments?🥹
Thank you!
Hi, thanks for sharing the code and the paper!
I'm trying to reproduce the results and noticed a small mismatch between the paper(supplementary) and the code.
In the supplementary:
(p.2 bottom, Section 1.2 "Implementation Details of Report Generation")
But in the code, the defaults seem swapped:
Could you please confirm which setting was used for the main experiments?🥹
Thank you!