Hi, I've been trying to run this code. But I couldn't find details about hyperparameters settings like initial learning rate and expected end training and validation losses from the paper. So I'm wondering if there are any suggestions on the parameters settings and what end loss curves and accuracy should I expect? Thanks!
Hi, I've been trying to run this code. But I couldn't find details about hyperparameters settings like initial learning rate and expected end training and validation losses from the paper. So I'm wondering if there are any suggestions on the parameters settings and what end loss curves and accuracy should I expect? Thanks!