Transformer training loss with validation loss #47

wjc2830 · 2025-01-14T03:01:31Z

Hey there! Thanks a lot for your contribution to this awesome open-source project. I ran into a bit of a snag while training the transformer model and thought I'd reach out for some advice.

Specifically, I noticed that while the training loss is decreasing as expected, the validation loss starts to increase pretty early on, around epoch 2 or 3. Is this something normal? I also saw that in the checkpoint saving callback, the criterion is set to train/loss.

Just wanted to check if this is expected behavior or if I should be looking into it further. Thanks!

RobertLuo1 · 2025-01-19T06:14:37Z

Hi, Thanks for your attention to our work. Indeed, we also notice this phenomenon. We suspect that the class-conditional dropout mechanism causes the inconsistency between training and validation loss. The metrics such as FID are more correlated to the training loss.

wjc2830 · 2025-01-21T03:02:30Z

Got it, thx for reply!

RobertLuo1 closed this as completed Jan 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transformer training loss with validation loss #47

Transformer training loss with validation loss #47

wjc2830 commented Jan 14, 2025

RobertLuo1 commented Jan 19, 2025

wjc2830 commented Jan 21, 2025

Transformer training loss with validation loss #47

Transformer training loss with validation loss #47

Comments

wjc2830 commented Jan 14, 2025

RobertLuo1 commented Jan 19, 2025

wjc2830 commented Jan 21, 2025