Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transformer training loss with validation loss #47

Closed
wjc2830 opened this issue Jan 14, 2025 · 2 comments
Closed

Transformer training loss with validation loss #47

wjc2830 opened this issue Jan 14, 2025 · 2 comments

Comments

@wjc2830
Copy link

wjc2830 commented Jan 14, 2025

Hey there! Thanks a lot for your contribution to this awesome open-source project. I ran into a bit of a snag while training the transformer model and thought I'd reach out for some advice.

Specifically, I noticed that while the training loss is decreasing as expected, the validation loss starts to increase pretty early on, around epoch 2 or 3. Is this something normal? I also saw that in the checkpoint saving callback, the criterion is set to train/loss.

Just wanted to check if this is expected behavior or if I should be looking into it further. Thanks!

@RobertLuo1
Copy link
Collaborator

Hi, Thanks for your attention to our work. Indeed, we also notice this phenomenon. We suspect that the class-conditional dropout mechanism causes the inconsistency between training and validation loss. The metrics such as FID are more correlated to the training loss.

@wjc2830
Copy link
Author

wjc2830 commented Jan 21, 2025

Got it, thx for reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants