Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use more sophisticated checkpoint naming scheme #3

Open
pgagarinov opened this issue Feb 1, 2021 · 0 comments
Open

Use more sophisticated checkpoint naming scheme #3

pgagarinov opened this issue Feb 1, 2021 · 0 comments
Assignees
Labels
enhancement New feature or request pytorch-hyperlight

Comments

@pgagarinov
Copy link
Owner

The default checkpoint file naming scheme uses only epoch number and step number as keys for making checkpoint files names different for different epochs/steps. Such naming scheme is not sufficient when many checkpoints are created withing the same notebook for different models (or same models but with different hyper parameters). We should an adaptive naming scheme that accounts for

  • (optionally) Jupyter notebook name
  • Model class name or/and experiment id
  • run id (needed when the same model is trained multiple times)

We should also incorporate a warning mechanics that warns a user about the checkpoint directory growing to much due to containing too many outdated checkpoints

@pgagarinov pgagarinov added the enhancement New feature or request label Feb 1, 2021
@pgagarinov pgagarinov self-assigned this Feb 1, 2021
@pgagarinov pgagarinov changed the title Use checkpoint names unique for each trial within the jupyter notebook Use more sophisticated checkpoint naming scheme Feb 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request pytorch-hyperlight
Projects
None yet
Development

No branches or pull requests

1 participant