Use more sophisticated checkpoint naming scheme #3

pgagarinov · 2021-02-01T16:48:51Z

The default checkpoint file naming scheme uses only epoch number and step number as keys for making checkpoint files names different for different epochs/steps. Such naming scheme is not sufficient when many checkpoints are created withing the same notebook for different models (or same models but with different hyper parameters). We should an adaptive naming scheme that accounts for

(optionally) Jupyter notebook name
Model class name or/and experiment id
run id (needed when the same model is trained multiple times)

We should also incorporate a warning mechanics that warns a user about the checkpoint directory growing to much due to containing too many outdated checkpoints

pgagarinov added the enhancement New feature or request label Feb 1, 2021

pgagarinov self-assigned this Feb 1, 2021

pgagarinov changed the title ~~Use checkpoint names unique for each trial within the jupyter notebook~~ Use more sophisticated checkpoint naming scheme Feb 1, 2021

pgagarinov added the pytorch-hyperlight label Feb 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use more sophisticated checkpoint naming scheme #3

Use more sophisticated checkpoint naming scheme #3

pgagarinov commented Feb 1, 2021

Use more sophisticated checkpoint naming scheme #3

Use more sophisticated checkpoint naming scheme #3

Comments

pgagarinov commented Feb 1, 2021