-
Notifications
You must be signed in to change notification settings - Fork 385
Description
Enter the chapter number
Chapter 11. Training Deep Neural Networks
Enter the page number
No response
What is the cell's number in the notebook
Cells 105-106 in 11_training_deep_neural_networks.ipynb
Enter the environment you are using to run the notebook
Jupyter on MacOS
Describe your issue
The last step of Exercise 8 reads:
Step 7: Retrain your model using 1cycle scheduling and see if it improves training speed and model accuracy.
Solution code in cells 105 and 106:
n_epochs = 60
optimizer = torch.optim.NAdam(model.parameters(), lr=1e-3)
scheduler = torch.optim.lr_scheduler.OneCycleLR(
optimizer, epochs=n_epochs, steps_per_epoch=len(train_loader), max_lr=1e-2)
criterion = nn.CrossEntropyLoss()
accuracy = torchmetrics.Accuracy(task="multiclass", num_classes=10).to(device)history = train_with_early_stopping(model, optimizer, criterion, accuracy,
train_loader, valid_loader, n_epochs,
patience=20, scheduler=scheduler)train_with_early_stopping() function (defined earlier in the notebook) calls scheduler.step() at the end of every epoch. This seems to work, however, the documentation of OneCycleLR scheduler mentions that it should be called at the end of every batch.
Also, using NAdam optimizer resulted in validation accuracy consistently dropping to ~0.10 after a few epochs and training diverging, although this might be an issue with my environment. I resolved this by replacing NAdam with SGD optimizer.
Enter what you expected to happen
No response
If you found a workaround, describe it here
No response