-
Notifications
You must be signed in to change notification settings - Fork 126
Open
Labels
enhancementNew feature or requestNew feature or request
Description
If some weights in the network become NaN at some point during training, I'd like the training to stop with an error.
Currently, when training on a GPU, there is no error, training just continues, usually giving poor results. An error occurs later, when the model is executed on the CPU, where a "Floating-point invalid operation" is thrown at some point.
Perhaps such a check could be an optional (on by default) step after each parameter update.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request