How to read the training curve of a neural network?

Modified on Mon, 20 Feb, 2023 at 4:02 PM

A neural network has many parameters (structure, epoch number, ...) that can be tailored in order to obtain the most accurate model. Each time a neural network is trained, the error curve will show at each epoch (each time the model goes through all the training data) what is the current error (MSE) of the model in predicting the results of the data used for training and of an unseen set of data (validation set).

A general rule is that these values should decrease and converge to a value as small as possible, but the shape of the curve can also be a warning that there is an issue with the training of the neural network. Here are some typical error curves, along with a possible interpretation and a suggestion on how to change the model:


(a) The error has not converged yet, which suggests that more epochs will provide a better results.	(b) The error converged at a relatively high value. A bigger network might help (more neurons/layers).

(c) The train and validation errors are getting separated: the model overfits. A smaller/simpler network will help.	(d) The model converges at small values. This is probably the best of the four models (but could still be improved).
Examples of typical error curves when training a neural network.

You can go in the advanced options of the neural network and activate the hyperparameter optimisation in which the platform will automatically look for the best model by trying and evaluating models with different architectures (see here).