How to read a Learning curve?

Modified on Tue, 10 Oct, 2023 at 10:47 AM

Learning curves help understand the evolution of a model's performance with the amount of data used to train that model. They can be used for instance to know whether more data would help the model be better or not. They also enable you to know whether the model is a good fit, underfitting, or overfitting.

What is a learning curve

To produce a learning curve, many models are trained using a different proportion of the data available for training. For each proportion, the model performance metric (mean square error in our case) will be calculated. In some cases, to avoid having too much variability in the results, it is possible to train multiple models for the same proportion of data and get an average (see curve on test data on the left).

On the Figure shown below for instance, you can see that training a model with more data will improve the model's performance as the mean square error (MSE) keeps going down for the test data. Point 1 shows the MSE when the model is trained on 50% of the available data, and point 2 shows the MSE when the model is trained on 100% of the available data.

How to read a learning curve?

In the examples below, we trained 3 models (linear regression, polynomial regression and decision tree regression) on the same data and we plotted the learning curve for each model:

	Example for an underfitting model: Linear regression Trend in data neither covered in train nor test set. Error for both sets very high (small gap between the two curves). Bringing more data won’t help. You need to use a more complex model to capture the trends in the data.
	Example for a suitable model: Polynomial regression Trend in data captured well for both data sets. Low train set error and similar test set error (small gap between the two curves, or gap that becomes smaller). Bringing more data can help if the curves are not fully converged.
	Example for an overfitting model: Decision Tree Regression Train set captured very well, test set very bad. Low train set error, high test set error (large gap between the two curves). Use a less complex model to reduce overfitting. If you are sure you’re using the right model bringing more data also helps to reduce overfitting.

Read here to find out what “less” or “more complex” means in practical.