Model complexity

Modified on Wed, 15 Mar, 2023 at 10:18 AM

The article How to read a Learning curve? advises to reduce or increase model complexity in case of over- or underfitting. Any of the models in the platform can be made more or less complex.

If a model needs to become less complex that means the model has to become “smaller,” that is less coefficients or less model parts which have to be fitted to the data.

For a Polynomial Regression model that is quite straightforward: reduce the max degree of the polynomial (overall or for some of the inputs).

For a Neural Network “less complex” means you need to reduce the model size. That is a smaller number of hidden layer and/or a smaller size of each layer. Also, increasing the drop out rate would make the model less complex in some sense.

For a Random Forest that would mean: the decisions trees within the forest mustn't become too big. To achieve this the max depth of the tree can be reduced, or the minimum number of split and/or leaf samples increased. Always prefer a Random Forest instead of a Decision Tree.

Less complex models in general are:

Linear Regression (not much you can do to in-/decrease model complexity here)
Nearest Neighbor Regression
Support Vector Regression

Using these models can help but in many cases these models might be too simple. It's often better to use one of the more complex model types and restrict model complexity via the hyperparameters mentioned above.

If you need to increase model complexitym, all the above suggestions also apply, but in the opposite direction.