Neural network uncertainty vs. model error

Modified on Wed, 15 Mar, 2023 at 10:14 AM

Within the neural network model, you have the option to enable an uncertainty to be output for your neural network. A neural network has many parameters (structure, epoch number, dropout fraction, …) that can be tailored to obtain the most accurate model.

Dropout Fraction

The Dropout Fraction can be used to control the randomness in your network. A higher Dropout Fraction increases the randomness in your model, by randomly turning off (‘dropping out’) some fraction of the neurons in each training step. Higher dropout fraction can help reduce the chance of overfitting (through a process known as ‘regularization’), but it can mean that your model will require more training steps to learn to fit the data. Usual values are between 0 and 0.2.

Monolith also uses this Dropout Fraction to provide an indication of Uncertainty on the model output. This is only possible if you have used a non-zero value for Dropout Fraction.

Calculating uncertainty

Say you imported a dataset and then used the Train Test Split function to randomly sample that dataset into:

Dataset A - 80 % of the data for training the model
Dataset B - remaining 20 % of the data for validating the model

Then use Dataset A as the input to train a neural network. Whilst setting up the neural network, you enable uncertainty. As a result, the process of training the neural network on Dataset A involves 100 repetitions. Every repetition, a fraction of the neurons are randomly turned off (‘dropped out’). As a result, the predicted output from the neural network is slightly different in every repetition. After 100 repetitions the network is fully trained and the predicted output from the network for any combination of inputs is the mean value from these repetitions.

The uncertainty visualised around that mean output is ± 2 standard deviations (which means the value is 95 % likely to be in the uncertainty range).

Calculating the error

To calculate the Error between this predicted mean and the true measured value, you can use Dataset B - this is 20 % of the data that was not used to train the neural network. Compare the difference between the predicted mean value from the neural network to the known value to see how accurate the neural network is at any given datapoint.

How do error and uncertainty relate to each other

Let's assume a model was trained with uncertainty and is now evaluated on a test set. For each prediction, there are four possible options (see figure below):

The model has a low error and low uncertainty: This is the ideal situation. The model is right and is sure of itself.
The model has low error and high uncertainty: This is also a good situation. Although the model is not as sure (higher uncertainty), the prediction is still very good.
The model has high error and high uncertainty: This is not ideal, but still relatively good. The prediction of the model is wrong, but at least the model is aware of it and calculates a high uncertainty. Therefore, the user knows not to trust the model too much in this region (maybe more data is needed).
The model has high error and low uncertainty: This is not a good situation. The model makes a wrong prediction, but thinks that it's sure of itself and might convince the user to trust this wrong prediction.

You can read more about uncertainty and error in this blog post.