Decision Tree Regression

Modified on Tue, 18 Jul, 2023 at 2:03 PM

Description

Decision tree regression trains a decision tree based machine learning model.

Decision tree models models represent data by constructing a trees of branched binary decisions. These branches end in leaf nodes that represent the average target value of the observations in the node.

Application

Similar to other machine learning regression models, decision tree regression can be used to predict single or multiple “target” output values from “feature” inputs. See the last section for advantages and disadvantages of this type of model and when to use or not to use it.

How to use

Decision tree regression requires three inputs.

Data	The data to use to train the model
Inputs	The columns in the data to use as inputs (“features”) to make predictions from.
Outputs	The column(s) in the data to use as the output(s) of the models. These are the values the models will predict based on the input features.

Advanced Options

Maximum depth	Controls how deep the tree can grow, i.e. how many binary decisions (binary = boolean, yes/no, true/false) can be made in a branch. The same depth is applied across all branches. Deeper trees represent the training data better, but are more likely to overfit.
Minimum split samples	The number of rows of data required to add a new split to a branch. Lower values allow branches to grow to a greater depth and increase overfitting.
Minimum leaf samples	The number of rows of data required to create a leaf (final) node in a branch. Lower values allow branches to grow to a greater depth and increase overfitting.

Examples

This article provides a good and vivid illustration of how a decision tree works: A visual introduction to machine learning.

Decision Tree Regression

Description

Application

How to use

Advanced Options

Examples

More on this step

Advantages

Disadvantages