Description
Polynomial Regression is a machine learning model that can be used to fit a non-linear relationship between an output variable and one or more input variables.
Application
Polynomial Regression is a useful model when the relationship between the outputs and inputs is non-linear, as it can be used to detect and model more complex patterns in the data. See the last section for advantages and disadvantages of this type of model and when to use or not to use it.
How to use
- Under the Data input choose a tabular dataset to build the model on
- Under Inputs select the input features you would like to use for the model
- Under Outputs select the output features you would like to predict with this model
- The Output Polynomial Coefficients? tick box decides whether to provide the coefficients of the fitted model as part of the output. See below for more.
- Under the Name input type the name that you would like to refer to the model being created.
- Click Apply to train the model.
Output of Polynomial Coefficients
If the option Output Polynomial Coefficients? is enabled further fields become available to customise the output.
Option: Select format for coefficients
You can select between two different output formats:
Inline Equation (Pi Compatible) | The entire polynomial equation is printed in an info box. You can copy & paste that String over to your downstream system. Depending on the expected format of that system you might need to adapt the string to it. |
Table (CSV Exportable) | The polynomial equation is converted in a table. The first column contains the term (the inputs and exponents), the second column contains the coefficients. Export the table with coefficients to a |
Option: Select decimal places to round coefficients to (for display purposes only)
With this option the precision of the coefficients is controlled. By default this option is set to 10. That is, each coefficient is printed and exported with 10 decimal places. If you need higher precision you can increase that number. If less precision is sufficient you can decrease this number. The example screenshots above are created with a value of 4.
This option does not impact the precision of the coefficients used for internal model training and predictions! This can be controlled by the hyperparameter Rounding in the Advanced Options, see below.
Advanced Options Summary
The polynomial regression model has the following hyperparameters:
Model Type | The model type enables the user to choose regularised variants of the polynomial regression model. This is especially important when the data contains a lot of inputs and/or little data points. Regularised versions will prevent the model from overfitting by trying to keep coefficients as small as possible. The available options are:
|
Degree of Polynomial | The Degree of Polynomial determines the maximum degree of polynomial terms in the model. Higher values of the degree of polynomial will lead to more complex relationships that can be modeled but is also more prone to overfitting. This parameter has to be set to the highest degree you want to use in your polynomial. |
Max Degree of N | If you assign an input feature to one of these parameters all terms related to this feature are limited to a polynomial degree of at most N. That way the global maximum degree of the polynomial is overridden for the assigned input feature by this parameter. This parameter is available for Use this feature to restrict input features to lower degrees than the globally set Degree of Polynomial. Currently you can only restrict the maximum degree. It is not possible to increase the maximum degree for an input compared to the global setting. That is, if Degree of Polynomial is set to 3 you can restrict certain inputs to a maximum degree of 1 or 2 by means of the corresponding field. You cannot use the field Max Degree of 4 to increase one feature to a higher degree than 3; it would just be ignored. All fields Max Degree of N with |
Rounding | This parameter defines the numerical precision with which to return the polynomial coefficients The available options are:
The default selection is None which means the internal precision of the Python framework is used which is It is highly recommended to leave this parameter unchanged unless you have a clear requirement! |
Advantages
- Powerful tool for modeling nonlinear data. It is capable of fitting more complex relationships between the inputs and outputs than linear regression. Additionally, a lot of engineering data follows physical laws that are often polynomial, which makes this family of models often suitable.
- Can handle a wide range of data, including data that is not normally distributed.
- Useful for extrapolating data outside the range of the original data points, allowing it to generate predictions beyond the range of the original data.
- Is explainable by returning an equation that can help to better understand the product response, but also can be easily exported and embedded outside of the platform. Most of the more complex models (e.g. Neural Network) don't have this degree of explainability.
Disadvantages
- Can be prone to overfitting, which can lead to unreliable results.
- Can be computationally expensive and may require significant computing resources for large datasets.
- May be susceptible to outliers.
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article