Performance Metrics for Regression Tasks

Datascience George
2 min readJun 29, 2020

--

By George Bennett

A regression task is when machine learning models are used to predict a continuous variable. One example may be predicting the amount of people that will go to the swimming pool based on the morning temperature. The regressor is almost guaranteed to not be one hundred percent accurate. We can measure how far off its predictions are with performance metrics. The basis of these performance metrics is measuring the error term. The error is simply the distance between the predicted output and the actual value. This can be viewed as the red lines in the below image¹.

The most common performance metric for regression is root mean squared error (RMSE for short). RMSE follows the formula shown below².

Here the y hat is the predicted value our machine learning model gave us, the y was the known actual value, and n is the amount of predictions or rows we have. This metric penalizes large errors much more than small errors because of the squaring which is usually good but may be bad if your data is heavy with outliers.

Mean absolute error (MAE for short) on the other hand is good to use when you have outliers. It is not so good if there are few outliers. The formula for MAE is below³

The variables in this formula are the same as the one above it. Notice there is no squaring in the formula. This makes it much less sensitive to large errors.

Sources

  1. Regression and error Image https://www.google.com/url?sa=i&url=https%3A%2F%2Fnextjournal.com%2Fintelrefinery%2Fsimple-linear-regression&psig=AOvVaw3xHiXKgz5BvK_E7WS5ZoiK&ust=1593478829316000&source=images&cd=vfe&ved=0CAIQjRxqFwoTCOiIpLTppeoCFQAAAAAdAAAAABAD
  2. RMSE formula Image https://www.google.com/url?sa=i&url=https%3A%2F%2Ftowardsdatascience.com%2Fwhat-does-rmse-really-mean-806b65f2e48e&psig=AOvVaw2rd01x1iXghrLeyzGkb3UM&ust=1593478399600000&source=images&cd=vfe&ved=0CAIQjRxqFwoTCOCxtePnpeoCFQAAAAAdAAAAABAD
  3. MAE formula Image https://www.google.com/url?sa=i&url=https%3A%2F%2Fwww.easycalculation.com%2Fmaths-dictionary%2Fmean_absolute_error.html&psig=AOvVaw14AFT8eLaHX6Jp6eGxauPc&ust=1593478691500000&source=images&cd=vfe&ved=0CAIQjRxqFwoTCMiH4_LopeoCFQAAAAAdAAAAABAD

--

--

Datascience George
Datascience George

Written by Datascience George

Data scientist learning at Flat Iron School

No responses yet