MSE vs. RMSE: Which Metric Should You Use?

by Tutor Aspire February 17, 2016

Regression models are used to quantify the relationship between one or more predictor variables and a response variable.

Whenever we fit a regression model, we want to understand how well the model is able to use the values of the predictor variables to predict the value of the response variable.

Two metrics we often use to quantify how well a model fits a dataset are the mean squared error (MSE) and the root mean squared error (RMSE), which are calculated as follows:

MSE: A metric that tells us the average squared difference between the predicted values and the actual values in a dataset. The lower the MSE, the better a model fits a dataset.

MSE = Σ(ŷ_i – y_i)² / n

where:

Σ is a symbol that means “sum”
ŷ_i is the predicted value for the i^th observation
y_i is the observed value for the i^th observation
n is the sample size

RMSE: A metric that tells us the square root of the average squared difference between the predicted values and the actual values in a dataset. The lower the RMSE, the better a model fits a dataset.

It is calculated as:

RMSE = √Σ(ŷ_i – y_i)² / n

where:

Σ is a symbol that means “sum”
ŷ_i is the predicted value for the i^th observation
y_i is the observed value for the i^th observation
n is the sample size

Notice that the formulas are nearly identical. In fact, the root mean squared error is just the square root of the mean squared error.

RMSE vs. MSE: Which Metric Should You Use?

When assessing how well a model fits a dataset, we use the RMSE more often because it is measured in the same units as the response variable.

Conversely, the MSE is measured in squared units of the response variable.

To illustrate this, suppose we use a regression model to predict the number of points that 10 players will score in a basketball game.

The following table shows the predicted points from the model vs. the actual points the players scored:

We would calculate the mean squared error (MSE) as:

MSE = Σ(ŷ_i – y_i)² / n
MSE = ((14-12)²+(15-15)²+(18-20)²+(19-16)²+(25-20)²+(18-19)²+(12-16)²+(12-20)²+(15-16)²+(22-16)²) / 10
MSE = 16

The mean squared error is 16. This tells us that the average squared difference between the predicted values made by the model and the actual values is 16.

The root mean squared error (RMSE) would simply be the square root of the MSE:

RMSE = √MSE
RMSE = √16
RMSE = 4

The root mean squared error is 4. This tells us that the average deviation between the predicted points scored and the actual points scored is 4.

Notice that the interpretation of the root mean squared error is much more straightforward than the mean squared error because we’re talking about ‘points scored’ as opposed to ‘squared points scored.’

How to Use RMSE in Practice

In practice, we typically fit several regression models to a dataset and calculate the root mean squared error (RMSE) of each model.

We then select the model with the lowest RMSE value as the “best” model because it is the one that makes predictions that are closest to the actual values from the dataset.

Note that we can also compare the MSE values of each model, but RMSE is more straightforward to interpret so it’s used more often.

Additional Resources

Introduction to Multiple Linear Regression
RMSE vs. R-Squared: Which Metric Should You Use?
RMSE Calculator

MSE vs. RMSE: Which Metric Should You Use?

RMSE vs. MSE: Which Metric Should You Use?

How to Use RMSE in Practice

Additional Resources

How to Perform One Sample & Two Sample Z-Tests in Excel

How to Calculate CAGR in Google Sheets (Step-by-Step)

You may also like