Home Â» How to Perform Multiple Linear Regression in SPSS

# How to Perform Multiple Linear Regression in SPSS

Multiple linear regressionÂ isÂ a method we can use to understand the relationship between two or more explanatory variables and a response variable.

This tutorial explains how to perform multiple linear regression in SPSS.

### Example: Multiple Linear Regression in SPSS

Suppose we want to know if the number of hours spent studying and the number of prep exams taken affects the score that a student receives on a certain exam. To explore this, we can perform multiple linear regression using the following variables:

Explanatory variables:

• Hours studied
• Prep exams taken

Response variable:

• Exam score

Use the following steps to perform this multiple linear regression in SPSS.

Step 1: Enter the data.

Enter the following data for the number of hours studied, prep exams taken, and exam score received for 20 students:

Step 2: Perform multiple linear regression.

Click theÂ AnalyzeÂ tab, thenÂ Regression, thenÂ Linear:

Drag the variableÂ scoreÂ into the box labelled Dependent. Drag the variablesÂ hours andÂ prep_exams into the box labelled Independent(s). Then clickÂ OK.

Step 3: Interpret the output.

Once you clickÂ OK, the results of the multiple linear regression will appear in a new window.

The first table weâ€™re interested in is titledÂ Model Summary:

Here is how to interpret the most relevant numbers in this table:

• R Square:Â This is the proportion of the variance in the response variable that can be explained by the explanatory variables. In this example,Â 73.4%Â of the variation in exam scores can be explained by hours studied and number of prep exams taken.
• Std. Error of the Estimate:Â TheÂ standard errorÂ is the average distance that the observed values fall from the regression line. In this example,Â the observed values fall an average ofÂ 5.3657Â units from the regression line.

The next table weâ€™re interested in is titledÂ ANOVA:

Here is how to interpret the most relevant numbers in this table:

• F:Â This is the overall F statistic for the regression model, calculated as Mean Square Regression / Mean Square Residual.
• Sig:Â This is the p-value associated with the overall F statistic. It tells us whether or not the regression model as a whole is statistically significant. In other words, it tells us if the two explanatory variables combined have a statistically significant association with the response variable. In this case the p-value is equal to 0.000, which indicates that the explanatory variablesÂ hours studiedÂ andÂ prep exams takenÂ have a statistically significant association withÂ exam score.

The next table weâ€™re interested in is titledÂ Coefficients:

Here is how to interpret the most relevant numbers in this table:

• Unstandardized B (Constant):Â This tells us the average value of the response variable when both predictor variables are zero. In this example, the average exam score isÂ 67.674Â when hours studied and prep exams taken are both equal to zero.
• Unstandardized B (hours):Â This tells us the average change in exam score associated with a one unit increase in hours studied, assuming number of prep exams taken is held constant. In this case, each additional hour spent studying is associated with an increase of 5.556 points in exam score, assuming the number of prep exams taken is held constant.
• Unstandardized B (prep_exams):Â This tells us the average change in exam score associated with a one unit increase in prep exams taken, assuming number of hours studied is held constant. In this case, each additional prep exam taken is associated with a decrease of .602 points in exam score, assuming the number of hours studied is held constant.
• Sig. (hours):Â This is the p-value for the explanatory variableÂ hours. Since this value (.000) is less than .05, we can conclude that hours studied has a statistically significant association with exam score.
• Sig. (prep_exams):Â This is the p-value for the explanatory variableÂ prep_exams. Since this value (.519) is not less than .05, we cannot conclude that number of prep exams taken has a statistically significant association with exam score.

Lastly, we can form a regression equation using the values shown in the table forÂ constant,Â hours, andÂ prep_exams. In this case, the equation would be:

Estimated exam score =Â 67.674 + 5.556*(hours) â€“ .602*(prep_exams)

We can use this equation to find the estimated exam score for a student, based on the number of hours they studied and the number of prep exams they took. For example, a student that studies for 3 hours and takes 2 prep exams is expected to receive an exam score of 83.1:

Estimated exam score =Â 67.674 + 5.556*(3) â€“ .602*(2) = 83.1

Note:Â Since the explanatory variableÂ prep examsÂ was not found to be statistically significant, we may decide to remove it from the model and instead perform simple linear regression usingÂ hours studiedÂ as the only explanatory variable.