*67*

A **partial F-test **is used to determine whether or not there is a statistically significant difference between a regression model and some nested version of the same model.

A *nested* model is simply one that contains a subset of the predictor variables in the overall regression model.

For example, suppose we have the following regression model with four predictor variables:

Y = β_{0} + β_{1}x_{1} + β_{2}x_{2} + β_{3}x_{3} + β_{4}x_{4} + ε

One example of a nested model would be the following model with only two of the original predictor variables:

Y = β_{0} + β_{1}x_{1} + β_{2}x_{2} + ε

To determine if these two models are significantly different, we can perform a partial F-test.

**Partial F-Test: The Basics**

A partial F-test calculates the following F test-statistic:

F = ((RSS_{reduced} – RSS_{full})/p) / (RSS_{full}/n-k)

where:

**RSS**: The residual sum of squares of the reduced (i.e. “nested”) model._{reduced}**RSS**: The residual sum of squares of the full model._{full}**p:**The number of predictors removed from the full model.**n:**The total observations in the dataset.**k:**The number of coefficients (including the intercept) in the full model.

Note that the residual sum of squares will always be smaller for the full model since adding predictors will always lead to some reduction in error.

Thus, a partial F-test essentially tests whether the group of predictors that you removed from the full model are actually useful and need to be included in the full model.

This test uses the following null and alternative hypotheses:

**H _{0}:** All coefficients removed from the full model are zero.

**H _{A}:** At least one of the coefficients removed from the full model is non-zero.

If the p-value corresponding to the F test-statistic is below a certain significance level (e.g. 0.05), then we can reject the null hypothesis and conclude that at least one of the coefficients removed from the full model is significant.

**Partial F-Test: An Example**

In practice, we use the following steps to perform a partial F-test:

**1. **Fit the full regression model and calculate RSS_{full}.

**2. **Fit the nested regression model and calculate RSS_{reduced}.

**3. **Perform an ANOVA to compare the full and reduced model, which will produce the F test-statistic needed to compare the models.

For example, the following code shows how to fit the following two regression models in R using data from the built-in **mtcars** dataset:

**Full model:** mpg = β_{0} + β_{1}disp + β_{2}carb + β_{3}hp + β_{4}cyl

**Reduced model:** mpg = β_{0} + β_{1}disp + β_{2}carb

#fit full model model_full #fit reduced model model_reduced #perform ANOVA to test for differences in models anova(model_reduced, model_full) Analysis of Variance Table Model 1: mpg ~ disp + carb Model 2: mpg ~ disp + carb + hp + cyl Res.Df RSS Df Sum of Sq F Pr(>F) 1 29 254.82 2 27 238.71 2 16.113 0.9113 0.414

From the output we can see that the F test-statistic from the ANOVA is **0.9113** and the corresponding p-value is **0.414**.

Since this p-value is not less than .05, we will fail to reject the null hypothesis. This means we don’t have sufficient evidence to say that either of the predictor variables *hp* or *cyl* are statistically significant.

In other words, adding *hp* and *cyl* to the regression model do not significantly improve the fit of the model.