*97*

A box plot is used to visualizeÂ the five number summary of a dataset, which includes:

- The minimum
- The first quartile
- The median
- The third quartile
- The maximumÂ

This tutorial explains how to create and modify box plots in SPSS.

**How to Create a Single Box Plot in SPSS**

Suppose we have the following dataset that shows the average points scored per game by 16 basketball players on a certain team:

To create a box plot to visualize the distribution of these data values, we can click theÂ **AnalyzeÂ **tab, thenÂ **Descriptive Statistics**, thenÂ **Explore**:

This will bring up the following window:

To create a box plot, drag the variableÂ **pointsÂ **into theÂ box labelledÂ **Dependent List**. Then make sureÂ **PlotsÂ **is selected under the option that saysÂ **DisplayÂ **near the bottom of the box.

Once you clickÂ **OK**, the following box plot will appear:

Hereâ€™s how to interpret this box plot:

**A Note on Outliers**

The interquartile range (IQR) is the distance between the third quartile and the first quartile.Â SPSS considers any data value to be an outlier if it is 1.5 times the IQR larger than the third quartile or 1.5 times the IQR smaller than the first quartile.

Outliers are displayed as tiny circles in SPSS. In the previous example there were no outliers, which is why there were no tiny circles shown in the box plot. However, if our largest value in the dataset was actually 50 then the box plot would show a tiny circle to indicate the outlier:

If an outlier is present in your dataset, you have a few options:

**Make sure the outlier is not a data entry error.**Â Sometimes data values are simply recorded incorrectly. If an outlier is present, first verify that the value was entered correctly and that it wasnâ€™t an error.**Assign a new value to the outlier**. If the outlier turns out to be a result of a data entry error, you may decide to assign a new value to it such asÂ the mean or the medianÂ of the dataset.**Remove the outlier.Â**If the value is a true outlier, you may choose to remove it if it will have a significant impact on your overall analysis. Just make sure to mention in your final report or analysis that you removed an outlier.

**How to Create Multiple Box Plots in SPSS**

If you have several variables, SPSS can also create multiple side-by-side box plots. For example, suppose we have the following data on average points scored by 16 players on three different teams:

To create a box plot for each of these variables, we can once again click onÂ theÂ **AnalyzeÂ **tab, thenÂ **Descriptive Statistics**, thenÂ **Explore**. We can then drag all three variables into the box labelledÂ **Dependent List**:

Once we clickÂ **OK**, the following box plots will appear:

This helps us easily visualize the differences in the distributions between these three teams.

We can also observe the following:

- The median points scored per game is highest for team B and lowest for team C
- The variation in the number of points scored per game is highest for team B, which can be seen by how long their box plot is compared to team A and team C.
- The player with the highest points per game is on team B and the player with the lowest points per game is on team C.

Box plots are useful because they can provide us with so much information about the distribution of datasets just from a single plot.