*35*

In statistics, a **z-score** tells us how many standard deviations away a value is from the mean. We use the following formula to calculate a z-score:

Z-Score =(x_{i}â€“ Î¼) / Ïƒ

where:

**x**A single data value_{i}:**Î¼:**The mean of the dataset**Ïƒ:**The standard deviation of the dataset

Z-scores are often used to detect outliers in a dataset. For example, observations with a z-score less than -3 or greater than 3 are often deemed to be outliers.

However, z-scores can be affected by unusually large or small data values, which is why a more robust way to detect outliers is to use a **modified z-score**, which is calculated as:

Modified z-score = 0.6745(x_{i}â€“ xÌƒ) / MAD

where:

**x**A single data value_{i}:**xÌƒ:**The median of the dataset**MAD:**The median absolute deviation of the dataset

A modified z-score is more robust because it uses the median to calculate z-scores as opposed to the mean, which is known to be influenced by outliers.

Iglewicz and Hoaglin recommend that values with modified z-scores less than -3.5 or greater than 3.5 be labeled as potential outliers.

The following step-by-step example shows how to calculate modified z-scores for a given dataset.

**Step 1: Create the Data**

Suppose we have the following dataset with 16 values:

**Step 2: Find the Median**

Next, we will find the median. This represents the middle point in the dataset, which turns out to beÂ **16**.

**Step 3: Find the Absolute Difference Between Each Value & the Median**

Next, we will find the absolute difference between each individual data value and the median. For example, the absolute difference between the first data value and the median is calculated as:

Absolute Difference = |6 â€“ 16| = 10

We can use the same formula to calculate the absolute difference between each individual data value and the median:

**Step 4: Find the Median Absolute Deviation**

Next, weâ€™ll find the median absolute deviation. This is the median of the second column, which turns out to beÂ **8**.

**Step 5: Find the Modified Z-Score for Each Data Value**

Lastly, we can calculate the modified z-score for each data value using the following formula:

**Modified z-score = 0.6745(x _{i} â€“ xÌƒ) / MAD**

For example, the modified z-score for the first data value is calculated as:

Modified z-score = 0.6745*(6-16) / 8 =Â **-0.843**

We can repeat this formula for every value in the dataset:

We can see that no value in the dataset has a modified z-score less than -3.5 or greater than 3.5, thus we wouldnâ€™t label any value in this dataset as a potential outlier.

**How to Handle Outliers**

If an outlier is present in your dataset, you have a few options:

**Make sure the outlier is not the result of a data entry error.**Â Sometimes an individual simply enters the wrong data value when recording data. If an outlier is present, first verify that the value was entered correctly and that it wasnâ€™t an error.**Assign a new value to the outlier**. If the outlier turns out to be a result of a data entry error, you may decide to assign a new value to it such as the mean or the median of the dataset.**Remove the outlier.**If the value is a true outlier, you may choose to remove it if it will have a significant impact on your overall analysis. Just make sure to mention in your final report or analysis that you removed an outlier.