*59*

This tutorial provides a simple explanation of the difference between a PDF (probability density function) and a CDF (cumulative distribution function) in statistics.

**Random Variables**

Before we can define a PDF or a CDF, we first need to understand random variables.

A **random variable**, usually denoted as X, is a variable whose values are numerical outcomes of some random process. There are two types of random variables: discrete and continuous.

**Discrete Random Variables**

A **discrete random variable** is one which can take on only a countable number of distinct values like 0, 1, 2, 3, 4, 5…100, 1 million, etc. Some examples of discrete random variables include:

- The number of times a coin lands on tails after being flipped 20 times.
- The number of times a dice lands on the number
*4*after being rolled 100 times.

**Continuous Random Variables**

A **continuous random variable** is one which can take on an infinite number of possible values. Some examples of continuous random variables include:

- Height of a person
- Weight of an animal
- Time required to run a mile

For example, the height of a person could be 60.2 inches, 65.2344 inches, 70.431222 inches, etc. There are an infinite amount of possible values for height.

Rule of Thumb: If you can *count *the number of outcomes, then you are working with a discrete random variable (e.g. counting the number of times a coin lands on heads). But if you can *measure *the outcome, you are working with a continuous random variable (e.g. measuring, height, weight, time, etc.)

**Probability Density Functions**

A **probability density function **(pdf) tells us the probability that a random variable takes on a certain value.

For example, suppose we roll a dice one time. If we let *x *denote the number that the dice lands on, then the probability density function for the outcome can be described as follows:

**P(x : 0**

**P(x = 1)** : 1/6

**P(x = 2)** : 1/6

**P(x = 3)** : 1/6

**P(x = 4)** : 1/6

**P(x = 5)** : 1/6

**P(x = 6)** : 1/6

**P(x > 6)** : 0

Note that this is an example of a discrete random variable, since *x *can only take on integer values.

For a continuous random variable, we cannot use a PDF directly, since the probability that *x *takes on any exact value is zero.

For example, suppose we want to know the probability that a burger from a particular restaurant weighs a quarter-pound (0.25 lbs). Since *weight *is a continuous variable, it can take on an infinite number of values.

For example, a given burger might actually weight 0.250001 pounds, or 0.24 pounds, or 0.2488 pounds. The probability that a given burger weights exactly .25 pounds is essentially zero.

**Cumulative Distribution Functions**

A **cumulative distribution function **(cdf) tells us the probability that a random variable takes on a value less than or equal to *x*.

For example, suppose we roll a dice one time. If we let *x *denote the number that the dice lands on, then the cumulative distribution function for the outcome can be described as follows:

**P(x ≤ 0)** : 0

**P(x ≤ 1)** : 1/6

**P(x ≤ 2)** : 2/6

**P(x ≤ 3)** : 3/6

**P(x ≤ 4)** : 4/6

**P(x ≤ 5)** : 5/6

**P(x ≤ 6)** : 6/6

**P(x > 6)** : 0

Notice that the probability that *x *is less than or equal to *6 *is 6/6, which is equal to 1. This is because the dice will land on either 1, 2, 3, 4, 5, or 6 with 100% probability.

This example uses a discrete random variable, but a continuous density function can also be used for a continuous random variable.

Cumulative distribution functions have the following properties:

- The probability that a random variable takes on a value less than the smallest possible value is zero. For example, the probability that a dice lands on a value less than 1 is zero.
- The probability that a random variable takes on a value less than or equal to the largest possible value is one. For example, the probability that a dice lands on a value of 1, 2, 3, 4, 5, or 6 is one. It must land on one of those numbers.
- The cdf is always non-decreasing. That is, the probability that a dice lands on a number less than or equal to 1 is 1/6, the probability that it lands on a number less than or equal to 2 is 2/6, the probability that it lands on a number less than or equal to 3 is 3/6, etc. The cumulative probabilities are always non-decreasing.

**Related: **You can use an ogive graph to visualize a cumulative distribution function.

**The Relationship Between a CDF and a PDF**

In technical terms, a probability density function (pdf) is the derivative of a cumulative distribution function (cdf).

Furthermore, the area under the curve of a pdf between negative infinity and *x *is equal to the value of *x *on the cdf.

For an in-depth explanation of the relationship between a pdf and a cdf, along with the proof for why the pdf is the derivative of the cdf, refer to a statistical textbook.