*67*

To select a random sample in R we can use the **sample()Â **function, which uses the following syntax:

**sample(x, size, replace = FALSE, prob = NULL)**

where:

**x:**A vector of elements from which to choose.**size:Â**Sample size.**replace:Â**Whether to sample with replacement or not. Default is FALSE.**prob:Â**Vector of probability weights for obtaining elements from vector. Default is NULL.

This tutorial explains how to use this function to select a random sample in R from both a vector and a data frame.

**Example 1: Random Sample from a Vector**

The following code shows how to select a random sample from a vectorÂ **without replacement**:

#create vector of data data #select random sample of 5 elements without replacement sample(x=data, size=5) [1] 10 12 5 14 7

The following code shows how to select a random sample from a vectorÂ **with replacement**:

#create vector of data data #select random sample of 5 elements with replacement sample(x=data, size=5, replace=TRUE) [1] 12 1 1 6 14

**Example 2: Random Sample from a Data Frame**

The following code shows how to select a random sample from a data frame:

#create data frame df #view data frame df x y z 1 3 12 2 2 5 6 7 3 6 4 8 4 6 23 8 5 8 25 15 6 12 8 17 7 14 9 29 #select random sample of three rows from data frame rand_df sample(nrow(df), size=3), ] #display randomly selected rows rand_df x y z 4 6 23 8 7 14 9 29 1 3 12 2

Hereâ€™s whatâ€™s happening in this bit of code:

**1.** To select a subset of a data frame in R, we use the following syntax: df[rows, columns]

**2.** In the code above, we randomly select a sample of 3 rows from the data frame andÂ *allÂ *columns.

**3.** The end result is a subset of the data frame with 3 randomly selected rows.

Itâ€™s important to note that each time we use the **sample()** function, R will select a different sample since the function chooses values randomly.

In order to replicate the results of some analysis, be sure to use **set.seed(some number)** so that the sample() function chooses the same random sample each time. For example:

#make this example reproducible set.seed(23) #create data frame df #select random sample of three rows from data frame rand_df sample(nrow(df), size=3), ] #display randomly selected rows rand_df x y z 5 8 25 15 2 5 6 7 6 12 8 17

Each time you run the above code, the same 3 rows of the data frame will be selected each time.Â

**Additional Resources**

Stratified Sampling in R (With Examples)

Systematic Sampling in R (With Examples)

Cluster Sampling in R (With Examples)