Using sample() in R allows you to take a random sample of elements from a dataset or a vector,either with or without replacement
The basic syntax is the following
sample (x, size, replace = FALSE, prob = NULL) (see ?sample)
Suppose we have a vector with 10 elements
vector <- c (seq (1:10))
# to generate a random sample of 5 elements from the vector
# without replacement:
sample (vector, 5)
## [1] 9 6 1 3 8
# generate another sample of 5 elements:
sample (vector, 5)
## [1] 9 7 8 4 10
we can also use the argument replace = TRUE so that we are sampling with replacement.
sample (vector, 5, replace = TRUE)
## [1] 7 3 1 9 10
we can also generate sample from a dataset, e.g.: iris in R
head(iris) # this is to see the first 6 rows of this dataset
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
set.seed (100) # this is to ensure that this example is replicable
now choose a random vector of 10 elements from all 150 rows
sample_rows <- sample (1:nrow(iris), 10)
print(sample_rows)
## [1] 102 112 4 55 70 98 135 7 43 140
# choose the 10 rows of the iris datasetthat match the row numbers above
sample <- iris [sample_rows, ]
sample
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 102 5.8 2.7 5.1 1.9 virginica
## 112 6.4 2.7 5.3 1.9 virginica
## 4 4.6 3.1 1.5 0.2 setosa
## 55 6.5 2.8 4.6 1.5 versicolor
## 70 5.6 2.5 3.9 1.1 versicolor
## 98 6.2 2.9 4.3 1.3 versicolor
## 135 6.1 2.6 5.6 1.4 virginica
## 7 4.6 3.4 1.4 0.3 setosa
## 43 4.4 3.2 1.3 0.2 setosa
## 140 6.9 3.1 5.4 2.1 virginica