This is a cheatsheet to give basic documentation for the R functions encountered in introductory statistics.

Functions to create vectors

The c() function

It’s sometimes called “combine” or “collect” or “concatenate” or just plain “cram together”. It just takes a list of items or smaller vectors and produces one larger vector. Here are some examples.

# Create x
x <- c(1,4,5,6,9)
# Display x
x
## [1] 1 4 5 6 9
# Create y
y <- c("a","b","c")
# Display y
y
## [1] "a" "b" "c"
# Create a and b, then combine into c
a <- c(1,2,3)
b <- c(4,5,6)
c <- c(a,b)
# Display c
c
## [1] 1 2 3 4 5 6

Note that if you use a command in R which creates something but does not use it in an assignment statement, the thing you create is displayed, but is not available for further use. Here’s an example.

c(12,23,45)
## [1] 12 23 45

But if you do the creation on the right hand side of an assignment statement, it is not automatically displayed.

x <- c(12,23,45)

To display something, just type its name.

x
## [1] 12 23 45

The : operator

To create a vector of integers starting with Integer1 and ending with Integer2, just place a : between the two. Here are some examples.

x <- 1:100
x
##   [1]   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17
##  [18]  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34
##  [35]  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51
##  [52]  52  53  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68
##  [69]  69  70  71  72  73  74  75  76  77  78  79  80  81  82  83  84  85
##  [86]  86  87  88  89  90  91  92  93  94  95  96  97  98  99 100
x <- -10:10
x
##  [1] -10  -9  -8  -7  -6  -5  -4  -3  -2  -1   0   1   2   3   4   5   6
## [18]   7   8   9  10

The rep() function.

It has two arguments, the thing to be repeated and the number of repetitions. Here are some examples.

x <- rep(0,5)
x
## [1] 0 0 0 0 0
x <- rep(c(1,2,3),5)
x
##  [1] 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3

Functions to get information about vectors

The length() function

This just tells you how many elements there are in the vector.

x <- -23:45
length(x)
## [1] 69

The sum() function

This adds up all the numbers in the vector.

x <- 1:100
sum(x)
## [1] 5050

The min() and max() functions. These give you the minimum and maximum values.

x <- 1:100
min(x)
## [1] 1
max(x)
## [1] 100

Statistical Functions

The mean() function

This gives the usual arithmetic mean, the sum of values divided by the length of the vector.

x <- 1:100
mean(x)
## [1] 50.5

The median() function

This gives the value which has half of the numbers in the vector on either side.

x <- 1:100
median(x)
## [1] 50.5

The mean and the median are measures of central location. They are numbers which describe where the numbers in the vector are located.

The quantile() function

This has two arguments. The first is a vector of numbers. The second is a fraction between 0 and 1. The function returns a number which has that fraction of the numbers in the vector to the left of itself. This is usually referred to as a percentile.

x <- 1:100
quantile(x,.75)
##   75% 
## 75.25

Note that there are many different algorithms for determining quantiles, which give slightly different results. You should ignore these differences. You can type help(quantile) into R to read the details.

The sd() function

This gives the standard deviation of the numbers in the vector. It is a measure of the variation in the numbers in the vector.

x <- 1:100
sd(x)
## [1] 29.01149
x <- 1:200
sd(x)
## [1] 57.87918

Note that the numers in the second version of x are more spread out than those in the first version. Comparing the two values of the standard deviaition confirm this.

The IQR() function

The interquartile range is the difference between the 75th percentile and the 25th percentile of the numbers in the vector. It is an alternative measure of variation.

x <- 1:100
IQR(x)
## [1] 49.5
x <- 1:200
IQR(x)
## [1] 99.5

Note that the IQR values indicate that the numbers in the second version of x are more spread out than those in the first version.

the summary() function

The summary function produces some basic statistical measures.

x <- 1:100
summary(x)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.00   25.75   50.50   50.50   75.25  100.00