This is a cheatsheet to give basic documentation for the R functions encountered in introductory statistics.

## Functions to create vectors

### The c() function

It’s sometimes called “combine” or “collect” or “concatenate” or just plain “cram together”. It just takes a list of items or smaller vectors and produces one larger vector. Here are some examples.

# Create x
x <- c(1,4,5,6,9)
# Display x
x
##  1 4 5 6 9
# Create y
y <- c("a","b","c")
# Display y
y
##  "a" "b" "c"
# Create a and b, then combine into c
a <- c(1,2,3)
b <- c(4,5,6)
c <- c(a,b)
# Display c
c
##  1 2 3 4 5 6

Note that if you use a command in R which creates something but does not use it in an assignment statement, the thing you create is displayed, but is not available for further use. Here’s an example.

c(12,23,45)
##  12 23 45

But if you do the creation on the right hand side of an assignment statement, it is not automatically displayed.

x <- c(12,23,45)

To display something, just type its name.

x
##  12 23 45

### The : operator

To create a vector of integers starting with Integer1 and ending with Integer2, just place a : between the two. Here are some examples.

x <- 1:100
x
##      1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17
##    18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34
##    35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51
##    52  53  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68
##    69  70  71  72  73  74  75  76  77  78  79  80  81  82  83  84  85
##    86  87  88  89  90  91  92  93  94  95  96  97  98  99 100
x <- -10:10
x
##   -10  -9  -8  -7  -6  -5  -4  -3  -2  -1   0   1   2   3   4   5   6
##    7   8   9  10

### The rep() function.

It has two arguments, the thing to be repeated and the number of repetitions. Here are some examples.

x <- rep(0,5)
x
##  0 0 0 0 0
x <- rep(c(1,2,3),5)
x
##   1 2 3 1 2 3 1 2 3 1 2 3 1 2 3

## Functions to get information about vectors

### The length() function

This just tells you how many elements there are in the vector.

x <- -23:45
length(x)
##  69

### The sum() function

This adds up all the numbers in the vector.

x <- 1:100
sum(x)
##  5050

### The min() and max() functions. These give you the minimum and maximum values.

x <- 1:100
min(x)
##  1
max(x)
##  100

## The mean() function

This gives the usual arithmetic mean, the sum of values divided by the length of the vector.

x <- 1:100
mean(x)
##  50.5

## The median() function

This gives the value which has half of the numbers in the vector on either side.

x <- 1:100
median(x)
##  50.5

The mean and the median are measures of central location. They are numbers which describe where the numbers in the vector are located.

## The quantile() function

This has two arguments. The first is a vector of numbers. The second is a fraction between 0 and 1. The function returns a number which has that fraction of the numbers in the vector to the left of itself. This is usually referred to as a percentile.

x <- 1:100
quantile(x,.75)
##   75%
## 75.25

Note that there are many different algorithms for determining quantiles, which give slightly different results. You should ignore these differences. You can type help(quantile) into R to read the details.

## The sd() function

This gives the standard deviation of the numbers in the vector. It is a measure of the variation in the numbers in the vector.

x <- 1:100
sd(x)
##  29.01149
x <- 1:200
sd(x)
##  57.87918

Note that the numers in the second version of x are more spread out than those in the first version. Comparing the two values of the standard deviaition confirm this.

## The IQR() function

The interquartile range is the difference between the 75th percentile and the 25th percentile of the numbers in the vector. It is an alternative measure of variation.

x <- 1:100
IQR(x)
##  49.5
x <- 1:200
IQR(x)
##  99.5

Note that the IQR values indicate that the numbers in the second version of x are more spread out than those in the first version.

## the summary() function

The summary function produces some basic statistical measures.

x <- 1:100
summary(x)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
##    1.00   25.75   50.50   50.50   75.25  100.00