Download R from http://cran.r-project.org/mirrors.html
Recommended supplement but not necessary: RStudio from http://www.rstudio.com/products/rstudio/download/
- R Studio: A (very matlab like) IDE for R
Last Updated: June 24, 2015
Download R from http://cran.r-project.org/mirrors.html
Recommended supplement but not necessary: RStudio from http://www.rstudio.com/products/rstudio/download/
Comments are typed with hashtags '#'
# This is a comment
cat("This is not a comment")
## This is not a comment
No block comments. So sad. =(
Examples: 1,2.0,1.1,pi
c(1,2.0,1.1,pi)
## [1] 1.000000 2.000000 1.100000 3.141593
Inf can also be used in calculations1/Inf
## [1] 0
We can even use complex numbers.
complex(real = 1, imaginary = 2)
## [1] 1+2i
8+6i
## [1] 8+6i
Example: 'One', '1', 'pi'
c('One', '1', 'pi')
## [1] "One" "1" "pi"
Boolian values can take only two values: TRUE (T) or FALSE (F).
c(TRUE, FALSE, TRUE)
## [1] TRUE FALSE TRUE
A factor is a categorical variable that can take on only a finite set of values. i.e. Sex, Faculty in University
factor(c('Male','Female','Male','Male'))
## [1] Male Female Male Male ## Levels: Female Male
Everything is an object, including functions!
Most common objects are called vectors.
Examples: vector of numbers
a1 <- c(1,2,3) a1
## [1] 1 2 3
a2 <- c('one','two','three')
a2
## [1] "one" "two" "three"
a3 <- c('1','2','3')
a3
## [1] "1" "2" "3"
You can also create a range of values using start:end
4:10
## [1] 4 5 6 7 8 9 10
4:-3
## [1] 4 3 2 1 0 -1 -2 -3
0.1:4
## [1] 0.1 1.1 2.1 3.1
Numerical operations are: +, -, *, /, ^
- These operate elementwise between vectors.
| Operator | Description |
|---|---|
| + | Addition |
| - | Subtraction |
| * | Multiplication |
| / | Division |
| ^ | Power |
c(1,2,3) * c(4,5,6)
## [1] 4 10 18
Note: They don't have to have the same length. If they don't then the vector will recycle though the shorter vector. The longer has to be a multiple of the shorter vector.
c(1,2,3) ^ c(1,2,3,4,5,6)
## [1] 1 4 27 1 32 729
TRUE or FALSE| Operator | Description |
|---|---|
| < | Less than |
| > | Greater than |
| <= | Less than or equal to |
| >= | Greater than or equal to |
| == | Equal to |
| != | Not equal to |
| | | Elementwise Or |
| Operator | Description |
|---|---|
| & | Elementwise And |
| || | Or |
| && | And |
c(TRUE, FALSE) | c(FALSE, FALSE)
## [1] TRUE FALSE
c(1,2,3) < c(2,1,4)
## [1] TRUE FALSE TRUE
Pro Tip: When interacting with number, boolians are converted to an integer: 0, or 1.
is.(typename)
Example: is.vector, is.integer, is.numeric, is.data.frame, is.matrix
typeof() !is.numeric(a1)
## [1] TRUE
is.vector(a1)
## [1] TRUE
is.data.frame(a1)
## [1] FALSE
Assignment can come in 3 forms:var_name <- evaluationvar_name = evaluationevaluation -> var_name
x <- 1 x
## [1] 1
Be careful: <- is not the same as < -
x < -1
## [1] FALSE
y = "string" y
## [1] "string"
"This isn't used much" -> z z
## [1] "This isn't used much"
They are different vectors! To concatenate two vectors, use c(vector.1, vector.2)
b12 <- c(a1,a2) b12
## [1] "1" "2" "3" "one" "two" "three"
b23 <- c(a2,a3) b23
## [1] "one" "two" "three" "1" "2" "3"
b13 <- c(a1,a3) b13
## [1] "1" "2" "3" "1" "2" "3"
b21 <- c(a2,a1) b21
## [1] "one" "two" "three" "1" "2" "3"
Notice that when combined with characters, numerics are changed into characters automatically. So b23 == b21.
b23 == b21
## [1] TRUE TRUE TRUE TRUE TRUE TRUE
b123 <- c(a1,a2,a3)
To use dot product of two vectors (instead of elementwise) use %*%
a1 %*% a1
## [,1] ## [1,] 14
c(1,4,5) %*% c(6,7,2)
## [,1] ## [1,] 44
c(1,2,"three") be?c(3,4,5,6) to the power of 4?c(3,4,5,6) greater than 4?c(1,2,"three") be?class(c(1,2,"three"))
## [1] "character"
c(3,4,5,6) to the power of 4?c(3,4,5,6) ^ 4
## [1] 81 256 625 1296
c(3,4,5,6) greater than 4?c(3,4,5,6) > 4
## [1] FALSE FALSE TRUE TRUE
Different from vectors, they allow us to put multiple structures in a list.
- Useful when we need to store a list of objects with different datatypes
l12 <- list(a1,a2) l12
## [[1]] ## [1] 1 2 3 ## ## [[2]] ## [1] "one" "two" "three"
l23 <- list(a2,a3) l23
## [[1]] ## [1] "one" "two" "three" ## ## [[2]] ## [1] "1" "2" "3"
l13 <- list(a1,a3) l13
## [[1]] ## [1] 1 2 3 ## ## [[2]] ## [1] "1" "2" "3"
Notice they are stored in two 'different arrays'
as.vector, as.list can interchange list to vectors and vectors to list via as.vector and as.list
as.vector(l23)
## [[1]] ## [1] "one" "two" "three" ## ## [[2]] ## [1] "1" "2" "3"
as.list(a1)
## [[1]] ## [1] 1 ## ## [[2]] ## [1] 2 ## ## [[3]] ## [1] 3
q1 = 1:10 #Question 1 q1c = as.list(as.character(seq(2.1,2.5,0.1))) q1 + as.numeric(q1c) # Question 2
## [1] 3.1 4.2 5.3 6.4 7.5 8.1 9.2 10.3 11.4 12.5
x1 = rnorm(7,mean = 13, sd = 5) #Question 3 x2 = rnorm(7,mean = 13, sd = 5) x1 %*% x2 #Question 4
## [,1] ## [1,] 695.1124
M1 <- matrix(c(1,2,3,4,5,6,7,8,9),nrow=3,ncol=3) M1
## [,1] [,2] [,3] ## [1,] 1 4 7 ## [2,] 2 5 8 ## [3,] 3 6 9
M2 <- matrix(9:1 ,3 ,3) M2
## [,1] [,2] [,3] ## [1,] 9 6 3 ## [2,] 8 5 2 ## [3,] 7 4 1
M3 <- matrix(c(a1,a2),2,3) M3
## [,1] [,2] [,3] ## [1,] "1" "3" "two" ## [2,] "2" "one" "three"
df1 <- data.frame(a1,a2,a3) df1
## a1 a2 a3 ## 1 1 one 1 ## 2 2 two 2 ## 3 3 three 3
| Attribute | Description |
|---|---|
names |
Names of an object |
dimnames |
Names of the dimensions of an object |
dim |
Dimension of an object |
class |
Class of an object |
length |
Length of an object |
length(a1)
## [1] 3
names(a1) = c("a","b","c")
a1
## a b c ## 1 2 3
names(df1) = c("var_1","var_2","var_3")
df1
## var_1 var_2 var_3 ## 1 1 one 1 ## 2 2 two 2 ## 3 3 three 3
dim(M1)
## [1] 3 3
Indices, just like linear algebra, for vectors, specify thy entry, and matrix row first then column.
a1[2] # Second entry
## b ## 2
M1[1,2] #First row second column
## [1] 4
df1[2,3] # Second row third column
## [1] 2 ## Levels: 1 2 3
M1[1,] # First row
## [1] 1 4 7
M1[,3] # Third column
## [1] 7 8 9
You can also Boolian values to get a subset of values:
a1[a1 <= 2]
## a b ## 1 2
Accessing the elements of a list is slightly different. Use double [[]] notation:
l13[[1]]
## [1] 1 2 3
rownames(M1) <- c('Ein','Zwei','Drei')
colnames(M1) <- c('Un','Deux','Trois')
M1
## Un Deux Trois ## Ein 1 4 7 ## Zwei 2 5 8 ## Drei 3 6 9
rownames(df1) <- c('Uno','Dos','Tres')
colnames(df1) <- c('yi','er','san')
df1
## yi er san ## Uno 1 one 1 ## Dos 2 two 2 ## Tres 3 three 3
rbind(): Add new row to rbind, cbind
M1.rbind <- rbind(M1,M1) M1.rbind
## Un Deux Trois ## Ein 1 4 7 ## Zwei 2 5 8 ## Drei 3 6 9 ## Ein 1 4 7 ## Zwei 2 5 8 ## Drei 3 6 9
M2.rbind <- rbind(M1,M2) # Notice the names of columns and rows M2.rbind
## Un Deux Trois ## Ein 1 4 7 ## Zwei 2 5 8 ## Drei 3 6 9 ## 9 6 3 ## 8 5 2 ## 7 4 1
M1.cbind <- cbind(M1,M1) M1.cbind
## Un Deux Trois Un Deux Trois ## Ein 1 4 7 1 4 7 ## Zwei 2 5 8 2 5 8 ## Drei 3 6 9 3 6 9
df1$yi
## [1] 1 2 3
df1$er
## [1] one two three ## Levels: one three two
df1$san
## [1] 1 2 3 ## Levels: 1 2 3
read.file_type(file = "Name.file_type", header = TRUE, sep = "")
attach(iris) head(iris)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species ## 1 5.1 3.5 1.4 0.2 setosa ## 2 4.9 3.0 1.4 0.2 setosa ## 3 4.7 3.2 1.3 0.2 setosa ## 4 4.6 3.1 1.5 0.2 setosa ## 5 5.0 3.6 1.4 0.2 setosa ## 6 5.4 3.9 1.7 0.4 setosa
summary(iris)
## Sepal.Length Sepal.Width Petal.Length Petal.Width ## Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100 ## 1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300 ## Median :5.800 Median :3.000 Median :4.350 Median :1.300 ## Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199 ## 3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800 ## Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500 ## Species ## setosa :50 ## versicolor:50 ## virginica :50 ## ## ##
print(iris)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species ## 1 5.1 3.5 1.4 0.2 setosa ## 2 4.9 3.0 1.4 0.2 setosa ## 3 4.7 3.2 1.3 0.2 setosa ## 4 4.6 3.1 1.5 0.2 setosa ## 5 5.0 3.6 1.4 0.2 setosa ## 6 5.4 3.9 1.7 0.4 setosa ## 7 4.6 3.4 1.4 0.3 setosa ## 8 5.0 3.4 1.5 0.2 setosa ## 9 4.4 2.9 1.4 0.2 setosa ## 10 4.9 3.1 1.5 0.1 setosa ## 11 5.4 3.7 1.5 0.2 setosa ## 12 4.8 3.4 1.6 0.2 setosa ## 13 4.8 3.0 1.4 0.1 setosa ## 14 4.3 3.0 1.1 0.1 setosa ## 15 5.8 4.0 1.2 0.2 setosa ## 16 5.7 4.4 1.5 0.4 setosa ## 17 5.4 3.9 1.3 0.4 setosa ## 18 5.1 3.5 1.4 0.3 setosa ## 19 5.7 3.8 1.7 0.3 setosa ## 20 5.1 3.8 1.5 0.3 setosa ## 21 5.4 3.4 1.7 0.2 setosa ## 22 5.1 3.7 1.5 0.4 setosa ## 23 4.6 3.6 1.0 0.2 setosa ## 24 5.1 3.3 1.7 0.5 setosa ## 25 4.8 3.4 1.9 0.2 setosa ## 26 5.0 3.0 1.6 0.2 setosa ## 27 5.0 3.4 1.6 0.4 setosa ## 28 5.2 3.5 1.5 0.2 setosa ## 29 5.2 3.4 1.4 0.2 setosa ## 30 4.7 3.2 1.6 0.2 setosa ## 31 4.8 3.1 1.6 0.2 setosa ## 32 5.4 3.4 1.5 0.4 setosa ## 33 5.2 4.1 1.5 0.1 setosa ## 34 5.5 4.2 1.4 0.2 setosa ## 35 4.9 3.1 1.5 0.2 setosa ## 36 5.0 3.2 1.2 0.2 setosa ## 37 5.5 3.5 1.3 0.2 setosa ## 38 4.9 3.6 1.4 0.1 setosa ## 39 4.4 3.0 1.3 0.2 setosa ## 40 5.1 3.4 1.5 0.2 setosa ## 41 5.0 3.5 1.3 0.3 setosa ## 42 4.5 2.3 1.3 0.3 setosa ## 43 4.4 3.2 1.3 0.2 setosa ## 44 5.0 3.5 1.6 0.6 setosa ## 45 5.1 3.8 1.9 0.4 setosa ## 46 4.8 3.0 1.4 0.3 setosa ## 47 5.1 3.8 1.6 0.2 setosa ## 48 4.6 3.2 1.4 0.2 setosa ## 49 5.3 3.7 1.5 0.2 setosa ## 50 5.0 3.3 1.4 0.2 setosa ## 51 7.0 3.2 4.7 1.4 versicolor ## 52 6.4 3.2 4.5 1.5 versicolor ## 53 6.9 3.1 4.9 1.5 versicolor ## 54 5.5 2.3 4.0 1.3 versicolor ## 55 6.5 2.8 4.6 1.5 versicolor ## 56 5.7 2.8 4.5 1.3 versicolor ## 57 6.3 3.3 4.7 1.6 versicolor ## 58 4.9 2.4 3.3 1.0 versicolor ## 59 6.6 2.9 4.6 1.3 versicolor ## 60 5.2 2.7 3.9 1.4 versicolor ## 61 5.0 2.0 3.5 1.0 versicolor ## 62 5.9 3.0 4.2 1.5 versicolor ## 63 6.0 2.2 4.0 1.0 versicolor ## 64 6.1 2.9 4.7 1.4 versicolor ## 65 5.6 2.9 3.6 1.3 versicolor ## 66 6.7 3.1 4.4 1.4 versicolor ## 67 5.6 3.0 4.5 1.5 versicolor ## 68 5.8 2.7 4.1 1.0 versicolor ## 69 6.2 2.2 4.5 1.5 versicolor ## 70 5.6 2.5 3.9 1.1 versicolor ## 71 5.9 3.2 4.8 1.8 versicolor ## 72 6.1 2.8 4.0 1.3 versicolor ## 73 6.3 2.5 4.9 1.5 versicolor ## 74 6.1 2.8 4.7 1.2 versicolor ## 75 6.4 2.9 4.3 1.3 versicolor ## 76 6.6 3.0 4.4 1.4 versicolor ## 77 6.8 2.8 4.8 1.4 versicolor ## 78 6.7 3.0 5.0 1.7 versicolor ## 79 6.0 2.9 4.5 1.5 versicolor ## 80 5.7 2.6 3.5 1.0 versicolor ## 81 5.5 2.4 3.8 1.1 versicolor ## 82 5.5 2.4 3.7 1.0 versicolor ## 83 5.8 2.7 3.9 1.2 versicolor ## 84 6.0 2.7 5.1 1.6 versicolor ## 85 5.4 3.0 4.5 1.5 versicolor ## 86 6.0 3.4 4.5 1.6 versicolor ## 87 6.7 3.1 4.7 1.5 versicolor ## 88 6.3 2.3 4.4 1.3 versicolor ## 89 5.6 3.0 4.1 1.3 versicolor ## 90 5.5 2.5 4.0 1.3 versicolor ## 91 5.5 2.6 4.4 1.2 versicolor ## 92 6.1 3.0 4.6 1.4 versicolor ## 93 5.8 2.6 4.0 1.2 versicolor ## 94 5.0 2.3 3.3 1.0 versicolor ## 95 5.6 2.7 4.2 1.3 versicolor ## 96 5.7 3.0 4.2 1.2 versicolor ## 97 5.7 2.9 4.2 1.3 versicolor ## 98 6.2 2.9 4.3 1.3 versicolor ## 99 5.1 2.5 3.0 1.1 versicolor ## 100 5.7 2.8 4.1 1.3 versicolor ## 101 6.3 3.3 6.0 2.5 virginica ## 102 5.8 2.7 5.1 1.9 virginica ## 103 7.1 3.0 5.9 2.1 virginica ## 104 6.3 2.9 5.6 1.8 virginica ## 105 6.5 3.0 5.8 2.2 virginica ## 106 7.6 3.0 6.6 2.1 virginica ## 107 4.9 2.5 4.5 1.7 virginica ## 108 7.3 2.9 6.3 1.8 virginica ## 109 6.7 2.5 5.8 1.8 virginica ## 110 7.2 3.6 6.1 2.5 virginica ## 111 6.5 3.2 5.1 2.0 virginica ## 112 6.4 2.7 5.3 1.9 virginica ## 113 6.8 3.0 5.5 2.1 virginica ## 114 5.7 2.5 5.0 2.0 virginica ## 115 5.8 2.8 5.1 2.4 virginica ## 116 6.4 3.2 5.3 2.3 virginica ## 117 6.5 3.0 5.5 1.8 virginica ## 118 7.7 3.8 6.7 2.2 virginica ## 119 7.7 2.6 6.9 2.3 virginica ## 120 6.0 2.2 5.0 1.5 virginica ## 121 6.9 3.2 5.7 2.3 virginica ## 122 5.6 2.8 4.9 2.0 virginica ## 123 7.7 2.8 6.7 2.0 virginica ## 124 6.3 2.7 4.9 1.8 virginica ## 125 6.7 3.3 5.7 2.1 virginica ## 126 7.2 3.2 6.0 1.8 virginica ## 127 6.2 2.8 4.8 1.8 virginica ## 128 6.1 3.0 4.9 1.8 virginica ## 129 6.4 2.8 5.6 2.1 virginica ## 130 7.2 3.0 5.8 1.6 virginica ## 131 7.4 2.8 6.1 1.9 virginica ## 132 7.9 3.8 6.4 2.0 virginica ## 133 6.4 2.8 5.6 2.2 virginica ## 134 6.3 2.8 5.1 1.5 virginica ## 135 6.1 2.6 5.6 1.4 virginica ## 136 7.7 3.0 6.1 2.3 virginica ## 137 6.3 3.4 5.6 2.4 virginica ## 138 6.4 3.1 5.5 1.8 virginica ## 139 6.0 3.0 4.8 1.8 virginica ## 140 6.9 3.1 5.4 2.1 virginica ## 141 6.7 3.1 5.6 2.4 virginica ## 142 6.9 3.1 5.1 2.3 virginica ## 143 5.8 2.7 5.1 1.9 virginica ## 144 6.8 3.2 5.9 2.3 virginica ## 145 6.7 3.3 5.7 2.5 virginica ## 146 6.7 3.0 5.2 2.3 virginica ## 147 6.3 2.5 5.0 1.9 virginica ## 148 6.5 3.0 5.2 2.0 virginica ## 149 6.2 3.4 5.4 2.3 virginica ## 150 5.9 3.0 5.1 1.8 virginica
WARNING: DO NOT USE ANY OF lapply OR sapply under normal circumstances
sapply(iris$Sepal.Width,floor)
## [1] 3 3 3 3 3 3 3 3 2 3 3 3 3 3 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 3 ## [36] 3 3 3 3 3 3 2 3 3 3 3 3 3 3 3 3 3 3 2 2 2 3 2 2 2 2 3 2 2 2 3 3 2 2 2 ## [71] 3 2 2 2 2 3 2 3 2 2 2 2 2 2 3 3 3 2 3 2 2 3 2 2 2 3 2 2 2 2 3 2 3 2 3 ## [106] 3 2 2 2 3 3 2 3 2 2 3 3 3 2 2 3 2 2 2 3 3 2 3 2 3 2 3 2 2 2 3 3 3 3 3 ## [141] 3 3 2 3 3 3 2 3 3 3
lapply(iris$Sepal.width,floor)
## list()
floor(iris$Sepal.Width)
## [1] 3 3 3 3 3 3 3 3 2 3 3 3 3 3 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 3 ## [36] 3 3 3 3 3 3 2 3 3 3 3 3 3 3 3 3 3 3 2 2 2 3 2 2 2 2 3 2 2 2 3 3 2 2 2 ## [71] 3 2 2 2 2 3 2 3 2 2 2 2 2 2 3 3 3 2 3 2 2 3 2 2 2 3 2 2 2 2 3 2 3 2 3 ## [106] 3 2 2 2 3 3 2 3 2 2 3 3 3 2 2 3 2 2 2 3 3 2 3 2 3 2 3 2 2 2 3 3 3 3 3 ## [141] 3 3 2 3 3 3 2 3 3 3
Notice that this returns same thing as sapply, so there is no reason to use sapply under most of the cases.
apply is a function to apply a function to a matrix by row or column or bothapply(M2,1,min)# Minimum for each row
## [1] 3 2 1
apply(M2,2,min)# Minimum for each column
## [1] 7 4 1
apply(M2,c(1,2),min)# Minimum of all entries, same as min(M2)
## [,1] [,2] [,3] ## [1,] 9 6 3 ## [2,] 8 5 2 ## [3,] 7 4 1
Can predefine default value for argument(s) - Can take in vectors instead of scalars
random.walk <- function(n=1000,p=0.5, start = 0,min = 0, max = 1){
rand <- runif(n = n, min = min, max = max)/(max - min);
steps <- sign(2*(rand - p));
out <- start + cumsum(steps);
return(out)
}
plot(random.walk(),type = "l")
Use the iris dataset in R and build a matrix call iris.matrix with the followings:
1. Columns and rows of iris corresponds to columns and rows of iris.matrix
2. Change the Species column of iris.matrix into the following indicator variables
- 1 - setosa, 2 - versicolor, 3 - virginica
3. Get the mean of every column except for Species column
4. Take away respective mean from each column except for the Species column
5. Produce the summary of the new matrix
browser() is useful to help debugging, talk about this later?function_name is useful to look up what a function doesInteresting reads:
ggplot2: Plot package based around "the grammer of graphics" data.table: Package showcasing a faster version of data.frame
R Programming Reference: http://rpubs.com/uwaterloodatateam/r-programming-reference
Learn more R using swirl: http://swirlstats.com/students.html
Contribute useful code snippets: https://github.com/uWaterlooDataTeam/r-programming-tutorial