Syntax and variable assignment in R:
numbersix <- 6
numbersix
## [1] 6
stringsix <- "six"
Modulo operation:
10%%3
## [1] 1
Can see datatypes by doing:
class(numbersix)
## [1] "numeric"
class(stringsix)
## [1] "character"
A vector in R is basically an array. To make one, use the “combine” function:
vec <- c(4, 5, 6)
vec
## [1] 4 5 6
Can give names to vector elements with the “names” function:
cardinals <- c("first", "second", "third")
names(vec) <- cardinals
vec
## first second third
## 4 5 6
Summing 2 vectors adds each element in the respective parallel slots, one by one, creating a new summed vector.
vec2 <- c(1, 2, 3)
vec + vec2
## first second third
## 5 7 9
It seems that you can't have a vector with different data types inside. In this vector, everything gets cast as a string just because there's one string in there…
vec3 <- c(1, 3, "hi")
vec3
## [1] "1" "3" "hi"
You can access elements of a vector just like an array (although indices start at 1, not 0)
vec[1]
## first
## 4
It seems that when you try to access outside the bounds of a vector, it returns “NA.”
vec[4]
## <NA>
## NA
Returning a subset of a vector:
subset <- vec[c(2, 3)]
subset
## second third
## 5 6
Or, using “to-from” syntax:
bigvector <- c(1, 2, 3, 4, 5, 6, 7, 8, 9)
subset <- bigvector[3:7]
subset
## [1] 3 4 5 6 7
To get the average of all elements in a vector:
mean(bigvector)
## [1] 5
Comparison operators are the same as Java. Can use comparison operators to create a boolean vector, from which you can select from the original vector to extract certain elements. By default, R will extract only those elements that are TRUE (are greater than or equal to 6).
subset <- bigvector >= 6
sixormore <- bigvector[subset]
sixormore
## [1] 6 7 8 9
A matrix is a table with rows and columns. Construction:
mat <- matrix(1:10, byrow = TRUE, nrow = 3)
## Warning: data length [10] is not a sub-multiple or multiple of the number
## of rows [3]
“1:10” is what specifies the items in the matrix (numbers 1-10), can specify either “byrow” (fill by rows, left to right) or “bycol” (up to down). “nrow” specifices that the numbers 1-9 will be scrunched into 3 rows.
mat
## [,1] [,2] [,3] [,4]
## [1,] 1 2 3 4
## [2,] 5 6 7 8
## [3,] 9 10 1 2
Note that 1:10 starts repeating because we have extra slots.
You can make a matrix from vectors:
malvern <- c("Malvern", "PA")
boston <- c("Boston", "MA")
cambridge <- c("Cambridge", "MA")
towns <- matrix(c(malvern, boston, cambridge), byrow = TRUE, nrow = 3)
towns
## [,1] [,2]
## [1,] "Malvern" "PA"
## [2,] "Boston" "MA"
## [3,] "Cambridge" "MA"
Basically what we're doing is putting all the items (malvern, pa, boston, ma..) into one big row, then smushing them into rows and columns.
*If these elements were numbers, could sum up rows with the rowSums() function.
Can name the rows and columns:
toCoffee <- c(4, 5.6, 0.7, 0.5, 0.4, 0.9)
townnames <- c("Malvern", "Boston", "Cambridge")
coltitles <- c("To Starbucks", "To Dunkin")
mat <- matrix(toCoffee, nrow = 3, byrow = TRUE, dimnames = list(townnames, coltitles))
mat
## To Starbucks To Dunkin
## Malvern 4.0 5.6
## Boston 0.7 0.5
## Cambridge 0.4 0.9
Adding a column to the coffee matrix using cbind():
toPeets <- c(19.4, 3.7, 0.3)
mat <- cbind(mat, toPeets)
mat
## To Starbucks To Dunkin toPeets
## Malvern 4.0 5.6 19.4
## Boston 0.7 0.5 3.7
## Cambridge 0.4 0.9 0.3
coffeemileage <- rowSums(mat)
coffeemileage
## Malvern Boston Cambridge
## 29.0 4.9 1.6
Selecting from a matrix goes like this, where row, col. Averagedistance is the average distance to any coffee shop across all 3 towns.
averagedistance <- mean(mat[, ])
averagedistance
## [1] 3.944
avgCambridge <- mean(mat[3, ])
avgCambridge
## [1] 0.5333
avgToDunkin <- mean(mat[, 2])
avgToDunkin
## [1] 2.333
Arithmetic operators- doing mat/2 will divide every coffee distance times 2.
mat/2
## To Starbucks To Dunkin toPeets
## Malvern 2.00 2.80 9.70
## Boston 0.35 0.25 1.85
## Cambridge 0.20 0.45 0.15