R Part 1- Basics, Vectors, Matrices

Basics

Syntax and variable assignment in R:

numbersix <- 6
numbersix

## [1] 6


stringsix <- "six"

Modulo operation:

10%%3

## [1] 1

Can see datatypes by doing:

class(numbersix)

## [1] "numeric"

class(stringsix)

## [1] "character"

Vectors

A vector in R is basically an array. To make one, use the “combine” function:

vec <- c(4, 5, 6)
vec

## [1] 4 5 6

Can give names to vector elements with the “names” function:

cardinals <- c("first", "second", "third")
names(vec) <- cardinals
vec

##  first second  third 
##      4      5      6

Summing 2 vectors adds each element in the respective parallel slots, one by one, creating a new summed vector.

vec2 <- c(1, 2, 3)
vec + vec2

##  first second  third 
##      5      7      9

It seems that you can't have a vector with different data types inside. In this vector, everything gets cast as a string just because there's one string in there…

vec3 <- c(1, 3, "hi")
vec3

## [1] "1"  "3"  "hi"

You can access elements of a vector just like an array (although indices start at 1, not 0)

vec[1]

## first 
##     4

It seems that when you try to access outside the bounds of a vector, it returns “NA.”

vec[4]

## <NA> 
##   NA

Returning a subset of a vector:

subset <- vec[c(2, 3)]
subset

## second  third 
##      5      6

Or, using “to-from” syntax:

bigvector <- c(1, 2, 3, 4, 5, 6, 7, 8, 9)
subset <- bigvector[3:7]
subset

## [1] 3 4 5 6 7

To get the average of all elements in a vector:

mean(bigvector)

## [1] 5

Comparison operators are the same as Java. Can use comparison operators to create a boolean vector, from which you can select from the original vector to extract certain elements. By default, R will extract only those elements that are TRUE (are greater than or equal to 6).

subset <- bigvector >= 6
sixormore <- bigvector[subset]
sixormore

## [1] 6 7 8 9

Matrices

A matrix is a table with rows and columns. Construction:

mat <- matrix(1:10, byrow = TRUE, nrow = 3)

## Warning: data length [10] is not a sub-multiple or multiple of the number
## of rows [3]

“1:10” is what specifies the items in the matrix (numbers 1-10), can specify either “byrow” (fill by rows, left to right) or “bycol” (up to down). “nrow” specifices that the numbers 1-9 will be scrunched into 3 rows.

mat

##      [,1] [,2] [,3] [,4]
## [1,]    1    2    3    4
## [2,]    5    6    7    8
## [3,]    9   10    1    2

Note that 1:10 starts repeating because we have extra slots.

You can make a matrix from vectors:

malvern <- c("Malvern", "PA")
boston <- c("Boston", "MA")
cambridge <- c("Cambridge", "MA")
towns <- matrix(c(malvern, boston, cambridge), byrow = TRUE, nrow = 3)
towns

##      [,1]        [,2]
## [1,] "Malvern"   "PA"
## [2,] "Boston"    "MA"
## [3,] "Cambridge" "MA"

Basically what we're doing is putting all the items (malvern, pa, boston, ma..) into one big row, then smushing them into rows and columns.

*If these elements were numbers, could sum up rows with the rowSums() function.

Can name the rows and columns:

toCoffee <- c(4, 5.6, 0.7, 0.5, 0.4, 0.9)
townnames <- c("Malvern", "Boston", "Cambridge")
coltitles <- c("To Starbucks", "To Dunkin")
mat <- matrix(toCoffee, nrow = 3, byrow = TRUE, dimnames = list(townnames, coltitles))
mat

##           To Starbucks To Dunkin
## Malvern            4.0       5.6
## Boston             0.7       0.5
## Cambridge          0.4       0.9

Adding a column to the coffee matrix using cbind():

toPeets <- c(19.4, 3.7, 0.3)
mat <- cbind(mat, toPeets)
mat

##           To Starbucks To Dunkin toPeets
## Malvern            4.0       5.6    19.4
## Boston             0.7       0.5     3.7
## Cambridge          0.4       0.9     0.3

coffeemileage <- rowSums(mat)
coffeemileage

##   Malvern    Boston Cambridge 
##      29.0       4.9       1.6

Selecting from a matrix goes like this, where row, col. Averagedistance is the average distance to any coffee shop across all 3 towns.

averagedistance <- mean(mat[, ])
averagedistance

## [1] 3.944

avgCambridge <- mean(mat[3, ])
avgCambridge

## [1] 0.5333


avgToDunkin <- mean(mat[, 2])
avgToDunkin

## [1] 2.333

Arithmetic operators- doing mat/2 will divide every coffee distance times 2.

mat/2

##           To Starbucks To Dunkin toPeets
## Malvern           2.00      2.80    9.70
## Boston            0.35      0.25    1.85
## Cambridge         0.20      0.45    0.15