Introduction to R

Environments and Objects

Jason Freels

16 December 2016

OVERVIEW

In this presentation...

R OBJECTS: Elements

In R Elements Have an Atomic Structure

raw \(\rightarrow\) logical \(\rightarrow\) integer \(\rightarrow\) numeric \(\rightarrow\) complex \(\rightarrow\) character


2 < "george"
[1] TRUE
2 > "george"
[1] FALSE
-2 < "-3"
[1] TRUE
-2 < FALSE
[1] TRUE

Coercing Elements to Other Atomic Modes

as.complex(-2)
[1] -2+0i
as.character(-2)
[1] "-2"
as.logical(-2) ### Only 0 returns as FALSE
[1] TRUE

Logical operators applied to numeric elements

2 == 3
[1] FALSE
2 > 3
[1] FALSE
2 <= 3
[1] TRUE
2 < 3 | 2 > 3  ### Is EITHER 2<3 OR 2>3 true?
[1] TRUE
2 < 3 & 2 > 3  ### Are BOTH 2<3 AND 2>3 true?
[1] FALSE

Mathematical Functions on Numeric Elements

sqrt(4)
[1] 2
exp(1)
[1] 2.718282
log(1)
[1] 0
log10(1)
[1] 0
ceiling(pi)
[1] 4
floor(pi)
[1] 3
factorial(5)
[1] 120
choose(4,2)
[1] 6

R Objects: Vectors

Vectors are the Base Data Structure in R

The Concatenate Function - c( )

A <- c(1,2,3,4,5) ; A
[1] 1 2 3 4 5
B <- c(6,7,8,9,10) ; B
[1]  6  7  8  9 10
C <- c(A,B,"George") ; C  ### Coerces the numbers in A & B to characters 
 [1] "1"      "2"      "3"      "4"      "5"      "6"      "7"     
 [8] "8"      "9"      "10"     "George"

The Repeat Function - rep( )

D <- rep(1,10); D
 [1] 1 1 1 1 1 1 1 1 1 1
E <- rep(c(1,2,3,4),5); E
 [1] 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
G <- rep(c(1,2,3,4), each=5); G
 [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4

The Sequence Function - seq( )

H <- seq(1,10) ; H
 [1]  1  2  3  4  5  6  7  8  9 10
I <- seq(4,20,by=4) ; I
[1]  4  8 12 16 20
J <- seq(1,2, by=.1) ; J
 [1] 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0
K <- seq(10,2, by=-0.5) ; K
 [1] 10.0  9.5  9.0  8.5  8.0  7.5  7.0  6.5  6.0  5.5  5.0  4.5  4.0  3.5
[15]  3.0  2.5  2.0
L <- seq(10,2, length=7) ; L
[1] 10.000000  8.666667  7.333333  6.000000  4.666667  3.333333  2.000000

Vector Structure

str(J)
 num [1:11] 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 ...

Logical Functions on Vectors

A < B | A > B
[1] TRUE TRUE TRUE TRUE TRUE
A < B || A > B
[1] TRUE
A < B & A > B
[1] FALSE FALSE FALSE FALSE FALSE
A < B && A > B
[1] FALSE

Mathematical Functions on Vectors

A + B
[1]  7  9 11 13 15
A * B ## Scalar multiplication
[1]  6 14 24 36 50
A%*%B ##  Matrix multiplication
     [,1]
[1,]  130
round(sqrt(A),digits = 3)
[1] 1.000 1.414 1.732 2.000 2.236
round(exp(A),digits = 2)
[1]   2.72   7.39  20.09  54.60 148.41
round(log(A),digits = 3)
[1] 0.000 0.693 1.099 1.386 1.609
round(log10(A),digits=3)
[1] 0.000 0.301 0.477 0.602 0.699
sum(A)
[1] 15
cumsum(A)
[1]  1  3  6 10 15
prod(B)
[1] 30240
cumprod(B)
[1]     6    42   336  3024 30240
t(A) ### Returns the transpose of A
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    2    3    4    5
t(t(A))
     [,1]
[1,]    1
[2,]    2
[3,]    3
[4,]    4
[5,]    5
abs(-B/2)
[1] 3.0 3.5 4.0 4.5 5.0
table(c(A,B/2)) ### displays how many times each unique value is observed 

  1   2   3 3.5   4 4.5   5 
  1   1   2   1   2   1   2 

Accessing And Assigning Vector Elements Using Brackets

A ### Recall the value of the vector A defined earlier
[1] 1 2 3 4 5
A[3] ## Returns the 3rd element of A
[1] 3
A[1] <- 0.1 ; A ## Assigns the 1st element of A as 0.1
[1] 0.1 2.0 3.0 4.0 5.0
A[-1] <- 4 ; A ## Assigns all but the 1st element of A as 4 
[1] 0.1 4.0 4.0 4.0 4.0
A < 0.5
[1]  TRUE FALSE FALSE FALSE FALSE
A[4] == 2
[1] FALSE

R OBJECTS: Matrices

Matricies combine Atomic Vectors in a 2-dimensional Framework

Building Matrices 1 - Specifying Individual Elements

mat <- matrix(data = 1:9, 
              ncol = 3, 
              nrow = 3, 
              byrow = TRUE, 
              dimnames = list(c('A','B','C'),
                              c('D','E','F'))) 
mat
  D E F
A 1 2 3
B 4 5 6
C 7 8 9

Building matrices 2 - interactively with edit( )

mat2 <- matrix(1)
mat2 <- edit(mat2)

Building Matrices by Merging Vectors

vec1 <-  1:4
vec2 <-  5:8
vec3 <-  9:12
vec4 <- 13:16
vec.mat <- matrix(data = c(vec1, vec2, vec3, vec4), 
                  ncol = 4) 
vec.mat
     [,1] [,2] [,3] [,4]
[1,]    1    5    9   13
[2,]    2    6   10   14
[3,]    3    7   11   15
[4,]    4    8   12   16
rbind(vec1, vec2, vec3, vec4)
     [,1] [,2] [,3] [,4]
vec1    1    2    3    4
vec2    5    6    7    8
vec3    9   10   11   12
vec4   13   14   15   16
cbind(vec1, vec2, vec3, vec4)
     vec1 vec2 vec3 vec4
[1,]    1    5    9   13
[2,]    2    6   10   14
[3,]    3    7   11   15
[4,]    4    8   12   16

Numerical operations on matricies

mat1 <- matrix(sample(1:50,size = 25), 
               nrow = 5, 
               byrow = T) 
mat1
     [,1] [,2] [,3] [,4] [,5]
[1,]   33   14   25   19   49
[2,]    2   37   13   47   36
[3,]   44   21   39    4   46
[4,]    8    9   10   27    7
[5,]   12   15   18   11   17
diag(mat1)  ### Diagonal elements `mat1`
[1] 33 37 39 27 17
det(mat1)   ### Determinant of `mat1`
[1] 2390956
solve(mat1) ### Returns the inverse `mat1` if it exists
            [,1]         [,2]        [,3]         [,4]         [,5]
[1,] -0.07517830  0.036648102  0.13213627  0.070335046 -0.247424043
[2,] -0.09558018  0.060913710  0.10881254  0.006456413 -0.150590391
[3,]  0.06416847 -0.066029655 -0.11758058 -0.028874224  0.284919923
[4,]  0.01485264 -0.005037316 -0.01811660  0.037442763  0.001460504
[5,]  0.05984886 -0.006443448 -0.05306413 -0.049000065  0.063724719
t(mat1) ### Returns the tranpose of `mat1`
     [,1] [,2] [,3] [,4] [,5]
[1,]   33    2   44    8   12
[2,]   14   37   21    9   15
[3,]   25   13   39   10   18
[4,]   19   47    4   27   11
[5,]   49   36   46    7   17
eigen(mat1) ### Returns the eigenvalues and eigenvectors of `mat1`
$values
[1] 108.921417+0.000000i  30.900944+0.000000i  18.157668+0.000000i
[4]  -2.490014+5.737797i  -2.490014-5.737797i

$vectors
              [,1]           [,2]          [,3]                    [,4]
[1,] -0.5278529+0i  0.31792088+0i -0.3921790+0i -0.50972262+0.24878917i
[2,] -0.4090286+0i -0.81783168+0i  0.8181752+0i -0.38372140+0.29275659i
[3,] -0.6561359+0i  0.47854674+0i -0.1987081+0i  0.57631369+0.00000000i
[4,] -0.2012046+0i -0.03029619+0i -0.3498562+0i  0.04229242-0.07909292i
[5,] -0.2882175+0i -0.01236186+0i  0.1220688+0i  0.13925026-0.29285803i
                        [,5]
[1,] -0.50972262-0.24878917i
[2,] -0.38372140-0.29275659i
[3,]  0.57631369+0.00000000i
[4,]  0.04229242+0.07909292i
[5,]  0.13925026+0.29285803i

Accessing & Assigning matrix components

mat1[4,3]
[1] 10
mat1[5,5]
[1] 17
mat1[7]
[1] 37
mat1[24]
[1] 7
mat1[1,] ### Returns the first row of mat1
[1] 33 14 25 19 49
mat1[,3] ### Returns the third column of mat1
[1] 25 13 39 10 18

R OBJECTS: Arrays

Arrays are element Repositories With More than Two Dimensions

R OBJECTS: Data Frames

Data Frames Are The Most Common Data Structure In R

Creating data frames

age<-c(23, 35, 19)              ### Numeric vector
sex<-c("Male", "Female", "Yes") ### Character vector
job<-c(TRUE, TRUE, FALSE)       ### Logical vector
data.frame(age,sex,job, row.names = c("Jim", "Joe", "Ray"))
    age    sex   job
Jim  23   Male  TRUE
Joe  35 Female  TRUE
Ray  19    Yes FALSE
read.csv("http://www.fdic.gov/bank/individual/failed/banklist.csv", header = T)

Accessing data frames components

R installs with several data sets, such as mtcars

head(mtcars)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
class(mtcars)
[1] "data.frame"

Working With Data Frames

str(mtcars)
'data.frame':   32 obs. of  11 variables:
 $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
 $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
 $ disp: num  160 160 108 258 360 ...
 $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
 $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
 $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
 $ qsec: num  16.5 17 18.6 19.4 17 ...
 $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
 $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
 $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
 $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

The $ operator is used to call data frame columns as mtcars$cyl

R OBJECTS: Lists

Lists Are The Most General & Powerful Object Type in R

Example: Function outputs You write script that inputs the cars data set & creates several outputs

You can assign each of these objects to a list

Atomic Vectors & Lists

V <- c(3,TRUE,"george") ; V
[1] "3"      "TRUE"   "george"
list1 <- list(3,TRUE,"george") ; list1
[[1]]
[1] 3

[[2]]
[1] TRUE

[[3]]
[1] "george"
list2 <- list(head(mtcars),"george", A) ; list2 
[[1]]
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

[[2]]
[1] "george"

[[3]]
[1] 0.1 4.0 4.0 4.0 4.0
str(list2)
List of 3
 $ :'data.frame':   6 obs. of  11 variables:
  ..$ mpg : num [1:6] 21 21 22.8 21.4 18.7 18.1
  ..$ cyl : num [1:6] 6 6 4 6 8 6
  ..$ disp: num [1:6] 160 160 108 258 360 225
  ..$ hp  : num [1:6] 110 110 93 110 175 105
  ..$ drat: num [1:6] 3.9 3.9 3.85 3.08 3.15 2.76
  ..$ wt  : num [1:6] 2.62 2.88 2.32 3.21 3.44 ...
  ..$ qsec: num [1:6] 16.5 17 18.6 19.4 17 ...
  ..$ vs  : num [1:6] 0 0 1 1 0 1
  ..$ am  : num [1:6] 1 1 1 0 0 0
  ..$ gear: num [1:6] 4 4 4 3 3 3
  ..$ carb: num [1:6] 4 4 1 1 2 1
 $ : chr "george"
 $ : num [1:5] 0.1 4 4 4 4

Accessing & Assigning List Components

Objects assigned to a list may be accessed by using double brackets [[ ]]

 list1[[1]]
[1] 3

Accessing & Assigning Components Of Objects Inside A List

 list2[[3]][1]
[1] 0.1
 list2[[1]]$mpg
[1] 21.0 21.0 22.8 21.4 18.7 18.1

R OBJECTS: Functions

One Object to Rule Them All

Parts of an R Function

Creating and examining a function

foo <- function(x) { ### Values inside ( ) are interpreted as function arguments
  
  a <- 1 ; 
  b <- 2      ### Values separated by ";" are interpreted as separate lines 
  c <- a*x+b
  
  return(c)
}                  ### Values inside { } are interpreted as the function body
foo(2)
[1] 4

Lexical Scoping Rules

b  ### Error: object "b" not found

Lexical Scoping Example

Let's demonstrate where foo lives

environment( )
<environment: R_GlobalEnv>
environment(foo)
<environment: R_GlobalEnv>
str(foo) 
function (x)  
 - attr(*, "srcref")=Class 'srcref'  atomic [1:8] 1 8 8 1 8 1 1 8
  .. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x000000001b9623b8> 

What about when you're not in a presentation

Are there ways to make a and b available in the current environment?

a <- 1 ; b <- 2 # Moving the informal arguments outside the function
foo <- function(x) { 
        a*x+b
}
foo(2) ; a ; b
[1] 4
[1] 1
[1] 2
foo <- function(x) { 
    a <- 1 ; b <- 2
    c <- a*x+b
    return(list(a,b,c))
}
foo(2)
[[1]]
[1] 1

[[2]]
[1] 2

[[3]]
[1] 4
foo2<-function(x) { 
    a <<- 1 ; b <<- 2
    c <- a*x+b
}
foo(2) ; a ; b
[[1]]
[1] 1

[[2]]
[1] 2

[[3]]
[1] 4
[1] 1
[1] 2

R Environments

R arrives at the same result regardless of where a and b are defined

This is due to search procedure defined by the lexical scoping rules

  1. R first searches the function's enclosing environment for a and b
  2. If not found in the enclosing environment, R searches the parent environment
  3. This continues until R finds a and b or the search reaches the empty environment

The empty environment is the parent to the global environment

Return( ) Allows Multiple Function Outputs

Use <<- to Assign in Parent Environment

The deep-assignment operator allows an inheritance to the parent environment

Use carefully, <<- can change base R values, giving unexpected side effects

Lexical Scoping Rules Example - 2

Look at the code chunk below

a<-1
b<-2

foo<-function(x) {
    a*x+b
}

bar<-function(x){
  a<-2
  b<-1
  foo(x)
}

bar(2)

What value did you choose?

bar(2)
[1] 4

That Was A Lot!