Programming in R

STOR 320
10/04/17

This lecture is about

  • conditionals (if/else)
  • loops (for, while)
  • how to define your own functions
  • basic programming flow

References

if statements

if (2 > 1) {
  print('fact')
}
[1] "fact"

if statements

if (2 < 1) {
  print('Math is Broken')
}

if/else statements

# Flip a coin 
if (runif(1) < 0.5) {
    print('heads')
} else {
    print('tails')
}
[1] "tails"

else-if statements

r <- runif(1)
# rock paper sissors
if (r < 1/3) {
    print('rock')
} else if (1/3 < r && r < 2/3) {
    print('paper')
} else{
    print('scissors')
}
[1] "paper"

Switches

If you have lots of nested If-thens, a switch() function could be more handy. Here is an example from Section 19.4.2. What does it do?

function(x, y, op) {
  switch(op,
    plus = x + y,
    minus = x - y,
    times = x * y,
    divide = x / y,
    stop("Unknown op!")
  )
}

Equality

2*5 == 10
[1] TRUE

Beware of finite precision arithmetic

# oops
sqrt(2)^2 == 2
[1] FALSE

Safety first

dplyr::near(sqrt(2)^2, 2)
[1] TRUE

Vectorized operations

c(1, 1, 1) == c(1, 2, 1)
[1]  TRUE FALSE  TRUE

If you need to return just one value in a loop, use the double symbol (I confess I don't really see the value of this.)

default to && or || in a for loop

c(T, T, T) || c(T, F, T)
[1] TRUE

This operator appears to look at the first term of each vector only.

Otherwise you get a vector

c(T, T, T) | c(T, F, T)
[1] TRUE TRUE TRUE

Loops

  • for
  • while

for loops

for (i in 1:10) {
    print("Oh Canada!")
    print(i)
}
[1] "Oh Canada!"
[1] 1
[1] "Oh Canada!"
[1] 2
[1] "Oh Canada!"
[1] 3
[1] "Oh Canada!"
[1] 4
[1] "Oh Canada!"
[1] 5
[1] "Oh Canada!"
[1] 6
[1] "Oh Canada!"
[1] 7
[1] "Oh Canada!"
[1] 8
[1] "Oh Canada!"
[1] 9
[1] "Oh Canada!"
[1] 10

Pre-allocate memory

nums <- vector("double", 10) # or rep(0, 10) or something else
for (i in 1:10) {
  nums[i] <- runif(1)
}

Dynamic allocation bad

# nums <- c()
# for (i in 1:10) {
#   nums <- c(nums, runif(1))
# }

while loops

current_position <- 10
n_iter <- 0
while (current_position > 0){
    current_position <- current_position + rnorm(1)
    n_iter <- n_iter + 1
}
print(paste0('you lost all your money after ', n_iter, ' trips to the casino'))
[1] "you lost all your money after 1297 trips to the casino"

Infinite loops

while (TRUE){
    print('Duke sucks')
}

Vectorization

Try to vectorize anything you can (once you learn what that means…)

sapply(1:10, function(x) x * 2)

Functions are for humans and computers

  • break long program up into small chunks
  • more readable
  • code reuse
    • catch errors
    • don't have to copy/paste
  • only need to fix code in one place

When to write a function?

You should consider writing a function whenever you’ve copied and pasted a block of code more than twice

Define a function

power <- function(num, exponent){
    # returns num raised to the exponent
    num ^ exponent
}

power(2, 3)
[1] 8

Default arguments

power <- function(num, exponent=3){
    # returns num raised to the exponent
    num ^ exponent
}

power(2)
[1] 8

Return values

random_rps <- function(){
    # randomly returns one of rock, paper or scissors

    r <- runif(1)
    # rock paper sissors
    if (r < 1/3) {
        return('rock')
    } else if (1/3 < r && r < 2/3) {
        return('paper')
    } else{
        return('scissors')
    }
}

random_rps()
[1] "paper"

Write functions in separate scripts and import them

source('fun.R')

helper_fun()
[1] "Im not a very helpful helper function"

Vectors and lists

  • vectors are homogeneous and sequential (1 dimensional)
  • lists are heterogeneous and hierarchical

section 20 from r4ds

Vectors have a type

boolean, character, complex, raw, integer and double

Integer vector

c(1,2,3) # 1:3
[1] 1 2 3

Boolean vector

# boolean
c(TRUE, FALSE, TRUE) 
[1]  TRUE FALSE  TRUE

Character vector

# string
c('Iain', 'wishes', 'vectors', 'were', 'named', 'lists', 'instead')
[1] "Iain"    "wishes"  "vectors" "were"    "named"   "lists"   "instead"

typeof

typeof(rep(TRUE, 4))
[1] "logical"

Question 1

What are the types of the following vectors

# a
c(1, 2, 'three')

# b
c(TRUE, TRUE, "FALSE")

# c
c(1, 2, 3.1)

Explicit coercion

as.integer(c('1', '2', '3'))
[1] 1 2 3

Implicit coercion

c(1, 2, TRUE)
[1] 1 2 1

Vectorized operations and implicit coercion

sum(c(-2, -1, 1, 2) > 0)
[1] 2

Subsetting a vector

v <- 11:20
v[3]
[1] 13
v[c(1,10)]
[1] 11 20
v[v %%2 == 0]
[1] 12 14 16 18 20

Numbers to words

Create the numbers2words function from https://github.com/ateucher/useful_code/blob/master/R/numbers2words.r

numbers2words(312)
[1] "three hundred twelve"

Sub strings

substr('Iain', 2, 3)
[1] "ai"

Question 2

How many numbers below 4869 are divisible by three and start with the letter n?

sum(1:4869 %% 3 == 0 & substr(numbers2words(1:4869), 1, 1) == 'n' )
[1] 39

Lists

Lists can contain objects of multiple types and are indexed by names (as opposed to index sequentially)

Make a list

L <- list(number=1, letter='a', bool=TRUE)
L
$number
[1] 1

$letter
[1] "a"

$bool
[1] TRUE

Access elements of a list with [[]]

To access elements of a list use [[]]

L[['number']]
[1] 1

A single bracket returns a list

you can use a single [] and this will return a list

L['number']
$number
[1] 1

see section 20.5.3 for the difference between [] and [[]]

Lists are hierarchical

LoL <- list(names = list('Iain', 'Brendan', 'Varun'),
            numbers=list(1:3, 1:5, 1:7))

LoL[['numbers']][[2]]
[1] 1 2 3 4 5