Programming in R

STOR 320
10/04/17

This lecture is about

  • conditionals (if/else)
  • loops (for, while)
  • how to define your own functions
  • basic programming flow

References

if statements

if (2 > 1) {
  print('fact')
}
[1] "fact"

if statements

if (2 < 1) {
  print('Math is Broken')
}

if/else statements

# Flip a coin 
if (runif(1) < 0.5) {
    print('heads')
} else {
    print('tails')
}
[1] "heads"

else-if statements

r <- runif(1)
# rock paper sissors
if (r < 1/3) {
    print('rock')
} else if (1/3 < r && r < 2/3) {
    print('paper')
} else{
    print('scissors')
}
[1] "scissors"

Switches

If you have lots of nested If-thens, a switch() function could be more handy. Here is an example from Section 19.4.2. What does it do?

function(x, y, op) {
  switch(op,
    plus = x + y,
    minus = x - y,
    times = x * y,
    divide = x / y,
    stop("Unknown op!")
  )
}

Equality

2*5 == 10
[1] TRUE

Beware of finite precision arithmetic

# oops
sqrt(2)^2 == 2
[1] FALSE

Safety first

dplyr::near(sqrt(2)^2, 2)
[1] TRUE

Vectorized operations

c(1, 1, 1) == c(1, 2, 1)
[1]  TRUE FALSE  TRUE

If you need to return just one value in a loop, use the double symbol (I confess I don't really see the value of this.)

default to && or || in a for loop

c(T, T, T) || c(T, F, T)
[1] TRUE

This operator appears to look at the first term of each vector only.

Otherwise you get a vector

c(T, T, T) | c(T, F, T)
[1] TRUE TRUE TRUE

Loops

  • for
  • while

for loops

for (i in 1:10) {
    print("Oh Canada!")
    print(i)
}
[1] "Oh Canada!"
[1] 1
[1] "Oh Canada!"
[1] 2
[1] "Oh Canada!"
[1] 3
[1] "Oh Canada!"
[1] 4
[1] "Oh Canada!"
[1] 5
[1] "Oh Canada!"
[1] 6
[1] "Oh Canada!"
[1] 7
[1] "Oh Canada!"
[1] 8
[1] "Oh Canada!"
[1] 9
[1] "Oh Canada!"
[1] 10

Pre-allocate memory

nums <- vector("double", 10) # or rep(0, 10) or something else
for (i in 1:10) {
  nums[i] <- runif(1)
}

Dynamic allocation bad

# nums <- c()
# for (i in 1:10) {
#   nums <- c(nums, runif(1))
# }

while loops

current_position <- 10
n_iter <- 0
while (current_position > 0){
    current_position <- current_position + rnorm(1)
    n_iter <- n_iter + 1
}
print(paste0('you lost all your money after ', n_iter, ' trips to the casino'))
[1] "you lost all your money after 1205 trips to the casino"

Infinite loops

while (TRUE){
    print('Duke sucks')
}

Vectorization

Try to vectorize anything you can (once you learn what that means…)

sapply(1:10, function(x) x * 2)

Functions are for humans and computers

  • break long program up into small chunks
  • more readable
  • code reuse
    • catch errors
    • don't have to copy/paste
  • only need to fix code in one place

When to write a function?

You should consider writing a function whenever you’ve copied and pasted a block of code more than twice

Define a function

power <- function(num, exponent){
    # returns num raised to the exponent
    num ^ exponent
}

power(2, 3)
[1] 8

Default arguments

power <- function(num, exponent=3){
    # returns num raised to the exponent
    num ^ exponent
}

power(2)
[1] 8

Return values

random_rps <- function(){
    # randomly returns one of rock, paper or scissors

    r <- runif(1)
    # rock paper sissors
    if (r < 1/3) {
        return('rock')
    } else if (1/3 < r && r < 2/3) {
        return('paper')
    } else{
        return('scissors')
    }
}

random_rps()
[1] "paper"

Write functions in separate scripts and import them

source('fun.R')

helper_fun()
[1] "Im not a very helpful helper function"

Vectors and lists

  • vectors are homogeneous and sequential (1 dimensional)
  • lists are heterogeneous and hierarchical

section 20 from r4ds

Vectors have a type

boolean, character, complex, raw, integer and double

Integer vector

c(1,2,3) # 1:3
[1] 1 2 3

Boolean vector

# boolean
c(TRUE, FALSE, TRUE) 
[1]  TRUE FALSE  TRUE

Character vector

# string
c('Iain', 'wishes', 'vectors', 'were', 'named', 'lists', 'instead')
[1] "Iain"    "wishes"  "vectors" "were"    "named"   "lists"   "instead"

typeof

typeof(rep(TRUE, 4))
[1] "logical"

Question 1

What are the types of the following vectors

# a
c(1, 2, 'three')

# b
c(TRUE, TRUE, "FALSE")

# c
c(1, 2, 3.1)

Explicit coercion

as.integer(c('1', '2', '3'))
[1] 1 2 3

Implicit coercion

c(1, 2, TRUE)
[1] 1 2 1

Vectorized operations and implicit coercion

sum(c(-2, -1, 1, 2) > 0)
[1] 2

Subsetting a vector

v <- 11:20
v[3]
[1] 13
v[c(1,10)]
[1] 11 20
v[v %%2 == 0]
[1] 12 14 16 18 20

Numbers to words

Create the numbers2words function from https://github.com/ateucher/useful_code/blob/master/R/numbers2words.r

Copy this code to an R-script, name it, save it, source it and run the function with a few different arguments.

numbers2words(312)
[1] "three hundred twelve"

Sub strings

substr('Iain', 2, 3)
[1] "ai"

Question

How many numbers below 4869 are divisible by 5 and start with the letter “f”?

Use the two previous slides and submit your answer on Sakai.

Question (Solution)

How many numbers below 4869 are divisible by three and start with the letter n?

sum(1:4869 %% 3 == 0 & substr(numbers2words(1:4869), 1, 1) == 'n' )
[1] 39

Lists

Lists can contain objects of multiple types and are indexed by names (as opposed to index sequentially)

Make a list

L <- list(number=1, letter='a', bool=TRUE)
L
$number
[1] 1

$letter
[1] "a"

$bool
[1] TRUE

Access elements of a list with [[]]

To access elements of a list use [[]]

L[['number']]
[1] 1

A single bracket returns a list

you can use a single [] and this will return a list

L['number']
$number
[1] 1

see section 20.5.3 for the difference between [] and [[]]

Single Brackets versus Double Brackets Example

a <- list(a = 1:3, b = "a string", c = pi, d = list(-1, -5))

Now look at a, a[2] and a[[2]] and note the differences.

Lists are hierarchical

LoL <- list(names = list('Iain', 'Brendan', 'Varun'),
            numbers=list(1:3, 1:5, 1:7))

LoL[['numbers']][[2]]
[1] 1 2 3 4 5