R Programming for Biologists: Workshop 2

Gavin Douglas
Aug. 14th, 2018

Warm-up questions


  • Assign x to be the vector c("a", "cool", "cat"). Remove the 2nd element from this vector and assign it to be a new object called y.

  • What's the difference between a dataframe and a matrix?

  • What would which(c(10, 4, 2, 30) >= 4) return?

  • What would table(c("cat", "cat", "dog")) return?

Changing your working directory


getwd()


setwd("path/to/my/files")

Assignment 1: Helpful or harmful?

R Control Structures


  • if-else

  • for

  • while

  • break

  • next

  • return

if-else control structure

x <- 1

if(x >= 3) {
  y <- "large"
} else if(x == 2) {
  y <- "medium"
} else {
  y <- "small"
}

y
[1] "small"

There can be any number of “else if” statements, but the else statement needs to be last.

Only the if statement is required.

A single statement can have multiple conditions

& is the operator for “and” and | is the operator for “or”.

x <- 5; y <- 2

if((x >= 3) & (y >= 3)) {
  print("HUGE!")
} else {
  print("small")
}
[1] "small"
if((x >= 3) | (y >= 3)) {
  print("HUGE!")
} else {
  print("small")
}
[1] "HUGE!"

Looping over a vector with "for"

Often is numeric, but not always:

breed <- c("greyhound", "terrier", "bulldog")
for(dog in breed) {
  print(dog)
}
[1] "greyhound"
[1] "terrier"
[1] "bulldog"

Nested control structures:

for(i in c(1:4)) {
  x <- i*100
  if(x > 200) {
    print(x)
  }
}
[1] 300
[1] 400

Nested for loops iterating over matrix

tmp <- matrix(c(10, 70, 50, 20, 30, 90), nrow=2, ncol=3)

for(i in 1:nrow(tmp)) {
  for(j in 1:ncol(tmp)) {
    print(c(i, j, tmp[i, j]))
  }
}
[1]  1  1 10
[1]  1  2 50
[1]  1  3 30
[1]  2  1 70
[1]  2  2 20
[1]  2  3 90

"while" looping

x <- 1

while(x < 5) {
  x <- x + 1
  print(x)
}
[1] 2
[1] 3
[1] 4
[1] 5

Why wasn't the last value printed 4?

"next" in loops

Using next will skip to the next iteration of the loop.

for(x in c("cow", "deer", "dog", "cat", "lion", "giraffe")) {
  if(x != "cat") {
    next
  }
  print(c("yay this was printed ok!", x))
}
[1] "yay this was printed ok!" "cat"                     

"break" in loops

Using break will stop the loop.

for(x in c("cow", "deer", "dog", "cat", "lion", "giraffe")) {
  if(x == "cat") {
    print("Uh oh, time to stop!")
    break
  }
  print(c("yay this was printed ok!", x))
}
[1] "yay this was printed ok!" "cow"                     
[1] "yay this was printed ok!" "deer"                    
[1] "yay this was printed ok!" "dog"                     
[1] "Uh oh, time to stop!"

Basic R function

multiply <- function(x, y) {
                product <- x * y
                return(product)
}

multiply(10, 5)
[1] 50

In this case we need to explicitly specify the function arguments:

multiply(60)
Error in multiply(60) : argument "y" is missing, with no default

Setting default values in function

square_vec <- function(x, remove_NA=TRUE) {

  if(remove_NA) {
    x <- as.vector(na.omit(x))
  }

  return(x**2)
}

square_vec(c(7, 5, NA, 2))
[1] 49 25  4
square_vec(c(7, 5, NA, 2), remove_NA=FALSE)
[1] 49 25 NA  4

Lexical scoping 1/2

Can define variables in environment where the function was defined.

add_nums <- function(x) {

  return(x + y)

}
add_nums(1)
Error in add_nums(1) : object 'y' not found
y <- 10
add_nums(1)
[1] 11

Lexical scoping 2/2

A function can return a different function (example from Coursera).

make.power <- function(n) {
 pow <- function(x) {
  return(x^n) 
 }
 return(pow)
}

cube <- make.power(3)
cube(10)
[1] 1000

Recursive functions

calc_factorial <- function(x) {
  if (x == 0) {
    return (1)
  } else {
    return (x * calc_factorial(x - 1))
  }
}

calc_factorial(6)
[1] 720
calc_factorial(3)
[1] 6