For loops

Mike McCann
22-23 January 2015

For loops

Loops are an important programming tool. The first loop we will learn is a for loop.

For loops run for a certain number of steps, which you define, during which any statements in the loop are executed.

The basic syntax is:

for (some sequence of steps)

{

execute some statements

}

Why use a for loop?

  1. We have a repeated process with indentical formatting, but different values.

  2. To avoid laborious typing into R

Our first loop

for (i in 1:5) {

  • i starts at 1. R will execute some statements;
  • i is increased to i = 2 and statements are executed again;
  • i is increased to i = 3 and statements are executed again;
  • and so on, until i = 5, at which point the loop executes the set of statements for the last time.

}

Our first loop

for (i in 1:5){
    print(i)
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5

Syntax

Be sure you distinguish between:

  • curly braces { }
  • parentheses ( )
  • square brackets [ ].

Syntax

Brackets [ ] are used to access elements of vectors, matrices, and dataframes.

x <- 1:10

x[6]
[1] 6
x[5:7]
[1] 5 6 7

Syntax

Parentheses ( ) are used to specify arguments to functions.

x <- 1:10

sum(x)
[1] 55
mean(x)
[1] 5.5

Syntax

Finally, curly braces { } enclose statements to be executed within the body of a loop.

for (i in 1:3) {
  print(i)
}
[1] 1
[1] 2
[1] 3

Using a for loop

You can perform operations on i.

for (i in 1:4){
    print(i^2)
}
[1] 1
[1] 4
[1] 9
[1] 16

Using a for loop

Assignments can occur in a loop.

x <- 2

for (i in 1:4){
    x <- x^2
}

Notice: i is not directly called in the equation.

The operation x <- x^2 will be done four times.

Try It!

  1. Create a for loop that prints numbers 1 to 100.
  2. Create a for loop that prints numbers 100 to 1.
  3. Create a for loop that adds 1 to numbers 1:5.
  4. Create a for loop that divides all even numbers from 0 to 20 by 10 (consider using seq())
  5. Bonus! What would be the final value here (witout trying it)? Why?
dogs <- 10
for (i in 1:5){
    dogs <- dogs + 1
}

For Loops and Vectors

In the above examples, we used i directly in mathematical operations. It is more common to loop over elements of a vector to accomplish some particular task.

nameVector <- c("Charlie", "Helga", "Clancy")

for (i in 1:length(nameVector)){
    print(paste("Hi,", nameVector[i], sep=" "))
}
[1] "Hi, Charlie"
[1] "Hi, Helga"
[1] "Hi, Clancy"

What did that really do?

Consider the loop in pieces

length(nameVector) # The # of positions in nameVector
[1] 3
nameVector[1] # The 1st position in nameVector
[1] "Charlie"
# Combine text and index of a vector
paste("Hi,", nameVector[1], sep=" ") 
[1] "Hi, Charlie"

What did that really do?

Loops are their own little environment, so use print() to view them on your console.

nameVector <- c("Charlie", "Helga", "Clancy")

for (i in 1:length(nameVector)){
    print(paste("Hi,", nameVector[i], sep=" "))
}
[1] "Hi, Charlie"
[1] "Hi, Helga"
[1] "Hi, Clancy"

What did that really do?

Without print() or an assignment <- results are not returned.

nameVector <- c("Charlie", "Helga", "Clancy")

for (i in 1:length(nameVector)){
    paste("Hi,", nameVector[i], sep=" ")
}

Try It!

1.) Create a vector of names of people in your row, write them a nice message using a loop.
2.) Explain why the following code is wrong:

for (x in 1:10) {
  print(sum(i))
}

Lists

Lists are another of the 5 basic data structures in R.

Unlike a vector, the elements of a list can be any type… including other lists!

Constructing Lists

You construct a list with list().

Instead of c() for vectors.

Syntax of lists

x <- c(1,2,3,4,5) # Create a vector

a <- list(1,2,3,4,5) # Create a List

x[3] # Index a vector 
[1] 3
a[[3]] # Index a list 
[1] 3

Why use a list? A Brief Tangent

A list is a generic vector containing other objects. For example, x is a list containing three vectors n, s, b, and a numeric 3.

n <- c(2, 3, 5) 
s <- c("aa", "bb", "cc", "dd", "ee") 
b <- c(TRUE, FALSE, TRUE, FALSE, FALSE) 
x <- list(n, s, b, 3) # combine n, s, b, 3
x
[[1]]
[1] 2 3 5

[[2]]
[1] "aa" "bb" "cc" "dd" "ee"

[[3]]
[1]  TRUE FALSE  TRUE FALSE FALSE

[[4]]
[1] 3

List Slicing

We retrieve a list slice with the single square bracket [ ] operator.

x[2] 
[[1]]
[1] "aa" "bb" "cc" "dd" "ee"
# With an index vector, we can retrieve a slice with multiple members.
x[c(2, 4)] 
[[1]]
[1] "aa" "bb" "cc" "dd" "ee"

[[2]]
[1] 3

List Reference

In order to reference a list member directly, we have to use the double square bracket [[ ]] operator.

x[[2]] # 2nd element of list 
[1] "aa" "bb" "cc" "dd" "ee"
x[[2]][2] # 2nd element of 2nd element of list 
[1] "bb"

List Reference

We can modify list contents directly

x[[2]][2] <- "ta" 
x[[2]] 
[1] "aa" "ta" "cc" "dd" "ee"

How do we save loop outputs?

Instead of printing to the screen, we usually want to create an object with the outputs on the loop. In general, we do do this either with a vector or a list.

outputs <- list() # Create an blank output

for (x in 1:3){
  outputs[[x]] <- x*10
}
outputs
[[1]]
[1] 10

[[2]]
[1] 20

[[3]]
[1] 30

Try It!

1.) Compute x*2 for 1:100. Place the output in a vector.
2.) Compute x*2 for 1:100. Place the output in a list.
3.) How do we get the 47th position in question 1 and 2?
4.) What does this code do?

output <- list()
for(x in 1:10) {
  output[1] <- sum(x + x^2)
}

5.) What does this code do?

output2 <- list()
for(x in 1:11) {
  output2[x+1] <- sum(x + x^2)
}

if Statements

if (3 > 2){
  print("Yes")
}
[1] "Yes"

Flow Statements

Often, we want to control for loops to account for variables, options, and logical statements.

Let's us an if statement:

for (x in 1:5){
  if(x > 3){
    print(paste(x,"is greater than 3"))
  }
  if(x <= 3){
    print(paste(x,"is less than or equal to 3"))
  }
}
[1] "1 is less than or equal to 3"
[1] "2 is less than or equal to 3"
[1] "3 is less than or equal to 3"
[1] "4 is greater than 3"
[1] "5 is greater than 3"

Controlling Flow: Break

Often we need to handle logical cases within a loop.

We can end a loop running based on an if and break statement

for (x in 1:5){
  if(x > 3){break}
  if(x <= 3){print(paste(x,"is less than or equal to 3"))}
}
[1] "1 is less than or equal to 3"
[1] "2 is less than or equal to 3"
[1] "3 is less than or equal to 3"

Controlling Flow: Next

Sometimes we don't want to break the statement, just skip a troublesome object or R that we know will cause an error.

We can continue within a loop based on an if and next statement. Here we want to skip 4.

for (x in 1:5){
  if(x == 4){next}
  if(x > 3){print(paste(x,"is greater than 3"))}
  if(x <= 3){print(paste(x,"is less than or equal to 3"))}
  }
[1] "1 is less than or equal to 3"
[1] "2 is less than or equal to 3"
[1] "3 is less than or equal to 3"
[1] "5 is greater than 3"

Try It!

1.) Create a for loop that computes x * 2 for 1:100. Place the output in a vector. However, calculate x * 3 for x=32 and x=67.
2.) Create a for loop that computes x * 2 for 1:100. Place the output in a list. However, break the loop after 51 iterations
3.) Create a for loop that computes x * 2 for 1:100. Place the output in a list. However, skip x= 71, 74. How can you show that you suceeded?

Try It!

1.) Why does this code fail?

for (x in 1:5){
  if(x <= 4){
    print(paste(x,"is less than or equal to 4"))
  }
  # Kill the loop after x=3
  if(x > 3){break}
}

2.) Create a for loop that computes x * 2 for 1:100, and place the output in a list. However, break the loop when the square root of the output of a statement is greater than 8.4. What is the last x value reported?

Alternatives to for loops: the apply family

The apply family of functions allows you to process whole rows, columns, or lists.

This is called Vectorization.

This can replace for loops (and is often faster).

apply()

  • apply(M, 1, fun) = apply fun to the rows of M
  • apply(M, 2, fun) = apply fun to the columns of M

apply()

Use apply() to take the median of the rows/columns.

M <- matrix(rnorm(9), ncol=3, nrow=3)
M
          [,1]       [,2]      [,3]
[1,] -1.227423 -2.2275760 0.7494961
[2,] -1.207342  0.8004867 0.4152786
[3,]  0.889925  1.1353026 1.5251520
apply(M, 1, median) # median of rows 
[1] -1.2274229  0.4152786  1.1353026
apply(M, 2, median) # median of columns 
[1] -1.2073425  0.8004867  0.7494961

apply()

The for loop version

M
          [,1]       [,2]      [,3]
[1,] -1.227423 -2.2275760 0.7494961
[2,] -1.207342  0.8004867 0.4152786
[3,]  0.889925  1.1353026 1.5251520
for (i in 1:3){
    print(median(M[,i]))
}
[1] -1.207342
[1] 0.8004867
[1] 0.7494961

The apply family

Other versions of apply() for other data types:

  • lapply()
  • tapply()
  • sapply()
  • vapply()
  • mapply()

Read more here and here.

The apply family

Other versions of apply() for other data types

mylist <- list(1, 2, 9) # make a list 

sqrt(mylist) # returns an error 

lapply(mylist, sqrt) # applies function to list

Questions?