Mike McCann
22-23 January 2015
Loops are an important programming tool. The first loop we will learn is a for loop.
For loops run for a certain number of steps, which you define, during which any statements in the loop are executed.
The basic syntax is:
for (some sequence of steps)
{
execute some statements
}
We have a repeated process with indentical formatting, but different values.
To avoid laborious typing into R
for (i in 1:5) {
- i starts at 1. R will execute some statements;
- i is increased to i = 2 and statements are executed again;
- i is increased to i = 3 and statements are executed again;
- and so on, until i = 5, at which point the loop executes the set of statements for the last time.
}
for (i in 1:5){
print(i)
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
Be sure you distinguish between:
Brackets [ ] are used to access elements of vectors, matrices, and dataframes.
x <- 1:10
x[6]
[1] 6
x[5:7]
[1] 5 6 7
Parentheses ( ) are used to specify arguments to functions.
x <- 1:10
sum(x)
[1] 55
mean(x)
[1] 5.5
Finally, curly braces { } enclose statements to be executed within the body of a loop.
for (i in 1:3) {
print(i)
}
[1] 1
[1] 2
[1] 3
You can perform operations on i.
for (i in 1:4){
print(i^2)
}
[1] 1
[1] 4
[1] 9
[1] 16
Assignments can occur in a loop.
x <- 2
for (i in 1:4){
x <- x^2
}
Notice: i is not directly called in the equation.
The operation x <- x^2 will be done four times.
dogs <- 10
for (i in 1:5){
dogs <- dogs + 1
}
In the above examples, we used i directly in mathematical operations. It is more common to loop over elements of a vector to accomplish some particular task.
nameVector <- c("Charlie", "Helga", "Clancy")
for (i in 1:length(nameVector)){
print(paste("Hi,", nameVector[i], sep=" "))
}
[1] "Hi, Charlie"
[1] "Hi, Helga"
[1] "Hi, Clancy"
Consider the loop in pieces
length(nameVector) # The # of positions in nameVector
[1] 3
nameVector[1] # The 1st position in nameVector
[1] "Charlie"
# Combine text and index of a vector
paste("Hi,", nameVector[1], sep=" ")
[1] "Hi, Charlie"
Loops are their own little environment, so use print() to view them on your console.
nameVector <- c("Charlie", "Helga", "Clancy")
for (i in 1:length(nameVector)){
print(paste("Hi,", nameVector[i], sep=" "))
}
[1] "Hi, Charlie"
[1] "Hi, Helga"
[1] "Hi, Clancy"
Without print() or an assignment <- results are not returned.
nameVector <- c("Charlie", "Helga", "Clancy")
for (i in 1:length(nameVector)){
paste("Hi,", nameVector[i], sep=" ")
}
1.) Create a vector of names of people in your row, write them a nice message using a loop.
2.) Explain why the following code is wrong:
for (x in 1:10) {
print(sum(i))
}
Lists are another of the 5 basic data structures in R.
Unlike a vector, the elements of a list can be any type… including other lists!
You construct a list with list().
Instead of c() for vectors.
x <- c(1,2,3,4,5) # Create a vector
a <- list(1,2,3,4,5) # Create a List
x[3] # Index a vector
[1] 3
a[[3]] # Index a list
[1] 3
A list is a generic vector containing other objects. For example, x is a list containing three vectors n, s, b, and a numeric 3.
n <- c(2, 3, 5)
s <- c("aa", "bb", "cc", "dd", "ee")
b <- c(TRUE, FALSE, TRUE, FALSE, FALSE)
x <- list(n, s, b, 3) # combine n, s, b, 3
x
[[1]]
[1] 2 3 5
[[2]]
[1] "aa" "bb" "cc" "dd" "ee"
[[3]]
[1] TRUE FALSE TRUE FALSE FALSE
[[4]]
[1] 3
We retrieve a list slice with the single square bracket [ ] operator.
x[2]
[[1]]
[1] "aa" "bb" "cc" "dd" "ee"
# With an index vector, we can retrieve a slice with multiple members.
x[c(2, 4)]
[[1]]
[1] "aa" "bb" "cc" "dd" "ee"
[[2]]
[1] 3
In order to reference a list member directly, we have to use the double square bracket [[ ]] operator.
x[[2]] # 2nd element of list
[1] "aa" "bb" "cc" "dd" "ee"
x[[2]][2] # 2nd element of 2nd element of list
[1] "bb"
We can modify list contents directly
x[[2]][2] <- "ta"
x[[2]]
[1] "aa" "ta" "cc" "dd" "ee"
Instead of printing to the screen, we usually want to create an object with the outputs on the loop. In general, we do do this either with a vector or a list.
outputs <- list() # Create an blank output
for (x in 1:3){
outputs[[x]] <- x*10
}
outputs
[[1]]
[1] 10
[[2]]
[1] 20
[[3]]
[1] 30
1.) Compute x*2 for 1:100. Place the output in a vector.
2.) Compute x*2 for 1:100. Place the output in a list.
3.) How do we get the 47th position in question 1 and 2?
4.) What does this code do?
output <- list()
for(x in 1:10) {
output[1] <- sum(x + x^2)
}
5.) What does this code do?
output2 <- list()
for(x in 1:11) {
output2[x+1] <- sum(x + x^2)
}
if (3 > 2){
print("Yes")
}
[1] "Yes"
Often, we want to control for loops to account for variables, options, and logical statements.
Let's us an if statement:
for (x in 1:5){
if(x > 3){
print(paste(x,"is greater than 3"))
}
if(x <= 3){
print(paste(x,"is less than or equal to 3"))
}
}
[1] "1 is less than or equal to 3"
[1] "2 is less than or equal to 3"
[1] "3 is less than or equal to 3"
[1] "4 is greater than 3"
[1] "5 is greater than 3"
Often we need to handle logical cases within a loop.
We can end a loop running based on an if and break statement
for (x in 1:5){
if(x > 3){break}
if(x <= 3){print(paste(x,"is less than or equal to 3"))}
}
[1] "1 is less than or equal to 3"
[1] "2 is less than or equal to 3"
[1] "3 is less than or equal to 3"
Sometimes we don't want to break the statement, just skip a troublesome object or R that we know will cause an error.
We can continue within a loop based on an if and next statement. Here we want to skip 4.
for (x in 1:5){
if(x == 4){next}
if(x > 3){print(paste(x,"is greater than 3"))}
if(x <= 3){print(paste(x,"is less than or equal to 3"))}
}
[1] "1 is less than or equal to 3"
[1] "2 is less than or equal to 3"
[1] "3 is less than or equal to 3"
[1] "5 is greater than 3"
1.) Create a for loop that computes x * 2 for 1:100. Place the output in a vector. However, calculate x * 3 for x=32 and x=67.
2.) Create a for loop that computes x * 2 for 1:100. Place the output in a list. However, break the loop after 51 iterations
3.) Create a for loop that computes x * 2 for 1:100. Place the output in a list. However, skip x= 71, 74. How can you show that you suceeded?
1.) Why does this code fail?
for (x in 1:5){
if(x <= 4){
print(paste(x,"is less than or equal to 4"))
}
# Kill the loop after x=3
if(x > 3){break}
}
2.) Create a for loop that computes x * 2 for 1:100, and place the output in a list. However, break the loop when the square root of the output of a statement is greater than 8.4. What is the last x value reported?
The apply family of functions allows you to process whole rows, columns, or lists.
This is called Vectorization.
This can replace for loops (and is often faster).
Use apply() to take the median of the rows/columns.
M <- matrix(rnorm(9), ncol=3, nrow=3)
M
[,1] [,2] [,3]
[1,] -1.227423 -2.2275760 0.7494961
[2,] -1.207342 0.8004867 0.4152786
[3,] 0.889925 1.1353026 1.5251520
apply(M, 1, median) # median of rows
[1] -1.2274229 0.4152786 1.1353026
apply(M, 2, median) # median of columns
[1] -1.2073425 0.8004867 0.7494961
The for loop version
M
[,1] [,2] [,3]
[1,] -1.227423 -2.2275760 0.7494961
[2,] -1.207342 0.8004867 0.4152786
[3,] 0.889925 1.1353026 1.5251520
for (i in 1:3){
print(median(M[,i]))
}
[1] -1.207342
[1] 0.8004867
[1] 0.7494961
Other versions of apply() for other data types
mylist <- list(1, 2, 9) # make a list
sqrt(mylist) # returns an error
lapply(mylist, sqrt) # applies function to list