Section 7: Loops

Like many programming languages, R allows you to build loops to accomplish tasks in bulk. In this section, we’ll learn how to make a for loop.

sequence <- seq(1, 100, by=2)  # create a vector of values 

sequence

##  [1]  1  3  5  7  9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45
## [24] 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91
## [47] 93 95 97 99

sequence_squared <- NULL  # Create a blank object in which you can store your results

     
for (i in 1:length(sequence)){   # Here's your vector counter
    sequence_squared[i] = sequence[i]^2  # Here's your statement, stored in the new vector
}

# Lets see what it returned...
sequence_squared

##  [1]    1    9   25   49   81  121  169  225  289  361  441  529  625  729
## [15]  841  961 1089 1225 1369 1521 1681 1849 2025 2209 2401 2601 2809 3025
## [29] 3249 3481 3721 3969 4225 4489 4761 5041 5329 5625 5929 6241 6561 6889
## [43] 7225 7569 7921 8281 8649 9025 9409 9801

You can also add if-else statements to your loops to sort data conditionally.

nina<-"has never run a marathon"
mike<-"has never run a marathon"
julie<-"runs marathons"

folks<-as.data.frame(cbind(nina, mike, julie))


for(i in 1:length(folks)){
  if (folks[i]!="runs marathons"){
    new_statement<-paste(names(folks[i]), "should probably go for a jog or something", sep=" ")
    print(new_statement)
  }
  else{
    new_statement<-paste(names(folks[i]), "is plenty fit already", sep=" ")
    print(new_statement)
  }
}

## [1] "nina should probably go for a jog or something"
## [1] "mike should probably go for a jog or something"
## [1] "julie is plenty fit already"

You can use loops to do things like recode variables or subset your data. Let’s look at an example using the diamonds data.

library(ggplot2)

## Warning: package 'ggplot2' was built under R version 3.1.3

set.seed(423)
diamonds_part<-diamonds[sample(1:nrow(diamonds), 1000, replace=FALSE),]

new_diamonds<-NULL # Create a blank object in which to store 

for (i in 1:nrow(diamonds_part)){
  if(diamonds_part$cut[i]=="Ideal" | diamonds_part$cut[i]=="Premium"){
    newRow<-diamonds_part[i,]
  }  
  new_diamonds<-rbind(new_diamonds, newRow)
}

new_diamonds$cut[1:100]

##   [1] Ideal   Ideal   Ideal   Ideal   Ideal   Premium Premium Premium
##   [9] Premium Ideal   Ideal   Ideal   Ideal   Premium Premium Ideal  
##  [17] Ideal   Premium Premium Ideal   Ideal   Ideal   Ideal   Ideal  
##  [25] Premium Premium Premium Premium Premium Ideal   Ideal   Ideal  
##  [33] Ideal   Ideal   Ideal   Premium Premium Premium Ideal   Premium
##  [41] Premium Premium Premium Premium Premium Premium Premium Ideal  
##  [49] Ideal   Ideal   Ideal   Ideal   Ideal   Ideal   Ideal   Premium
##  [57] Premium Ideal   Ideal   Ideal   Ideal   Premium Premium Premium
##  [65] Premium Premium Premium Premium Premium Ideal   Premium Premium
##  [73] Premium Premium Premium Ideal   Ideal   Premium Premium Premium
##  [81] Premium Premium Ideal   Ideal   Ideal   Ideal   Ideal   Ideal  
##  [89] Premium Ideal   Premium Ideal   Ideal   Ideal   Ideal   Premium
##  [97] Ideal   Premium Ideal   Premium
## Levels: Fair < Good < Very Good < Premium < Ideal

which(new_diamonds$cut=="Very good" | new_diamonds$cut=="Good" | new_diamonds$cut=="Fair")  # Make sure we don't have the ommitted cuts in our data...

## integer(0)

You can also accomplish loop-like tasks using the apply() command. There are multiple types of apply commands that can be used with different data structures - such as lapply(), which works with lists and and tapply(), which works with arrays. Since we’re working with the diamonds data still, we’ll just use apply().

diamonds_part<-NULL  # Let's make a new diamonds subset, with just numeric values to keep it simple
diamonds_part<-diamonds[,c(1,5,7)]  

apply(diamonds_part, 2, mean)

##        carat        depth        price 
##    0.7979397   61.7494049 3932.7997219

apply(diamonds_part, 2, which.max)

## carat depth price 
## 27416 52861 27750

Some general tips on building loops:

Remember to initialize an object outside of your loop that your loop will fill with data
You can use the stop() command inside your loop to print error messages if a certain condition is met.
You can also use the cat() command to update you regarding the loop’s progress. Just insert cat(“your message here”) in whatever section of the loop you wish.