Like many programming languages, R allows you to build loops to accomplish tasks in bulk. In this section, we’ll learn how to make a for loop.

sequence <- seq(1, 100, by=2)  # create a vector of values 

sequence
##  [1]  1  3  5  7  9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45
## [24] 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91
## [47] 93 95 97 99
sequence_squared <- NULL  # Create a blank object in which you can store your results

     
for (i in 1:length(sequence)){   # Here's your vector counter
    sequence_squared[i] = sequence[i]^2  # Here's your statement, stored in the new vector
}

# Lets see what it returned...
sequence_squared
##  [1]    1    9   25   49   81  121  169  225  289  361  441  529  625  729
## [15]  841  961 1089 1225 1369 1521 1681 1849 2025 2209 2401 2601 2809 3025
## [29] 3249 3481 3721 3969 4225 4489 4761 5041 5329 5625 5929 6241 6561 6889
## [43] 7225 7569 7921 8281 8649 9025 9409 9801

You can also add if-else statements to your loops to sort data conditionally.

nina<-"has never run a marathon"
mike<-"has never run a marathon"
julie<-"runs marathons"

folks<-as.data.frame(cbind(nina, mike, julie))


for(i in 1:length(folks)){
  if (folks[i]!="runs marathons"){
    new_statement<-paste(names(folks[i]), "should probably go for a jog or something", sep=" ")
    print(new_statement)
  }
  else{
    new_statement<-paste(names(folks[i]), "is plenty fit already", sep=" ")
    print(new_statement)
  }
}
## [1] "nina should probably go for a jog or something"
## [1] "mike should probably go for a jog or something"
## [1] "julie is plenty fit already"

You can use loops to do things like recode variables or subset your data. Let’s look at an example using the diamonds data.

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.1.3
set.seed(423)
diamonds_part<-diamonds[sample(1:nrow(diamonds), 1000, replace=FALSE),]

new_diamonds<-NULL # Create a blank object in which to store 

for (i in 1:nrow(diamonds_part)){
  if(diamonds_part$cut[i]=="Ideal" | diamonds_part$cut[i]=="Premium"){
    newRow<-diamonds_part[i,]
  }  
  new_diamonds<-rbind(new_diamonds, newRow)
}

new_diamonds$cut[1:100]
##   [1] Ideal   Ideal   Ideal   Ideal   Ideal   Premium Premium Premium
##   [9] Premium Ideal   Ideal   Ideal   Ideal   Premium Premium Ideal  
##  [17] Ideal   Premium Premium Ideal   Ideal   Ideal   Ideal   Ideal  
##  [25] Premium Premium Premium Premium Premium Ideal   Ideal   Ideal  
##  [33] Ideal   Ideal   Ideal   Premium Premium Premium Ideal   Premium
##  [41] Premium Premium Premium Premium Premium Premium Premium Ideal  
##  [49] Ideal   Ideal   Ideal   Ideal   Ideal   Ideal   Ideal   Premium
##  [57] Premium Ideal   Ideal   Ideal   Ideal   Premium Premium Premium
##  [65] Premium Premium Premium Premium Premium Ideal   Premium Premium
##  [73] Premium Premium Premium Ideal   Ideal   Premium Premium Premium
##  [81] Premium Premium Ideal   Ideal   Ideal   Ideal   Ideal   Ideal  
##  [89] Premium Ideal   Premium Ideal   Ideal   Ideal   Ideal   Premium
##  [97] Ideal   Premium Ideal   Premium
## Levels: Fair < Good < Very Good < Premium < Ideal
which(new_diamonds$cut=="Very good" | new_diamonds$cut=="Good" | new_diamonds$cut=="Fair")  # Make sure we don't have the ommitted cuts in our data...
## integer(0)

You can also accomplish loop-like tasks using the apply() command. There are multiple types of apply commands that can be used with different data structures - such as lapply(), which works with lists and and tapply(), which works with arrays. Since we’re working with the diamonds data still, we’ll just use apply().

diamonds_part<-NULL  # Let's make a new diamonds subset, with just numeric values to keep it simple
diamonds_part<-diamonds[,c(1,5,7)]  

apply(diamonds_part, 2, mean)
##        carat        depth        price 
##    0.7979397   61.7494049 3932.7997219
apply(diamonds_part, 2, which.max)
## carat depth price 
## 27416 52861 27750

Some general tips on building loops: