Fun with Froot Loops

An Introductory Session on Loops in R

Introduction

What is the big idea behind loops?

“Lulu asks Toucan Sam to pass her a froot loop, one by one, until they’re done!”

A loop in a computer program is an instruction that repeats until a specified condition is reached. Each time that Lulu asks and Toucan Sam does, is called an iteration.

Loop Structure:

  1. Lulu asks a question.
  2. If the answer requires an action, Toucan Sam does that action.
  3. Lulu asks the same question… again and again…
  4. Toucan Sam does the required action… again and again…
  5. Until, Toucan Sam satisfies some condition (ie. all of the froot loops are gone).


Why are they useful?

“Iteration leads to automation!”

Using loops in your programs can help you be more efficient about time, space, and money. It will also prove useful when working across multiple datasets to which you want to apply the same commands/functions.


What are the types of loops?

High-level computer programs accommodate several types of loops.

Types of Loops:

  • for loop - runs a preset number of times
  • while loop - repeats as long as an expression is true, in which the expression is a statement that has a value
  • do while loop or repeat until loop - repeats until an expression becomes false
  • indefinite loop or endless loop - repeats indefinitely because it has no terminating condition
  • nested loop - inside any other for, while or do while loop


How can we employ them at PP?

Lots and lots of ways!

For example, consider a survey with a lot of questions on a 4-pt Agree/Disagree scale where the responses are the words “Strongly Agree”, “Agree”, “Disagree”, “Strongly Disagree”. Instead of recoding the responses in each column… one by one… you can set a loop to recode each response in an array according to a set of conditions (ie. 1 = “Strongly Disagree”).


For Loop

Logic Model


R Syntax


Example


Say we want output like so:
print(paste("The year is", 2015))
## [1] "The year is 2015"
print(paste("The year is", 2016))
## [1] "The year is 2016"
print(paste("The year is", 2017))
## [1] "The year is 2017"
print(paste("The year is", 2018))
## [1] "The year is 2018"
print(paste("The year is", 2019))
## [1] "The year is 2019"


We can make this more quickly and efficiently with a for loop!
for (year in c(2015, 2016, 2017, 2018, 2019)){
  print(paste("The year is", year))
}
## [1] "The year is 2015"
## [1] "The year is 2016"
## [1] "The year is 2017"
## [1] "The year is 2018"
## [1] "The year is 2019"


Say the sequence is longer, we can use indexes to make the code better.
for (i in 1989:2019){
  print(paste("The year is", i))
}
## [1] "The year is 1989"
## [1] "The year is 1990"
## [1] "The year is 1991"
## [1] "The year is 1992"
## [1] "The year is 1993"
## [1] "The year is 1994"
## [1] "The year is 1995"
## [1] "The year is 1996"
## [1] "The year is 1997"
## [1] "The year is 1998"
## [1] "The year is 1999"
## [1] "The year is 2000"
## [1] "The year is 2001"
## [1] "The year is 2002"
## [1] "The year is 2003"
## [1] "The year is 2004"
## [1] "The year is 2005"
## [1] "The year is 2006"
## [1] "The year is 2007"
## [1] "The year is 2008"
## [1] "The year is 2009"
## [1] "The year is 2010"
## [1] "The year is 2011"
## [1] "The year is 2012"
## [1] "The year is 2013"
## [1] "The year is 2014"
## [1] "The year is 2015"
## [1] "The year is 2016"
## [1] "The year is 2017"
## [1] "The year is 2018"
## [1] "The year is 2019"


While Loop

Logic Model


R Syntax


Example


Remember what our last for loop looked like?
for (i in 1989:2019){
  print(paste("The year is", i))
}
## [1] "The year is 1989"
## [1] "The year is 1990"
## [1] "The year is 1991"
## [1] "The year is 1992"
## [1] "The year is 1993"
## [1] "The year is 1994"
## [1] "The year is 1995"
## [1] "The year is 1996"
## [1] "The year is 1997"
## [1] "The year is 1998"
## [1] "The year is 1999"
## [1] "The year is 2000"
## [1] "The year is 2001"
## [1] "The year is 2002"
## [1] "The year is 2003"
## [1] "The year is 2004"
## [1] "The year is 2005"
## [1] "The year is 2006"
## [1] "The year is 2007"
## [1] "The year is 2008"
## [1] "The year is 2009"
## [1] "The year is 2010"
## [1] "The year is 2011"
## [1] "The year is 2012"
## [1] "The year is 2013"
## [1] "The year is 2014"
## [1] "The year is 2015"
## [1] "The year is 2016"
## [1] "The year is 2017"
## [1] "The year is 2018"
## [1] "The year is 2019"


Here is how we would generate the same output with a while loop:
i <- 1989
while (i < 2020){
  print(paste("The year is", i))
  i = i+1
}
## [1] "The year is 1989"
## [1] "The year is 1990"
## [1] "The year is 1991"
## [1] "The year is 1992"
## [1] "The year is 1993"
## [1] "The year is 1994"
## [1] "The year is 1995"
## [1] "The year is 1996"
## [1] "The year is 1997"
## [1] "The year is 1998"
## [1] "The year is 1999"
## [1] "The year is 2000"
## [1] "The year is 2001"
## [1] "The year is 2002"
## [1] "The year is 2003"
## [1] "The year is 2004"
## [1] "The year is 2005"
## [1] "The year is 2006"
## [1] "The year is 2007"
## [1] "The year is 2008"
## [1] "The year is 2009"
## [1] "The year is 2010"
## [1] "The year is 2011"
## [1] "The year is 2012"
## [1] "The year is 2013"
## [1] "The year is 2014"
## [1] "The year is 2015"
## [1] "The year is 2016"
## [1] "The year is 2017"
## [1] "The year is 2018"
## [1] "The year is 2019"


How is the While Loop Different from the For Loop?

For Loops allow you to run through the loop a set number of times. Thus, it is best to use a for loop when you know how when the loop should stop.

While Loops allow for more flexability in what you put in it. Particularly, if you want to use an existing parameter in your dataset as your loop condition is is best to use a while loop.


Relevant Examples

Data Frame

mydata <- read_excel("dataframe.xlsx", col_names=TRUE)
head(mydata)
## # A tibble: 6 x 6
##   ID    Male  Female   Age EnglishScore MathScore
##   <chr> <chr> <chr>  <dbl>        <dbl>     <dbl>
## 1 id1   Male  <NA>       6          2.1       1.9
## 2 id2   <NA>  Female     7          3.7       3.7
## 3 id3   <NA>  Female     8          4         4  
## 4 id4   <NA>  Female     9          3.9       3.5
## 5 id5   Male  <NA>       7          3.6       3.9
## 6 id6   <NA>  Female     7          1.7       1.7


Revalue For Loop
for (i in colnames(mydata[,2:3])){
  mydata[[i]] <- revalue(mydata[[i]], replace=c("Male"=0,"Female"=1), warn_missing=FALSE)
}
head(mydata)
## # A tibble: 6 x 6
##   ID    Male  Female   Age EnglishScore MathScore
##   <chr> <chr> <chr>  <dbl>        <dbl>     <dbl>
## 1 id1   0     <NA>       6          2.1       1.9
## 2 id2   <NA>  1          7          3.7       3.7
## 3 id3   <NA>  1          8          4         4  
## 4 id4   <NA>  1          9          3.9       3.5
## 5 id5   0     <NA>       7          3.6       3.9
## 6 id6   <NA>  1          7          1.7       1.7


New Variable
mydata <- mydata %>%
  rowwise () %>%
  mutate(Gender = sum(as.double(Male), as.double(Female), na.rm=TRUE))
head(mydata)
## Source: local data frame [6 x 7]
## Groups: <by row>
## 
## # A tibble: 6 x 7
##   ID    Male  Female   Age EnglishScore MathScore Gender
##   <chr> <chr> <chr>  <dbl>        <dbl>     <dbl>  <dbl>
## 1 id1   0     <NA>       6          2.1       1.9      0
## 2 id2   <NA>  1          7          3.7       3.7      1
## 3 id3   <NA>  1          8          4         4        1
## 4 id4   <NA>  1          9          3.9       3.5      1
## 5 id5   0     <NA>       7          3.6       3.9      0
## 6 id6   <NA>  1          7          1.7       1.7      1


Plot For Loop
par(mfrow = c(2, 2))

for (j in unique(mydata$Age)){
  mydata_sub <- subset(mydata, Age == j)
  plot(mydata_sub$EnglishScore, mydata_sub$MathScore,
       xlab="English Scores", ylab="Math Scores", main=paste(j,"years old"))
}