An Introductory Session on Loops in R
“Lulu asks Toucan Sam to pass her a froot loop, one by one, until they’re done!”
A loop in a computer program is an instruction that repeats until a specified condition is reached. Each time that Lulu asks and Toucan Sam does, is called an iteration.
Loop Structure:
“Iteration leads to automation!”
Using loops in your programs can help you be more efficient about time, space, and money. It will also prove useful when working across multiple datasets to which you want to apply the same commands/functions.
High-level computer programs accommodate several types of loops.
Types of Loops:
Lots and lots of ways!
For example, consider a survey with a lot of questions on a 4-pt Agree/Disagree scale where the responses are the words “Strongly Agree”, “Agree”, “Disagree”, “Strongly Disagree”. Instead of recoding the responses in each column… one by one… you can set a loop to recode each response in an array according to a set of conditions (ie. 1 = “Strongly Disagree”).
print(paste("The year is", 2015))
## [1] "The year is 2015"
print(paste("The year is", 2016))
## [1] "The year is 2016"
print(paste("The year is", 2017))
## [1] "The year is 2017"
print(paste("The year is", 2018))
## [1] "The year is 2018"
print(paste("The year is", 2019))
## [1] "The year is 2019"
for (year in c(2015, 2016, 2017, 2018, 2019)){
print(paste("The year is", year))
}
## [1] "The year is 2015"
## [1] "The year is 2016"
## [1] "The year is 2017"
## [1] "The year is 2018"
## [1] "The year is 2019"
for (i in 1989:2019){
print(paste("The year is", i))
}
## [1] "The year is 1989"
## [1] "The year is 1990"
## [1] "The year is 1991"
## [1] "The year is 1992"
## [1] "The year is 1993"
## [1] "The year is 1994"
## [1] "The year is 1995"
## [1] "The year is 1996"
## [1] "The year is 1997"
## [1] "The year is 1998"
## [1] "The year is 1999"
## [1] "The year is 2000"
## [1] "The year is 2001"
## [1] "The year is 2002"
## [1] "The year is 2003"
## [1] "The year is 2004"
## [1] "The year is 2005"
## [1] "The year is 2006"
## [1] "The year is 2007"
## [1] "The year is 2008"
## [1] "The year is 2009"
## [1] "The year is 2010"
## [1] "The year is 2011"
## [1] "The year is 2012"
## [1] "The year is 2013"
## [1] "The year is 2014"
## [1] "The year is 2015"
## [1] "The year is 2016"
## [1] "The year is 2017"
## [1] "The year is 2018"
## [1] "The year is 2019"
for (i in 1989:2019){
print(paste("The year is", i))
}
## [1] "The year is 1989"
## [1] "The year is 1990"
## [1] "The year is 1991"
## [1] "The year is 1992"
## [1] "The year is 1993"
## [1] "The year is 1994"
## [1] "The year is 1995"
## [1] "The year is 1996"
## [1] "The year is 1997"
## [1] "The year is 1998"
## [1] "The year is 1999"
## [1] "The year is 2000"
## [1] "The year is 2001"
## [1] "The year is 2002"
## [1] "The year is 2003"
## [1] "The year is 2004"
## [1] "The year is 2005"
## [1] "The year is 2006"
## [1] "The year is 2007"
## [1] "The year is 2008"
## [1] "The year is 2009"
## [1] "The year is 2010"
## [1] "The year is 2011"
## [1] "The year is 2012"
## [1] "The year is 2013"
## [1] "The year is 2014"
## [1] "The year is 2015"
## [1] "The year is 2016"
## [1] "The year is 2017"
## [1] "The year is 2018"
## [1] "The year is 2019"
i <- 1989
while (i < 2020){
print(paste("The year is", i))
i = i+1
}
## [1] "The year is 1989"
## [1] "The year is 1990"
## [1] "The year is 1991"
## [1] "The year is 1992"
## [1] "The year is 1993"
## [1] "The year is 1994"
## [1] "The year is 1995"
## [1] "The year is 1996"
## [1] "The year is 1997"
## [1] "The year is 1998"
## [1] "The year is 1999"
## [1] "The year is 2000"
## [1] "The year is 2001"
## [1] "The year is 2002"
## [1] "The year is 2003"
## [1] "The year is 2004"
## [1] "The year is 2005"
## [1] "The year is 2006"
## [1] "The year is 2007"
## [1] "The year is 2008"
## [1] "The year is 2009"
## [1] "The year is 2010"
## [1] "The year is 2011"
## [1] "The year is 2012"
## [1] "The year is 2013"
## [1] "The year is 2014"
## [1] "The year is 2015"
## [1] "The year is 2016"
## [1] "The year is 2017"
## [1] "The year is 2018"
## [1] "The year is 2019"
For Loops allow you to run through the loop a set number of times. Thus, it is best to use a for loop when you know how when the loop should stop.
While Loops allow for more flexability in what you put in it. Particularly, if you want to use an existing parameter in your dataset as your loop condition is is best to use a while loop.
mydata <- read_excel("dataframe.xlsx", col_names=TRUE)
head(mydata)
## # A tibble: 6 x 6
## ID Male Female Age EnglishScore MathScore
## <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 id1 Male <NA> 6 2.1 1.9
## 2 id2 <NA> Female 7 3.7 3.7
## 3 id3 <NA> Female 8 4 4
## 4 id4 <NA> Female 9 3.9 3.5
## 5 id5 Male <NA> 7 3.6 3.9
## 6 id6 <NA> Female 7 1.7 1.7
for (i in colnames(mydata[,2:3])){
mydata[[i]] <- revalue(mydata[[i]], replace=c("Male"=0,"Female"=1), warn_missing=FALSE)
}
head(mydata)
## # A tibble: 6 x 6
## ID Male Female Age EnglishScore MathScore
## <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 id1 0 <NA> 6 2.1 1.9
## 2 id2 <NA> 1 7 3.7 3.7
## 3 id3 <NA> 1 8 4 4
## 4 id4 <NA> 1 9 3.9 3.5
## 5 id5 0 <NA> 7 3.6 3.9
## 6 id6 <NA> 1 7 1.7 1.7
mydata <- mydata %>%
rowwise () %>%
mutate(Gender = sum(as.double(Male), as.double(Female), na.rm=TRUE))
head(mydata)
## Source: local data frame [6 x 7]
## Groups: <by row>
##
## # A tibble: 6 x 7
## ID Male Female Age EnglishScore MathScore Gender
## <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 id1 0 <NA> 6 2.1 1.9 0
## 2 id2 <NA> 1 7 3.7 3.7 1
## 3 id3 <NA> 1 8 4 4 1
## 4 id4 <NA> 1 9 3.9 3.5 1
## 5 id5 0 <NA> 7 3.6 3.9 0
## 6 id6 <NA> 1 7 1.7 1.7 1
par(mfrow = c(2, 2))
for (j in unique(mydata$Age)){
mydata_sub <- subset(mydata, Age == j)
plot(mydata_sub$EnglishScore, mydata_sub$MathScore,
xlab="English Scores", ylab="Math Scores", main=paste(j,"years old"))
}