In this example I’ll show some basic examples of for() loops in R using toy examples. Then build some for() loops to address some practical problems, including running a series of regession models. I’ll also show how to debug them by putting them into a function().
I’ll start off with some toy examples to illustrate basic functionality.
for(i in 1:10){
print(i^2)
}
## [1] 1
## [1] 4
## [1] 9
## [1] 16
## [1] 25
## [1] 36
## [1] 49
## [1] 64
## [1] 81
## [1] 100
Square each value and save it into a vector
First, a vector to hold the data
#replicate the value NA 10 times
storage.vector <- rep(NA,10)
For each value 1 to 10 square it and save it into storage.vector
for(i in 1:10){
storage.vector[i] <- i^2
}
Look at the output
storage.vector
## [1] 1 4 9 16 25 36 49 64 81 100
Draw values from a dataframe, square them, then store them to a dataframe.
First, Make a dataframe to hold the output of the loop
A list of values (technicall a vector)
values <- c(1,3,6,12,24,48)
A dataframe holding thse values, and a blank row to hold the output.
output.df <- data.frame(value = values,
value.squared = NA)
output.df
## value value.squared
## 1 1 NA
## 2 3 NA
## 3 6 NA
## 4 12 NA
## 5 24 NA
## 6 48 NA
Equivalently we could do this
output.df <- data.frame(value = c(1,3,6,12,24,48),
value.squared = NA)
The number of values
n.values <- length(values)
Or equivalently, using the dataframe
n.values <- dim(output.df)[1]
Loop over the i values and square
for (i in 1:n.values){
#get value in row i of the "i" column
i.current <- output.df[i,"value"]
#square it
i.squared <- i.current^2
#save it to dataframe
output.df[i,"value.squared"] <- i.squared
}
output.df
## value value.squared
## 1 1 1
## 2 3 9
## 3 6 36
## 4 12 144
## 5 24 576
## 6 48 2304
The for loop
loop.in.function <- function(){
for (i in 1:n.values){
#get value in row i of the "i" column
i.current <- output.df[i,"value"]
#square it
i.squared <- i.current^2
#save it to dataframe
output.df[i,"value.squared"] <- i.squared
}
#a helpful reminder
cat("NOTE: If you want to save the output, must assign it to an object")
#MUST INCLUDE return() to send output out of function land
return(output.df)
}
Run the loop
loop.in.function()
## NOTE: If you want to save the output, must assign it to an object
## value value.squared
## 1 1 1
## 2 3 9
## 3 6 36
## 4 12 144
## 5 24 576
## 6 48 2304
If you need to debug
debugonce(loop.in.function)
For each species in a dataset, how many rows does that species occur in. There are easier ways to anwer this question; this is just for illustration.
Load data for Lloyd et al 2016 PeerJ paper.
dat <- read.csv(file = "PALO_data.csv")
head(dat)
## Location Species Year Session spp.ann.tot net.hours tot.years.obs
## 1 PALO BAWW 1997 NA 0 NA 4
## 2 PALO BAWW 2007 NA 0 NA 4
## 3 PALO BAWW 2002 1 1 941.0 4
## 4 PALO BAWW 2010 1 1 1095.0 4
## 5 PALO BAWW 2008 1 1 591.5 4
## 6 PALO BAWW 2003 NA 0 NA 4
## caps.per.1K.nethours net.hours.log
## 1 NA NA
## 2 NA NA
## 3 1.062699 6.846943
## 4 0.913242 6.998510
## 5 1.690617 6.382662
## 6 NA NA
unique() extracts each unqiue species name
species.vector <- unique(dat$Species)
The vector has 23 elements, one for each species. Use length() to determine the number of elements and save it.
n.spp <- length(species.vector)
A vector index values, 1 through the number of species
index.values <- 1:n.spp
Dataframe to hold output
output <- data.frame(species.vector,
N.rows = NA)
We can loop over each species and determine how many rows of that species there are
for(i in index.values){
#extract current species
current.spp <- species.vector[i]
#determine which rows of the datset have $spp that matches
#the current species
index.current.spp <- which(dat$Species == current.spp)
#determine the length of index.current.spp
#to count up the number of rows
n.current.spp <- length(index.current.spp)
#save the output to the row of the dataframe
#corresponding to the current
#species
output$N.rows[i] <- n.current.spp
}
I could write this with less preliminary code
#vector of each species
species.vector <- unique(dat$Species)
#dataframe to hold output
output <- data.frame(species.vector,
N.rows = NA)
for(i in 1:length(species.vector)){
index.current.spp <- which(dat$Species == species.vector[i])
output$N.rows[i] <- length(index.current.spp)
}
output
## species.vector N.rows
## 1 BAWW 11
## 2 BCPT 11
## 3 BITH 11
## 4 BTBW 11
## 5 GABU 11
## 6 GAEL 11
## 7 GTGT 11
## 8 HHTA 11
## 9 HIEM 11
## 10 HIPE 11
## 11 HISP 11
## 12 HITR 11
## 13 HIWO 11
## 14 LATH 11
## 15 NBTO 11
## 16 OVEN 11
## 17 RLTH 11
## 18 RTSO 11
## 19 SSHA 11
## 20 WCHT 11
function.for <- function() {
for(i in 1:length(species.vector)){
index.current.spp <- which(dat$Species == species.vector[i])
output$N.rows[i] <- length(index.current.spp)
}
# must return output of the function!
return(output)
}
Run the for loop by calling the function
function.for()
## species.vector N.rows
## 1 BAWW 11
## 2 BCPT 11
## 3 BITH 11
## 4 BTBW 11
## 5 GABU 11
## 6 GAEL 11
## 7 GTGT 11
## 8 HHTA 11
## 9 HIEM 11
## 10 HIPE 11
## 11 HISP 11
## 12 HITR 11
## 13 HIWO 11
## 14 LATH 11
## 15 NBTO 11
## 16 OVEN 11
## 17 RLTH 11
## 18 RTSO 11
## 19 SSHA 11
## 20 WCHT 11
To save the output of the function must assign it to an object
output <- function.for()
output
## species.vector N.rows
## 1 BAWW 11
## 2 BCPT 11
## 3 BITH 11
## 4 BTBW 11
## 5 GABU 11
## 6 GAEL 11
## 7 GTGT 11
## 8 HHTA 11
## 9 HIEM 11
## 10 HIPE 11
## 11 HISP 11
## 12 HITR 11
## 13 HIWO 11
## 14 LATH 11
## 15 NBTO 11
## 16 OVEN 11
## 17 RLTH 11
## 18 RTSO 11
## 19 SSHA 11
## 20 WCHT 11
Call debugonce() on the function
debugonce(function.for)
The Lloyd data contains time series of counts on 23 species of birds. The following loop runs a regression on each species and saves the output.
species.vector <- unique(dat$Species)
output.df <- data.frame(Species = species.vector,
intercept = NA,
year = NA)
regression.loop <- function(){
for(i in 1:length(species.vector)){
spp.index <- which(dat$Species ==
species.vector[i])
data.subset <- dat[spp.index,]
glm.out <- glm(spp.ann.tot ~ Year,
data.subset,
family = "poisson",
offset = net.hours.log)
output.df[i,c("intercept",
"year")] <- coef(glm.out)
}
return(output.df)
}
Run it
output <- regression.loop()
head(output)
## Species intercept year
## 1 BAWW -53.89253 0.02355041
## 2 BCPT -72.89729 0.03340652
## 3 BITH 10.34299 -0.00777472
## 4 BTBW -93.09204 0.04367571
## 5 GABU 24.57528 -0.01528563
## 6 GAEL 137.06044 -0.07159340