Time Comparison

I wanted to compare the runtimes of the for loop, the while loop and the built in R functions after reading that loops were not particularly efficient in R. I wrote a wrapper method designed to time how long it took to run a function and then 3 summing functions using a for loop, a while loop and the built in sum function.

timer <- function(func,...){
  start <- Sys.time()
  func(...)
  return(Sys.time() - start)
}

forLoop <- function(x){
  total <- 0
  for(i in 1:x){
    total <- total + i
  }
}

whileLoop <- function(x){
  i <- 0
  total <- 0
  while(i < x){
    total <- total + i
    i <- i + 1
  }
}

sumLoop <- function(x){
  sum(as.numeric(1:x))
}

I then ran each of them 10 times summing the first 1 million to 10 million numbers, storing their runtimes. They are plotted below.

forVector <- vector(length=5)
whileVector <- vector(length=5)
sumVector <- vector(length=5)
con <- 1000000
for(i in 1:10){
  forVector[i] <- timer(forLoop, i*con)
  whileVector[i] <- timer(whileLoop, i*con)
  sumVector[i] <- timer(sumLoop, i*con)
}

s <- seq(1,10,1)
plot(whileVector~s, type='l', col='green')
lines(forVector~s,type='l',col='red')
lines(sumVector~s, type='l',col='blue')

The time difference is amazing! It is pretty clear that when working with large data sets it is important to make wise decisions about how to process data. It appears that using the built in R function is the superior choice. I was also taken back by the different between a while loop and a for loop. I can only speculate that this is due to the fact that the while loop used two assignment statements while the for loop only used one.