In this document, we use four different methods: loop, reduce, recursion, and memorization, to compute ‘factorial’ first. In the end, we use microbenchmark package, which is a standard metric in R, to evaluate their performances in time.
library(purrr)
library(microbenchmark)
Note that purrr package will be used in reduce method.
We use a ‘for loop’ to compute the result directly:
factorial_loop <- function(n){
if (n ==0){
1
}else{
y <- 1
for(i in 1:n){
y <-y*i
}
y
}
}
‘Reduce’ is a method that can propagate the result from right to left.
factorial_reduce <- function(n){
if(n ==0){
1
}else{
reduce(c(1:n), function(x, y){
#message("x is ", x)
#message("y is ", y)
#message("")
x * y
})
}
}
‘Recusrion’ is a simple way to solve the problem by calling its sub-problem repeatly until the initial condition is matched.
factorial_func <- function(n){
if (n == 0){
1
}else{
n * factorial_func(n-1)
}
}
Different from Recursion, memorization will keep the results by extra memory space.
fac_tbl <- c (1, rep(NA, 24))
factorial_mem <- function(n){
stopifnot(n >=0)
if (n == 0){
1
}else{
if(!is.na(fac_tbl[n])){
fac_tbl[n]
} else {
fac_tbl[n-1] <<- factorial_mem(n-1)
n * fac_tbl[n-1]
}
}
}
We use microbenchmark package to evaluate these four methods’ performance in time. Note that the unit is in micro-second.
evaluate_by_microbenchmark <- function(n){
record_temp_perf <- summary(microbenchmark(factorial_loop(n),
factorial_reduce(n),
factorial_func(n),
factorial_mem(n)))
record_temp_perf.df <- data.frame(record_temp_perf)
print(record_temp_perf.df)
}
For example, we can use four methods to compute factorial number and evaluate them:
evaluate_by_microbenchmark(3)
## expr min lq mean median uq max
## 1 factorial_loop(n) 1.550 1.9355 2.73572 2.1720 2.3580 45.946
## 2 factorial_reduce(n) 11.631 12.8825 17.27129 14.1715 15.2665 264.492
## 3 factorial_func(n) 2.655 3.0415 4.16900 3.2505 3.5475 24.488
## 4 factorial_mem(n) 18.091 19.3935 21.95915 20.4660 21.8890 63.460
## neval
## 1 100
## 2 100
## 3 100
## 4 100
Note that the output gives summary statistics (min, lq, mean, median, uq, and max) describing the time it took to run the two function over the 100 iterations of each function call so ‘neval’ is equal to 100. Moreover, ‘lq’ and ‘uq’ means 25% and 75% of the distribution, respectively.
We then compute factorial number from 1 to 12 and evaluate them respectively. Note that in my computer, factorial number 13 will cause overflow.
for (i in 1:12){
evaluate_by_microbenchmark(i)
}
## expr min lq mean median uq max neval
## 1 factorial_loop(n) 1.149 1.5170 2.24940 1.7920 2.1430 10.808 100
## 2 factorial_reduce(n) 8.451 9.4375 19.87388 10.2415 12.3555 620.166 100
## 3 factorial_func(n) 1.209 1.5475 2.21252 1.9060 2.3080 10.048 100
## 4 factorial_mem(n) 8.307 9.6540 16.12930 10.6590 19.1825 77.068 100
## expr min lq mean median uq max neval
## 1 factorial_loop(n) 1.423 1.8275 2.61825 2.1400 2.9350 9.890 100
## 2 factorial_reduce(n) 10.874 11.8085 16.65233 12.8725 14.9400 90.638 100
## 3 factorial_func(n) 2.003 2.3915 3.25551 2.5715 2.8195 29.530 100
## 4 factorial_mem(n) 7.955 9.5520 13.31883 10.1725 10.9125 48.513 100
## expr min lq mean median uq max neval
## 1 factorial_loop(n) 1.575 2.0110 2.19815 2.1905 2.312 6.990 100
## 2 factorial_reduce(n) 11.791 12.9190 14.31068 14.1230 14.659 59.775 100
## 3 factorial_func(n) 2.642 2.9770 3.20049 3.1700 3.364 5.897 100
## 4 factorial_mem(n) 18.426 19.3735 20.77403 20.2045 20.951 43.628 100
## expr min lq mean median uq max neval
## 1 factorial_loop(n) 1.774 2.2095 2.51121 2.4080 2.5565 16.580 100
## 2 factorial_reduce(n) 13.340 14.2705 15.52745 15.3300 15.8095 48.341 100
## 3 factorial_func(n) 3.476 3.8380 4.08287 4.0180 4.2390 7.209 100
## 4 factorial_mem(n) 18.191 19.1935 20.77293 20.1255 21.0680 52.603 100
## expr min lq mean median uq max
## 1 factorial_loop(n) 2.043 2.5640 3.75356 2.9550 4.3925 20.916
## 2 factorial_reduce(n) 14.796 16.3675 22.99093 17.7460 20.7055 130.597
## 3 factorial_func(n) 4.165 4.7680 6.02899 5.1355 5.6030 11.485
## 4 factorial_mem(n) 18.782 20.5885 28.41639 21.8765 41.7235 87.003
## neval
## 1 100
## 2 100
## 3 100
## 4 100
## expr min lq mean median uq max neval
## 1 factorial_loop(n) 2.261 2.7120 3.04454 2.9620 3.1965 8.318 100
## 2 factorial_reduce(n) 15.951 17.1070 18.98521 18.2090 18.8830 55.206 100
## 3 factorial_func(n) 4.869 5.3215 5.69003 5.6230 5.8260 14.508 100
## 4 factorial_mem(n) 18.333 19.6220 22.15418 20.5635 21.1710 93.079 100
## expr min lq mean median uq max neval
## 1 factorial_loop(n) 2.469 2.9495 3.29283 3.1395 3.3620 16.507 100
## 2 factorial_reduce(n) 17.176 18.6690 19.93469 19.5535 20.2340 51.081 100
## 3 factorial_func(n) 5.745 6.0945 6.48984 6.4150 6.7040 10.069 100
## 4 factorial_mem(n) 17.961 19.3035 20.85047 20.5600 21.1975 51.354 100
## expr min lq mean median uq max
## 1 factorial_loop(n) 2.745 3.2870 4.02985 3.5050 3.7765 19.735
## 2 factorial_reduce(n) 18.754 20.6295 23.32073 21.5390 22.4615 80.784
## 3 factorial_func(n) 6.408 6.8370 8.06142 7.2595 7.7770 28.331
## 4 factorial_mem(n) 18.647 19.9920 22.63953 20.9055 21.6000 104.610
## neval
## 1 100
## 2 100
## 3 100
## 4 100
## expr min lq mean median uq max
## 1 factorial_loop(n) 3.020 3.6195 4.92024 3.9350 6.5775 9.732
## 2 factorial_reduce(n) 20.427 22.2935 30.16569 23.3285 34.0840 99.081
## 3 factorial_func(n) 7.134 7.9330 10.87285 8.5605 14.0570 26.381
## 4 factorial_mem(n) 18.517 20.8025 32.16427 22.8065 42.4115 159.118
## neval
## 1 100
## 2 100
## 3 100
## 4 100
## expr min lq mean median uq max
## 1 factorial_loop(n) 3.217 3.8530 4.62227 4.2020 4.4800 17.430
## 2 factorial_reduce(n) 22.560 24.2090 30.21288 25.0180 26.7560 167.815
## 3 factorial_func(n) 7.996 8.7460 10.00581 9.0820 9.3840 34.991
## 4 factorial_mem(n) 18.424 20.8415 26.19596 21.7095 22.7685 180.932
## neval
## 1 100
## 2 100
## 3 100
## 4 100
## expr min lq mean median uq max
## 1 factorial_loop(n) 3.332 3.9100 5.06758 4.3540 4.8475 33.651
## 2 factorial_reduce(n) 22.085 25.3845 30.59537 26.6275 28.2010 84.275
## 3 factorial_func(n) 8.609 9.6160 11.14002 10.0200 10.5135 24.144
## 4 factorial_mem(n) 18.510 20.6015 26.57481 21.7240 23.7470 115.039
## neval
## 1 100
## 2 100
## 3 100
## 4 100
## expr min lq mean median uq max
## 1 factorial_loop(n) 3.645 4.1595 4.52455 4.4165 4.7395 8.207
## 2 factorial_reduce(n) 23.597 26.4670 30.32473 27.5030 28.9965 131.069
## 3 factorial_func(n) 9.234 10.1055 11.20232 10.5680 10.9850 31.583
## 4 factorial_mem(n) 18.211 20.1095 22.68465 21.0175 22.3090 88.959
## neval
## 1 100
## 2 100
## 3 100
## 4 100
With these data, we finally draw the trends by their means
time_data <- data.frame(factorial_loop=c(2.1395,1.97,2.37,3.215,4.03,3.11,3.74,4.49,4.35,5.25,4.40,4.70),
factorial_reduce=c(11.38,12.38,17.51,21.186,25.29,20.62,23.40,26.49,28.44,28.88,26.36,30.48),
factorial_func=c(2.414,2.48,3.79,5.41,6.82,6.06,6.89,9.65,9.67,9.85,10.03,11.26),
factorial_mem=c(11.2,9.69,23.96,28.39,31.47,22.24,24.87,27.715,25.39,24.73,21.44,23.23),
number=c(1,2,3,4,5,6,7,8,9,10,11,12))
and then draw the line graph.
library(ggplot2)
library(reshape2)
data_long <- melt(time_data, id="number")
ggplot(data=data_long,
aes(x=number, y=value, colour=variable)) +
geom_line()
According to the graph shown above, Loop generally has the best performance and its increase is close to constant. The main reason is that it just scans the input one time and then obtains the answer.
Recursion is the next but its increase is dramatic sine it will call its sub-problem and save no intermediate result.
Memorization has the worst performance in the beginning since it is very close to what Recursion does but still has extra I/O time to keep intermediate results. However, its benefit will be gained when n is larger since what it has to do is to obtain the saved result and operate a multiply operation.
Reduce’s spirit are accumulation and propagation. It has to scan 1 to n and accumulates all intermediates. Since it has most operations and costs most space, it will have worst performance when n is large.
The main drawback is that my computer can not compute the result when n is greater than 13. If this problem can be overcome, I think the trend will become obvious.