1 Goal


The goal of this tutorial is to learn how to create vectors in a efficient way in order to use the minimum computational power possible.


2 Data Preparation


Let’s see three different methods for creating a vector:

x <- 1:5
x_seq <- seq(from = 1, to = 5)
x_seq1 <- seq(from = 1, to = 5, by = 1)
x_loop <- vector(mode = "integer", length = 0)

for (i in 1:5) {
  x_loop[i] <- i
}

3 Using microbenchmark function


Seems that all the methods used leads to the same result, and that’s completely true. But do they all have the same computational efficiency? Let’s check it with a package that measures the time the CPU uses for reading and executing the code.

library(microbenchmark)
microbenchmark(
x <- 1:5,
x_seq <- seq(from = 1, to = 5),
x_seq1 <- seq(from = 1, to = 5, by = 1),
x_loop <- vector(mode = "integer", length = 0),
for (i in 1:5) {
  x_loop[i] <- i
}
)
## Unit: nanoseconds
##                                            expr     min        lq
##                                        x <- 1:5       0       0.0
##                  x_seq <- seq(from = 1, to = 5)    5531    6716.0
##         x_seq1 <- seq(from = 1, to = 5, by = 1)   15802   19160.5
##  x_loop <- vector(mode = "integer", length = 0)       0     395.0
##           for (i in 1:5) {     x_loop[i] <- i } 1409187 1461927.5
##        mean  median        uq     max neval
##      359.60     395     790.0    1580   100
##    10168.88    8296   13037.0   38321   100
##    28452.43   26469   34172.5   63210   100
##     1189.25    1185    1580.0   11062   100
##  1596208.69 1526125 1596643.0 2552496   100

As it can be seen, when creating a vector, it is better to avoid functions or loops as much as possible as they make the code to do calls to more pieces of internal code for applying the function.


4 Conclusion


Although you have different methods to reach the same result, you must look for the most efficient way, as doing executions of poor efficient code with big pieces of data will make you wait much more than using efficient code.