set.seed(2611)
The sizes and types of the array on which the match is perfomed: array full of NA values, or an array full of LETTERS.
(experiment <- expand.grid(size = 2 ^ (10:18), type = c("NA", "LETTERS")))
## size type
## 1 1024 NA
## 2 2048 NA
## 3 4096 NA
## 4 8192 NA
## 5 16384 NA
## 6 32768 NA
## 7 65536 NA
## 8 131072 NA
## 9 262144 NA
## 10 1024 LETTERS
## 11 2048 LETTERS
## 12 4096 LETTERS
## 13 8192 LETTERS
## 14 16384 LETTERS
## 15 32768 LETTERS
## 16 65536 LETTERS
## 17 131072 LETTERS
## 18 262144 LETTERS
A trivial lookup table
(lookup_table <- setNames(letters, LETTERS))
## A B C D E F G H I J K L M N O P Q R
## "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r"
## S T U V W X Y Z
## "s" "t" "u" "v" "w" "x" "y" "z"
Iterating over all experiments: For each size, a matches vector of the corresponding type is created
library(plyr)
result <- adply(
experiment,
1,
function(x) {
matches <- switch(
as.character(x$type),
`NA`=rep(NA_character_, x$size),
LETTERS=sample(LETTERS, size = x$size, replace = TRUE),
stop()
)
system.time(lookup_table[matches])
}
)
The run time seems to be quadratic for NA tables and linear for LETTERS tables.
result
## size type user.self sys.self elapsed user.child sys.child
## 1 1024 NA 0.001 0.000 0.001 0 0
## 2 2048 NA 0.005 0.000 0.005 0 0
## 3 4096 NA 0.019 0.004 0.023 0 0
## 4 8192 NA 0.088 0.000 0.088 0 0
## 5 16384 NA 0.362 0.000 0.361 0 0
## 6 32768 NA 1.415 0.000 1.417 0 0
## 7 65536 NA 5.759 0.004 5.857 0 0
## 8 131072 NA 22.535 0.000 22.596 0 0
## 9 262144 NA 91.531 0.004 91.533 0 0
## 10 1024 LETTERS 0.000 0.000 0.000 0 0
## 11 2048 LETTERS 0.000 0.000 0.000 0 0
## 12 4096 LETTERS 0.001 0.000 0.000 0 0
## 13 8192 LETTERS 0.000 0.000 0.000 0 0
## 14 16384 LETTERS 0.001 0.000 0.001 0 0
## 15 32768 LETTERS 0.001 0.000 0.001 0 0
## 16 65536 LETTERS 0.003 0.000 0.002 0 0
## 17 131072 LETTERS 0.006 0.000 0.006 0 0
## 18 262144 LETTERS 0.011 0.000 0.010 0 0
This plot shows the per-element run time, which is constant for LETTERS and grows for NA.
library(ggplot2)
ggplot(result, aes(x=size, y=user.self / size, color = type)) + geom_point() + scale_x_log10() + scale_y_log10()