IS607 Week 13 Assignment Part 3

This script aims to test the performance of gsub (base R) against str_replace_all (stringr).

Getting started

Import the necessary packages:

## Loading required package: stringr
## Loading required package: microbenchmark

Generate random character data (this will take some time):

ex.data <- NULL

for (i in 1:100000) {
  ex.data[i] <- paste(sample(letters, 10, replace = TRUE), collapse = '')
}

Performance

Now we have everything we need to compare the two functions.

# stringr replace
microbenchmark(str_replace_all(ex.data, 'a', ''))
## Unit: milliseconds
##                               expr   min    lq  mean median    uq   max
##  str_replace_all(ex.data, "a", "") 50.36 52.36 56.68  53.83 56.04 112.4
##  neval
##    100
# base replace
microbenchmark(gsub('a', '', ex.data))
## Unit: milliseconds
##                    expr   min    lq  mean median    uq   max neval
##  gsub("a", "", ex.data) 49.71 51.57 54.49  52.57 54.14 76.89   100

The difference in performance is not very pronounced, but this is unsurprising as str_replace_all() is a wrapper around gsub().