R is widely seen as being 'slow' (see julia web page)
But, if you use a few specific tools, then this becomes irrelevant because of the powerful tools in various packages in R
December 9, 2016
R is widely seen as being 'slow' (see julia web page)
But, if you use a few specific tools, then this becomes irrelevant because of the powerful tools in various packages in R
Pure R, when the most efficient vectorized code is used, appears to be 1/2x the speed of the most efficient C++.
See Hadley Wickham's page on Rcpp, scroll down to "Vector input, vector output"… ), noting that if it took 10 minutes to write the C++ code, it would have to be 150,000 times faster to make it worth it.
Spatial simulation means doing the same thing over and over and over … so we need speed
We will show how to profile your code at the end of this section.
# Instead of
a <- vector()
for (i in 1:1000) {
a[i] <- rnorm(1)
}
# use vectorized version, which is built into the functions
a <- rnorm(1000)
data.framebase package – everything matrix or vector is 'fast'raster - for spatially referenced matrices
sp - equivalent of vector shapefiles in a GIS
see also sf
data.table
data.frame type data (i.e., columns of data)data.frame is small (<100,000 rows)SpaDES – many functions; will be moved into a separate package soon
Rcpp
SpaDES functions quickly, because there are fewer tutorials online for theseraster, sp, data.table, RcppSpaDES functions?`spades-package` # section 2 shows many functions # e.g., ?spread ?move ?cir ?adj ?distanceFromEachPoint
rastertutorials
spTutorials
sfdata.table packageFrom every data.table user ever:
install.packages('data.table')
(at least for large tables!)
data.table tutorialsraster and data.table togetherThe current implementation of LANDIS-SpaDES uses a "reduced" data structure throughout
Instead of keeping rasters of everything (one can imagine that there is redundancy, i.e., 2 pixels next to each other may be identical)
We make one raster of "id" and one data.table with a column called "id"
Then we can have as many columns as we want of information about each of these places
Like "polygons", but for rasters, and dynamic… can change over time
This may be useful for your own module
raster and data.table together?rasterizeReduced
What does this do?
Rcpp packageFrom every Rcpp user ever:
install.packages('Rcpp')
AFTER that, then you can use 2 great tools:
profvis package (built into the latest Rstudio previews, but not the official release version)microbenchmark packagemicrobenchmark::microbenchmark(
loop = {
a <- vector()
for (i in 1:1000) a[i] <- rnorm(1)
},
vectorized = { a <- rnorm(1000) }
)
## Unit: microseconds ## expr min lq mean median uq max neval ## loop 4359.033 4769.864 5861.9357 5217.3890 6005.141 43314.569 100 ## vectorized 92.191 93.715 102.4152 95.3465 101.329 175.683 100
If you have Rstudio version >=0.99.1208, then it has profiling as a menu item.
alternatively, we wrap any block of code with profvis
This can be a spades() call, so it will show you the entire model:
profvis::profvis({a <- rnorm(10000000)})
spades callTry it:
mySim <- simInit(
times = list(start = 0.0, end = 2.0, timeunit = "year"),
params = list(
.globals = list(stackName = "landscape", burnStats = "nPixelsBurned")
),
modules = list("randomLandscapes", "fireSpread", "caribouMovement"),
paths = list(modulePath = system.file("sampleModules", package = "SpaDES"))
)
profvis::profvis({spades(mySim)})
If you have used these tools, then:
SpaDES model callBuilding models with modules …