INTRODUCTION

The R spatial environment for manipulating spatial objects is mostly made of two suites of packages, one for vector data (or shapefiles) (sp, rgdal, rgeos) and one for raster data (raster). Vector data correspond to coordinates that make either points, lines or polygons, while raster data mostly correspond to regularly gridded data.

More recently, the new package sf (for simple features) has been developped to simplify and improve manipulations and operations on spatial vector data. Here, I’ve chosen to focus on spatial objects from the sp package as a lot of other packages cannot deal with sf objects yet. Also, once sp objects are understood, it is really straigthforward to move to using sf instead.























SHAPEFILES

The main packages used for handling shapefiles and doing spatial operations are listed below.

Package Use
sp Defines S4 spatial classes and methods for manipulating spatial objects
rgdal Provides bindings with the GDAL and PROJ.4 libraries for reading, projections, coordinate transformations, etc.
rgeos Provides bindings for the GEOS library for spatial operations (intersections, buffers, etc.)
maptools Tools for reading and handling spatial objects
sf New simpler S3 spatial classes and methods for manipulating spatial objects as simple features























Reading and writing

Reading

First, let’s look in the folder where the shapefile is located.

list.files("C:/Users/God/Documents/ML")
##  [1] "carreteras.dbf"                     
##  [2] "carreteras.prj"                     
##  [3] "carreteras.sbn"                     
##  [4] "carreteras.sbx"                     
##  [5] "carreteras.shp"                     
##  [6] "carreteras.shx"                     
##  [7] "cat.txt"                            
##  [8] "cdem_dem_021E.tif"                  
##  [9] "cdem_dem_031H.tif"                  
## [10] "crap.dbf"                           
## [11] "crap.prj"                           
## [12] "crap.shp"                           
## [13] "crap.shx"                           
## [14] "Extended Raster"                    
## [15] "Extended Raster.zip"                
## [16] "Landcover_2015_extended.tif"        
## [17] "LandEcoCorrected (1).txt"           
## [18] "LandEcoCorrected.txt"               
## [19] "LandEcoCorrected_addedvalue (1).txt"
## [20] "LandEcoCorrected_addedvalue.csv"    
## [21] "LandEcoCorrected_addedvalue.txt"    
## [22] "poblacion.dbf"                      
## [23] "poblacion.prj"                      
## [24] "poblacion.sbn"                      
## [25] "poblacion.sbx"                      
## [26] "poblacion.shp"                      
## [27] "poblacion.shx"                      
## [28] "pop.dbf"                            
## [29] "pop.prj"                            
## [30] "pop.shp"                            
## [31] "pop.shx"                            
## [32] "roads.dbf"                          
## [33] "roads.prj"                          
## [34] "roads.shp"                          
## [35] "roads.shx"                          
## [36] "test.dbf"                           
## [37] "test.prj"                           
## [38] "test.shp"                           
## [39] "test.shx"























Reading shapefiles is done with the function readOGR from the rgdal package.

library(sp)
library(rgdal)
roads <- readOGR(dsn = "C:/Users/God/Documents/ML", layer = "carreteras", 
    encoding = "UTF-8")
## OGR data source with driver: ESRI Shapefile 
## Source: "C:/Users/God/Documents/ML", layer: "carreteras"
## with 1171 features
## It has 4 fields
class(roads)
## [1] "SpatialLinesDataFrame"
## attr(,"package")
## [1] "sp"























This object can be simply plotted with the function plot which has a method for spatial objects.

plot(roads, col = gray(0.75), lwd = 4, axes = TRUE)























What’s contained in this object?

head(roads)
##   CLASIF   LENGTH DESCRIPCIO      PAíS
## 0     C4  747.849    Veredas Guatemala
## 1     C4 1303.025    Veredas    Belice
## 2     C4 2055.814    Veredas    Belice
## 3     C4 1807.620    Veredas    Belice
## 4     C4 2661.517    Veredas    Belice
## 5     C4  547.114    Veredas    Mexico























Writing

writeOGR(roads, dsn = "C:/Users/God/Documents/ML", layer = "test", driver = "ESRI Shapefile", 
    overwrite = TRUE)
list.files("C:/Users/God/Documents/ML")
##  [1] "carreteras.dbf"                     
##  [2] "carreteras.prj"                     
##  [3] "carreteras.sbn"                     
##  [4] "carreteras.sbx"                     
##  [5] "carreteras.shp"                     
##  [6] "carreteras.shx"                     
##  [7] "cat.txt"                            
##  [8] "cdem_dem_021E.tif"                  
##  [9] "cdem_dem_031H.tif"                  
## [10] "crap.dbf"                           
## [11] "crap.prj"                           
## [12] "crap.shp"                           
## [13] "crap.shx"                           
## [14] "Extended Raster"                    
## [15] "Extended Raster.zip"                
## [16] "Landcover_2015_extended.tif"        
## [17] "LandEcoCorrected (1).txt"           
## [18] "LandEcoCorrected.txt"               
## [19] "LandEcoCorrected_addedvalue (1).txt"
## [20] "LandEcoCorrected_addedvalue.csv"    
## [21] "LandEcoCorrected_addedvalue.txt"    
## [22] "poblacion.dbf"                      
## [23] "poblacion.prj"                      
## [24] "poblacion.sbn"                      
## [25] "poblacion.sbx"                      
## [26] "poblacion.shp"                      
## [27] "poblacion.shx"                      
## [28] "pop.dbf"                            
## [29] "pop.prj"                            
## [30] "pop.shp"                            
## [31] "pop.shx"                            
## [32] "roads.dbf"                          
## [33] "roads.prj"                          
## [34] "roads.shp"                          
## [35] "roads.shx"                          
## [36] "test.dbf"                           
## [37] "test.prj"                           
## [38] "test.shp"                           
## [39] "test.shx"























Building shapefiles

From a data.frame

cat <- read.table("C:/Users/God/Documents/ML/cat.txt", header = TRUE, stringsAsFactors = FALSE)
head(cat)
##           X        Y Attack
## 1 -89.34188 18.49952      0
## 2 -89.41309 18.47991      0
## 3 -89.26248 18.60019      1
## 4 -89.19920 18.59252      1
## 5 -89.30334 18.59921      1
## 6 -89.23133 18.59242      1

This is a simple data.frame with X and Y columns representing longitudes and latitudes.























To transform this data.frame to a spatial object, we just have to do this:

library(sp)
coordinates(cat) <- ~X + Y
class(cat)
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"

The object returned is now a SpatialPointsDataFrame.























Here is what it looks like.

library(scales)  # to use the alpha function for adding transparency

plot(cat, col = alpha(ifelse(cat$Attack == 0, "blue", "red"), 0.4), pch = 16, 
    cex = 1.5, axes = TRUE)























From scratch

Shapefiles can also be built from scratch and turned to a spatial object with the SpatialPointsDataFrame function.

set.seed(123)
n <- 10
x1 <- rnorm(n, 0, 1)
y1 <- rnorm(n, 0, 1)
id <- 1:n
x <- SpatialPointsDataFrame(cbind(x1, y1), data = data.frame(id))
head(x)
##   id
## 1  1
## 2  2
## 3  3
## 4  4
## 5  5
## 6  6
plot(x, axes = TRUE)























Creating a shapefile of polygons is slightly more complex and requires a couple more steps. We will first show the use of pipes ( %>% )from the magrittr package to make this easier. Pipes are used to apply a series of operations on an object. The first argument of a function in a pipe sequence is by default the result of the previous operation. It is meant to increase the readability of code.

library(magrittr)

temp <- 1:10

sum(rep(mean(temp), 50))
## [1] 275
temp %>% 
mean %>% 
rep(50) %>% 
sum
## [1] 275























To create a polygon, we need to have a matrix of points and in wich the first point is the same as the last to close the polygon.

m <- rbind(c(0, 0), c(0, 2), c(2, 2), c(3, 2), c(2, 1))
m <- rbind(m, m[1, ])

Then we can use pipes to go through the series of steps to build a SpatialPolygons. Type ?SpatialPolygons to better understand their hierarchical structure.

pol <- m %>% 
     Polygon %>% 
     list %>% 
     Polygons(ID = 1) %>% 
     list %>% 
     SpatialPolygons

plot(pol,axes=TRUE)























Structure of spatial objects

Spatial objects from the sp package are made with S4 classes. S4 classes are formal classes that can be tricky and hard to understand. Here, I won’t go into the details of how S4 classes work, but I will describe the structure of spatial objects.

library(sp)
getClass("Spatial")
## Class "Spatial" [package "sp"]
## 
## Slots:
##                               
## Name:         bbox proj4string
## Class:      matrix         CRS
## 
## Known Subclasses: 
## Class "SpatialPoints", directly
## Class "SpatialMultiPoints", directly
## Class "SpatialGrid", directly
## Class "SpatialLines", directly
## Class "SpatialPolygons", directly
## Class "SpatialPointsDataFrame", by class "SpatialPoints", distance 2
## Class "SpatialPixels", by class "SpatialPoints", distance 2
## Class "SpatialMultiPointsDataFrame", by class "SpatialMultiPoints", distance 2
## Class "SpatialGridDataFrame", by class "SpatialGrid", distance 2
## Class "SpatialLinesDataFrame", by class "SpatialLines", distance 2
## Class "SpatialPixelsDataFrame", by class "SpatialPoints", distance 3
## Class "SpatialPolygonsDataFrame", by class "SpatialPolygons", distance 2























Indexing

Spatial objects either come with or without attributes. For example, points can be stored in a SpatialPoints object or in a SpatialPointsDataFrame object. When spatial objects don’t have attributes, they behave like vectors and can be indexed as such. Here, with a SpatialPoints object.

length(pol)
## [1] 1
dim(pol)
## NULL
pol[1]
## class       : SpatialPolygons 
## features    : 1 
## extent      : 0, 3, 0, 2  (xmin, xmax, ymin, ymax)
## coord. ref. : NA























When they have attributes (SpatialPointsDataFrame), they can be indexed just like a data.frame.

length(cat)
## [1] 101
dim(cat)
## [1] 101   1
cat[1:3, ]
## class       : SpatialPointsDataFrame 
## features    : 3 
## extent      : -89.41309, -89.26248, 18.47991, 18.60019  (xmin, xmax, ymin, ymax)
## coord. ref. : NA 
## variables   : 1
## names       : Attack 
## min values  :      0 
## max values  :      1

Columns of the attribute table can also be extracted just like in a data.frame.

cat$Attack
##   [1] 0 0 1 1 1 1 0 0 0 0 1 0 1 0 1 0 1 0 0 1 1 1 1 0 0 0 1 1 0 0 1 1
##  [33] 1 1 1 1 1 1 1 1 0 1 0 0 1 0 1 1 1 1 0 0 0 0 1 1 1 1 1 0 1 1 1 1
##  [65] 1 1 1 0 0 0 0 0 1 0 1 0 0 0 1 1 1 1 1 1 1 1 0 0 0 1 0 0 1 1 1 0
##  [97] 0 0 1 0 1























Accessing slots

The function names and the $ applied on a spatial object behave the same way as on a data.frame, while the @ operator allows to extract objects stored in the slots of the object. Different methods are also available to extract different information from spatial objects such bounding boxes (bbox), centroid coordinates (coordinates), etc. Type ?Spatial to learn more about them.

Names of the columns in the data.frame of attributes.

names(cat)
## [1] "Attack"

Look at the attribute table which corresponds to a data.frame.

head(cat@data)
##   Attack
## 1      0
## 2      0
## 3      1
## 4      1
## 5      1
## 6      1























Names of the different slots composing the object.

slotNames(cat)
## [1] "data"        "coords.nrs"  "coords"      "bbox"       
## [5] "proj4string"























Extract coordinates.

head(cat@coords)
##           X        Y
## 1 -89.34188 18.49952
## 2 -89.41309 18.47991
## 3 -89.26248 18.60019
## 4 -89.19920 18.59252
## 5 -89.30334 18.59921
## 6 -89.23133 18.59242

Slot containing the bounding box.

cat@bbox
##         min       max
## X -90.35930 -88.93847
## Y  17.85815  19.12909























– Exercice 1 –

Build a SpatialPointsDataFrame with 100 random locations centered on Sherbrooke, plot the results and write the results to a shapefile. Hint: use the function runif or rnorm to generate random values within a given range.























– Solution –

Here is a solution

# Find coordinates using Google maps and generate random values for lat
# lon coordinates and an id
n <- 100
lon <- runif(n, -71.958, -71.822)
lat <- runif(n, 45.383, 45.433)
id <- 1:n

# Build a data.frame and turn it to a SpatialPointsDataFrame
d <- data.frame(lon, lat, id)
coordinates(d) <- ~lon + lat

# Plot the result
plot(d, axes = TRUE)























# Write points to a shapefile
writeOGR(d, dsn = ".", layer = "sherby", driver = "ESRI Shapefile", overwrite_layer = TRUE)

# Check if the shapefile has been written
list.files()
##  [1] "CAN_msk_alt.grd"       "CAN_msk_alt.gri"      
##  [3] "CAN_msk_alt.vrt"       "GADM_2.8_CAN_adm1.rds"
##  [5] "GADM_2.8_CAN_adm2.rds" "GADM_2.8_NLD_adm2.rds"
##  [7] "map.gfw"               "map.gif"              
##  [9] "map.prj"               "README.md"            
## [11] "rsconnect"             "sherby.dbf"           
## [13] "sherby.shp"            "sherby.shx"           
## [15] "spatialR.html"         "spatialR.nb.html"     
## [17] "spatialR.pdf"          "spatialR.Rmd"         
## [19] "spatialR.Rproj"        "spatialR_cache"       
## [21] "spatialR_files"        "USA1_msk_alt.grd"     
## [23] "USA1_msk_alt.gri"      "USA1_msk_alt.vrt"     
## [25] "USA2_msk_alt.grd"      "USA2_msk_alt.gri"     
## [27] "USA2_msk_alt.vrt"      "USA3_msk_alt.grd"     
## [29] "USA3_msk_alt.gri"      "USA3_msk_alt.vrt"     
## [31] "USA4_msk_alt.grd"      "USA4_msk_alt.gri"     
## [33] "USA4_msk_alt.vrt"      "wc0.5"                
## [35] "wc10"                  "wc2-5"























Spatial operations

Most spatial operations are done using the package rgeos. One of the requirement of functions in rgeos is that objects need to be projected.

Overlay

The function over from package sp is used to determine whether different entities overlap. It returns a vector or a data.frame with the identities or the characteristics of overlapping elements.

plot(x, axes = TRUE, xlim = c(-3, 3), ylim = c(-3, 3))
plot(pol, add = TRUE)























The points are stored in a SpatialPointsDataFrame and the polygon in a SpatialPolygons object.

class(x)
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"
class(pol)
## [1] "SpatialPolygons"
## attr(,"package")
## [1] "sp"























As can be seen in the help files for ?over, if the second object is a spatial object without attributes, a vector with the same length as the first element is returned with the first overlapping id in the second element.If the second element has attribute data, attribute data are returned.

o <- over(x, pol)
o
##  1  2  3  4  5  6  7  8  9 10 
## NA NA NA  1 NA  1  1 NA NA NA
plot(x, axes = TRUE, xlim = c(-3, 3), ylim = c(-3, 3))
plot(x[!is.na(o), ], add = TRUE, col = "red")
plot(pol, add = TRUE)

o <- over(pol, x)
o
##   id
## 1  4























Cut or delimit

Here are some common operations that can be done on points (but also on lines or on polygons). These operations are usually done using the package rgeos.

library(rgeos)

# list of functions to apply
op <- c("gCentroid", "gEnvelope", "gConvexHull", "gBuffer", "gDelaunayTriangulation")
# set the graphic window
par(mfrow = c(2, 3), mar = c(0, 0, 3, 0))
# apply all operations
for (i in seq_along(op)) {
    plot(x, main = op[i], xlim = c(-2, 2), ylim = c(-3, 3))
    plot(get(op[i])(x), col = gray(0.5, 0.5), add = TRUE, cex = 3)
}























Here are some common operations that are done on polygons.

# construct two buffers
x1 <- gBuffer(x[1, ], width = 1)
x2 <- gBuffer(x[2, ], width = 1)

# list of functions to apply
op <- c("gIntersection", "gDifference", "gSymdifference", "gUnion")
# set the graphic window
par(mfrow = c(2, 2), mar = c(0, 0, 3, 0))
# apply all operations
for (i in seq_along(op)) {
    plot(x1, main = op[i], xlim = c(-2, 2), ylim = c(-3, 3))
    plot(x2, add = TRUE)
    plot(get(op[i])(x1, x2), add = TRUE, col = "red")
}























Measure distances

Here, let’s measure the distance between each points and their centroid.

cx <- gCentroid(x)
plot(x, axes = TRUE)
plot(cx, add = TRUE, col = "red", lwd = 3)
text(x, label = round(gDistance(x, cx, byid = TRUE), 2), adj = c(0, 1.3), 
    cex = 0.8)























Measure areas

Here, let’s calculate the area of each polygon of a Delaunay triangulation. Also, let’s use the polygonsLabel function from rgeos to plot the values in a good location in each polygon.

ch <- gDelaunayTriangulation(x)
area <- gArea(ch, byid = TRUE)
plot(ch, axes = TRUE)
invisible(polygonsLabel(ch, labels = round(area, 2), cex = 0.6, gridpoints = 1000, 
    method = "centroid"))  # plot are in each polygon























Sample

Sample points in polygons in different ways using the spsample function.

set.seed(12345)

n <- 50

# list sampling types
type <- c("random", "stratified", "regular", "hexagonal")

# set graphic window
par(mfrow = c(2, 2), mar = c(0, 0, 2, 0))  # build graphical display

# plot the different sampling types
for (i in seq_along(type)) {
    s <- spsample(pol, n, type = type[i])  # get random points
    plot(pol, main = type[i], font = 2)  # plot polygon
    plot(s, pch = 1, add = TRUE)  # plot random points
}























Random points can also be distributed within random polygons

set.seed(12345)

N <- 10  # number of random polygons
n <- 5  # number of random points within each polygons

plot(pol)
# sample points and build buffers around them
s1 <- gBuffer(spsample(pol, N, type = "random"), byid = TRUE, width = 0.1)
# plot buffers
plot(s1, add = TRUE)
# sample points in each polygon
s2 <- sapply(s1@polygons, spsample, n = n, type = "random")
# bind points in the same spatial object
s2 <- do.call("rbind", s2)
# plot points
plot(s2, add = TRUE)























Combine operations

Here is an example were we combine several operations to determine the length of roads within each random buffer in a region using the previous road data.

Let’s build the random buffers and see the results.

library(scales)
set.seed(1234)  # set the seed to obtain the same result each time
n <- 10
b <- gEnvelope(roads)  # determine the bounding box of the road system
s <- spsample(b, n = n, type = "random")  # throw random points within the bounding box
buffs <- gBuffer(s, width = 10000, byid = TRUE, id = 1:n)  # turn the points to buffers

plot(b)
plot(roads, add = TRUE, col = gray(0.7))
text(coordinates(s), labels = names(buffs))
plot(buffs, add = TRUE, border = NA, col = gray(0, 0.1))























Extract all road sections that intersect with a buffer.

int <- gIntersection(buffs, roads, byid = TRUE)
plot(int, col = "red", lwd = 4, add = TRUE)























Then determine the list of road sections touching each buffer.

o <- over(buffs, int, returnList = TRUE)
o
## $`1`
## 1 104 1 105 1 107 1 108 
##     1     2     3     4 
## 
## $`2`
##  2 200  2 201  2 202  2 203  2 395  2 667  2 701  2 703  2 783  2 784 
##      5      6      7      8      9     10     11     12     13     14 
##  2 801  2 802 2 1001 2 1003 2 1017 2 1018 2 1019 2 1020 2 1021 2 1024 
##     15     16     17     18     19     20     21     22     23     24 
## 2 1025 2 1096 2 1097 2 1100 2 1101 2 1102 2 1103 2 1104 2 1105 2 1106 
##     25     26     27     28     29     30     31     32     33     34 
## 2 1109 
##     35 
## 
## $`3`
##  3 196  3 226  3 229  3 391  3 393  3 629  3 630  3 631  3 637  3 638 
##     36     37     38     39     40     41     42     43     44     45 
##  3 639  3 646  3 647  3 648  3 649  3 650  3 651  3 688  3 689  3 698 
##     46     47     48     49     50     51     52     53     54     55 
##  3 699  3 700  3 772  3 773  3 774  3 775  3 776  3 975  3 976  3 977 
##     56     57     58     59     60     61     62     63     64     65 
##  3 978  3 979  3 980  3 981  3 982  3 983  3 984  3 985  3 986  3 988 
##     66     67     68     69     70     71     72     73     74     75 
## 3 1009  9 648  9 651  9 772  9 976 9 1009 
##     76    146    147    152    164    170 
## 
## $`4`
## named integer(0)
## 
## $`5`
## 5 731 5 732 5 733 5 734 5 735 5 736 5 737 5 811 5 812 5 813 5 814 
##    77    78    79    80    81    82    83    84    85    86    87 
## 5 815 5 816 5 817 5 818 5 819 5 820 5 821 5 822 5 823 5 824 5 825 
##    88    89    90    91    92    93    94    95    96    97    98 
## 5 826 5 827 5 828 5 829 5 832 5 836 5 837 
##    99   100   101   102   103   104   105 
## 
## $`6`
## named integer(0)
## 
## $`7`
## named integer(0)
## 
## $`8`
## 8 136 8 141 
##   106   107 
## 
## $`9`
##  3 648  3 651  3 772  3 976 3 1009   9 12   9 13  9 196  9 197  9 198 
##     49     52     58     64     76    108    109    110    111    112 
##  9 199  9 238  9 239  9 240  9 241  9 242  9 243  9 244  9 253  9 257 
##    113    114    115    116    117    118    119    120    121    122 
##  9 258  9 264  9 265  9 266  9 267  9 283  9 284  9 285  9 286  9 291 
##    123    124    125    126    127    128    129    130    131    132 
##  9 298  9 299  9 300  9 388  9 601  9 602  9 603  9 604  9 632  9 633 
##    133    134    135    136    137    138    139    140    141    142 
##  9 634  9 635  9 636  9 648  9 651  9 652  9 768  9 769  9 771  9 772 
##    143    144    145    146    147    148    149    150    151    152 
##  9 965  9 966  9 967  9 968  9 969  9 970  9 971  9 972  9 973  9 974 
##    153    154    155    156    157    158    159    160    161    162 
##  9 975  9 976 9 1004 9 1005 9 1006 9 1007 9 1008 9 1009 9 1010 
##    163    164    165    166    167    168    169    170    171 
## 
## $`10`
## named integer(0)























Finally, calculate the total length of roads inside each buffer.

o
dist <- sapply(o, function(i) {
    sum(gLength(int[i], byid = TRUE))
})
text(coordinates(buffs), labels = round(dist/1000, 1), col = "black", adj = c(-1, 
    -2))













































– Exercice 2 –

Calculate the proportion of forest in 10 random buffers across a landscape. Let’s first generate the landscape.

set.seed(123)

x<-runif(100, 0, 100) # generate random points
y<-runif(100, 0, 100)

h <- cbind(x, y) %>% # build a matrix
     SpatialPoints %>% # turn it to spatial points  
     gBuffer(width = rpois(length(x), 4), byid = TRUE) %>% # build random poisson buffers around them 
     gEnvelope(byid = TRUE) %>% # take their envelope
     gUnaryUnion # put them all in a single polygon
     

plot(h, col = "green4", border = NA, axes = TRUE)

– Solution –

# First throw 10 random points within the envelope of the landscape and then generates buffers around these points
b<-h %>% 
   gEnvelope %>% 
   spsample(10, type = "random") %>% 
   gBuffer(width = 10, byid = TRUE)

# Plot the buffers
plot(h, col = "green4", border = NA, axes = TRUE)
plot(b, add = TRUE)

# Intersect the buffers with the forest and plot the result
int<-gIntersection(b, h, byid = TRUE)
plot(int, add = TRUE, col = "red", border = NA)

# Compute the proportion of forested areas within each buffer
pa <- gArea(int, byid = TRUE) / gArea(b, byid = TRUE)

# Add the value on the plot
text(coordinates(b), label = round(pa, 2))

COORD. REF. SYS.

As always, when playing with spatial data, one needs to be aware of coordinate reference systems (CRS). Assigning and transforming CRS in R is not too difficult. It is mostly done with the proj4string, CRS and spTransform functions from the rgdal package.

First, let’s get some data from GBIF using the rgbif package. We will get occurence data for a species of birds, the Bicknell’s Thrush (Catharus bicknelli).

library(rgbif)

bick <- occ_search(scientificName = "Catharus bicknelli", hasCoordinate = TRUE, 
    country = "CA")
bick <- as.data.frame(bick$data)
coordinates(bick) <- ~decimalLongitude + decimalLatitude
plot(bick, axes = TRUE)























Assign a CRS

Assigning a coordinate reference system is done with the functions CRS (Coordinate Reference System) and proj4string.

proj4string(bick)
## [1] NA
proj4string(bick) <- CRS("+proj=longlat +datum=WGS84 +ellps=WGS84")
proj4string(bick)
## [1] "+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"
plot(bick, axes = TRUE)























Change a CRS

# canada<-readOGR('C:/Users/rouf1703/Documents/UdeS/Formation/Canada',layer='Canada')
library(raster)

canada <- getData("GADM", country = "CAN", level = 1)

crs <- c("+proj=longlat +datum=WGS84 +ellps=WGS84", "+proj=utm +zone=18 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0", 
    "+proj=utm +zone=10 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0", 
    "+proj=laea +lat_0=0 +lon_0=0 +ellps=GRS80")

par(mfrow = c(2, 2), mar = c(2, 3, 2, 0))

for (i in 1:length(crs)) {
    canada2 <- spTransform(canada, CRS(crs[i]))
    plot(canada2, axes = TRUE, main = crs[i], cex.main = 0.7)
}























Geographic vs. Projected CRS

Geographic coordinates are coordinates on a sphere or on an ellipse, while projected coordinates are defined on a flat 2D surface. Usually, geographic coordinates are in latitudes/longitudes and projected coordinates are in meters. As seen earlier, this is important when doing certain spatial operations. All functions from package rgeos require that spatial objects are projected. Otherwise, functions do not work properly because they assume that calculations are done on cartesian coordinates. Usually, a warning will be returned.

bbox(canada)
##          min       max
## x -141.00687 -52.61889
## y   41.67693  83.11042
bbox(spTransform(canada, CRS(crs[2])))
##        min     max
## x -3108601 2180270
## y  4640497 9231806























CRS and EPSG

Projections can also be given with their EPSG codes. More info on projections can be found here epsg.io where a description of the different EPSG is available. For example, submit the EPSG 4326 and check the corresponding string with the PROJ.4 library.

canada2 <- spTransform(x, CRS("+init=epsg:4326"))
## Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'spTransform' for signature '"numeric", "CRS"'
proj4string(canada2)
## [1] "+proj=laea +lat_0=0 +lon_0=0 +ellps=GRS80"

The EPSG code 4326 is equivalent to giving the "+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0" CRS to the data.























– Exercice 3 –

Generate a set of 50 200km random buffers across Canada and measure the proportion of overlap between each pair of buffers.

Note: The width argument in gBuffer takes the values in the coordinate reference system. If coordinates are in latlon, a value of 1 means 1 degre. If values are in meters, a value of 1 means 1 meter.

can <- raster::getData("GADM", country = "CAN", level = 1)























– Solution –

# first set the seed from the random generator to get reproducible
# results
set.seed(123)

# get shapefile for Canada
can <- raster::getData("GADM", country = "CAN", level = 1)
can <- spTransform(can, CRS("+proj=laea +lat_0=60 +lon_0=-100 +ellps=GRS80"))
plot(can, border = gray(0.25))

# throw random points
n <- 50
p <- spsample(can, n, type = "random")

# get buffers
b <- gBuffer(p, width = 200 * 1000, byid = TRUE)
b <- SpatialPolygonsDataFrame(b, data.frame(id = 1:n))

# plot them
plot(b, add = TRUE)
text(coordinates(b), label = b$id)

# build a data.frame to store each pairs
e <- expand.grid(id1 = b$id, id2 = b$id)

# compute the percent of overlap between each pair
o <- sapply(1:nrow(e), function(i) {
    x1 <- b[b$id == e[i, 1], ]
    x2 <- b[b$id == e[i, 2], ]
    g <- gIntersection(x1, x2)
    if (is.null(g)) {
        # if both buffers do not overlap, return 0
        0
    } else {
        gArea(g)/gArea(x1)  # otherwise, return the proportion between the overlapping area and the given buffer
    }
    # 
})
e$o <- o

# plot results
plot(e[, 1], e[, 2], cex = 2 * e$o, xlab = "id 1", ylab = "id 2", asp = 1)

This is probably not the most efficient way. Using gIntersection(b, b, byid = TRUE) returns the 132 overlapping pairs faster, but ids need to be constructed to associate the different intersections to the correct pairs. Also, since the buffers all have the same dimension, we don’t need the proportion of overlap for each ids as it is symmetric within each pair.























RASTERS

What is a raster?

A raster is a regular grid of pixel with values. Here is an example of building a simple raster with random values.

library(raster)

n <- 200
rast <- raster(nrow = n, ncol = n, ext = extent(canada))
rast <- setValues(rast, runif(n^2, 0, 1))
proj4string(rast) <- proj4string(canada)
ncell(rast)
## [1] 40000
plot(rast)
plot(canada, add = TRUE)























Read and write

Writing a raster to a file is done with the writeRaster function.

writeRaster(rast, "C:/Users/God/Documents/ML/rast.tif", overwrite = TRUE)
list.files("C:/Users/God/Documents/ML")
##  [1] "carreteras.dbf"                     
##  [2] "carreteras.prj"                     
##  [3] "carreteras.sbn"                     
##  [4] "carreteras.sbx"                     
##  [5] "carreteras.shp"                     
##  [6] "carreteras.shx"                     
##  [7] "cat.txt"                            
##  [8] "cdem_dem_021E.tif"                  
##  [9] "cdem_dem_031H.tif"                  
## [10] "crap.dbf"                           
## [11] "crap.prj"                           
## [12] "crap.shp"                           
## [13] "crap.shx"                           
## [14] "Extended Raster"                    
## [15] "Extended Raster.zip"                
## [16] "Landcover_2015_extended.tif"        
## [17] "LandEcoCorrected (1).txt"           
## [18] "LandEcoCorrected.txt"               
## [19] "LandEcoCorrected_addedvalue (1).txt"
## [20] "LandEcoCorrected_addedvalue.csv"    
## [21] "LandEcoCorrected_addedvalue.txt"    
## [22] "poblacion.dbf"                      
## [23] "poblacion.prj"                      
## [24] "poblacion.sbn"                      
## [25] "poblacion.sbx"                      
## [26] "poblacion.shp"                      
## [27] "poblacion.shx"                      
## [28] "pop.dbf"                            
## [29] "pop.prj"                            
## [30] "pop.shp"                            
## [31] "pop.shx"                            
## [32] "rast.tif"                           
## [33] "roads.dbf"                          
## [34] "roads.prj"                          
## [35] "roads.shp"                          
## [36] "roads.shx"                          
## [37] "test.dbf"                           
## [38] "test.prj"                           
## [39] "test.shp"                           
## [40] "test.shx"























Reading can be done directly with the raster function (or the stack or brick functions).

r1 <- raster("C:/Users/God/Documents/ML/cdem_dem_021E.tif")
r2 <- raster("C:/Users/God/Documents/ML/cdem_dem_031H.tif")

par(mfrow = 1:2)
plot(r1)
plot(r2)























Aggregate, merge and crop

Rasters can be aggregated, merged or cropped. An aggregation reduces the number of cells by a certain factor, while a merge brings two rasters together.

ncell(r1)
## [1] 46080000
r1 <- aggregate(r1, fact = 10)
r2 <- aggregate(r2, fact = 10)

ncell(r1)
## [1] 460800
r <- merge(r1, r2)
plot(r)























Rasters can be cropped with a specific region using the extent argument.

e <- extent(-72.75, -70.25, 45, 46)
r <- crop(r, e)
plot(r)























Layers, stacks and bricks

Rasters can have a single layer (RasterLayer) or multiple layers (RasterStack, RasterBrick).

r
## class       : RasterLayer 
## dimensions  : 480, 1200, 576000  (nrow, ncol, ncell)
## resolution  : 0.002083333, 0.002083333  (x, y)
## extent      : -72.7501, -70.2501, 45.0001, 46.0001  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs 
## data source : in memory
## names       : layer 
## values      : 29.01, 1160.79  (min, max)























Let’s get some data from the worldclim database to see what a stack looks like.

temp <- raster::getData("worldclim", var = "tmean", res = 10)
temp
## class       : RasterStack 
## dimensions  : 900, 2160, 1944000, 12  (nrow, ncol, ncell, nlayers)
## resolution  : 0.1666667, 0.1666667  (x, y)
## extent      : -180, 180, -60, 90  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
## names       : tmean1, tmean2, tmean3, tmean4, tmean5, tmean6, tmean7, tmean8, tmean9, tmean10, tmean11, tmean12 
## min values  :   -513,   -473,   -443,   -357,   -206,   -116,   -114,   -110,   -165,    -284,    -409,    -487 
## max values  :    338,    333,    333,    342,    360,    384,    392,    382,    358,     327,     328,     330
plot(temp/10)  # the dot is not included before the decimal so we need to divide by 10 to get the real value























temp <- brick(temp)
temp
## class       : RasterBrick 
## dimensions  : 900, 2160, 1944000, 12  (nrow, ncol, ncell, nlayers)
## resolution  : 0.1666667, 0.1666667  (x, y)
## extent      : -180, 180, -60, 90  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
## data source : in memory
## names       : tmean1, tmean2, tmean3, tmean4, tmean5, tmean6, tmean7, tmean8, tmean9, tmean10, tmean11, tmean12 
## min values  :   -513,   -473,   -443,   -357,   -206,   -116,   -114,   -110,   -165,    -284,    -409,    -487 
## max values  :    338,    333,    333,    342,    360,    384,    392,    382,    358,     327,     328,     330























Raster operations

Extract

Let’s extract elevation data for the Bicknell’s Thrush occurences we got previously.

plot(r)
plot(bick, add = TRUE)























e <- extract(r, bick)
hist(e, xlab = "Altitude", main = "")























velox

velox is an R package designed for faster extractions using C++ and rasters held in memory. For quicker extractions on relatively small datasets it can be much faster than the function extract.

library(velox)
library(microbenchmark)

rv <- velox(r)

buff <- gBuffer(bick, byid = TRUE, width = 0.05)
plot(r)
plot(buff, add = TRUE)























system.time(extract(r, buff, fun = mean))
##    user  system elapsed 
##   10.88    0.04   10.93
system.time(rv$extract(buff, fun = mean))
##    user  system elapsed 
##    0.84    0.00    0.85























Reclassify

mn <- c(0, 100, 200, 400, 600, 800)
mx <- c(100, 200, 400, 600, 800, 1200)
mat <- cbind(mn, mx, lab = mx)
mat
##       mn   mx  lab
## [1,]   0  100  100
## [2,] 100  200  200
## [3,] 200  400  400
## [4,] 400  600  600
## [5,] 600  800  800
## [6,] 800 1200 1200
rc <- reclassify(r, mat)
plot(rc, col = terrain.colors(nrow(mat)))

br>




















Rasterize

Here is an example of building a raster where each pixel value is determined by the number of points in each cell. For this, we use the rasterize function.

rbick <- raster(ncol = 30, nrow = 30, ext = extent(bick))

rbick <- rasterize(bick, rbick, field = 1, fun = "count", background = 0)

rbick[rbick == 0] <- NA  # replace cells with 0 observation with NA's to better visualize data

rbick
## class       : RasterLayer 
## dimensions  : 30, 30, 900  (nrow, ncol, ncell)
## resolution  : 0.5580082, 0.2445775  (x, y)
## extent      : -76.8885, -60.14826, 43.40956, 50.74688  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
## data source : in memory
## names       : layer 
## values      : 1, 54  (min, max)























Now let’s plot the results.

plot(bick, col = "white")  # first plot white points to get the correct extent around locations
plot(rbick, col = rev(heat.colors(100)), add = TRUE, legend.args = list(text = "Nb of obs.", 
    side = 3, line = 0.25), horizontal = TRUE)
plot(canada, border = gray(0.7), add = TRUE)
box()























Build a rasters from distances

Measure distances from roads and store the values in a raster.

library(RColorBrewer)
library(FRutils)

roads2 <- spTransform(roads, CRS("+proj=utm +zone=18 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"))
rast <- raster(ncol = 200, nrow = 200, ext = extent(roads2))
rast_centroids <- xyFromCell(rast, 1:ncell(rast), spatial = TRUE)
g <- gDistance(rast_centroids, gLineMerge(roads2), byid = TRUE)

class(g)
## [1] "matrix"
dim(g)
## [1]     1 40000
rast <- setValues(rast, g[1, ])
val <- pretty(seq(min(minValue(rast)), max(maxValue(rast)), length.out = 100), 
    10)

lab <- list(at = val, labels = val)
plot(rast, col = colo.scale(1:99, rev(brewer.pal(9, "YlOrRd"))), breaks = 99, 
    axis.args = lab)
plot(roads2, add = TRUE)























Rasterize with lengths

An example on using rasterize to compute the length of roads within each pixel of a raster.

test <- rasterize(roads2, aggregate(rast, 10), fun = "length")
plot(test)
plot(roads2, add = TRUE)













































Contours

Convert rasters values to contours in a SpatialLines object.

con <- rasterToContour(rast)
con
## class       : SpatialLinesDataFrame 
## features    : 9 
## extent      : -1163363, -964632, 2018427, 2198389  (xmin, xmax, ymin, ymax)
## coord. ref. : NA 
## variables   : 1
## names       : level 
## min values  : 10000 
## max values  :  5000
plot(rast)
plot(con, add = TRUE)























Polygonize a raster

Turn all cells that are at least 25km from the roads to polygons.

pr <- rasterToPolygons(rast, fun = function(x) {
    x > 25000
}, dissolve = TRUE)
plot(rast)
plot(pr, col = "blue", border = NA, add = TRUE)























– Exercice 4 –

Get data for Trillium cernuum using package rgbif and calculate the mean annual temperature and annual precipitation in the species range. Bioclimatic variables are described in the WorldClim pages.

Here is how to download the data. Note that the object returned by occ_search is a gbif object from which the data element need to be extracted. This element is a tibble that needs to be turned to a data.frame in order to convert it to a spatial object.

# download temperature data
r <- raster::getData("worldclim", var = "bio", res = 2.5, lon = -70, lat = 65)

# download records
x <- occ_search(scientificName = "Trillium erectum", limit = 1000, hasCoordinate = TRUE, 
    country = "CA")
x <- as.data.frame(x$data)























– Solution –

# dowload climatic data
r <- raster::getData("worldclim", var = "bio", res = 10)
r <- subset(r, c(1, 12))
names(r) <- c("temp", "prec")

# download Trillium data
x <- occ_search(scientificName = "Trillium cernuum", limit = 1000, hasCoordinate = TRUE, 
    country = "CA")

# get the records in a spatial object
x <- as.data.frame(x$data)
coordinates(x) <- ~decimalLongitude + decimalLatitude

# check the CRS of r
proj4string(r)
## [1] "+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"
# use the same projection for both object
proj4string(x) <- CRS(proj4string(r))

# reduce raster size to the bounding box of the records
r <- crop(r, extent(x))

# get convex hull of distribution
a <- gConvexHull(x)

# check data
par(mfrow = c(2, 1), mar = c(2, 3, 1, 3))
plot(r[[1]])
plot(x, add = TRUE)
plot(a, add = TRUE)
plot(r[[2]])
plot(x, add = TRUE)
plot(a, add = TRUE)

# extract values within the range of the species
e <- extract(r, a)

# check distribution of values for mean annual temperature and
# precipitation
par(mfrow = c(1, 2))
hist(e[[1]][, 1]/10, xlab = "")
hist(e[[1]][, 2], xlab = "")

MAPPING

STATIC

plot

data(meuse)
coordinates(meuse) <- ~x + y
proj4string(meuse) <- CRS("+init=epsg:28992")
head(meuse)
##   cadmium copper lead zinc  elev       dist   om ffreq soil lime
## 1    11.7     85  299 1022 7.909 0.00135803 13.6     1    1    1
## 2     8.6     81  277 1141 6.983 0.01222430 14.0     1    1    1
## 3     6.5     68  199  640 7.800 0.10302900 13.0     1    1    1
## 4     2.6     81  116  257 7.655 0.19009400  8.0     1    2    0
## 5     2.8     48  117  269 7.480 0.27709000  8.7     1    2    0
## 6     3.0     61  137  281 7.791 0.36406700  7.8     1    2    0
##   landuse dist.m
## 1      Ah     50
## 2      Ah     30
## 3      Ah    150
## 4      Ga    270
## 5      Ah    380
## 6      Ga    470
plot(meuse)

spplot

spplot(meuse, z = c("cadmium", "copper", "lead", "zinc"))

levelplot

library(rasterVis)

r <- raster::getData("worldclim", var = "tmean", res = 10)
r <- crop(r, extent(-150, -50, 43, 85))

levelplot(r/10, cuts = 99)























ggspatial

library(ggspatial)

ggspatial(meuse)























OpenStreetMap

library(dismo)

test <- gmap(meuse, type = "roadmap", filename = "map.gmap", scale = 2)
p <- spTransform(meuse, CRS(proj4string(test)))
test <- crop(test, p)
plot(test)
plot(p, add = TRUE)























ggmap

library(ggmap)

test <- gmap("Sherbrooke", type = "satellite", scale = 2, zoom = 13, rgb = FALSE)
plot(test)

qmap("Sherbrooke", zoom = 14, maptype = "satellite")























INTERACTIVE

Drawing polygons

select, drawPoly from package raster to interactively select elements

plot(x)

# pol<-drawPoly() plot(pol,add=TRUE,col='red')























leaflet

Here present leaflet and leafletExtras

library(leaflet)
library(leaflet.extras)

leaflet(cat) %>% 
  addTiles() %>%
  addCircleMarkers(data = cat, group = 'cat') %>%
  addDrawToolbar(targetGroup = 'cat', editOptions = editToolbarOptions(selectedPathOptions = selectedPathOptions())) %>%
  addLayersControl(overlayGroups = c('cat'),options =layersControlOptions(collapsed=FALSE)) %>%
  addMeasurePathToolbar(options =measurePathOptions(imperial = TRUE,minPixelDistance = 100,showDistances = FALSE))























tmap

cat$attack <- as.factor(cat$Attack)

library(tmap)

tmap_mode("view")

tm_shape(roads) + tm_lines() + tm_shape(cat) + tm_dots("attack") + tm_layout(basemaps = c("Esri.WorldImagery", 
    "Esri.WorldShadedRelief", "Esri.NatGeoWorldMap"))























mapedit

Here is a test with using mapedit functionality.

library(mapview)
library(mapedit)

x <- mapview() %>% editMap()
plot(x)
# x <- as(x[[1]],'Spatial')
plot(x)























MORE POSSIBILITIES

The main book for learning to use R for spatial data is probably Applied Spatial Data Analysis with R by Bivand et al. (2013)

Package Use
spatstat Huge package mainly for analysing spatial point patterns
adehabitat A collection of packages for studying habitat selection
gstat Variograms, geostatistics, kriging, etc.
SDMTools Tools for Species Distribution Modeling
spdep Tools for studying spatial dependence























Spatial Task View

The Spatial Task View on maintained on CRAN is worth a visit if you are searching for more possibilities.

rspatial.org

A good online reference for doing spatial analyses with R is rspatial.org which provides lots of examples on diferent types of analyses.























Special packages

Landscape metrics with SDMTools or spatialEco

library(SDMTools)
library(spatialEco)


r <- raster(nrows = 200, ncols = 200, xmn = 0, xmx = 100, ymn = 0, ymx = 100)

x <- runif(100, 0, 100)
y <- runif(100, 0, 100)

h <- gUnaryUnion(gEnvelope(gBuffer(SpatialPoints(cbind(x, y)), width = rpois(length(x), 
    4), byid = TRUE), byid = TRUE))

r <- rasterize(h, r)

plot(r, col = "green4")

x <- sampleRandom(r, 1, na.rm = TRUE, sp = TRUE)
pol <- SpatialPolygonsDataFrame(gBuffer(x, width = 20), data.frame(id = 1), 
    match.ID = FALSE)

a <- land.metrics(x = pol, y = r, metrics = c(2:38))
## Processing observation - 1
plot(pol, add = TRUE)
# plot(rasterToPolygons(r),add=TRUE)
plot(x, add = TRUE)

Scalebar(x = 90, y = 3, distance = 20, unit = "m", scale = 1)

a
##   class n.patches total.area prop.landscape patch.density total.edge
## 1     1         9     496.25              1    0.01813602        282
##   edge.density landscape.shape.index largest.patch.index
## 1     0.568262              3.133333           0.4211587
##   mean.patch.area sd.patch.area min.patch.area max.patch.area
## 1        55.13889      77.87421           2.25            209
##   perimeter.area.frac.dim mean.perim.area.ratio sd.perim.area.ratio
## 1                1.133278              1.463638           0.9954999
##   min.perim.area.ratio max.perim.area.ratio mean.shape.index
## 1            0.3668639             3.111111         1.295516
##   sd.shape.index min.shape.index max.shape.index mean.frac.dim.index
## 1      0.3475696               1        2.153846            1.202472
##   sd.frac.dim.index min.frac.dim.index max.frac.dim.index
## 1         0.2068328                  1           1.672263
##   total.core.area prop.landscape.core mean.patch.core.area
## 1          369.75           0.7450882             41.08333
##   sd.patch.core.area min.patch.core.area max.patch.core.area
## 1           65.26628                   0               168.5
##   prop.like.adjacencies aggregation.index lanscape.division.index
## 1             0.8673565          95.05155               0.6918843
##   splitting.index effective.mesh.size patch.cohesion.index
## 1        3.245535            152.9024             9.367108























sf: SIMPLE FEATURES FOR R

Vignette 1

  • S3 class
  • Interacts with the tidyverse and dplyr
  • Easy conversion from sp to sf