Introduction to Spatial Data Types in R

1. Spatial objects in R
- Conceptualizing a spatial Object
  - Exercise 1
2. Creating a spatial object
3. Getting spatial data in and out of R
- How to work with rgdal
  - Exercise 4
- Raster data
  - Exercise 5

Make sure you have the sp and raster libraries installed. [Note that the latest raster(2.5-2) depends on sp (≥ 1.2-0), so make sure that your versions align.]

Mac users

Since there is now a Mac binary package for rgdal available on CRAN a installing sp with the default settings will also install this rgdal as one of its dependencies, which includes a built-in basic GDAL that lacks all the extra GDAL formats can handle. There is an alternative rgdal distributed by kyngchaos that you can use instead. Formats not included in the CRAN distribution but included in the kyngchaos distribution are currently: DODS, Geomedia, Interlis 1, Interlis 2, LIBKML, MSSQLSpatial, NAS, ODBC, OGDI, PGeo, PostgreSQL, OSI, Walk, XLS. (You mostly might care about having the PostgreSQL driver at some point.)

If you DON’T care, simply say:

install.packages(c("sp", "raster"))

If you DO care, follow the instructions below.

Install the raster library:
```
install.packages("raster"))
```

Don’t install sp with dependencies:

install.packages("sp", dependencies = F)

Download the latest GDAL complete from this site
Doubleclick and install the downloaded .dmg file as you are used to on a Mac.
Make sure you have R Version 3.2 or later installed – if not update it.
Download a different rgdal from this site.
Doubleclick to open the .dmg file
Move rgdal_*.tgz to your Desktop folder

Install the local package with:

install.packages("~/Desktop/rgdal_*.tgz", repos = NULL, type = .Platform$pkgType)

Windows users

install.packages(c("sp", "raster"))

Mac and Windows

Test if all went well:

library (rgdal)

1. Spatial objects in R

Conceptualizing a spatial Object

In vector GIS we deal with, points, lines, and polygons:

Exercise 1

Discuss with your neighbor: What do we need to specify in order to define spatial vector data?

lat/lon coordinates
projection
attribute data
if polygon, is it a hole or not
… ?

In R the sp package provides classes and methods for spatial data types¹.

Development of the sp began in the early 2000s in an attempt to standardize how spatial data would be treated in R and to allow for better interoparbility between different analysis packages that use spatial data. The package provides classes and methods to create points, lines, polygons, and grids and to operate on them. It is one of the most important packages that you will need to use if dealing with spatial data in R. Many of the spatial analysis packages now use the spatial data types that are implemented in sp i.e. they “depend” on the sp package.

In sp spatial objects are conceptualized in the following way²:

The foundational structure for any spatial object in sp is the Spatial class. It has two slots (new-style class objects in R have pre-defined components called slots):

a bounding box
a CRS class object to define the Coordinate Reference System

2. Creating a spatial object

In order to create a spatial object manually the basic steps are:

I. Create a bunch of points, lines, or polygons (details below)

Convert those to a Spatial* object (* stands for Points, Lines, or Polygons). This steps adds the bounding box (automatically) and the slot for the Coordinate Reference System or CRS (which needs to be filled manually).

(Optional:) Add a data frame with attribute data, which will turn your Spatial* object into a Spatial*DataFrame object.

I. Create geometric objects (topology)

Points (which may have 2 or 3 dimensions) as the most basic spatial data object. They are generated out of either a single coordinate or a set of coordinates³, like a two-column matrix or a dataframe with a column for latitude and one for longitude.

Lines are generated out of Line objects. A Line object is a spaghetti collection of 2D coordinates and is generated out of a two-column matrix or a dataframe with a column for latitude and one for longitude. A Lines object is a list of Line objects, for example all the contours at a single elevation.

Polygons are generated out Polygon objects. A Polygon object is a spaghetti collection of 2D coordinates with equal first and last coordinates and is generated out of a two-column matrix or a dataframe with a column for latitude and one for longitude. A Polygons object is a list of Polygon objects, for example islands belonging to the same country.

II. Create spatial objects

Both Line/Lines and Polygon/Polygons objects are part of the R’s basic graphics package. So we need to turn those into spatial, i.e. “geographically aware” objects. SpatialPoints are directly made out of the coordinates. SpatialLines and SpatialPolygons objects are made using lists of Lines or Polygons objects respectively.

III. Add attributes

The points in a SpatialPoints object may be associated with a row of attributes to create a SpatialPointsDataFrame object. The coordinates and attributes may, but do not have to be keyed to each other using ID values.

SpatialLinesDataFrame and SpatialPolygonsDataFrame objects are defined using SpatialLines and SpatialPolygons objects and standard data frames, and the ID fields are here required to match the data frame row names.

Simplified diagram of how spatial objects are created in sp

Spatial methods

A number of spatial methods are available for the classes in sp. Among the ones I often use are:

method/class	and what it does
`bbox(x)`	returns the bounding box coordinates
`proj4string(x)`	sets or retrieves projection attributes using the CRS object.
`CRS()`	creates an object of class of coordinate reference system arguments
`spplot(x)`	plots a separate map of all the attributes unless specified otherwise
`coordinates(x)`	returns a matrix with the spatial coordinates. For spatial polygons it returns the centroids.
`over(x, y)`	used for example to retrieve the polygon or grid indexes on a set of points - we’ll come back to that one later
`spsample(x)`	sampling of spatial points within the spatial extent of objects

Creating a spatial object from scratch

Exercise 2

In this example we will manually generate a point vector object and plot it.

xy <- matrix(runif(20), ncol = 2)  # a matrix with some arbitrary points as coordinates..
xy.sp <- SpatialPoints(xy)  # ..converted into a spatial object
plot(xy.sp, pch = 2)

Test out some commands:

coordinates(xy.sp)

      coords.x1  coords.x2
 [1,] 0.1813925 0.86973823
 [2,] 0.2350274 0.41771681
 [3,] 0.1972356 0.22239799
 [4,] 0.1488553 0.62568307
 [5,] 0.8633978 0.00683451
 [6,] 0.9122698 0.63056878
 [7,] 0.9174440 0.40440935
 [8,] 0.6594771 0.53791701
 [9,] 0.6489924 0.85572014
[10,] 0.5936272 0.55002504

bbox(xy.sp)

                 min       max
coords.x1 0.14885531 0.9174440
coords.x2 0.00683451 0.8697382

str(xy.sp)

Formal class 'SpatialPoints' [package "sp"] with 3 slots
  ..@ coords     : num [1:10, 1:2] 0.181 0.235 0.197 0.149 0.863 ...
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : NULL
  .. .. ..$ : chr [1:2] "coords.x1" "coords.x2"
  ..@ bbox       : num [1:2, 1:2] 0.14886 0.00683 0.91744 0.86974
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : chr [1:2] "coords.x1" "coords.x2"
  .. .. ..$ : chr [1:2] "min" "max"
  ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
  .. .. ..@ projargs: chr NA

summary(xy.sp)

Object of class SpatialPoints
Coordinates:
                 min       max
coords.x1 0.14885531 0.9174440
coords.x2 0.00683451 0.8697382
Is projected: NA 
proj4string : [NA]
Number of points: 10

Add attributes:

df <- data.frame(attr1 = LETTERS[1:10], attr2 = c(10:1))
xy.spdf <- SpatialPointsDataFrame(xy.sp, df)
summary(xy.spdf)

Object of class SpatialPointsDataFrame
Coordinates:
                 min       max
coords.x1 0.14885531 0.9174440
coords.x2 0.00683451 0.8697382
Is projected: NA 
proj4string : [NA]
Number of points: 10
Data attributes:
     attr1       attr2      
 A      :1   Min.   : 1.00  
 B      :1   1st Qu.: 3.25  
 C      :1   Median : 5.50  
 D      :1   Mean   : 5.50  
 E      :1   3rd Qu.: 7.75  
 F      :1   Max.   :10.00  
 (Other):4

Some subsetting:

xy.spdf[1:2, ]  # row 1 and 2 only

class       : SpatialPointsDataFrame 
features    : 2 
extent      : 0.1813925, 0.2350274, 0.4177168, 0.8697382  (xmin, xmax, ymin, ymax)
coord. ref. : NA 
variables   : 2
names       : attr1, attr2 
min values  :     A,     9 
max values  :     B,    10

xy.spdf[["attr2"]]  # column with 'attr2' only

 [1] 10  9  8  7  6  5  4  3  2  1

Creating a spatial object from a lat/lon table

A SpatialPointsDataFrame object can be created directly from data.frames by specifying which columns contain the coordinates. This is interesting, for example if you have a spreadsheet that contains latitude, longitude and some values. You an read this into a data frame with read.table, and then create the object from the data frame in one step by using the coordinates() command. That automatically turns the dataframe object into a SpatialPointsDataFrame:

Exercise 3

If you haven’t already, create a new directory R_Workshop on your Desktop.
Download and unzip RSpatialDataTypes.zip in this directory.
Set your working directory to R_Workshop
Use read.csv() to read PhiladelphiaZIPHousing.csv into a dataframe in R and name it ph.df.
Use head() to examine the first few lines of the dataframe.
Use class() to examine which object class the table belongs to.
Convert the table into a spatial object with:

coordinates(ph.df) <- c("lon", "lat")
Use class()again to examine which object class the table belongs to now:
What to you observe?
Plot, using the attributes from the table:
bubble(ph.df, "price") or
spplot(ph.df, "use")

A brief, but important word about projection.

Note that the Spatial object you just created does not have a projection defined. It is ok to plot, but be aware that for any meaningful spatial operation you will need to define a projection.

This is how it’s done:

is.projected(ph.df)  # see if a projection is defined  
proj4string(ph.df) <- CRS("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")  # this is WGS84
is.projected(ph.df)  # voila

The following example⁴ shows how a set of polygons are built from scratch. Note that the coordinates of the Sr4 polygon move in the opposite direction (anti-clockwise) than the other three (clockwise); Sr4 is meant to represent a hole in the Sr3 polygon. The default value for the hole colour is “transparent”. Note that the Polygons objects have to be given character ID values, and that these values must be unique for Polygons objects combined in a SpatialPolygons object.

# create polyon objects from coordinates
Sr1 <- Polygon(cbind(c(2, 4, 4, 1, 2), c(2, 3, 5, 4, 2)))
Sr2 <- Polygon(cbind(c(5, 4, 2, 5), c(2, 3, 2, 2)))
Sr3 <- Polygon(cbind(c(4, 4, 5, 10, 4), c(5, 3, 2, 5, 5)))
Sr4 <- Polygon(cbind(c(5, 6, 6, 5, 5), c(4, 4, 3, 3, 4)), hole = TRUE)

# create lists of polygon objects from polygon objects and unique ID
Srs1 <- Polygons(list(Sr1), "s1")
Srs2 <- Polygons(list(Sr2), "s2")
Srs3 <- Polygons(list(Sr3, Sr4), "s3/4")

# create spatial polygons object from lists
SpP <- SpatialPolygons(list(Srs1, Srs2, Srs3), 1:3)
plot(SpP)

In order to add attributes to the polygons the row.names of the attributes data frame are matched with the ID slots of the SpatialPolygons object, and the rows of the data frame will be re-ordered if necessary.

Add attributes:

attr <- data.frame(numbers = 1:3, characters = LETTERS[3:1], row.names = c("s3/4", 
    "s1", "s2"))
SpDf <- SpatialPolygonsDataFrame(SpP, attr)
spplot(SpDf)

Let’s look at its structure:

str(SpDf)

Formal class 'SpatialPolygonsDataFrame' [package "sp"] with 5 slots
  ..@ data       :'data.frame': 3 obs. of  2 variables:
  .. ..$ numbers   : int [1:3] 2 3 1
  .. ..$ characters: Factor w/ 3 levels "A","B","C": 2 1 3
  ..@ polygons   :List of 3
  .. ..$ :Formal class 'Polygons' [package "sp"] with 5 slots
  .. .. .. ..@ Polygons :List of 1
  .. .. .. .. ..$ :Formal class 'Polygon' [package "sp"] with 5 slots
  .. .. .. .. .. .. ..@ labpt  : num [1:2] 2.7 3.55
  .. .. .. .. .. .. ..@ area   : num 5.5
  .. .. .. .. .. .. ..@ hole   : logi FALSE
  .. .. .. .. .. .. ..@ ringDir: int 1
  .. .. .. .. .. .. ..@ coords : num [1:5, 1:2] 2 1 4 4 2 2 4 5 3 2
  .. .. .. ..@ plotOrder: int 1
  .. .. .. ..@ labpt    : num [1:2] 2.7 3.55
  .. .. .. ..@ ID       : chr "s1"
  .. .. .. ..@ area     : num 5.5
  .. ..$ :Formal class 'Polygons' [package "sp"] with 5 slots
  .. .. .. ..@ Polygons :List of 1
  .. .. .. .. ..$ :Formal class 'Polygon' [package "sp"] with 5 slots
  .. .. .. .. .. .. ..@ labpt  : num [1:2] 3.67 2.33
  .. .. .. .. .. .. ..@ area   : num 1.5
  .. .. .. .. .. .. ..@ hole   : logi FALSE
  .. .. .. .. .. .. ..@ ringDir: int 1
  .. .. .. .. .. .. ..@ coords : num [1:4, 1:2] 5 2 4 5 2 2 3 2
  .. .. .. ..@ plotOrder: int 1
  .. .. .. ..@ labpt    : num [1:2] 3.67 2.33
  .. .. .. ..@ ID       : chr "s2"
  .. .. .. ..@ area     : num 1.5
  .. ..$ :Formal class 'Polygons' [package "sp"] with 5 slots
  .. .. .. ..@ Polygons :List of 2
  .. .. .. .. ..$ :Formal class 'Polygon' [package "sp"] with 5 slots
  .. .. .. .. .. .. ..@ labpt  : num [1:2] 6.13 3.93
  .. .. .. .. .. .. ..@ area   : num 10
  .. .. .. .. .. .. ..@ hole   : logi FALSE
  .. .. .. .. .. .. ..@ ringDir: int 1
  .. .. .. .. .. .. ..@ coords : num [1:5, 1:2] 4 10 5 4 4 5 5 2 3 5
  .. .. .. .. ..$ :Formal class 'Polygon' [package "sp"] with 5 slots
  .. .. .. .. .. .. ..@ labpt  : num [1:2] 5.5 3.5
  .. .. .. .. .. .. ..@ area   : num 1
  .. .. .. .. .. .. ..@ hole   : logi TRUE
  .. .. .. .. .. .. ..@ ringDir: int -1
  .. .. .. .. .. .. ..@ coords : num [1:5, 1:2] 5 5 6 6 5 4 3 3 4 4
  .. .. .. ..@ plotOrder: int [1:2] 1 2
  .. .. .. ..@ labpt    : num [1:2] 6.13 3.93
  .. .. .. ..@ ID       : chr "s3/4"
  .. .. .. ..@ area     : num 10
  ..@ plotOrder  : int [1:3] 1 2 3
  ..@ bbox       : num [1:2, 1:2] 1 2 10 5
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : chr [1:2] "x" "y"
  .. .. ..$ : chr [1:2] "min" "max"
  ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
  .. .. ..@ projargs: chr NA

Whoa.

Note that we can access the information in the slots using the @. For example

SpDf@bbox

  min max
x   1  10
y   2   5

However, it is strongly encouraged to use the provided functions and methods instead, like

bbox(SpDf)

  min max
x   1  10
y   2   5

To look at the attribute table:

as.data.frame(SpDf)  # instad of: SpDf@data

     numbers characters
s1         2          B
s2         3          A
s3/4       1          C

3. Getting spatial data in and out of R

The good news is that typically we do not have to create Spatial family objects as tediously as we did above. It is much more common that we work with already existing spatial data.

How to work with `rgdal`

In order to read those into R and turn them into Spatial* family objects we rely on the rgdal package. It provides us direct access to the powerful GDAL library from within R.

We can read in and write out spatial data using:

readOGR() and writeOGR() (for vector)

readGDAL() and writeGDAL() (for raster/grids)

The parameters provided for each function vary depending on the exact spatial file type you are reading. We will take an ESRI shapefile as an example. A shapefile - as you know - consists of various files, and R expects all those files to be in one directory.

When reading in a shapefile, readOGR() expects at least the following two arguments:

datasource name (dsn)  # the path to the folder that contains the files
                       # Note that this is a path to the folder, not a filename!
layer name (layer)     # the shapefile name without extension
                       # Note that this is not a path but just the name of the file!

For example, if I have a shapefile called Philadelphia.shp and all its associated files (like .dbf, .prj, .shx) in a directory called PH on my desktop, and I have my working directory set to my desktop folder, my command to read this shapefile would look like this:

readOGR(dsn = "PH", layer = "Philadelphia")

or in short:

readOGR("PH", "Philadelphia")

Exercise 4

Load the rgdal package.
Make sure your working directory is set to the R_Workshop folder and it contains the materials you downloaded and unzipped earlier.
Read PhillyTotalPopHHinc.shp into an object called philly. Make sure you understand the directory structure.
Examine the file, for example with summary(), class(), str("philly", max.level = 2),
Take a look at the column names of the attribute data with names()
Take a look at the attribute data with head()
Note that subsetting works here: philly[philly$totalPop > 5000,]
Plot some attribute data: spplot(philly, "medHHinc")

GDAL supports over 200 raster formats and vector formats. Use
ogrDrivers() and gdalDrivers()
(without arguments) to find out which formats your rgdal install can handle.

Raster data

Rasters are much more compact that vectors. Because of their regular structure the coordinates do not need to be recorded for each pixel or cell in the rectangular extent. A raster is defined by:

a CRS
coordinates of its origin
a distance or cell size in each direction
a dimension or numbers of cells in each direction
an array of values

If necessary, the coordinates for any cell can be computed.

In sp the GridTopology class is the key element of raster representations⁵. It contains

the center coordinate pair of the south-west raster cell,
the two cell sizes in the metric of the coordinates, giving the step to successive centres, and
the numbers of cells for each dimension.

A simple grid can be built like this:

# specify the grid topology with the following parameters:
# - the smallest coordinates for each dimension, here: 0,0
# - cell size in each dimension, here: 1,1 
# - number of cells in each dimension, here: 5,5
gtopo <- GridTopology(c(0,0), c(1,1), c(5,5)) # create the grid
datafr <- data.frame(runif(25)) # make up some data
SpGdf <- SpatialGridDataFrame(gtopo, datafr) # create the grid data frame
summary(SpGdf)

Object of class SpatialGridDataFrame
Coordinates:
      min max
[1,] -0.5 4.5
[2,] -0.5 4.5
Is projected: NA 
proj4string : [NA]
Grid attributes:
  cellcentre.offset cellsize cells.dim
1                 0        1         5
2                 0        1         5
Data attributes:
   runif.25.      
 Min.   :0.03807  
 1st Qu.:0.31346  
 Median :0.51471  
 Mean   :0.55202  
 3rd Qu.:0.75978  
 Max.   :0.95975

And it can be plotted like this:

spplot(SpGdf, sp.layout = c("sp.points", SpatialPoints(coordinates(SpGdf))))

Alternatively you can use the raster package, which works slightly differently.

The raster package is a major extension of spatial data classes to access large rasters and in particular to process very large files. It includes object classes for RasterLayer, RasterStacks, and RasterBricks, functions for converting among these classes, and operators for computations on the raster data. Conversion from sp type objects into raster type objects is easy.

If we wanted to do the same as above, namely creating the same raster object from scratch we would do the following:

# specify the RasterLayer with the following parameters:
# - minimum x coordinate (left border)
# - minimum y coordinate (bottom border)
# - maximum x coordinate (right border)
# - maximum y coordinate (top border)
# - resolution (cell size) in each dimension
r <- raster(xmn=-0.5, ymn=-0.5, xmx=4.5, ymx=4.5, resolution=c(1,1))
r

class       : RasterLayer 
dimensions  : 5, 5, 25  (nrow, ncol, ncell)
resolution  : 1, 1  (x, y)
extent      : -0.5, 4.5, -0.5, 4.5  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0

So here we have created an object of type RasterLayer, as compared to above, where we created an object of type GridTopology.

Compare this to the output from above and note something important here: Different from the grid object we generated from scratch, this raster object has a CRS defined! If the crs argument is missing when creating the Raster object, the x coordinates are within -360 and 360 and the y coordinates are within -90 and 90, the WGS84 projection is used by default!

Good to know.

To add some values to the cells we could the following. Be aware that different from the GridTopology object above, which is converted to a SpatialGridDataFrame when adding values, this here object remains a RasterLayer.

class(r)

[1] "RasterLayer"
attr(,"package")
[1] "raster"

r <- setValues(r, runif(25))
class(r)

[1] "RasterLayer"
attr(,"package")
[1] "raster"

plot(r); points(coordinates(r), pch=3)

(See the rasterVis package for more advanced plotting of Raster* objects.)

Exercise 5

We can use readGDAL from the sp package to read in a raster file. It will require one parameter as a minimum, which is the filename of the raster. So let’s do this.

Make sure your working directory is set to the R_Workshop folder and it contains the materials you downloaded and unzipped earlier.
Read in with: dem <- readGDAL("DEM_10m/bushkill_pa.dem")⁶
Examine with summary()

Alternatively we can use the raster package to read in the same file:

Load the raster library
Read in with: dem.r <- raster("DEM_10m/bushkill_pa.dem")
Examine with dem.r and compare to the above
Extract contour lines and plot them: contour(dem.r)
Note that this works too: contour(dem)

RasterLayer objects can also be created from a simple matrix.

class(volcano)

[1] "matrix"

volcano.r <- raster(volcano)
class(volcano.r)

[1] "RasterLayer"
attr(,"package")
[1] "raster"

This package becomes interesting when you need to work for example with Landsat images that have multiple bands and want to perform map algebra.

There are currently about 140 R packages on CRAN for reading, visualising, and analysing (geographical) spatial data. I recommend to visit that website if you are exploring spatial analysis with R.

From R Bivand (2011) Introduction to representing spatial objects in R ↩
Note that this is not the only way spatial objects are conceptualized in R. Other spatial packages may use their own class definitions for spatial data (for example spatstat). sp provides conversion functions for many of those formats.↩
Coordinates should be of type double and will be promoted if not.↩
from Edzer Pebesma Roger S. Bivand Feb 2005: S Classes and Methods for Spatial Data: the sp Package ↩
There is also a SpatialPixels object which stores grid topology and coordinates of the actual points.↩
From (https://www.e-education.psu.edu/geog482fall2/c7_p8.html)↩

Introduction to Spatial Data Types in R

claudia a engel

Last updated: April 18, 2016

1. Spatial objects in R

Conceptualizing a spatial Object

Exercise 1

2. Creating a spatial object

I. Create geometric objects (topology)

II. Create spatial objects

III. Add attributes

Spatial methods

Creating a spatial object from scratch

Exercise 2

Creating a spatial object from a lat/lon table

Exercise 3

A brief, but important word about projection.

3. Getting spatial data in and out of R

How to work with `rgdal`

Exercise 4

Raster data

Exercise 5

Introduction to Spatial Data Types in R

claudia a engel

Last updated: April 18, 2016

1. Spatial objects in R

Conceptualizing a spatial Object

Exercise 1

2. Creating a spatial object

I. Create geometric objects (topology)

II. Create spatial objects

III. Add attributes

Spatial methods

Creating a spatial object from scratch

Exercise 2

Creating a spatial object from a lat/lon table

Exercise 3

A brief, but important word about projection.

3. Getting spatial data in and out of R

How to work with rgdal

Exercise 4

Raster data

Exercise 5

How to work with `rgdal`