Make sure you have the sp
and raster
libraries installed. [Note that the latest raster
(2.5-2) depends on sp
(≥ 1.2-0), so make sure that your versions align.]
Mac users
Since there is now a Mac binary package for rgdal
available on CRAN a installing sp
with the default settings will also install this rgdal
as one of its dependencies, which includes a built-in basic GDAL that lacks all the extra GDAL formats can handle. There is an alternative rgdal
distributed by kyngchaos that you can use instead. Formats not included in the CRAN distribution but included in the kyngchaos distribution are currently: DODS, Geomedia, Interlis 1, Interlis 2, LIBKML, MSSQLSpatial, NAS, ODBC, OGDI, PGeo, PostgreSQL, OSI, Walk, XLS. (You mostly might care about having the PostgreSQL driver at some point.)
If you DON’T care, simply say:
install.packages(c("sp", "raster"))
If you DO care, follow the instructions below.
Install the raster library:
install.packages("raster"))
Don’t install sp
with dependencies:
install.packages("sp", dependencies = F)
Download the latest GDAL complete from this site
Doubleclick and install the downloaded .dmg
file as you are used to on a Mac.
Make sure you have R Version 3.2 or later installed – if not update it.
Download a different rgdal from this site.
Doubleclick to open the .dmg
file
Move rgdal_*.tgz
to your Desktop folder
Install the local package with:
install.packages("~/Desktop/rgdal_*.tgz", repos = NULL, type = .Platform$pkgType)
Windows users
install.packages(c("sp", "raster"))
Mac and Windows
Test if all went well:
library (rgdal)
In vector GIS we deal with, points, lines, and polygons:
Discuss with your neighbor: What do we need to specify in order to define spatial vector data?
In R the sp
package provides classes and methods for spatial data types1.
Development of the sp
began in the early 2000s in an attempt to standardize how spatial data would be treated in R and to allow for better interoparbility between different analysis packages that use spatial data. The package provides classes and methods to create points, lines, polygons, and grids and to operate on them. It is one of the most important packages that you will need to use if dealing with spatial data in R. Many of the spatial analysis packages now use the spatial data types that are implemented in sp
i.e. they “depend” on the sp
package.
In sp
spatial objects are conceptualized in the following way2:
The foundational structure for any spatial object in sp
is the Spatial
class. It has two slots (new-style class objects in R have pre-defined components called slots):
a bounding box
a CRS class object to define the Coordinate Reference System
In order to create a spatial object manually the basic steps are:
I. Create a bunch of points, lines, or polygons (details below)
- Convert those to a
Spatial*
object (*
stands for Points, Lines, or Polygons). This steps adds the bounding box (automatically) and the slot for the Coordinate Reference System or CRS (which needs to be filled manually).
- (Optional:) Add a data frame with attribute data, which will turn your
Spatial*
object into aSpatial*DataFrame
object.
Points (which may have 2 or 3 dimensions) as the most basic spatial data object. They are generated out of either a single coordinate or a set of coordinates3, like a two-column matrix or a dataframe with a column for latitude and one for longitude.
Lines are generated out of Line
objects. A Line
object is a spaghetti collection of 2D coordinates and is generated out of a two-column matrix or a dataframe with a column for latitude and one for longitude. A Lines
object is a list of Line
objects, for example all the contours at a single elevation.
Polygons are generated out Polygon
objects. A Polygon
object is a spaghetti collection of 2D coordinates with equal first and last coordinates and is generated out of a two-column matrix or a dataframe with a column for latitude and one for longitude. A Polygons
object is a list of Polygon
objects, for example islands belonging to the same country.
Both Line/Lines and Polygon/Polygons
objects are part of the R’s basic graphics
package. So we need to turn those into spatial, i.e. “geographically aware” objects. SpatialPoints
are directly made out of the coordinates. SpatialLines
and SpatialPolygons
objects are made using lists of Lines
or Polygons
objects respectively.
The points in a SpatialPoints
object may be associated with a row of attributes to create a SpatialPointsDataFrame
object. The coordinates and attributes may, but do not have to be keyed to each other using ID values.
SpatialLinesDataFrame
and SpatialPolygonsDataFrame
objects are defined using SpatialLines
and SpatialPolygons
objects and standard data frames, and the ID fields are here required to match the data frame row names.
A number of spatial methods are available for the classes in sp
. Among the ones I often use are:
method/class | and what it does |
---|---|
bbox(x) |
returns the bounding box coordinates |
proj4string(x) |
sets or retrieves projection attributes using the CRS object. |
CRS() |
creates an object of class of coordinate reference system arguments |
spplot(x) |
plots a separate map of all the attributes unless specified otherwise |
coordinates(x) |
returns a matrix with the spatial coordinates. For spatial polygons it returns the centroids. |
over(x, y) |
used for example to retrieve the polygon or grid indexes on a set of points - we’ll come back to that one later |
spsample(x) |
sampling of spatial points within the spatial extent of objects |
In this example we will manually generate a point vector object and plot it.
xy <- matrix(runif(20), ncol = 2) # a matrix with some arbitrary points as coordinates..
xy.sp <- SpatialPoints(xy) # ..converted into a spatial object
plot(xy.sp, pch = 2)
Test out some commands:
coordinates(xy.sp)
coords.x1 coords.x2
[1,] 0.1813925 0.86973823
[2,] 0.2350274 0.41771681
[3,] 0.1972356 0.22239799
[4,] 0.1488553 0.62568307
[5,] 0.8633978 0.00683451
[6,] 0.9122698 0.63056878
[7,] 0.9174440 0.40440935
[8,] 0.6594771 0.53791701
[9,] 0.6489924 0.85572014
[10,] 0.5936272 0.55002504
bbox(xy.sp)
min max
coords.x1 0.14885531 0.9174440
coords.x2 0.00683451 0.8697382
str(xy.sp)
Formal class 'SpatialPoints' [package "sp"] with 3 slots
..@ coords : num [1:10, 1:2] 0.181 0.235 0.197 0.149 0.863 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : NULL
.. .. ..$ : chr [1:2] "coords.x1" "coords.x2"
..@ bbox : num [1:2, 1:2] 0.14886 0.00683 0.91744 0.86974
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:2] "coords.x1" "coords.x2"
.. .. ..$ : chr [1:2] "min" "max"
..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
.. .. ..@ projargs: chr NA
summary(xy.sp)
Object of class SpatialPoints
Coordinates:
min max
coords.x1 0.14885531 0.9174440
coords.x2 0.00683451 0.8697382
Is projected: NA
proj4string : [NA]
Number of points: 10
Add attributes:
df <- data.frame(attr1 = LETTERS[1:10], attr2 = c(10:1))
xy.spdf <- SpatialPointsDataFrame(xy.sp, df)
summary(xy.spdf)
Object of class SpatialPointsDataFrame
Coordinates:
min max
coords.x1 0.14885531 0.9174440
coords.x2 0.00683451 0.8697382
Is projected: NA
proj4string : [NA]
Number of points: 10
Data attributes:
attr1 attr2
A :1 Min. : 1.00
B :1 1st Qu.: 3.25
C :1 Median : 5.50
D :1 Mean : 5.50
E :1 3rd Qu.: 7.75
F :1 Max. :10.00
(Other):4
Some subsetting:
xy.spdf[1:2, ] # row 1 and 2 only
class : SpatialPointsDataFrame
features : 2
extent : 0.1813925, 0.2350274, 0.4177168, 0.8697382 (xmin, xmax, ymin, ymax)
coord. ref. : NA
variables : 2
names : attr1, attr2
min values : A, 9
max values : B, 10
xy.spdf[["attr2"]] # column with 'attr2' only
[1] 10 9 8 7 6 5 4 3 2 1
A SpatialPointsDataFrame
object can be created directly from data.frames
by specifying which columns contain the coordinates. This is interesting, for example if you have a spreadsheet that contains latitude, longitude and some values. You an read this into a data frame with read.table
, and then create the object from the data frame in one step by using the coordinates()
command. That automatically turns the dataframe object into a SpatialPointsDataFrame
:
R_Workshop
on your Desktop.RSpatialDataTypes.zip
in this directory.R_Workshop
read.csv()
to read PhiladelphiaZIPHousing.csv
into a dataframe in R and name it ph.df
.head()
to examine the first few lines of the dataframe.class()
to examine which object class the table belongs to.Convert the table into a spatial object with:
coordinates(ph.df) <- c("lon", "lat")
Use class()
again to examine which object class the table belongs to now:
What to you observe?
Plot, using the attributes from the table:
bubble(ph.df, "price")
or
spplot(ph.df, "use")
Note that the Spatial object you just created does not have a projection defined. It is ok to plot, but be aware that for any meaningful spatial operation you will need to define a projection.
is.projected(ph.df) # see if a projection is defined
proj4string(ph.df) <- CRS("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs") # this is WGS84
is.projected(ph.df) # voila
The following example4 shows how a set of polygons are built from scratch. Note that the coordinates of the Sr4 polygon move in the opposite direction (anti-clockwise) than the other three (clockwise); Sr4 is meant to represent a hole in the Sr3 polygon. The default value for the hole colour is “transparent”. Note that the Polygons objects have to be given character ID values, and that these values must be unique for Polygons
objects combined in a SpatialPolygons
object.
# create polyon objects from coordinates
Sr1 <- Polygon(cbind(c(2, 4, 4, 1, 2), c(2, 3, 5, 4, 2)))
Sr2 <- Polygon(cbind(c(5, 4, 2, 5), c(2, 3, 2, 2)))
Sr3 <- Polygon(cbind(c(4, 4, 5, 10, 4), c(5, 3, 2, 5, 5)))
Sr4 <- Polygon(cbind(c(5, 6, 6, 5, 5), c(4, 4, 3, 3, 4)), hole = TRUE)
# create lists of polygon objects from polygon objects and unique ID
Srs1 <- Polygons(list(Sr1), "s1")
Srs2 <- Polygons(list(Sr2), "s2")
Srs3 <- Polygons(list(Sr3, Sr4), "s3/4")
# create spatial polygons object from lists
SpP <- SpatialPolygons(list(Srs1, Srs2, Srs3), 1:3)
plot(SpP)
In order to add attributes to the polygons the row.names
of the attributes data frame are matched with the ID slots of the SpatialPolygons object, and the rows of the data frame will be re-ordered if necessary.
Add attributes:
attr <- data.frame(numbers = 1:3, characters = LETTERS[3:1], row.names = c("s3/4",
"s1", "s2"))
SpDf <- SpatialPolygonsDataFrame(SpP, attr)
spplot(SpDf)
Let’s look at its structure:
str(SpDf)
Formal class 'SpatialPolygonsDataFrame' [package "sp"] with 5 slots
..@ data :'data.frame': 3 obs. of 2 variables:
.. ..$ numbers : int [1:3] 2 3 1
.. ..$ characters: Factor w/ 3 levels "A","B","C": 2 1 3
..@ polygons :List of 3
.. ..$ :Formal class 'Polygons' [package "sp"] with 5 slots
.. .. .. ..@ Polygons :List of 1
.. .. .. .. ..$ :Formal class 'Polygon' [package "sp"] with 5 slots
.. .. .. .. .. .. ..@ labpt : num [1:2] 2.7 3.55
.. .. .. .. .. .. ..@ area : num 5.5
.. .. .. .. .. .. ..@ hole : logi FALSE
.. .. .. .. .. .. ..@ ringDir: int 1
.. .. .. .. .. .. ..@ coords : num [1:5, 1:2] 2 1 4 4 2 2 4 5 3 2
.. .. .. ..@ plotOrder: int 1
.. .. .. ..@ labpt : num [1:2] 2.7 3.55
.. .. .. ..@ ID : chr "s1"
.. .. .. ..@ area : num 5.5
.. ..$ :Formal class 'Polygons' [package "sp"] with 5 slots
.. .. .. ..@ Polygons :List of 1
.. .. .. .. ..$ :Formal class 'Polygon' [package "sp"] with 5 slots
.. .. .. .. .. .. ..@ labpt : num [1:2] 3.67 2.33
.. .. .. .. .. .. ..@ area : num 1.5
.. .. .. .. .. .. ..@ hole : logi FALSE
.. .. .. .. .. .. ..@ ringDir: int 1
.. .. .. .. .. .. ..@ coords : num [1:4, 1:2] 5 2 4 5 2 2 3 2
.. .. .. ..@ plotOrder: int 1
.. .. .. ..@ labpt : num [1:2] 3.67 2.33
.. .. .. ..@ ID : chr "s2"
.. .. .. ..@ area : num 1.5
.. ..$ :Formal class 'Polygons' [package "sp"] with 5 slots
.. .. .. ..@ Polygons :List of 2
.. .. .. .. ..$ :Formal class 'Polygon' [package "sp"] with 5 slots
.. .. .. .. .. .. ..@ labpt : num [1:2] 6.13 3.93
.. .. .. .. .. .. ..@ area : num 10
.. .. .. .. .. .. ..@ hole : logi FALSE
.. .. .. .. .. .. ..@ ringDir: int 1
.. .. .. .. .. .. ..@ coords : num [1:5, 1:2] 4 10 5 4 4 5 5 2 3 5
.. .. .. .. ..$ :Formal class 'Polygon' [package "sp"] with 5 slots
.. .. .. .. .. .. ..@ labpt : num [1:2] 5.5 3.5
.. .. .. .. .. .. ..@ area : num 1
.. .. .. .. .. .. ..@ hole : logi TRUE
.. .. .. .. .. .. ..@ ringDir: int -1
.. .. .. .. .. .. ..@ coords : num [1:5, 1:2] 5 5 6 6 5 4 3 3 4 4
.. .. .. ..@ plotOrder: int [1:2] 1 2
.. .. .. ..@ labpt : num [1:2] 6.13 3.93
.. .. .. ..@ ID : chr "s3/4"
.. .. .. ..@ area : num 10
..@ plotOrder : int [1:3] 1 2 3
..@ bbox : num [1:2, 1:2] 1 2 10 5
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:2] "x" "y"
.. .. ..$ : chr [1:2] "min" "max"
..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
.. .. ..@ projargs: chr NA
Whoa.
Note that we can access the information in the slots using the @
. For example
SpDf@bbox
min max
x 1 10
y 2 5
However, it is strongly encouraged to use the provided functions and methods instead, like
bbox(SpDf)
min max
x 1 10
y 2 5
To look at the attribute table:
as.data.frame(SpDf) # instad of: SpDf@data
numbers characters
s1 2 B
s2 3 A
s3/4 1 C
The good news is that typically we do not have to create Spatial
family objects as tediously as we did above. It is much more common that we work with already existing spatial data.
rgdal
In order to read those into R and turn them into Spatial*
family objects we rely on the rgdal
package. It provides us direct access to the powerful GDAL library from within R.
We can read in and write out spatial data using:
readOGR()
and writeOGR()
(for vector)
readGDAL()
and writeGDAL()
(for raster/grids)
The parameters provided for each function vary depending on the exact spatial file type you are reading. We will take an ESRI shapefile as an example. A shapefile - as you know - consists of various files, and R expects all those files to be in one directory.
When reading in a shapefile, readOGR()
expects at least the following two arguments:
datasource name (dsn) # the path to the folder that contains the files
# Note that this is a path to the folder, not a filename!
layer name (layer) # the shapefile name without extension
# Note that this is not a path but just the name of the file!
For example, if I have a shapefile called Philadelphia.shp
and all its associated files (like .dbf, .prj, .shx) in a directory called PH
on my desktop, and I have my working directory set to my desktop folder, my command to read this shapefile would look like this:
readOGR(dsn = "PH", layer = "Philadelphia")
or in short:
readOGR("PH", "Philadelphia")
rgdal
package.R_Workshop
folder and it contains the materials you downloaded and unzipped earlier.PhillyTotalPopHHinc.shp
into an object called philly
. Make sure you understand the directory structure.summary()
, class()
, str("philly", max.level = 2)
,names()
head()
philly[philly$totalPop > 5000,]
spplot(philly, "medHHinc")
GDAL supports over 200 raster formats and vector formats. Use
ogrDrivers()
and gdalDrivers()
(without arguments) to find out which formats your rgdal
install can handle.
Rasters are much more compact that vectors. Because of their regular structure the coordinates do not need to be recorded for each pixel or cell in the rectangular extent. A raster is defined by:
If necessary, the coordinates for any cell can be computed.
In sp
the GridTopology
class is the key element of raster representations5. It contains
A simple grid can be built like this:
# specify the grid topology with the following parameters:
# - the smallest coordinates for each dimension, here: 0,0
# - cell size in each dimension, here: 1,1
# - number of cells in each dimension, here: 5,5
gtopo <- GridTopology(c(0,0), c(1,1), c(5,5)) # create the grid
datafr <- data.frame(runif(25)) # make up some data
SpGdf <- SpatialGridDataFrame(gtopo, datafr) # create the grid data frame
summary(SpGdf)
Object of class SpatialGridDataFrame
Coordinates:
min max
[1,] -0.5 4.5
[2,] -0.5 4.5
Is projected: NA
proj4string : [NA]
Grid attributes:
cellcentre.offset cellsize cells.dim
1 0 1 5
2 0 1 5
Data attributes:
runif.25.
Min. :0.03807
1st Qu.:0.31346
Median :0.51471
Mean :0.55202
3rd Qu.:0.75978
Max. :0.95975
And it can be plotted like this:
spplot(SpGdf, sp.layout = c("sp.points", SpatialPoints(coordinates(SpGdf))))
Alternatively you can use the raster
package, which works slightly differently.
The raster
package is a major extension of spatial data classes to access large rasters and in particular to process very large files. It includes object classes for RasterLayer
, RasterStacks
, and RasterBricks
, functions for converting among these classes, and operators for computations on the raster data. Conversion from sp
type objects into raster
type objects is easy.
If we wanted to do the same as above, namely creating the same raster object from scratch we would do the following:
# specify the RasterLayer with the following parameters:
# - minimum x coordinate (left border)
# - minimum y coordinate (bottom border)
# - maximum x coordinate (right border)
# - maximum y coordinate (top border)
# - resolution (cell size) in each dimension
r <- raster(xmn=-0.5, ymn=-0.5, xmx=4.5, ymx=4.5, resolution=c(1,1))
r
class : RasterLayer
dimensions : 5, 5, 25 (nrow, ncol, ncell)
resolution : 1, 1 (x, y)
extent : -0.5, 4.5, -0.5, 4.5 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
So here we have created an object of type RasterLayer
, as compared to above, where we created an object of type GridTopology
.
Compare this to the output from above and note something important here: Different from the grid object we generated from scratch, this raster object has a CRS defined! If the crs argument is missing when creating the Raster object, the x coordinates are within -360 and 360 and the y coordinates are within -90 and 90, the WGS84 projection is used by default!
Good to know.
To add some values to the cells we could the following. Be aware that different from the GridTopology
object above, which is converted to a SpatialGridDataFrame
when adding values, this here object remains a RasterLayer
.
class(r)
[1] "RasterLayer"
attr(,"package")
[1] "raster"
r <- setValues(r, runif(25))
class(r)
[1] "RasterLayer"
attr(,"package")
[1] "raster"
plot(r); points(coordinates(r), pch=3)
(See the rasterVis
package for more advanced plotting of Raster*
objects.)
We can use readGDAL
from the sp
package to read in a raster file. It will require one parameter as a minimum, which is the filename of the raster. So let’s do this.
R_Workshop
folder and it contains the materials you downloaded and unzipped earlier.dem <- readGDAL("DEM_10m/bushkill_pa.dem")
6summary()
Alternatively we can use the raster
package to read in the same file:
raster
librarydem.r <- raster("DEM_10m/bushkill_pa.dem")
dem.r
and compare to the abovecontour(dem.r)
contour(dem)
RasterLayer objects can also be created from a simple matrix.
class(volcano)
[1] "matrix"
volcano.r <- raster(volcano)
class(volcano.r)
[1] "RasterLayer"
attr(,"package")
[1] "raster"
This package becomes interesting when you need to work for example with Landsat images that have multiple bands and want to perform map algebra.
There are currently about 140 R packages on CRAN for reading, visualising, and analysing (geographical) spatial data. I recommend to visit that website if you are exploring spatial analysis with R.
From R Bivand (2011) Introduction to representing spatial objects in R↩
Note that this is not the only way spatial objects are conceptualized in R. Other spatial packages may use their own class definitions for spatial data (for example spatstat
). sp
provides conversion functions for many of those formats.↩
Coordinates should be of type double and will be promoted if not.↩
from Edzer Pebesma Roger S. Bivand Feb 2005: S Classes and Methods for Spatial Data: the sp Package↩
There is also a SpatialPixels
object which stores grid topology and coordinates of the actual points.↩
From (https://www.e-education.psu.edu/geog482fall2/c7_p8.html)↩