An introduction to R for Spatial data

Robin Lovelace - see rpubs.com/robinlovelace for slides

University of Leeds, 2016-04-20

Introduction

cdrc

cdrc

This course is brought to you the Consumer Data Research Centre (CDRC) and is funded by the ESRC’s (Big Data Network).

Packages

This R code installs and loads all the packages you need:

pkgs <- c(
  "sp", # R's base package
  "rgdal", # (can be tricky)
  "rgeos",
  "ggmap",
  "tmap",
  "leaflet",
  "dplyr"
  )
install.packages(pkgs)
lapply(pkgs, library, character.only = TRUE)

R code = reproducible!

A bit about R

Why R?

R is up and coming I

scholar-searches1

scholar-searches1

Source: r4stats.com

II - Increasing popularity in academia

scholar-searches2

scholar-searches2

Source: r4stats.com

III - R vs Python

Source: Data Camp

IV - employment market

jobs

jobs

Source: revolution analytics

Why R for spatial data?

“With the advent of “modern” GIS software, most people want to point and click their way through life. That’s good, but there is a tremendous amount of flexibility and power waiting for you with the command line. Many times you can do something on the command line in a fraction of the time you can do it with a GUI (Sherman 2008, p. 283)

Why R for spatial data II

It can take data in a wide range of formats. E.g. MySQL database dump gives you this:

LINESTRING(-1.81 52.55,-1.81 52.55)

And if you get stuck? Just ask!

Example: I could not load a MultiFeature GeoJSON. So I asked: http://stackoverflow.com/q/29066198/1694378

Visualisation

Why focus on visualisation?

If you cannot visualise your data, it is very difficult to understand your data. Conversely, visualisation will greatly aid in communicating your results.

Human beings are remarkably adept at discerning relationships from visual representations. A well-crafted graph can help you make meaningful comparisons among thousands of pieces of information, extracting patterns not easily found through other methods. … Data analysts need to look at their data, and this is one area where R shines. (Kabacoff, 2009, p. 45).

Maps, the ‘base graphics’ way

base graphics

base graphics

Source: Cheshire and Lovelace (2014) - available online

The ‘ggplot2’ way (source: This tutorial!)

Source: This tutorial!

Source: This tutorial!

R in the wild 1: Maps of all census variables for local authorities

census

census

R in the wild 2: Global shipping routes in the late 1700s

Source: R-Bloggers

R in the wild 3: The national propensity to cycle tool (NPCT)

See https://github.com/npct/pct-shiny

R in the wild 4: Mapping bicycle crashes in West Yorkshire I. | Source: Lovelace, Roberts and Kellar (2015)

Source: Lovelace, Roberts and Kellar (2015)

R in the wild 4: Mapping bicycle crashes in West Yorkshire II

R in the wild 5: Infographic of housing project finances

Flexibility of ggplot2 - see robinlovelace.net

Getting up and running for the data labs

Before progressing further: Any questions?

Data and code are all available online from a GitHub repository. Click “Download ZIP” to download all the test data, ready to procede:

https://github.com/StudentDataLabs/VisionZeroInnovationLab

R resources

Reproducible code of academic paper cycle crashes: https://github.com/Robinlovelace/bikeR

General tutorial: Introduction to Visualising Spatial Data with R: https://github.com/Robinlovelace/Creating-maps-in-R

Early preview of book on Efficient R Programming: https://csgillespie.github.io/efficientR/

Reading in data

There are a number of ways to read in a shapefile, but not all are the same!

lnd <- rgdal::readOGR(dsn = "data", layer = "london_sport")
lnd2 <- raster::shapefile("data/london_sport.shp")
lnd3 <- tmap::read_shape("data/london_sport.shp")
proj4string(lnd)
identical(lnd3, lnd2)

The structure of spatial data in R

lnd <- tmap::read_shape("data/london_sport.shp")
## NOTE: rgdal::checkCRSArgs: no proj_defs.dat in PROJ.4 shared files
# str(lnd[1,])
slotNames(lnd)
## [1] "data"        "polygons"    "plotOrder"   "bbox"        "proj4string"

Classes of spatial data in R

class(lnd)
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"

There is also:

Finding the centrepoint

coords <- coordinates(lnd)
plot(lnd[1:2,])
points(coords)

Bonus point: What class is coords? > - How to make it Spatial?

Extracting coordinates

We can see the individual vertices of a polygon

v <- lnd@polygons[[1]]@Polygons[[1]]@coords
plot(v)

Leaflet - saving maps

library(leaflet)
lnd84 <- readRDS('data/lnd84.Rds')

m <- leaflet() %>% addTiles() %>% addPolygons(data = lnd84)
htmlwidgets::saveWidget(m, file = "~/repos/robinlovelace.github.io/m.html")

Makes it available here! http://robinlovelace.net/m.html

Reading in student labs data