The following R mapping packages are used in this tutorial.
In addition the usual packages dplyr, ggplot2, tidr are all used extensively.
A more complete list of packages can be found on the Cran repository. y
Data for the tutorial is loaded from the Robin Lovelace tutorial folder.
The initial examples are all based on the London_sports shapefile
setwd("~/SpatialAnalysis/IntroToSpatialR/Creating-maps-in-R-master")
ldn = readOGR(dsn='data', 'london_sport' )Gographic objects can be subsetted using brackets in the same way other objects can. For example the code to find the high sports-participation areas of london is as follows.
First we create a new filter where the sports participation rate exceeds 25% of the population.
sel = ldn$Partic_Per > 25Next we plot a grey filled background map of London and then overlay a turquoise filled plot of London zones where the participation levels exceed 25%.
In this example London is split into quadrants on the basis of latitude and longitude. London Wards are coloured based on which quadrant the centre of each shape falls into.
To produce the graphic above the coordinates of the centre of london were used to generate a dividing north-south line and a dividing east-west line.
Two vectors were created which contained the results of a simple test against each ward, whether the centre of each ward lay to the east of the centre line and whether the ward lay to the north of the centre of london.
The two computed vectors were used to create a new field in the Spatial object @data slot with a description of the ward location; NE, NW, SE, SW.
Finally the wards were plotted on the outline of London. A colour map was used to fill each ward based on whether its centre lies to the nort or east of the centre of london.
Reprojecting and joining/clipping are fundamental GIS operations. Firstly we join non-spatial data to spatial data so that it can be mapped. Finally, we will cover spatial joins, whre information from two spatial objects is combined based on spatial location.
Spatial data types are based on the sp package which provides a number of classes and methods for dealing with spatial data types. Attributes can be attached to the spatial objects in ___DataFrame variants of each object class.
| data type | class | attributes | contains |
|---|---|---|---|
| points | SpatialPoints | No Spatial | |
| points | SpatialPointsDataFrame | data.frame | SpatialPoints |
| multipoints | SpatialMultiPoints | No | Spatial |
| multipoints | SpatialMultiPointsDataFrame | data.frame | SpatialMultiPoints |
| pixels | SpatialPixels | No | SpatialPoints |
| pixels | SpatialPixelsDataFrame | data.frame | SpatialPixels SpatialPointsDataFrame |
| full grid | SpatialGrid | No | SpatialPixels |
| full grid | SpatialGridDataFrame | data.frame | SpatialGrid |
| line | Line | No | |
| lines | Lines | No | Line list |
| lines | SpatialLines | No | Spatial, Lines list |
| lines | SpatialLinesDataFrame | data.frame | SpatialLines |
| polygons | Polygon | No | Line |
| polygons | Polygons | No | Polygon list |
| polygons | SpatialPolygons | No | Spatial, Polygons list |
A SpatialPoints class can only be created from a matrix, so even if the original data is a data.frame this must be converted as in the example below.
df = data.frame(x=1:3, y=c(1/2, 2/3, 3/4))
mat = as.matrix(df)
sp1 = SpatialPoints(coords = mat)Attributes can be attached to each of the spatial objects with an associated data frame.
df2 = data.frame(name=c('one','two','three'))
spdf = SpatialPointsDataFrame(sp1, df2)
summary(spdf)The proj4string identifies the coordinate reference system used by the dataset. The CRS can be manipulated by manually setting the string value. It should be noted however that simply changing the CRS does not reproject the data.
Atribute joins link additional pieces of information to the polygons via the spatial object dataframe. In this example recorded crimes are linked to London boroughs.
There are three types of spatial joins: many-to-one, one-to-many, one-to-one. This example uses many-to-one spatial joins to link transport infrastructure points to each London borough.
#Load london stations dataset
setwd("~/SpatialAnalysis/IntroToSpatialR/Creating-maps-in-R-master")
stations = readOGR(dsn = 'data', layer = 'lnd-stns')Check the coordinate systems of london stations and the london boroughs datasets.
proj4string(stations)
proj4string(ldn)The problem is that both datasets use different CRS. Since OSGB is the official coordinate reference system for the UK the stations dataset will be reprojected into that CRS.
stations27700 = spTransform(stations, CRSobj = CRS(proj4string(ldn)))
stations = stations27700
rm(stations27700)
#plot london and then overlay station points
plot(ldn)
plot(stations, add=TRUE) The grqph clearly shows that there are far more stations identified than lie within the boroughs. to simplify hte dataset points are trimmed to only those that fall within greater London. There are two functions that can be used to clipp to the greader london boundary: sp::over, rgeos::gIntersects.
#Clip to greater london using normal square bracket notation
stations_backup = stations
stations = stations_backup[ldn,]
#replot to demonstrate that the clipping has worked
plot(ldn)
plot(stations, add=TRUE)Now that the station points have been clipped to the greater london area aggregations can be performed on the datasets. These can either be done using highly summarised notation e.g. aggregate or in a slightly longer, but easier to read form using dplyr and by creating intermediate join tables.
The following code uses the over function to create a join table based on spatial overlap. Dplyr can then be used as normal to handle aggregation functions. Finally the results are joined back onto a master data.frame associated with the spatial points.
In the example below spatial joins using the over function are used to count the number of stations appearing in each london borough.
#set the borough name for each station point
stnbor = over(stations,ldn)
#the station order didn't change so re-add the station code to the dataset
stnbor = mutate(stnbor[1], stations$NUMBER)
colnames(stnbor) = c('ons_label', 'StationNumber')
#aggregate counts over borough and join back to borough data frame in london dataset
stn_agg = stnbor %>% group_by(ons_label) %>% summarise(StationCount = n())
ldn@data = left_join(ldn@data, stn_agg)
#now plot a density map using the tmap function
qtm(ldn, 'StationCount', text = 'name')ggmap requires spatial data to be supplied as a data.frame using fortify() from the package. In fortifying the data the attribute information is lost. It must therefore be added back by joining per the example below.
#Load mapTools to use fortify
ldn_f = fortify(ldn)
#data lost during the fortification can be joined back onto new dataframe.
#fortify gives each row of the old object an ID and order is preserved.
#It therefore works to create a new ID to join on in the original dataset based on row number
ldn$id = row.names(ldn)
ldn_f = left_join(ldn_f, ldn@data, by="id")
head(ldn_f)
#Now we can map using ggplot2
map = ggplot(ldn_f, aes(long, lat, group=group, fill=Partic_Per)) +
geom_polygon() +
coord_equal() +
labs(x = 'Easting (m)', y='Northin (m)',
fill='% Sports \nParticipation') +
ggtitle("London Sports Participation") +
scale_fill_gradient(low='grey', high='light blue')+
theme_minimal()
mapThe R package leaflet provides R bindings to the Leaflet javascript library. This allows the creation and development of interactive web maps.
library(leaflet)
#leaflet requires an object with WG84 projection
#get EPSG codes
EPSG = make_EPSG()
EPSG[grepl("WGS 84$", EPSG$note), ] #Search for WGS 84 code
#Store proj4 string
WGS84Str = paste0("+init=epsg:" ,
EPSG[grepl("WGS 84$", EPSG$note), ])
#reproject london to WGS84
ldn84 = spTransform(ldn, CRS(WGS84Str))
#create and display the leaflet object
pal = colorQuantile(palette ="YlOrRd",
domain = ldn84$Partic_Per,
n= 3)
m = leaflet(ldn84) %>%
#addTiles() %>%
addProviderTiles("Stamen.TonerLite", group = "Toner Lite") %>%
addPolygons(
stroke = FALSE,
color = ~pal(Partic_Per)
) %>%
addLegend("bottomright", pal = pal, values = ~Partic_Per,
title = "London Sports participation rates",
labFormat = labelFormat(prefix = "%"),
opacity = 1)
m1: question to colour wards within 10km of centre of london or where part of their mass lies within 10km of London. see page 8 of tutorial.
Only the london data is required for this exercise
dsn = '~/SpatialAnalysis/IntroToSpatialR/Creating-maps-in-R-master/data'
setwd(dsn)
ldn = readOGR(dsn='london_sport.shp', layer= 'london_sport')To get coordinates for the centre of London we first merge (union) all of the shapes together to retrieve the overall shape. Once all shapes have been merged we can get the coordinates of the centre of the shape.
ldn_bounds = gUnaryUnion(ldn)
ldn_centre = gCentroid(ldn_bounds)
plot(ldn_bounds)
plot(ldn_centre, add=TRUE)There are several ways to define ‘within’ a 10km radius of the centre of London. Perhaps simplest is to test whether the borough centroid is within a 10km radius. This is done with the funciton rgeos::gWithinDistance.
Note that all distances are relative to the coordinate system in use. In this case the default coordinate system is transverse mercator projection. The important part of the return string below is the +units=m part which shows units in use. In this case meters.
ldn@proj4stringThe distance between the centre of london and Bromley, the first geometry in the dataset is found to be 18.17km, hence the gWithinDistance function returns a false for this test.
p1 = ldn[1,] #get first polygon from dataset
gDistance(ldn_centre, gCentroid(p1))
gWithinDistance(ldn_centre,gCentroid(p1), dist = 10000)
plot(ldn_bounds)
plot(ldn_centre, add=TRUE)
plot(p1, add=TRUE, col="salmon")
plot(gCentroid(p1), add=TRUE)This method is applied to all polygons in the dataset. Any boroughs where the centroid is within a 10km radius of the geographic centroid of London are coloured green.
max_dist =10000 #radius in metres
within10k = c(gWithinDistance(ldn_centre, gCentroid(ldn, byid = TRUE), dist=max_dist, byid = TRUE))
plot(ldn_bounds)
plot(ldn, add=TRUE)
plot(ldn[c(within10k), ], add=TRUE, col='lightblue')
plot(gCentroid(ldn, byid = TRUE), add=TRUE)
plot(ldn_centre, col='red', add=TRUE)All London boroughs within 10km of the geographic centre. Using centroid method.
As well as testing based on distance calculations it is also possible to create a buffer and test for inclusion within this geographic shape.
A buffer facilitates alternative approaches including testing for any part of a polygon within the 10km radius or testing for complete inclusion within the radius. Both approaches are shown below for comparison.
ctr_buffer = gBuffer(ldn_centre, width=max_dist) #create a 10km buffer around the centre
within10k_any = ldn[ctr_buffer, ] #test to see if any points lie within the 10km radius
within10k_whole = as.vector(gCovers(ctr_buffer, ldn, byid = T)) #Use gCovers to avoid problems with boundaries
outside10k = as.vector(gDisjoint(ctr_buffer, ldn, byid=T)) #Test for boroughs where entire shape is outside centre
plot(ldn)
plot(ldn[outside10k,], add=T, col='salmon')
plot(within10k_any,col='lightgrey', add=TRUE)
plot(ldn[c(within10k), ], col='lightblue', add=TRUE)
plot(ldn[within10k_whole, ],col='darkblue', add=TRUE)
plot(gCentroid(ldn, byid=TRUE), add=T)
plot(ctr_buffer, add=T, border='red', lwd=2)
plot(ldn_centre, add=T, col="red", lwd=3)rgdal also requires installing the following on linux: