In this practical you will be putting into practice the various data handling, topographic and thematic mapping skills you have learned over the last few weeks to carry out a piece of spatial analysis in ArcGIS or R (the choice is yours!). I’ve been deliberately vague with some of the instructions as you should have picked up all of the basics in previous weeks.
I will post the answers to the task on Moodle in a couple of days.
You all remember the CASA Treasure Hunt, right? Here’s refresher…
\(~\)
Well, I asked groups to record their routes as they went around. Sadly, none of recorded routes I received came out that well this year - boo!.
Fortunately the winning group from last year (Team 7) tracked their route and the data were useable. Here’s where they went:
library(tmap)
library(geojsonio)
#Here's their data:
hunt <- geojson_read("https://www.dropbox.com/s/wa2ip35tcmt93g3/Team7.geojson?raw=1", what = "sp")
#And here's where they went...
tmap_mode("view")
tm_shape(hunt) +
tm_lines(col = "green", lwd = 4)
\(~\)
How far did this group travel (according to the trace they recorded - I won’t expect you to try and clip this to public transport routes, although you can try if you like!)?
How many TFL stations did their route pass within 100 metres of?
How many points did they score (I’m not including bonus points here, just total points) based on treasure hunt locations they managed to get within 300 metres of?
Which Wards did they pass through that had a) the lowest and b) the highest rates of Male Life Expectancy (according to the data you have been using for the last few weeks)?
Taking the average of all Wards that the group passed through, what was the average life expectancy at birth for babies born in those wards along the whole route? This can be both Male and Female life expectancies.
GeoJSON - https://www.dropbox.com/s/wa2ip35tcmt93g3/Team7.geojson?raw=1
Shapefile - https://www.dropbox.com/s/chzovfrvw9qsrsi/Team7.zip?dl=0
These can be obtained from here: https://www.doogal.co.uk/london_stations.php
#You can have this for free, R fans. Reading XML can be a pain in R, you will need the 'layer' information, which is contained in the <name> tag in a KML file...
library(rgdal)
tubestations <- readOGR("https://www.doogal.co.uk/LondonStationsKML.ashx", "London stations with zone information")
## OGR data source with driver: KML
## Source: "https://www.doogal.co.uk/LondonStationsKML.ashx", layer: "London stations with zone information"
## with 641 features
## It has 2 fields
A raw list with the points can be downloaded from here: https://www.dropbox.com/s/v66l4cx7aia9jlo/huntLocations.csv?raw=1
You will notice that the raw list doesn’t contain any spatial information at all. You will need to add this. Now, you can do it yourself, however below is quite a cool way for doing it automatically using the Google Places API
library(tidyverse)
huntaddresses <- read_csv("https://www.dropbox.com/s/v66l4cx7aia9jlo/huntLocations.csv?raw=1")
#code here lifted directly from - https://gist.github.com/josecarlosgonz/6417633
#library the required packages
library(RCurl)
library(RJSONIO)
library(plyr)
library(tidyverse)
#highlight this whole block and create this function to access the Google Places API
url <- function(address, return.call = "json", sensor = "false") {
root <- "http://maps.google.com/maps/api/geocode/"
u <- paste(root, return.call, "?address=", address, "&sensor=", sensor, sep = "")
return(URLencode(u))
}
#highlight this whole block and create this function to geocode some places just from a random list of treasure hunt locations
geoCode <- function(address,verbose=FALSE) {
if(verbose) cat(address,"\n")
u <- url(address)
doc <- getURL(u)
x <- fromJSON(doc,simplify = FALSE)
if(x$status=="OK") {
lat <- x$results[[1]]$geometry$location$lat
lng <- x$results[[1]]$geometry$location$lng
location_type <- x$results[[1]]$geometry$location_type
formatted_address <- x$results[[1]]$formatted_address
return(c(lat, lng, location_type, formatted_address))
Sys.sleep(0.5)
} else {
return(c(NA,NA,NA, NA))
}
}
#now use the geoCode() function (which calls the URL function) to geocode our list of places
#for loop to cycle through every treasure hunt location
i=1
for(i in 1:nrow(huntaddresses)){
# Every nine records, pause 3 seconds so that the API doesn't kick us off...
if(i %% 9 == 0) Sys.sleep(3)
#now create a temporary list of useful elements
tempdf <- as.list(geoCode(huntaddresses[i,1]))
#and write these back into our dataframe
huntaddresses[i,3] <- tempdf[1]
huntaddresses[i,4] <- tempdf[2]
huntaddresses[i,5] <- tempdf[4]
}
# rename the columns
names(huntaddresses) <- c("Location","Points","lat","lon","GoogleAddress")
head(huntaddresses)
#write a new .csv file to your working directory
write_csv(huntaddresses, "huntaddresses.csv")
https://www.dropbox.com/s/2cbu2ux9ddy9c0l/huntaddresses.csv?dl=0
huntaddresses <- read_csv("https://www.dropbox.com/s/2cbu2ux9ddy9c0l/huntaddresses.csv?raw=1")
Tools you might need to use: When I say might, I mean almost certainly!
Things you will need to consider:
Think carefully about choosing the appropriate ward boundaries to match your data as there will be several candidates for you to choose from. Don’t use generalised or clipped boundaries and select the appropriate time period
Your data does not contain any information for the wards in the City of London so you will need to either edit your ward boundaries (using the editor toolbar – see the appendix of this document) and merge all of the wards in the city of London into a new City of London object. While your editing session is open, you will also be able to edit the attribute table so that the code for the City of London matches the code in your data.
When you carry out a spatial join, you can also specify a merge rule by right-clicking on the field you are interested in. The default is ‘First’ which is not that good for most joins – Mean and Sum will probably be more useful in this task! It might also be useful to not join all fields, but only those of use, deleting others with the ‘X’ button.
Convert all of your spatial objects to sf (Simple Features) objects - https://cran.r-project.org/web/packages/sf/ it will make some of the aggregation easier and you will be able to make use of some of the neat geos_uniary functions such as st_buffer() for drawing buffers or the geos_binary_pred functions such as st_intersects for detecting where your points interset polygons, for example.
Use some of these online guides to help you:
https://bookdown.org/robinlovelace/geocompr/
https://cran.r-project.org/web/packages/sf/vignettes/sf1.html
https://cran.r-project.org/web/packages/sf/vignettes/sf2.html
https://cran.r-project.org/web/packages/sf/vignettes/sf3.html
https://cran.r-project.org/web/packages/sf/vignettes/sf4.html
\(~\)
library(sf)
#Download the Ward Boundaries from Moodle
LondonWards <- readOGR("LondonWardsBoundaries/LondonWardsNew.shp")
## OGR data source with driver: ESRI Shapefile
## Source: "LondonWardsBoundaries/LondonWardsNew.shp", layer: "LondonWardsNew"
## with 649 features
## It has 7 fields
LondonWardsSF <- st_as_sf(LondonWards)
#cut the city out - this just happens to be the first 25 rows, usefully!
city <- LondonWardsSF[1:25,]
city$agg <- 1
#merge all of the boundaries together so we've got a single object
cityagg <- city %>% group_by(city$agg) %>% summarise()
plot(cityagg)
#disolve the ward boundaries and just leave the first one as the city of London
LondonWards_dis <- aggregate(LondonWardsSF, by = cityagg, FUN = first)
#merge them back together into a new object
LondonWards_new <- rbind(LondonWards_dis, LondonWardsSF[26:649,])
plot(LondonWards_new)
#finally, update the codes for the city of London so that they match your
LondonWards_new[1,2] <- as.character("E09000001")
LondonWards_new[1,3] <- as.character("00AA")
LondonWards_new[1,4] <- as.character("City of London")