Practical 5 - Introduction to Spatial Analysis Methodologies

In this practical you will be putting into practice the various data handling, topographic and thematic mapping skills you have learned over the last few weeks to carry out a piece of spatial analysis in ArcGIS or R (the choice is yours!). I’ve been deliberately vague with some of the instructions as you should have picked up all of the basics in previous weeks.

I will post the answers to the task on Moodle in a couple of days.

Task - Analysing the CASA Treasure Hunt!

You all remember the CASA Treasure Hunt, right? Here’s refresher…

\(~\)

Well, I asked groups to record their routes as they went around. Sadly, none of recorded routes I received came out that well this year - boo!.

Fortunately the winning group from last year (Team 7) tracked their route and the data were useable. Here’s where they went:

Team 7’s Route in 2016

library(tmap)
library(geojsonio)

#Here's their data:
hunt <- geojson_read("https://www.dropbox.com/s/wa2ip35tcmt93g3/Team7.geojson?raw=1", what = "sp")

#And here's where they went...
tmap_mode("view")
tm_shape(hunt) +
  tm_lines(col = "green", lwd = 4)

\(~\)

I would like you to analyse the route using a variety of spatial analysis functions available in either R or ArcGIS

Questions to Answer

How far did this group travel (according to the trace they recorded - I won’t expect you to try and clip this to public transport routes, although you can try if you like!)?
How many TFL stations did their route pass within 100 metres of?
How many points did they score (I’m not including bonus points here, just total points) based on treasure hunt locations they managed to get within 300 metres of?
Which Wards did they pass through that had a) the lowest and b) the highest rates of Male Life Expectancy (according to the data you have been using for the last few weeks)?
Taking the average of all Wards that the group passed through, what was the average life expectancy at birth for babies born in those wards along the whole route? This can be both Male and Female life expectancies.

How to tackle the challenge

1 - Choose which side of the force you are on (Dark Side or Light Side):

Or

2 - Download your data

Team 7’s Route

GeoJSON - https://www.dropbox.com/s/wa2ip35tcmt93g3/Team7.geojson?raw=1

Shapefile - https://www.dropbox.com/s/chzovfrvw9qsrsi/Team7.zip?dl=0

The location of Tube / Rail Stations in London

These can be obtained from here: https://www.doogal.co.uk/london_stations.php

#You can have this for free, R fans. Reading XML can be a pain in R, you will need the 'layer' information, which is contained in the <name> tag in a KML file...
library(rgdal)
tubestations <- readOGR("https://www.doogal.co.uk/LondonStationsKML.ashx", "London stations with zone information")

## OGR data source with driver: KML 
## Source: "https://www.doogal.co.uk/LondonStationsKML.ashx", layer: "London stations with zone information"
## with 641 features
## It has 2 fields

The treasure hunt locations

A raw list with the points can be downloaded from here: https://www.dropbox.com/s/v66l4cx7aia9jlo/huntLocations.csv?raw=1

You will notice that the raw list doesn’t contain any spatial information at all. You will need to add this. Now, you can do it yourself, however below is quite a cool way for doing it automatically using the Google Places API

R Code for GeoCoding a list of places with no spatial reference:

First read a list of treasure hunt locations and their points scores from my dropbox

library(tidyverse)

huntaddresses <- read_csv("https://www.dropbox.com/s/v66l4cx7aia9jlo/huntLocations.csv?raw=1")

Now geocode them using the Google Maps API and a neat bit of code plundered from here: https://gist.github.com/josecarlosgonz/6417633

#code here lifted directly from - https://gist.github.com/josecarlosgonz/6417633 

#library the required packages
library(RCurl)
library(RJSONIO)
library(plyr)
library(tidyverse)

#highlight this whole block and create this function to access the Google Places API
url <- function(address, return.call = "json", sensor = "false") {
  root <- "http://maps.google.com/maps/api/geocode/"
  u <- paste(root, return.call, "?address=", address, "&sensor=", sensor, sep = "")
  return(URLencode(u))
}

#highlight this whole block and create this function to geocode some places just from a random list of treasure hunt locations
geoCode <- function(address,verbose=FALSE) {
  if(verbose) cat(address,"\n")
  u <- url(address)
  doc <- getURL(u)
  x <- fromJSON(doc,simplify = FALSE)
  if(x$status=="OK") {
    lat <- x$results[[1]]$geometry$location$lat
    lng <- x$results[[1]]$geometry$location$lng
    location_type  <- x$results[[1]]$geometry$location_type
    formatted_address  <- x$results[[1]]$formatted_address
    return(c(lat, lng, location_type, formatted_address))
    Sys.sleep(0.5)
  } else {
    return(c(NA,NA,NA, NA))
  }
}

#now use the geoCode() function (which calls the URL function) to geocode our list of places

#for loop to cycle through every treasure hunt location
i=1
for(i in 1:nrow(huntaddresses)){
  # Every nine records, pause 3 seconds so that the API doesn't kick us off...
  if(i %% 9 == 0) Sys.sleep(3)
  #now create a temporary list of useful elements 
  tempdf <- as.list(geoCode(huntaddresses[i,1]))
  #and write these back into our dataframe
  huntaddresses[i,3] <- tempdf[1]
  huntaddresses[i,4] <- tempdf[2]
  huntaddresses[i,5] <- tempdf[4]
}

# rename the columns
names(huntaddresses)  <- c("Location","Points","lat","lon","GoogleAddress")
head(huntaddresses)

#write a new .csv file to your working directory
write_csv(huntaddresses, "huntaddresses.csv")

Now, for whatever reason, sometimes the API doesn’t return all of the addresses (seems to depend on the time of day you run the query!) If you have not managed to get all of the addresses from the first run of this code, then you can download my effort from here:

https://www.dropbox.com/s/2cbu2ux9ddy9c0l/huntaddresses.csv?dl=0

huntaddresses <- read_csv("https://www.dropbox.com/s/2cbu2ux9ddy9c0l/huntaddresses.csv?raw=1")

3 - Tips for Success

a) Tips for the (D) Side

Tools you might need to use: When I say might, I mean almost certainly!

Editor toolbar Spatial Join (one-to-many and one-to-one)
KML to Layer
Buffer
Repair Geometry
Intersect
Dissolve
Project
Select by Attribute

Things you will need to consider:

Think carefully about choosing the appropriate ward boundaries to match your data as there will be several candidates for you to choose from. Don’t use generalised or clipped boundaries and select the appropriate time period
Your data does not contain any information for the wards in the City of London so you will need to either edit your ward boundaries (using the editor toolbar – see the appendix of this document) and merge all of the wards in the city of London into a new City of London object. While your editing session is open, you will also be able to edit the attribute table so that the code for the City of London matches the code in your data.
In order to obtain variable summaries, you will probably need to make use of the ‘Summarize’ facility when right-clicking a field in an attribute table
If you can’t find a tool, use the search facility in Arc
When you carry out a spatial join, think about whether you should use a one-to-one or one-to-many join. You will probably need to use both in this task.
When you carry out a spatial join, you can also specify a merge rule by right-clicking on the field you are interested in. The default is ‘First’ which is not that good for most joins – Mean and Sum will probably be more useful in this task! It might also be useful to not join all fields, but only those of use, deleting others with the ‘X’ button.

b) Tips for the ebels

Convert all of your spatial objects to sf (Simple Features) objects - https://cran.r-project.org/web/packages/sf/ it will make some of the aggregation easier and you will be able to make use of some of the neat geos_uniary functions such as st_buffer() for drawing buffers or the geos_binary_pred functions such as st_intersects for detecting where your points interset polygons, for example.
Use some of these online guides to help you:

http://strimas.com/r/tidy-sf/

https://bookdown.org/robinlovelace/geocompr/

https://cran.r-project.org/web/packages/sf/vignettes/sf1.html

https://cran.r-project.org/web/packages/sf/vignettes/sf2.html

https://cran.r-project.org/web/packages/sf/vignettes/sf3.html

https://cran.r-project.org/web/packages/sf/vignettes/sf4.html

\(~\)

Appendix – a) Merging the City of London Wards into a single zone: D(Arc)side GIS Style

Start ArcMap and add your 2011 Ward boundary data (downloaded from either the UK Data Service or ONS Geoportal – don’t get the Census Merged Wards, just Wards E+W) to a new map document. Or the clean set I’ve put on Moodle…
Right-click on your layer and select Data > Export Data. Save your data as a new file geodatabase feature class and give it an appropriate name – we are doing this as we will be messing around with boundary polygons and it’s good practice not to ruin the original file!
Add your newly saved feature class your map. Open up the attribute table and sort the data by the column WD11CDO (right-click the header column and sort ascending). As the borough code for the City of London is 00AA, all of the Wards within will now be sorted to the top of the file.
Left-click the small box to the left of the first row – this should highlight the first row in blue (and the polygon on the map). Scroll down a few rows until you find the last City of London ward – this should be 00AAGB, Walbrook. Hold down shift and click the small box to the left of this row. All rows above should now be highlighted in blue (and this should show on the map).
We are now ready to dissolve these boundaries into a single City of London zone. If you have not already done so, activate the Editor toolbar: Customize > Toolbars > Editor. On the Editor toolbar, select Start Editing from the dropdown menu.
If you have other layers open in your map, you may have to select the correct layer to edit.
In the editor dropdown menu, ‘Merge’ should now be active. Click Merge. Select the first feature as your merge feature and click ‘OK’.
You should now see on your map that the ward boundaries have dissolved and in your attribute table, only the first row – Aldersgate, should be left.
In your attribute table, Click in the cells for WD11CD, WD11CDO and WD11NM and manually edit the codes so that for this new polygon, WD11CD = E09000001, WD11CDO = 00AA and WD11NM = City of London.
We are now ready to join your London data to your new boundaries
Add your LondonData.xls layer to your map (not the .csv file – this, for some unknown reason might cause problems!). Join the xls data to your newly edited boundary data, keeping only the matching records. Export your new layer as a new geodatabase feature class in a new file geodatabase for this week’s practical.
In the Editor drop down menu, select Save Edits and then Stop Editing.

Appendix – b) Merging the City of London Wards into a single zone: R(ebel) Style

library(sf)
#Download the Ward Boundaries from Moodle
LondonWards <- readOGR("LondonWardsBoundaries/LondonWardsNew.shp")

## OGR data source with driver: ESRI Shapefile 
## Source: "LondonWardsBoundaries/LondonWardsNew.shp", layer: "LondonWardsNew"
## with 649 features
## It has 7 fields

LondonWardsSF <- st_as_sf(LondonWards)

#cut the city out - this just happens to be the first 25 rows, usefully!
city <- LondonWardsSF[1:25,]
city$agg <- 1

#merge all of the boundaries together so we've got a single object
cityagg <- city %>% group_by(city$agg) %>% summarise()
plot(cityagg)

#disolve the ward boundaries and just leave the first one as the city of London
LondonWards_dis <- aggregate(LondonWardsSF, by = cityagg, FUN = first)

#merge them back together into a new object
LondonWards_new <- rbind(LondonWards_dis, LondonWardsSF[26:649,])
plot(LondonWards_new)

#finally, update the codes for the city of London so that they match your 
LondonWards_new[1,2] <- as.character("E09000001")
LondonWards_new[1,3] <- as.character("00AA")
LondonWards_new[1,4] <- as.character("City of London")

Practical 5 - Introduction to Spatial Analysis Methodologies

Adam Dennett

27 October 2017

Task - Analysing the CASA Treasure Hunt!

Team 7’s Route in 2016

I would like you to analyse the route using a variety of spatial analysis functions available in either R or ArcGIS

Questions to Answer

How to tackle the challenge

1 - Choose which side of the force you are on (Dark Side or Light Side):

Or

2 - Download your data

Team 7’s Route

The location of Tube / Rail Stations in London

The treasure hunt locations

R Code for GeoCoding a list of places with no spatial reference:

3 - Tips for Success

a) Tips for the (D) Side

b) Tips for the ebels

Appendix – a) Merging the City of London Wards into a single zone: D(Arc)side GIS Style

Appendix – b) Merging the City of London Wards into a single zone: R(ebel) Style