In 1854 a Cholera epidemic haunts London. By then common theory is that Cholera is transmitted by air. English physician John Snow doubts this theory and is maybe the first one to perform geospatial analysis. He plot locations of Cholera-deaths and pumps and finds the causal link between water and Cholera. We will take a look at his results and learn how to use leaflet package.
You need to download data of Cholera Pumps & Deaths from the link in Bibliography at the end of this article and save .csv file locally.
# Import Data
deaths_pumps <- read.csv("./R_ultimate/data/Cholera Pumps & Deaths.csv")
For our tutorial we need two packages:
leaflet for geospatial plotting
dplyr for data preparation
head(deaths_pumps)
## count geometry
## 1 3 <Point><coordinates>-0.13793,51.513418</coordinates></Point>
## 2 2 <Point><coordinates>-0.137883,51.513361</coordinates></Point>
## 3 1 <Point><coordinates>-0.137853,51.513317</coordinates></Point>
## 4 1 <Point><coordinates>-0.137812,51.513262</coordinates></Point>
## 5 4 <Point><coordinates>-0.137767,51.513204</coordinates></Point>
## 6 2 <Point><coordinates>-0.137537,51.513184</coordinates></Point>
library(leaflet)
library(dplyr)
geometry variable in deaths_pumps requires an extraction of longitude and latitude coordinates. Some irrelevant information is removed with gsub. Then, longitude and latitude information is extracted with strsplit, unlist and as.numeric. Coordinates (long and lat) are stored in one long vector coords. But they are easy to split: longitudes are small (below zero) and latitudes above 50.
Then, dataframe deaths_pumps is separated into deaths and pumps. Deaths are indicated by a count above 0 and pumps by a count below 0.
deaths_pumps$geometry <- gsub("<Point><coordinates>", "", deaths_pumps$geometry)
deaths_pumps$geometry <- gsub("</coordinates></Point>", "", deaths_pumps$geometry)
coords <- deaths_pumps$geometry %>% strsplit(., ",") %>% unlist() %>% as.numeric()
deaths_pumps$long <- coords[coords < 2]
deaths_pumps$lat <- coords[coords > 50]
deaths <- deaths_pumps %>% filter (count > 0)
pumps <- deaths_pumps %>% filter (count < 0)
#we need to specify what our median location is because our map always needs to have some center point, for this we just take all the observations that we have referring to this object 'deaths' and we are calculating the median longitude and the median latitude.
median_location <- data.frame(long = median(deaths$long),
lat = median(deaths$lat))
We create an object lf which will include all our information on the plot. First, we apply function leaflet and make extensive use of piping operator to load further information. We use Stamen.Toner to get the more classical black and white look. With setView we define the center and zoom of our map. We add circles for deaths and pumps.
Deaths are marked in red. With increasing death count radius increases. Pumps are marked in green.
lf = leaflet() %>%
addProviderTiles(providers$Stadia.StamenToner) %>% #map style, here bl&white, other ways include classical Google maps style
setView(lng = median_location$long, lat = median_location$lat, zoom = 17) %>% #we need to center our view with the function setView with 2 parameters lng and lat referring to our median location.
addCircles(lng = deaths$long, lat = deaths$lat, radius = deaths$count*2,
stroke = F, color = 'red', fillOpacity = 0.8, popup = paste("Deaths:", deaths$count)) %>%
addCircles(lng = pumps$long, lat = pumps$lat, radius = 2,
color = 'green', fillOpacity = 1, popup = "Pump")
lf
With this visualisation Snow was able to detect certain hotspots. In Broadwick Street (then: Broad Street) there is a pump and many Cholera-related deaths, while in other regions there are pumps and no deaths at all. Based on this he was able to refuse theory of air-borne Cholera transfer. He put his theory to test by disabling pump in Broad Street and saved many lifes.
We learned how easy it is to reproduce his results with leaflet.
John Snow Wikipedia https://en.wikipedia.org/wiki/John_Snow
Cholera Pumps and Deaths Data https://fusiontables.google.com/DataSource?docid=147wlDisDp6NnpNxHQpbnjAQ-iW4dR2MAmFdQxYc#rows:id=1