Introduction

Police in Britain not only register every single crime they encounter, and include coordinates, but also distribute their data free on the web. For this project, data downloaded from all forces for May and June 2014 for the whole Britain is used.

Data Preparation

The original dataset provided by the British Police is in csv format. The data scientist Fabio Veronesi made the data available on his site and analyses this data in this blog post.

dat <- read.csv("http://www.fabioveronesi.net/Blog/2014-05-metropolitan-street.csv")

Here is the structure of the dataset:

str(dat)
## 'data.frame':    79832 obs. of  12 variables:
##  $ Crime.ID             : Factor w/ 55285 levels "","0000782cea7b25267bfc4d22969498040d991059de4ebc40385be66e3ecc3c73",..: 1 1 1 1 1 2926 28741 19664 45219 21769 ...
##  $ Month                : Factor w/ 1 level "2014-05": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Reported.by          : Factor w/ 1 level "Metropolitan Police Service": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Falls.within         : Factor w/ 1 level "Metropolitan Police Service": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Longitude            : num  0.141 0.137 0.14 0.136 0.135 ...
##  $ Latitude             : num  51.6 51.6 51.6 51.6 51.6 ...
##  $ Location             : Factor w/ 20462 levels "No Location",..: 15099 14596 1503 1919 12357 1503 8855 14060 8855 8855 ...
##  $ LSOA.code            : Factor w/ 4864 levels "","E01000002",..: 24 24 24 24 24 24 24 24 24 24 ...
##  $ LSOA.name            : Factor w/ 4864 levels "","Barking and Dagenham 001A",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ Crime.type           : Factor w/ 14 levels "Anti-social behaviour",..: 1 1 1 1 1 3 3 5 7 7 ...
##  $ Last.outcome.category: Factor w/ 23 levels "","Awaiting court outcome",..: 1 1 1 1 1 21 8 21 8 8 ...
##  $ Context              : logi  NA NA NA NA NA NA ...

This dataset provides a series of useful information regarding the crime: its locations (longitude and latitude in degrees), the address (if available), the type of crime and the court outcome (if available). For this project, only the coordinates and the type of crime are used.

For some incidents the coordinates are not provided, and are removed from the data:

dat <- dat[!is.na(dat$Longitude)&!is.na(dat$Latitude),]

This eliminates 870 entries from the file, thus data now has 78,962 rows.

Mapping Clusters

Since there are so many points on a map that it doesn’t make sense to plot every marker. So, plots of clusters of markers with addMarkers(clusterOptions = markerClusterOptions()) are used. You can zoom in to each cluster, the clusters will separate until you can see the individual markers. The Crime Type is added as a popup for each marker. Click on any blue marker to determine what crime was committed.

library(leaflet)
## Warning: package 'leaflet' was built under R version 3.2.5
dat %>% 
  leaflet() %>% 
  addTiles() %>% 
  addMarkers(popup=dat$Crime.type , clusterOptions=markerClusterOptions())
## Assuming 'Longitude' and 'Latitude' are longitude and latitude, respectively

Conclusion

In this project open crime data was displayed using Leaflet, one of the most popular Javascript libraries for creating interactive maps. However, in only a few lines of code, the leaflet R package allowed the creation of my own leaflet map without needing to know any Javascript!