library(tidyverse)
library(leaflet)
The dataset we will be using is publically available on their website.
The description on their website reads:
Crime incident reports are provided by Boston Police Department (BPD) to document the initial details surrounding an incident to which BPD officers respond. This is a dataset containing records from the new crime incident report system, which includes a reduced set of fields focused on capturing the type of incident as well as when and where it occurred. Records in the new system begin in June of 2015.
For this project we will be using leaflet to create an interactive map using the dataset to see where in Boston these crimes occured.
dat <- read.csv("https://data.boston.gov/dataset/6220d948-eae2-4e4b-8723-2dc8e67722a3/resource/313e56df-6d77-49d2-9c49-ee411f10cf58/download/tmphpcrbasd.csv")
str(dat)
## 'data.frame': 23403 obs. of 17 variables:
## $ INCIDENT_NUMBER : chr "S87066666" "225520077" "222924960" "222648862" ...
## $ OFFENSE_CODE : int 3301 3115 3301 3831 724 301 619 3126 801 611 ...
## $ OFFENSE_CODE_GROUP : logi NA NA NA NA NA NA ...
## $ OFFENSE_DESCRIPTION: chr "VERBAL DISPUTE" "INVESTIGATE PERSON" "VERBAL DISPUTE" "M/V - LEAVING SCENE - PROPERTY DAMAGE" ...
## $ DISTRICT : chr "B2" "D14" "C11" "B2" ...
## $ REPORTING_AREA : int 300 786 355 288 200 NA 778 NA 235 77 ...
## $ SHOOTING : int 0 0 0 0 0 0 0 0 0 0 ...
## $ OCCURRED_ON_DATE : chr "2022-04-07 19:30:00" "2022-02-02 00:00:00" "2022-04-09 16:30:00" "2022-02-05 18:25:00" ...
## $ YEAR : int 2022 2022 2022 2022 2022 2022 2022 2022 2022 2022 ...
## $ MONTH : int 4 2 4 2 1 3 2 3 2 2 ...
## $ DAY_OF_WEEK : chr "Thursday" "Wednesday" "Saturday" "Saturday" ...
## $ HOUR : int 19 0 16 18 0 13 12 10 22 10 ...
## $ UCR_PART : logi NA NA NA NA NA NA ...
## $ STREET : chr "THORNTON PLACE" "WASHINGTON ST" "GIBSON ST" "WASHINGTON ST" ...
## $ Lat : num 0 42.3 42.3 42.3 42.3 ...
## $ Long : num 0 -71.1 -71.1 -71.1 -71.1 ...
## $ Location : chr "(0, 0)" "(42.34308127134165, -71.14172267328729)" "(42.29755532959655, -71.05970910242573)" "(42.329748204791635, -71.08454011649543)" ...
It’s noted that there are a few missing values for Lat and Long, so we will want to get rid of those to not confuse leaflet!
dat_clean <- dat[!(dat$Lat==0 | dat$Long==0),]
dat_clean <- dat_clean[!is.na(dat_clean$Long)&!is.na(dat_clean$Lat),]
data_cleandat_clean %>%
leaflet() %>%
addTiles() %>%
addMarkers(popup=dat$Crime.type , clusterOptions=markerClusterOptions())
It looks like Downtown Boston, as well as South End contain the highest concentration of reported crimes.