I finally signed the lease for my future appartment. Now it’s time to see whether my new neighborhood is safe or not !
I used this dataset, which contains information about breaking & entering in Montreal.
Read the data and group it by PDQ (police station):
library(readr)
library(dplyr)
library(maptools)
library(rgeos)
library(ggplot2)
crime_df <- read_csv("donneesouvertes-citoyens.csv")
crime_grouped_df <- crime_df %>% group_by(PDQ) %>%
summarise(N = n())
Download the corresponding shapefile from here.
We can then read and clean it:
sh <- readShapePoly("Limites_PDQ_2016_Lat_Long.shp")
sh@data$NOM_LIEU <- parse_number(sh@data$NOM_LIEU)
Then we merge crime data with shapefile data:
sh@data <- sh@data %>% left_join(crime_grouped_df, by=c("NOM_LIEU"="PDQ"))
Finally we can plot:
sh@data$ntile <- ntile(sh@data$N, 4)
sh@data$frequency <- sh@data$N / 15
colors = c("khaki2", "goldenrod2", "darkorange1", "firebrick2")
plot_colors <- colors[sh@data$ntile]
plot(sh, col=plot_colors)
title("Montreal map: breaking & entering", sub="Data: from 201501 to 201603 \n
(source: http://donnees.ville.montreal.qc.ca/dataset/actes-criminels)",
cex.sub=0.8)
legend("topleft", legend=c("Very low risk", "Low risk", "Medium risk", "High risk"),
fill=colors, cex=0.75)
centroids <- gCentroid(sh, byid=TRUE)
indexes <- sh@data$NOM_LIEU %in% c(23, 26, 38, 44)
text(centroids$x[indexes], centroids$y[indexes], sh@data$NOM_LIEU[indexes], cex=0.6)
Here is the meaning of the risk levels:
ggplot(data=sh@data %>% na.omit(), aes(factor(ntile), frequency)) +
geom_boxplot(fill="lavenderblush") +
labs(x="Risk level", y="Monthly incident frequency",
title="Risk levels in terms of monthly frequency of breaking & entering") +
scale_x_discrete(labels=c("Very low risk", "Low risk", "Medium risk", "High risk"))
As for the red areas on the map, they are:
Oh and in case you’re wondering…my neighborhood is “Very low risk” :)