PNP Crime Data

The PNP publishes some crime reports on the Bantay Krimen website, which I have downloaded and loaded into R. These come from NCR only.

library(rjson)
y <- fromJSON(file="~/projects/pnp-crime-stats/data/crime_stats.json")
df <- data.frame(matrix(unlist(y), nrow = length(y), byrow=T))
names(df) <- names(y[[1]])
df$date2 <- as.Date(df$date)

The data represents a total of 3298 incidents between 2015-07-21 and 2016-06-01. For each incidence we have the following information:

lng, lat, date, customdate, time, customtime, crime, location, region, province, modus, crimetype, station, moduscode, date2

All Crimes by Type:

How much is each crime represented in the overall dataset? Here we graph the count for each.

suppressMessages(library(ggvis))
suppressMessages(library(dplyr))

df %>% 
  group_by(crime) %>% 
  summarize(count = n()) %>% 
  ggvis(~crime, ~count) %>% 
  layer_bars() %>% 
  add_axis("x", properties = axis_props(
    labels = list(angle = 45, align = "left", fontSize = 12)
  ))

There are four types of crimes represented in the database:
1. ANTI-CARNAPPING ACT (R.A. 6539) MC (Motorcycle Napping)
2. ANTI-CARNAPPING ACT (R.A. 6539) MV (Carnapping) 3. ROBBERY 4. THEFT

Mapping Crime Date

We can map the crimes as points using the leaflet package. We’ll add a color coding by the four crime types available, and add a popup tooltip to decribe the crime using the modus text variable.

library(leaflet)
pal <- colorFactor(c("#000000", "#0000AA", "#AA0000", "#00AA00"), NULL, n = 4)
df$lat <- as.numeric(as.character(df$lat))
df$lng <- as.numeric(as.character(df$lng))
leaflet(df) %>% 
  addTiles() %>% 
  addCircleMarkers(~lng,
                   ~lat,
                   color = ~pal(crime),
                   radius = 3,
                   popup = ~crime)

It may be useful to know where crimes are occuring. We can use the default clustering options from leaflet to group points by zoom level.

leaflet(df) %>% 
  addTiles() %>% 
  addCircleMarkers(~lng,
                   ~lat,
                   color = ~pal(crime),
                   radius = 3,
                   popup = ~crime,
                   clusterOptions = markerClusterOptions())

The above clustering is a great start, but leaves a bit to be desired. I would like to know where specific types of crimes occur, for example, to avoid parking in a car jacking hotspot.