After I have joined the University of Chicago in Spring 2020, I deciced to explore the number of crimes around the campus.

The necessary data was downloaded from the City of Chicago Data Portal for 2020 only.

# Install packages 
library(tidyverse) 
library(leaflet)
library(leaflet.extras) 
library(RColorBrewer)
library(forcats)
library(ggthemes)

# Read data as RDS
crimes <- readRDS(file = "crimes2020.rds")

# Select some columns
names(crimes)
##  [1] "ID"                   "Case Number"          "Date"                
##  [4] "Block"                "IUCR"                 "Primary Type"        
##  [7] "Description"          "Location Description" "Arrest"              
## [10] "Domestic"             "Beat"                 "District"            
## [13] "Ward"                 "Community Area"       "FBI Code"            
## [16] "X Coordinate"         "Y Coordinate"         "Year"                
## [19] "Updated On"           "Latitude"             "Longitude"           
## [22] "Location"
# Select necessary columns
crimes <- crimes %>% select(Date, Block, `Primary Type`, Description, `Location Description`, 
                            Arrest, Domestic, District, Ward, `Community Area`, Latitude, Longitude)

# Rename some columns
crimes <- crimes %>% 
          rename(Type = `Primary Type`, Location = `Location Description`, Community = `Community Area`)

I decided to analyze three areas around the campus - Hyde Park, Washington Park, and Woodlawn.

According to Wikipedia, Hyde Park community area has number 41, Washington Park is number 40, and Woodlawn is number 42.

1469 crimes were registered in three communities during the first quarter of 2020 and some of them did not have geolocation.

crimes <- crimes %>% filter(Community == 40 | Community == 41 | Community == 42) 
dim(crimes)
## [1] 1469   12
# Check NAs in Longitude and Latitude
sum(is.na(crimes$Longitude))
## [1] 13
sum(is.na(crimes$Latitude))
## [1] 13
# In total 13 crimes do not have exact location records.

We can map every registered crime as a map and a heatmap.

# Create a popup 
my_popup <- paste0("<br><strong>Date: </strong>", 
                   crimes$Date,
                   "<br><strong>Block: </strong>", 
                   crimes$Block,
                   "<br><strong>Primary Type: </strong>", 
                   crimes$Type,
                   "<br><strong>Description: </strong>", 
                   crimes$Domestic,
                   "<br><strong>Location Description: </strong>", 
                   crimes$Location,
                   "<br><strong>Arrest: </strong>", 
                   crimes$Arrest,
                   "<br><strong>Domestic: </strong>", 
                   crimes$Domestic,
                   "<br><strong>District: </strong>", 
                   crimes$District,
                   "<br><strong>Ward: </strong>", 
                   crimes$Ward,
                   "<br><strong>Community Area: </strong>", 
                   crimes$Community,
                   "<br><strong>Latitude: </strong>", 
                   crimes$Latitude,
                   "<br><strong>Longitude: </strong>", 
                   crimes$Longitude)

# Create a map 
leaflet(crimes) %>% 
  addProviderTiles(providers$Esri) %>%
  addMarkers(~Longitude, ~Latitude, popup = ~my_popup, clusterOptions = markerClusterOptions()) 
# Create a heatmap
leaflet(crimes) %>% 
  addProviderTiles(providers$Esri) %>%
  addHeatmap(lng = ~na.omit(Longitude), lat = ~na.omit(Latitude), 
             intensity = ~Type, 
             blur = 20, max = 0.05, radius = 15)

If you zoom the first map, you can see the exact location and all additional information about every registered crime. The campus itself looks safe, in my opinion.

Which crimes are the most frequent around UChicago?

# Sort factors to order them by count
crimes <- within(crimes, type <- factor(Type, levels = names(sort(table(Type), decreasing = TRUE))))

# Visualize crimes by type in the area
ggplot(crimes, aes(x=reorder(Type, Type, function(x)-length(x)), fill = Type)) + geom_bar() + 
  coord_flip() + 
  theme(legend.position = "none") +
  theme(axis.title.y=element_blank()) +
  ggtitle("Crimes in Hyde Park, Washington Park, and Woodlawn") +
  labs(subtitle = "Source: City of Chicago Data Portal")