This report provides insight into Denver’s crime statistics and trends from 2018-2022. Crime has been on the rise in most major cities since the start of the pandemic and Denver is no exception. Crime in the city has risen by roughly 23% since the end of 2019 and if trend continues, 2022 is expected to have the highest number of reported crimes over the past five years. The objective of this report is to identify crime trends, the most frequently reported offenses and crime activity by neighborhood. This report can be utilized by the Denver Police Department to identify neighborhoods and times that are at a higher risk of crime activity and allocate resources accordingly to prevent and suppress future crime. Additionally, this report can be used by Denver citizens to recognize crime trends in the city and make clear the importance of staying vigilant and ensuring they are taking preventative measures to keep them and the people around them, safe.
setwd("/Users/patrickdawson/Documents/Data Visualization")
filename= "Denver Crime//Crime.csv"
df= read.csv("Denver Crime//Crime.csv")
library(lubridate)
library(dplyr)
library(ggplot2)
library(scales)
library(RColorBrewer)
library(ggthemes)
library(ggrepel)
library(plyr)
library(plotly)
library(leaflet)
library(zoo)
library(rmdformats)
ColsToDrop = c("GEO_X","GEO_Y","GEO_LON","GEO_LAT")
df = df[, !names(df) %in% ColsToDrop]
date_time = df$FIRST_OCCURRENCE_DATE
x = mdy_hms(date_time)
df$Year = year(x)
df$Month = month(x)
df$MonthName = months(x, abbreviate = TRUE)
df$Day = weekdays(x)
df$Hour = hour(x)
df$AM_or_PM = substring(df$FIRST_OCCURRENCE_DATE, nchar(as.character(df$FIRST_OCCURRENCE_DATE))-1)
df$Weekend = wday(x) %in% c(1, 7)
ColsToDrop = c("weekend")
df = df[, !names(df) %in% ColsToDrop]
df2 = df[!(df$Year == "2017"),]
The dataset is comprised of over 300,000 reported crimes to the Denver Police Department from January 2018 to August 2022. The original dataset was sourced from a public data archive on Denvergov.org. The dataset includes crime characteristics that include offense ID, occurrence date, reported date, location, and neighborhood. For each entry, the data is assigned to one of 14 unique crime categories such as Theft from Motor Vehicle, Public Disorder, Larceny, etc. The dataset covers a four-and-a-half-year period which provides a large enough sample to identify trends and provide meaningful insight and analysis. The crime dataset was last updated on September 1st, 2022.
The line graph, below, provides insight into the trend of criminal activity, in Denver, year over year. The visualization shows that crime has been trending upwards since the beginning of 2018. The crime rate between 2018 and the end of 2021 has increased by a staggering 21%. The city experienced a decrease in criminal activity by about 2% between 2018 and 2019 but has risen by roughly 23% since the end of 2019 due to a post pandemic crime surge and rise in the city’s population. The data also suggests that criminal activity is seasonal in Denver. The visualization shows the number of reported crimes drops in the winter, then starts to spike in the spring and peaks in the summer months. The largest monthly flux occurred between February 2019 and May 2022 where the difference in number of reported crimes was roughly 3,000.
CrimeTrend = df2 %>%
select(Month, Year) %>%
group_by(Month, Year) %>%
summarise(n = length(Month), .groups = "keep") %>%
data.frame()
CrimeTrend$MonthYear <- as.yearmon(paste(CrimeTrend$Year, CrimeTrend$Month), "%Y %m")
CrimeTrend = CrimeTrend[-c(45),]
hi_lo2 = CrimeTrend %>%
filter(n == min(n) | n == max(n)) %>%
data.frame()
ggplot(CrimeTrend, aes(x = MonthYear, y = n)) +
geom_line(color = "NavyBlue", size=1, group = 1) +
labs(title = "Crimes Reported by Year", x = "Year", y = "Crimes Reported") +
theme_light() +
theme(plot.title = element_text(hjust = .5)) +
scale_y_continuous(labels = comma) +
geom_point(data = hi_lo2, aes(x = MonthYear, y= n), shape =21, size =3, fill = 'white', color ='NavyBlue') +
geom_label_repel(aes(label = ifelse(n == min(n), scales::comma(n), "")), box.padding = 1.7, point.padding = .5, size =4, color ='Black', segment.color = 'Black') +
geom_label_repel(aes(label = ifelse(n == max(n), scales::comma(n), "")), box.padding = 1, point.padding = .5, size =4, color ='Black', segment.color = 'Black')
The figure below shows the top 10 crime offenses reported in the Denver area between 2018 and 2022. As we can see, Theft from Motor Vehicle was the most reported offense between this period, with over 50,000 thefts reported. Public Disorder and Larceny were the second and third most reported crimes with over 90,000 occurrences between them. The All-Other Crimes category is high on the list as well, this category consists of criminal activities such as Trespassing, Vehicle Eluding Police, Violation of Restraining Order, and Unlawful Possession of Weapon; these criminal activities were reported more than 45,000 times. Auto Theft rounds out the top 5 highest crimes reported with over 42,000 occurrences. Theft from a Motor Vehicle and Auto Theft make up roughly a third of all reported crimes in the city. This indicates that vehicles are an easy target for criminals. This type of criminal activity should be a top concern for law enforcement and civilians. To prevent break ins, car owners, in the Denver area, should take preventive measures such as keeping car doors locked, securing car keys, removing any valuables, and parking in a well-lit, secure area.
CrimeCount = data.frame(dplyr::count(df2, OFFENSE_CATEGORY_ID))
CrimeCount = CrimeCount[order(CrimeCount$n, decreasing = TRUE),]
top_10crimes = CrimeCount$OFFENSE_CATEGORY_ID[1:10]
agg_crime10 = df2 %>%
filter(OFFENSE_CATEGORY_ID %in% top_10crimes) %>%
select(OFFENSE_CATEGORY_ID) %>%
group_by(OFFENSE_CATEGORY_ID) %>%
summarise(n = length(OFFENSE_CATEGORY_ID), .groups = "keep") %>%
data.frame()
agg_crime10 = agg_crime10[order(agg_crime10$n, decreasing = TRUE),]
Top10CrimeIDbyYear = df2 %>%
filter(OFFENSE_CATEGORY_ID %in% top_10crimes) %>%
select(OFFENSE_CATEGORY_ID, Year) %>%
group_by(OFFENSE_CATEGORY_ID, Year) %>%
dplyr::summarise(n = length(OFFENSE_CATEGORY_ID), .groups = "keep") %>%
data.frame()
Top10CrimeIDbyYear2 = df2 %>%
filter(OFFENSE_CATEGORY_ID %in% top_10crimes) %>%
select(OFFENSE_CATEGORY_ID, Year) %>%
group_by(OFFENSE_CATEGORY_ID, Year) %>%
dplyr::summarise(n = length(OFFENSE_CATEGORY_ID), .groups = "keep") %>%
data.frame()
max_y = round_any(max(agg_crime10$n), 10000, ceiling)
Top10CrimeIDbyYear2$Year = as.factor(Top10CrimeIDbyYear2$Year)
ggplot(Top10CrimeIDbyYear2, aes( x = reorder(OFFENSE_CATEGORY_ID, n, sum), y = n, fill = OFFENSE_CATEGORY_ID)) +
geom_bar(stat="identity", position = position_stack(reverse = TRUE)) +
coord_flip() +
labs(title = "Top 10 Reported Crimes", x = "", y = "Number of Crimes", fill = "Offense ID") +
theme_light() +
theme(legend.position = "none") +
theme(plot.title = element_text(hjust = 0.5)) +
scale_fill_brewer(palette = "Paired", guide = guide_legend(reverse = TRUE)) +
geom_text(data = agg_crime10, aes(x = OFFENSE_CATEGORY_ID, y = n, label = scales::comma(n), fill = NULL), hjust = -0.1, size = 4) +
scale_y_continuous(labels = comma, breaks = seq(0, max_y, by = 10000), limits = c(0, max_y))
The line graph, below, shows crime occurrence by day of the week, broken out by year. The graph indicates that over the four-and-a-half-year period, the most crime occurs on Fridays in Denver. In 2021, roughly 1,000 more crimes occurred on Friday than any other day of the week. In most years, Sunday through Tuesday saw the least amount of crime occur. On average crimes occurred, during this three-day period, 3% less than Thursday through Saturday. The rise in crime between Thursday and Saturday can likely be linked to an increase in people outside and an increase in tourism, on the weekends, which creates more opportunities for crime to occur.
CrimeDay = df2 %>%
select(FIRST_OCCURRENCE_DATE) %>%
mutate(Year = year(mdy_hms(FIRST_OCCURRENCE_DATE)), Day = weekdays(mdy_hms(FIRST_OCCURRENCE_DATE), abbreviate = TRUE)) %>%
group_by(Year, Day) %>%
dplyr::summarise(n = length(Year), .groups = 'keep') %>%
data.frame()
CrimeDay$Year = as.factor(CrimeDay$Year)
day_order = factor(CrimeDay$Day, level = c('Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun'))
ggplot(CrimeDay, aes(x = day_order, y = n, group = Year)) +
geom_line(aes(color = Year), size=2) +
labs(title = "Crimes Reported by Day and Year", x = "Day of Week", y = "Crimes Reported") +
theme_light() +
theme(plot.title = element_text(hjust = .5)) +
geom_point(shape =21, size=4, color="black", fill = "white") +
scale_y_continuous(labels = comma) +
scale_color_brewer(palette = "Paired", guide = guide_legend(reverse = TRUE))
The line graph below plots the number of criminal offenses by hour. This visualization indicates that most crime takes place during daylight hours with roughly 60% of crimes occurring between 6am and 6 p.m. The number of crime occurrences increase hourly from 6 a.m. through the afternoon, peaking at 5 p.m. and then dropping to a low point at 5 a.m. The data also indicates that more than 22% of all crimes occur within a 4-hour period between 4 p.m. and 7 p.m. In comparison, only 8% of all crime occurs within a same 4-hour period between 3 a.m. and 6 a.m.
CrimeHour = data.frame(count(df2, "Hour"))
CrimeHour$Hour = as.factor(CrimeHour$Hour)
hi_lo = CrimeHour %>%
filter(freq == min(freq) | freq == max(freq)) %>%
data.frame()
hour_order = factor(CrimeHour$Hour, level = c('6', '7', '8', '9', '10', '11', '12','13', '14', '15', '16', '17', '18', '19',
'20', '21', '22', '23', '0', '1', '2','3','4','5'))
ggplot(CrimeHour, aes(x = hour_order, y = freq)) +
geom_line(color = 'black', size = 1, group = 1) +
geom_point(shape = 21, size = 4, color ='NavyBlue', fill = 'white') +
labs(x ="Hour", y= "Crime Reported", title = "Crime Reported by Hour") +
scale_y_continuous(labels = comma) +
theme_light() +
theme(plot.title = element_text(hjust = 0.5)) +
geom_point(data = hi_lo, aes(x = Hour, y= freq), shape =21, size =2, fill = 'NavyBlue', color ='NavyBlue') +
geom_label_repel(aes(label = ifelse(freq == max(freq) | freq == min(freq), scales::comma(freq), "")), box.padding = 1.5, point.padding = 1, size =4, color ='Black', segment.color = 'Black')
The graph, below, represents the Denver neighborhoods that have experienced the highest amount of crime activity between 2018 and 2022. Based on this data visualization, Five Points and Central Park have experienced the highest amount of crime during this period with over 33,000 crimes reported. Capitol Hill and Central Business District have also experienced a significant amount of crime. On average 3,000 crimes are reported in these neighborhoods each year. The significant crime rate in these neighborhoods can be correlated to their location in the city; most of these neighborhoods are in highly populated areas in downtown Denver.
NeighbCrime= data.frame(count(df2, "NEIGHBORHOOD_ID"))
NeighbCrime = NeighbCrime[order(NeighbCrime$freq, decreasing = TRUE),]
top_10NeighbCrime = NeighbCrime$NEIGHBORHOOD_ID[1:10]
Top10NeighbCrime = df2 %>%
filter(NEIGHBORHOOD_ID %in% top_10NeighbCrime) %>%
select(Year, NEIGHBORHOOD_ID) %>%
group_by(NEIGHBORHOOD_ID, Year) %>%
dplyr::summarise(n = length(NEIGHBORHOOD_ID), .groups = "keep") %>%
data.frame()
Top10NeighbCrime$Year = as.factor(Top10NeighbCrime$Year)
ggplot(Top10NeighbCrime, aes( x = reorder(NEIGHBORHOOD_ID, n, sum), y = n, fill = Year)) +
geom_bar(stat="identity", position = position_stack(reverse = TRUE)) +
coord_flip() +
labs(title = "Denver Neighborhoods with Highest Crime Rate", x = "", y = "Number of Crimes", fill = "Year") +
theme_light() +
theme(plot.title = element_text(hjust = 0.5)) +
scale_fill_brewer(palette = "Dark2", guide = guide_legend(reverse = TRUE)) +
scale_y_continuous(labels = comma)
The bar graph below provides insight into where crimes, across the city, were reported the least. These neighborhoods include Wellshire, Indian Creek, Country Club and Rosedale. The data suggests that these four neighborhoods are considered Denver’s safest areas making up less than 1% of the city’s total reported crime. Wellshire has an astonishing low crime rate with an average of less than 100 crimes reported per year. It should be noted that most of these neighborhoods are located in the suburbs of Denver, outside of the metropolitan area.
bottom_10NeighbCrime = NeighbCrime$NEIGHBORHOOD_ID[70:79]
Bottom10NeighbCrime = df2 %>%
filter(NEIGHBORHOOD_ID %in% bottom_10NeighbCrime) %>%
select(Year, NEIGHBORHOOD_ID) %>%
group_by(NEIGHBORHOOD_ID, Year) %>%
dplyr::summarise(n = length(NEIGHBORHOOD_ID), .groups = "keep") %>%
data.frame()
Bottom10NeighbCrime$Year = as.factor(Bottom10NeighbCrime$Year)
ggplot(Bottom10NeighbCrime, aes( x = reorder(NEIGHBORHOOD_ID, n, sum), y = n, fill = Year)) +
geom_bar(stat="identity", position = position_stack(reverse = TRUE)) +
coord_flip() +
labs(title = "Denver Neighborhoods with Lowest Crime Rate", x = "", y = "Number of Crimes", fill = "Year") +
theme_light() +
theme(plot.title = element_text(hjust = 0.5)) +
scale_fill_brewer(palette = "Dark2", guide = guide_legend(reverse = TRUE)) +
scale_y_continuous(labels = comma)
In conclusion, crime has been trending upwards in Denver over the past two years. Crime in the city increased by roughly 23% since the end of 2019 due primarily to a post pandemic crime surge and rise in the city’s population. A third of all crimes, reported in the city, are linked to Theft from a Vehicle and Auto Theft. The data indicates that most crimes occur during daylight hours with roughly 60% of crimes occurring between 6am and 6 p.m. Lastly, the data suggests that the neighborhoods located in downtown Denver experience more crime than neighborhoods in the suburbs. To combat the rise in crime, Denver law enforcement should utilize data to identify areas and times that are at a higher risk of crime activity. Understanding the data will allow police departments to strategically position personnel and make faster, more accurate decisions. Denver citizens also play a major part in suppressing crime throughout the city. They should ensure they remain vigilant and are taking proper preventative measures to keep them and others, safe.