1 Introduction

In her article “Denver’s violent crime is on the rise”“, Allison Sherry writes that although Denver crime rates have remained relatively flat between 2017 and 2018, the rate of violent crimes, such as homicides, rapes, and robberies, have risen. In January, the Denver Police Department reported that offenses such as burglary and arson have fallen but drug, narcotics, and illegal possesion of weapon violations have increased. According to Denver Police Chief Paul Pazen, more subject stops and traffic stops (reffered to as proactive policing) has show to lower overall crime rates.

In the fall of 2012 Washington and Colorado became the first U.S. states to legalize canabis for recreational use. Recreational sales of marijuana started on January 1st, 2014. Sales rose to significant levels in 2015 and some law-makers question whether marijuana is the cause of the crime increase.

1.1 Purpose

The City and County of Denver, Colorado has made some data about their police stops public. This is a large, well-defined data set. The list of potential insights one could gather from these data are endless. The goal of this analysis is to validate claims made about drugs and crime in Denver using data analysis.

1.2 The Data

The data for this analysis comes from Denver Crime Data, a publicly available dataset.

This dataset includes criminal offenses in the City and County of Denver for the previous five calendar years plus the current year to date. The data is based on the National Incident Based Reporting System (NIBRS) which includes all victims of person crimes and all crimes within an incident. The data is dynamic, which allows for additions, deletions and/or modifications at any time, resulting in more accurate information in the database. Due to continuous data entry, the number of records in subsequent extractions are subject to change. Crime data is updated Monday through Friday.

1.3 Research Questions

We’ll investigate the claims outlined in the January article referenced above.

  • Denver crime rates have remained relatively flat
  • Violent crime rate has risen
  • Burglary and arson have fallen
  • Drug, narcotics, and illegal possesion violations have increased

Lastly, we’ll investigate whether marijuana is could be the cause of the increased crime rate in Denver.

2 Data Pre-processing

2.1 Load Packages

if (!require(pacman)) install.packages("pacman")
p_load(tidyverse, data.table, lubridate, scales, knitr, kableExtra, sf, leaflet, rgdal, raster)

# Set default ggplot theme
theme_set(theme_minimal())

2.2 Load Crime Data

# Crimes
## Define Connection
crime_conn <- url("https://www.denvergov.org/media/gis/DataCatalog/crime/csv/crime.csv", "rb")
## Read data
crime <- read_csv(crime_conn, 
                  col_types =  cols(INCIDENT_ID = col_character(),
                                    OFFENSE_ID = col_character(),
                                    OFFENSE_CODE = col_character(),
                                    OFFENSE_CODE_EXTENSION = col_character(),
                                    OFFENSE_TYPE_ID = col_character(),
                                    OFFENSE_CATEGORY_ID = col_character(),
                                    FIRST_OCCURRENCE_DATE = col_datetime(format = "%m/%d/%Y %H:%M:%S %p"),
                                    LAST_OCCURRENCE_DATE = col_datetime(format = "%m/%d/%Y %H:%M:%S %p"),
                                    REPORTED_DATE = col_datetime(format = "%m/%d/%Y %H:%M:%S %p"),
                                    INCIDENT_ADDRESS = col_character(),
                                    GEO_X = col_integer(),
                                    GEO_Y = col_integer(),
                                    GEO_LON = col_double(),
                                    GEO_LAT = col_double(),
                                    DISTRICT_ID = col_character(),
                                    PRECINCT_ID = col_character(),
                                    NEIGHBORHOOD_ID = col_character(),
                                    IS_CRIME = col_logical(),
                                    IS_TRAFFIC = col_logical()
                                    )
                  ) %>% 
  filter(REPORTED_DATE < as.Date("2019-04-01"))
# Close Connection
close.connection(crime_conn)

3 Data Contents

3.1 Summary

At the time of this analysis the dataset contains 471137 observations of offenses reported from January 02, 2014 to March 31, 2019. 74.0% are categorized as a crime and 26.1% are categorized as traffic incidents. It’s shocking that there are so many criminal offenses in Denver, however this is an aggregated statistic - we will investigate how criminal and traffic frequency has changed over time.

3.2 Variables

Some of these variables are not as intuitive as you’d think. After spending some time on Kaggle I discovered Niel Oza’s Kernel: he reached out to the city of Denver and got some clarification for several columns. His descriptions are below:

  • OFFENSE_ID is a unique identifier for each offense. It is generated by concatanating INCIDENT_ID, OFFENSE_CODE, and OFFENSE_CODE_EXTENSION. It provides a unique identifier for each offense
  • INCIDENT_ID is an identifier for an occurence of offenses. Most OFFENSE_ID’s have unique INCIDENT_ID’s, but when a person commits multiple offenses at once, e.g. liquor possession and heroine possession, multiple OFFENSE_ID’s will be generated from the INCIDENT_ID
  • OFFENSE_CODE is a unique identifier for a particular type of offense. Things such as criminal mischief, trespassing, larceny, etc. all have different OFFENSE_CODE values to identify them
  • OFFENSE_CODE_EXTENSION are used to describe a subset of another type crime. For example criminal_mischief-motor vehicle and criminal_mischief-other have the same OFFENSE_CODE but different extensions to differentiate them
  • OFFENSE_TYPE_ID provides the basic name for the offense. Each combination of OFFENSE_CODE and OFFENSE_EXTENSION reference a unique crime. Contents of this column include things such as theft-shoplift, criminal-trespassing, and threats-to-injure
  • OFFENSE_CATEGORY_ID provides a more general categorization for crimes. For example, theft-shoplift and theft-from-bldg are both forms of larceny
  • FIRST_OCCURENCE_DATE is the first possible date/time of the offense. If the time of the offense is known, the LAST_OCCURENCE_DATE will have value NaN. If the time is not known, FIRST_OCCURENCE_DATE will note the first possible time for the offense, and LAST_OCCURENCE_DATE will be last possible time of the offense. This commonly occurs with burglaries, where the exact time of the offense may not be known, but a range of time is known
  • LAST_OCCURENCE_DATE will be NaN if the exact time of the offense is known and will be an actual time if only a range of possible times is known. In the latter case, it will be the last possible time the offense could have occured.
  • REPORTED_DATE is the time at which the offense was reported to the police
  • INCIDENT_ADDRESS provides the location of the offense. Not all entries have a value for this column for privacy reasons
  • GEO_LON and GEO_LAT are the latitudes and longitudes of the location of the offense
  • GEO_X and GEO_Y are the state plane (city of Denver standard projection) for the offense location. Functionally simlar to GEO_LON and GEO_LAT
  • DISTRICT_ID is the district in charge of handling the offense
  • PRECINCT_ID is the precinct in charge of handling the offense
  • NEIGHBORHOOD_ID is the neighborhood the offense occurred in
  • IS_CRIME states whether the offense was a crime
  • IS_TRAFFIC states whether the offense was a traffic incident

4 Data Exploration

4.1 Crime vs. Traffic Offenses

not-Traffic Traffic
non-Criminal 0 122656
Crimnal 348235 246

The IS_CRIME and IS_TRAFFIC fields are used to classify observations into 3 main categories: non-criminal traffic, non-traffic related criminal, and criminal traffic offenses. Criminal-traffic cases are least common, whereas non-criminal traffic offenses make up close to 75% of our observations.

4.2 Incident vs. Offense

The relationship between incidents and offenses is one-to-many. Multiple offenses ID’s will be generated from the incident ID when there are multiple crimes committed. The majority of observations have a one-to-one relationship. 6.5365276 of incidents have multiple offenses, so we will consider each observation as it’s own crime. In following analyses it would be beneficial to investigate the crimes with multiple offenses to determine violations committed together.

4.4 Overall Offense Counts

Let’s look at the top category and type of offenses. The offense type is derivative of the offense category.

Traffic accidents are incidentally the most popular

4.4.1 Offense Categories

4.4.2 Offense Types

5 Data Exploration

5.1 Date Aggregations

We found that the rate of crime has been increasing for the last few years. Let’s see whether crime is uniformly distributed across months of the year.

Let’s perform a Chi-squared Test for the crime counts by month.

## 
##  Chi-squared test for given probabilities
## 
## data:  table(month(crime$FIRST_OCCURRENCE_DATE, label = TRUE))
## X-squared = 1947.5, df = 11, p-value < 2.2e-16

There is enough evidence to suggest that crime is not uniformly distributed across the months of the year. In other words, crime s are more likely to be committed in certain months than others.

What about days of the week?

Is crime occurrence equally likely throughout the week?

## 
##  Chi-squared test for given probabilities
## 
## data:  table(wday(crime$FIRST_OCCURRENCE_DATE, label = TRUE))
## X-squared = 2105.4, df = 6, p-value < 2.2e-16

Doesn’t look like it… Weekends sure look like the most likely time for crimes to be committed. I wonder how this changes for different crime categories.

6 Drugs and Narcotics

Let’s look at how the frequency of drug-related offenses has changed over time. We’ll filter for the offense category labeled drug-alcohol, then only look at drug related instances, and then clean up the offense type by removing the appended offense characteristic (selling, manufactoring, possession, cultivation).

Drug n
Methamphetamine 5973
Cocaine 4317
Heroin 3220
Marijuana 2878
Hallucinogen 211
Synth-narcotic 178
Opium-or-deriv 162
Barbiturate 51

Methamphetamine is the most prevalent drug in Denver. Has it always been this way? Let’s first look at the yearly prevalence of drug-related crime.

The rate of drug-related crime has been growing. Let’s see what we can learn from looking at the four most prominent drugs in Denver: meth, cocaine, heroin, and marijuana.

There is quite a spike in marijuana-related offenses first few months of 2015. If you recall, recreational sales of marijuana first started on January 1st, 2014. This spike is probably the joint result of marijuana abuse and strict DPD precautions to prevent indecency and abuse of marijuana. On the other hand, since the legalization of recreational marijuana, the methamphetamine offenses have risen astronomically. One can argue that meth, not marijuana, is the cause of the increased crime rate. However it’s possible that the legalization recreational marijuana lead to increase usage of meth.

7 Violent Crime

The Denver Police Department has reported that reports of violent crime have increased. These crimes include murder, aggravated assult, sexual assault, and illegal possession of weapons. Let’s see how the instances of these crimes have changes in the last 5 years.

Sure enough, the of instances of violent crime has been rising steadly for since 2014. There were major spikes in the summer of 2016 and 2018. Let’s look at whether a specific violent crime was the cause of the spikes.

Since 2014 there have been very few accounts of murder, but aggravated and sexual assault have been become more prevalent in the last two years. It looks like assault has gradually increased, but it’s difficult to say whether murder and accounts of illegal possession of weapons are actually increasing.

8 Future Analyses: Maps

In this analysis we looked soley at crimes and when they happened, but we never looked into where crimes occurred. It’s possible that certain categories of crimes are more prevalent in particular districts or precincts.

The DenverGov website has a Denver Crime Map for users to investigate crime maps. I briefly looked into the shapefiles myself, but I’d love to look into this more in the future.

9 References