Crime Rates in Cincinnati

Introduction

Before the riot of 2001, Cincinnati’s overall crime rate was dropping dramatically and had reached its lowest point since 1992. After the riot violent crime increased. Reflecting national trends, crime rates in Cincinnati have dropped in all categories from 2006 to 2010.

In 2011, using FBI data, the CQ Press ranked Cincinnati the 16th most dangerous city in the United States.However, the FBI web site recommends against using its data for rankings, since there are many factors that influence crime rates. Also, the CQ Press did not include Chicago, Illinois in its ranking.[6]

According to the Hamilton County Prosecutor, the vast majority of people being murdered are inside the city and are African American. In 2009, 44 black men and 11 black women were murdered, but no whites.[8] According to the Prosecutor, people almost always commit murders inside their racial classifications, and there is a subset in the underclass of Cincinnati which “commit a lot of violent crime and tend to be black,”[8] similar to national trends.

We have analysed the data to understand how crime rates have changed by time and location.

The final conclusions of the analysis will answer:

  • The number of crimes
  • Violence of offense
  • Weapons Used
  • Days of the week
  • Most dangerous locations
  • Gender which is targetted the most
  • Age bracket

We will be leveraging this information using various libraries in R and answer the question: Is Cincinnati a safe place as it was earlier?

Packages Required

The following packages are used :

library(tidyr) #for data cleaning
library(tidyverse) #to install the required packages
library(readr) #to import the csv file
library(dplyr) #to manipulate the data
library(ggplot2) #for plotting different graphs
library(knitr) #for R markdown
library(leaflet) #for visulaizing the data on map

Data Cleaning and Preparation

DATA SOURCE

I am using the crime dataset for the city of Cincinnati to explore the situation of crimes in the city

The link to the data source is here

PURPOSE OF THE DATA

The purpose is to identify crime rate and analyze different factors that lead to the crime. The original data contains 314000 rows and 39 columns and for the years 1991 to 2018.

DATA CLEANING

The first step is to import the data from a CSV file. #Importing data

#Import CSV file
crime_data <- read.csv('crimedata.csv')

On initial analysis on the data I found that I could use the date reported column and split it into two columns. The two columns would represent the date of crime and the time of occurrence.

#Date of crime and time of crime

Date_Rpt <- substr(crime_data$DATE_FROM, 1, 22)
Date_Rpt <- (strptime(Date_Rpt, '%m/%d/%Y %I:%M:%S %p'))


Time_of_rpt <- substr(crime_data$DATE_FROM, 12, 22)
Time_of_rpt<- as.difftime(Time_of_rpt,'%I:%M:%S %p', units="hours")
Date_Rpt <- substr(crime_data$DATE_FROM, 1, 10)

crime_data$DATE_FROM <- Date_Rpt
crime_data <- cbind(crime_data,  Time_of_rpt)

Now I converted the data into a table and filtered out the required columns for analysis

#Converting to a table
crime_data <- tbl_df(crime_data)
#selecting the required columns
cd_cleandata <- select(crime_data,
                   INSTANCEID,  
                   DATE_FROM,
                   Time_of_rpt,
                   OFFENSE,
                   DAYOFWEEK,
                   SNA_NEIGHBORHOOD,    
                   WEAPONS,
                   LONGITUDE_X, 
                   LATITUDE_X,  
                   VICTIM_AGE,  
                   VICTIM_GENDER
                  
)

Providing the summary of the cleaned dataset

summary(cd_cleandata)
##                                 INSTANCEID      DATE_FROM        
##  C2CA9A5A-5C3C-419F-BFE1-F6650C936EFF:   127   Length:355379     
##  932627CA-318A-4B93-8FE3-57914736D1F6:   126   Class :character  
##  456C6412-EC72-4F59-B1B4-5980631026AD:    84   Mode  :character  
##  3EFCADA0-F622-403C-9B64-50E23EC5F00F:    80                     
##  46504773-3D1B-48A2-96F6-74BD8EBE5F73:    64                     
##  1AF3B2F3-411A-4A01-A3A3-1E0F0DDAD73E:    60                     
##  (Other)                             :354838                     
##  Time_of_rpt                                OFFENSE      
##  Length:355379     THEFT                        :120315  
##  Class :difftime   CRIMINAL DAMAGING/ENDANGERING: 43536  
##  Mode  :numeric    ASSAULT                      : 38225  
##                    BURGLARY                     : 28817  
##                    AGGRAVATED ROBBERY           : 18516  
##                    BREAKING AND ENTERING        : 17148  
##                    (Other)                      : 88822  
##      DAYOFWEEK            SNA_NEIGHBORHOOD 
##  FRIDAY   :52163   N/A            : 44776  
##  SATURDAY :51460   WESTWOOD       : 32163  
##  SUNDAY   :50299   EAST PRICE HILL: 23103  
##  MONDAY   :50214   WEST PRICE HILL: 18588  
##  WEDNESDAY:49728   AVONDALE       : 17911  
##  TUESDAY  :49450   CUF            : 14680  
##  (Other)  :52065   (Other)        :204158  
##                                              WEAPONS      
##  99 - NONE                                       :141022  
##  99--NONE                                        : 70866  
##  40 - PERSONAL WEAPONS (HANDS, FEET, TEETH, ETC.): 41219  
##  U - UNKNOWN                                     : 24493  
##  40--PERSONAL WEAPONS (HANDS, FEET, TEETH, ETC.) : 16932  
##  12 - HANDGUN                                    : 15743  
##  (Other)                                         : 45104  
##   LONGITUDE_X       LATITUDE_X      VICTIM_AGE   
##  Min.   :-84.82   Min.   :39.05   18-25  :72619  
##  1st Qu.:-84.57   1st Qu.:39.12   31-40  :58768  
##  Median :-84.52   Median :39.14   UNKNOWN:57537  
##  Mean   :-84.52   Mean   :39.14   41-50  :43554  
##  3rd Qu.:-84.49   3rd Qu.:39.16   26-30  :41978  
##  Max.   :-84.25   Max.   :39.36   51-60  :35961  
##  NA's   :45864    NA's   :45864   (Other):44962  
##               VICTIM_GENDER   
##                      : 54361  
##  F - FEMALE          :    25  
##  FEMALE              :164757  
##  M - MALE            :    18  
##  MALE                :135901  
##  NON-PERSON (BUSINESS:    51  
##  UNKNOWN             :   266

Removing missing values and blank fields from the dataset.

#cleaning to remove na
cd_cleandata <- na.omit(cd_cleandata)
cd_cleandata <- with(cd_cleandata, cd_cleandata[!(INSTANCEID == "" | is.na(INSTANCEID)), ])
cd_cleandata <- with(cd_cleandata, cd_cleandata[!(DAYOFWEEK == "" | is.na(DAYOFWEEK)), ])
cd_cleandata <- with(cd_cleandata, cd_cleandata[!(SNA_NEIGHBORHOOD == "" | is.na(SNA_NEIGHBORHOOD)), ])
cd_cleandata <- with(cd_cleandata, cd_cleandata[!(OFFENSE == "" | is.na(OFFENSE)), ])
cd_cleandata <- with(cd_cleandata, cd_cleandata[!(VICTIM_AGE == "" | is.na(VICTIM_AGE)), ])
cd_cleandata <- with(cd_cleandata, cd_cleandata[!(VICTIM_AGE == "UNKNOWN" | is.na(VICTIM_AGE)), ])
cd_cleandata <- with(cd_cleandata, cd_cleandata[!(VICTIM_GENDER == "" | is.na(VICTIM_GENDER)), ])

Combining the similar kinds of crimes to on and combinig the weapons used to find the unique weapons used for the crime. Also, grouped juvenile and under 18 years of age into one group called Under 18.

#cleaning offense column
cd_cleandata$OFFENSE <- gsub(".*HARRASS.*", "HARRASS", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*MENACING.*", "MENACING", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*FORGERY.*", "FORGERY", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*ROBBERY.*", "ROBBERY", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*MURDER.*", "MURDER", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*ABDUCTION.*", "ABDUCTION", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*SEX.*", "SEXUAL CRIME", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*INTIMID.*", "INTIMIDATION", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*KIDNAPPING.*", "KIDNAPPING", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*PUBLIC INDECENCY.*", "PUBLIC INDECENCY", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*ARSON.*", "ARSON", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*UNAUTHORIZED USE.*", "UNAUTHORIZED USE", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*PATIENT ABUSE.*", "PATIENT ABUSE", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*TELEPHONE HARRASSMENT.*", "TELEPHONE HARRASSMENT", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*VIOL.*", "VIOLATE PROTECTION ORDER", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*CREDIT CARD.*", "CREDIT CARD FRAUD", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*VANDALISM.*", "VANDALISM", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*THEFT.*", "THEFT", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*CRIMINAL.*", "CRIMINAL MISCHIEF", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*DISORDERLY CONDUCT.*", "DISORDERLY CONDUCT", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*ENDANGERING CHILDREN.*", "ENDANGERING CHILDREN", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*ASSAULT.*", "ASSAULT", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*BURGLARY.*", "BURGLARY", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$OFFENSE <- gsub(".*RAPE.*", "RAPE", (cd_cleandata$OFFENSE), perl = FALSE)
cd_cleandata$VICTIM_AGE <- gsub(".*JUVENILE.*", "UNDER 18", (cd_cleandata$VICTIM_AGE), perl = FALSE)

#Cleaning weapons column

cd_cleandata$WEAPONS <- gsub(".*11.*", "FIREARM", (cd_cleandata$WEAPONS), perl = FALSE)
cd_cleandata$WEAPONS <- gsub(".*12.*", "HANDGUN", (cd_cleandata$WEAPONS), perl = FALSE)
cd_cleandata$WEAPONS <- gsub(".*13.*", "RIFLE", (cd_cleandata$WEAPONS), perl = FALSE)
cd_cleandata$WEAPONS <- gsub(".*14.*", "SHOTGUN", (cd_cleandata$WEAPONS), perl = FALSE)
cd_cleandata$WEAPONS <- gsub(".*15.*", "FIREARM", (cd_cleandata$WEAPONS), perl = FALSE)
cd_cleandata$WEAPONS <- gsub(".*16.*", "FIREARM", (cd_cleandata$WEAPONS), perl = FALSE)
cd_cleandata$WEAPONS <- gsub(".*17.*", "FIREARM", (cd_cleandata$WEAPONS), perl = FALSE)
cd_cleandata$WEAPONS <- gsub(".*18.*", "BB AND PELLET GUNS", (cd_cleandata$WEAPONS), perl = FALSE)
cd_cleandata$WEAPONS <- gsub(".*20.*", "KNIFE/CUTTING INSTRUMENT", (cd_cleandata$WEAPONS), perl = FALSE)
cd_cleandata$WEAPONS <- gsub(".*30.*", "BLUNT OBJECT", (cd_cleandata$WEAPONS), perl = FALSE)
cd_cleandata$WEAPONS <- gsub(".*35.*", "MOTOR VEHICLE", (cd_cleandata$WEAPONS), perl = FALSE)
cd_cleandata$WEAPONS <- gsub(".*40.*", "PERSONAL WEAPON", (cd_cleandata$WEAPONS), perl = FALSE)
cd_cleandata$WEAPONS <- gsub(".*60.*", "EXPLOSIVES", (cd_cleandata$WEAPONS), perl = FALSE)
cd_cleandata$WEAPONS <- gsub(".*70.*", "DRUGS", (cd_cleandata$WEAPONS), perl = FALSE)
cd_cleandata$WEAPONS <- gsub(".*80.*", "OTHER WEAPONS", (cd_cleandata$WEAPONS), perl = FALSE)
cd_cleandata$WEAPONS <- gsub(".*U.*", "UNKNOWN", (cd_cleandata$WEAPONS), perl = FALSE)
cd_cleandata$WEAPONS <- gsub(".*65.*", "FIRE/INCENDIARY DEVICE", (cd_cleandata$WEAPONS), perl = FALSE)
cd_cleandata$WEAPONS <- gsub(".*50.*", "POISON", (cd_cleandata$WEAPONS), perl = FALSE)
cd_cleandata$WEAPONS <- gsub(".*99.*", "NONE", (cd_cleandata$WEAPONS), perl = FALSE)
cd_cleandata$WEAPONS <- gsub(".*85.*", "ASPHYXIATION", (cd_cleandata$WEAPONS), perl = FALSE)

Final Dataset

The data set I used to analyze the problem is as follows:

kable(cd_cleandata[1:10,], caption = "Preview - Clean Data")
Preview - Clean Data
INSTANCEID DATE_FROM Time_of_rpt OFFENSE DAYOFWEEK SNA_NEIGHBORHOOD WEAPONS LONGITUDE_X LATITUDE_X VICTIM_AGE VICTIM_GENDER
D6D7D173-E416-4571-AF34-A767ACF810D9 03/16/2015 15.033333 hours MENACING MONDAY WALNUT HILLS NONE -84.49080 39.11996 26-30 FEMALE
2A569674-9603-4ED9-82BB-783B843B94C9 06/23/2011 10.333333 hours ASSAULT THURSDAY AVONDALE PERSONAL WEAPON -84.49155 39.14167 41-50 MALE
5B95E5DC-E35D-4F86-B77C-02F25C950329 06/25/2012 15.583333 hours ASSAULT MONDAY COLLEGE HILL FIREARM -84.54596 39.20102 18-25 MALE
B908DB49-C51D-442B-ADFD-B6E64696071B 03/13/2018 16.750000 hours ASSAULT TUESDAY WEST END PERSONAL WEAPON -84.52521 39.11082 41-50 FEMALE
66F569F5-4C8E-4848-BD42-D0F8DE4D42B0 05/19/2013 14.500000 hours BURGLARY SUNDAY MADISONVILLE FIREARM -84.39035 39.15479 18-25 FEMALE
3B4450F2-F94B-4BFB-8B0A-3FF04DBD73C4 07/24/2015 0.000000 hours ENDANGERING CHILDREN FRIDAY EVANSTON UNKNOWN -84.47835 39.13468 UNDER 18 MALE
B095D1B2-9089-4CA4-85B2-A2657FB82793 09/05/2012 1.366667 hours ROBBERY WEDNESDAY OVER-THE-RHINE NONE -84.51276 39.11108 31-40 MALE
FD5706CD-7E2F-4CED-8C73-2E3AB142EF8C 09/03/2015 1.166667 hours ASSAULT THURSDAY OVER-THE-RHINE PERSONAL WEAPON -84.51640 39.11505 41-50 FEMALE
417FB2EC-1405-4F31-802B-2D3D7C9A47C3 08/31/2014 18.000000 hours THEFT SUNDAY MT. AUBURN NONE -84.51008 39.11733 18-25 MALE
A6B8356E-5A44-425A-8076-80BA963D09AC 09/25/2010 14.500000 hours ASSAULT SATURDAY WALNUT HILLS PERSONAL WEAPON -84.49454 39.12494 31-40 MALE

Exploratory Analysis

The exploratory data analysis is used to find the following answers:

  • The number of crimes
  • Violence of offense
  • Weapons Used
  • Days of the week
  • Most dangerous locations
  • Gender which is targetted the most
  • Age bracket
#total number of crimes
length(unique(cd_cleandata$INSTANCEID))
## [1] 201198
#Violence of offense
cd_cleandata    %>% count(OFFENSE,  sort    = TRUE) %>% top_n(5)
## Selecting by n
## # A tibble: 5 x 2
##   OFFENSE               n
##   <chr>             <int>
## 1 THEFT             76142
## 2 ASSAULT           42003
## 3 CRIMINAL MISCHIEF 32744
## 4 BURGLARY          30155
## 5 ROBBERY           18469
#Weapons used
cd_cleandata    %>% count(WEAPONS,  sort    = TRUE) %>% filter(WEAPONS != "NONE" & WEAPONS!= "UNKNOWN") %>% top_n(5)
## Selecting by n
## # A tibble: 5 x 2
##   WEAPONS                                         n
##   <chr>                                       <int>
## 1 PERSONAL WEAPON                             46461
## 2 FIREARM                                      7918
## 3 OTHER WEAPONS                                2474
## 4 PERSONAL WEAPONS (HANDS, FEET, TEETH, ETC.)  1094
## 5 MOTOR VEHICLE                                 829
#unsafe days of the week
cd_cleandata    %>% count(DAYOFWEEK,    sort    = TRUE) %>% top_n(5)
## Selecting by n
## # A tibble: 5 x 2
##   DAYOFWEEK     n
##   <fct>     <int>
## 1 SATURDAY  37445
## 2 FRIDAY    37005
## 3 SUNDAY    37004
## 4 MONDAY    36286
## 5 WEDNESDAY 35881
#unsafe neighborhoods
cd_cleandata    %>% 
  count(SNA_NEIGHBORHOOD, sort = TRUE) %>% top_n(5)
## Selecting by n
## # A tibble: 5 x 2
##   SNA_NEIGHBORHOOD     n
##   <fct>            <int>
## 1 WESTWOOD         23857
## 2 EAST PRICE HILL  19441
## 3 WEST PRICE HILL  16044
## 4 AVONDALE         15413
## 5 CUF              12816
#Which gender is targeted more
cd_cleandata    %>% count(VICTIM_GENDER,    sort    = TRUE) %>% top_n(1)
## Selecting by n
## # A tibble: 1 x 2
##   VICTIM_GENDER      n
##   <fct>          <int>
## 1 FEMALE        141323
#age group
cd_cleandata    %>% count(VICTIM_AGE,   sort    = TRUE) %>% top_n(5)
## Selecting by n
## # A tibble: 5 x 2
##   VICTIM_AGE     n
##   <chr>      <int>
## 1 18-25      62000
## 2 31-40      50983
## 3 41-50      37963
## 4 26-30      36088
## 5 51-60      31584

Creating a bar plot to see the distribution of the crime over the years. The plot shows that the crime rates were high in the years 2010 to 2013 and after that it decreased in the further years.

#adding year column to the dataset
year = (substr(cd_cleandata$DATE_FROM,7,11))
cd_cleandata <- cbind(cd_cleandata,  year)

#plotting barplot
barplot(table(substr(cd_cleandata$DATE_FROM, 7, 11)),
        main = "Crimes committed by Year",
        xlab = "Year",
        ylab = "Total Crimes",
        col = "cyan")

Filtering the offenses by type.

#5 offense types

cd_cleandata %>%
  filter(cd_cleandata$OFFENSE == "THEFT" |
           cd_cleandata$OFFENSE == "MENACING"|
           cd_cleandata$OFFENSE == "ASSAULT"|
           cd_cleandata$OFFENSE == "BURGLARY"|
           cd_cleandata$OFFENSE == "CRIMINAL MISCHIEF") %>%
  ggplot(aes(factor(OFFENSE), fill = OFFENSE)) + 
  geom_bar(stat="count", position = "dodge")+
  theme(axis.text.x = element_text(angle = 45, hjust = 1))+
  labs( y = "Count of crimes",
        x = "Offense type",
        title = "Top 5 offenses in Cincinnati")+
  guides(fill=guide_legend(title="Crime in Cincinnati"))

Here we can see that menacing has the lowest count and theft has the highest number of occurrence.

Using the leaflet package I am using the map functionality to find out the count of crimes per location.

#changing neighborhood to factor

cd_cleandata$SNA_NEIGHBORHOOD <- as.factor(cd_cleandata$SNA_NEIGHBORHOOD)

#creating function for visualization by location


mapforlocation <- function(location){
  subdata <- filter(cd_cleandata, location == cd_cleandata$SNA_NEIGHBORHOOD)
  occurence_map <- leaflet() %>%
    addTiles()%>%
    addMarkers(lng = subdata$LONGITUDE_X,
               lat = subdata$LATITUDE_X, 
               clusterOptions = markerClusterOptions()
    )
  
}

map <- mapforlocation(cd_cleandata$SNA_NEIGHBORHOOD)
map

Here we clearly see that the neighborhood Over the Rhine has the highest number of crimes.

Next I tried to find the trend to see the top 5 SNA neighborhoods with the highest number of crimes.

# finding most dangerous crimes by SNA neighborhood
cd_cleandata %>%
  filter(cd_cleandata$SNA_NEIGHBORHOOD == "WESTWOOD" |
           cd_cleandata$SNA_NEIGHBORHOOD == "WEST PRICE HILL"|
           cd_cleandata$SNA_NEIGHBORHOOD == "EAST PRICE HILL"|
           cd_cleandata$SNA_NEIGHBORHOOD == "OVER-THE-RHINE"|
           cd_cleandata$SNA_NEIGHBORHOOD == "AVONDALE") %>%
  ggplot(aes(factor(SNA_NEIGHBORHOOD), fill = SNA_NEIGHBORHOOD)) + 
  geom_bar(stat="count", position = "dodge",color ="blue")+
  theme(axis.text.x = element_text(angle = 30, hjust = 1))+
  labs( y = "Total number of crimes",
        x = "Neighborhood",
        title = "Unsafe Neighborhoods")

In the below plot we can see that the age group 31-40 is the most unsafe age group and the neighbohood has the highest number of crimes in that ge group.

#graph for age group and neighborhood

cd_cleandata %>% filter(cd_cleandata$SNA_NEIGHBORHOOD == "WESTWOOD"|
                       cd_cleandata$SNA_NEIGHBORHOOD == "WEST PRICE HILL"|
                       cd_cleandata$SNA_NEIGHBORHOOD == "EAST PRICE HILL"|
                       cd_cleandata$SNA_NEIGHBORHOOD == "OVER-THE-RHINE"|
                       cd_cleandata$SNA_NEIGHBORHOOD == "AVONDALE") %>%
  group_by(VICTIM_AGE, SNA_NEIGHBORHOOD)%>%
  tally()%>%
  ggplot(aes(x = VICTIM_AGE, y = n, group = SNA_NEIGHBORHOOD, color = SNA_NEIGHBORHOOD)) + 
  geom_line()+
  labs(y = "Number of Crimes",
       x = "Age Group",
       title = "Most Dangerous locations",
       color = "Neighborhoods")

Summary

I worked to analyze the Cincinnati crime dataset to find trends in the data and determine the parameters which lead to the criminal activity in the city.

I addressed this problem by using the data columns like neighborhood and the number of crimes in that area. I also use the offense types to find the fators which lead to those crimes.

The methodology I used was that I created a barplot, map, histogram and trend lines to compare the various factors and thereby gaining the insights.

This analysis helped to find the most dangerous locations in the city which can help the police department to increase the security measures in those areas. We also found out the criminal activity against the different age groups which can help the police to focus people in that age group which seem to be the working class age. We also see that theft is the most comon type of crime and thereby security measure against thefts can be focused more. We also see that the crimes has been reducing over the years which is a good indicator for the police department.

Through my analysis the police department can enforce more security in the nieghborhood Avondale. The city can conduct more awareness for the people in the age group 31-40 about how to be safe and avoid such situations. To reduce the number of thefts various programs can be conducted to make common people aware about how to take care of their belongings.

This project can be further improved by working on the less utilized columns like gender of the victim, weapons used in the crime, bias of the criminal and times of the crime to get more deeper insights.