Introduction

In this case study, we analyze the crime trend in city of Chicago in the period 2001-2018.The dataset has been downloaded from the state government website, https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2. The data is always kept in live on the link, but for the present analysis, we are capping the data at end of year 2018. Every record identifies a criminal activity using various features. These attributes include;

Few questions worth investigation include but are not limited to; yearly trend of crime rates in Chicago, types of crime that are prevalent in Chicago as well as most notorious Crime locations, etc. We plan to start the analysis with these goals but it is worth digressing if some interesting phenomenon is observed.

Preprocessing the Data

The original dataset had close to 7 million records and 22 features. One of the first things to check for is completeness of the records and how to handle missing values. It was observed that approx. 1/2 milion rows were not completely populated. Since original data is 7 milion records, it does not hurt to delete all rows with missing values. Fortunately, we do not lose much information here.

Another transformation involves extracting day of the week as well as day of the month from the date of the incident. It can help us to identify weekly trends in crime rates. Along the same lines, we use the date column to identify the time of the day when the crime occured. As a result, we added following columns;

Data Exploration

To begin the EDA, we observe the yearly trend with respect to number of crimes committed. First, it will be interesting to see the yearly trend of crime rates.

It is easy to notice that year 2003 has been the most notorious in the recent history of Chicago. Recently, the Chicago residents should feel relatively safer. The good news is that the crime rate seems to be decreasing as time progresses. The year 2019 has just started, so it makes sense for small numbers.

Earlier, we saw that 2003 was the worst year in the history of Chicago starting from 2001. The crime rate skyrocketed from 350,000 (in 2002) to more than 450,000 in 2003. One thing which is worth investigating is, what types of crimes were the most frequent in 2003.

We observe that Theft and Battery are the two most notorious crimes in the year 2003. Before we dig into theses two types of crimes in 2003, it will be interesting to compare the trend of varous types of crime from 2002 to 2003. It may help us to see which crimes mostly predominated this significant increase.

We observe that the count has increased for every category from 2002 to 2003. It will be more useful to check the percentage change for each crime type. Since we have nineteen crime types, a dotchart is preferred to a barchart.

It is unfortunate to see that almost all types of crimes observed significant increase in 2003, making 2003 as the most corrupt year as per this dataset. So now we go back to our previous goal, check the crime behavior(i.e. trend) for all types of crimes in year 2003 only. One approach can be to look at the daily trend combined with time of the day.

This graph is loaded with information, as expected. Here are few fairly obvious observations;

Notorious Locations of the Activities

Earlier, we identified the timings when different criminal activities occured. It will be also useful to identify the locations where these activities occured.It might happen that some types of crimes occur at few spots more frequently as compared to other locations. Some crimes are more likely to occur at certain time of the day, as well as at certain places. To this end, we first look at the raw count of crimes committed at different locations.

There are total 180 distinct locations, hence the image looks blurred. To gather more information, we will select top 10 locations with respect to crime counts.

Next, we will identify the most popular crimes commited at the above locations.

We see that Street was one of the most notorious spots in year 2003. All types of crimes occur on the street. It is followed by Residence, where Theft is the most popular criminal activity.

Criminal Activities at Residence

For convenience, we will group the crimes committed at following locations under the umbrella of Residence;RESIDENCE PORCH/HALLWAY,RESIDENCE-GARAGE,and RESIDENTIAL YARD (FRONT/BACK).

Similar to Street, we will first identify top 10 crimes that occured at residence or houses in 2003.

Top 10 Types of Crimes Occuring at Residence in 2003
Primary.Type crime_count
BATTERY 22309
OTHER OFFENSE 16247
CRIMINAL DAMAGE 13870
BURGLARY 13743
THEFT 13541
ASSAULT 7071
CRIMINAL TRESPASS 3131
DECEPTIVE PRACTICE 2895
NARCOTICS 2754
OFFENSE INVOLVING CHILDREN 1697

Similar to Street related crimes, we look at the percentage of each crime that occured at a given day and time of the day. To this end, we plan to do a grouping on crime type, weekday and time of the day. We will do this aggregation only for top 10 residence related crimes. Then we compute the percentage of a particular type of crime committed at a particular time and day.

From the above figure, few observations are obvious; Burglary and Theft were frequent on weekdays mornings at the residence. It is expected, since residents are out at work on weekday mornings. Besides these two crimes, we do not see much pattern in other criminal activities. Another question, we can answer is on a given weekday how does the percentage of residence related crimes change with time. To this end, we will first find the crime count at a given time and day. Then, we will do the similar computation for residence related criminal activities, and will do a join on time and day.

Once we have the merged data, we compute the percentage of residence related crimes at a given time on a particular day. Then we visualize it using area chart.

In the mornings, mostly on weekdays, the crimes at residential properties peak. Other than that, we do not see any noticeable trends. It will be more interesting to see daily trends for the most popular residence related crimes.

Few observations from above line plot include; Burglary and Theft peak during the morning hours btween 7 and 2p.m, since people are at work, Criminal Trespassing also peaks during the morning since houses are relatively empty.

An interesting observation is that Battery related crimes are significantly higher than other crimes on residential properties. As per law, Battery is defined as unlawful and any type of indecent physical contact with someone. One question to consider is, are the Battery crimes related to Domestic violence, which mostly involves close members of the family living together. Before checking that, we should try to answer the follwing question; are most of the crimes occuring at Residence as a result of Domestic violence?

Percent of Domestic Violence Crimes at Residences in 2003
Location.Description Domestic crime_percent
RESIDENCE false 76.44
RESIDENCE true 23.56

Only 24 percent of the crimes occuring at the residence are related to Domestic violence. Now we come back to the previous question, what percent of Battery related crimes is a result of Domestic problems. Rather than analyzing it only for Battery crimes, we can compute the share of Domestic problems for each of the top 10 residential crimes in the year 2003.

It is informative to see that domestic problems did contribute significantly to few types of crime such as Offense Involving Children and Assault. It is likely that offensive acts are committed towards children, and thes actions are caused because of domestic problems. Therefore, it makes sense that almost half of offenses involving children are related to Domestic problems.

Now that we have identified the crime trend on Residence, it will be useful to identify the areas of chicago with widespread Residence related crimes in 2003. We use the similar approach as the Street related crimes above.

Was a particular type of criminal activity more prevalent on Residential properties in a particular region of Chicago in 2003? To this end we perform faceting by type of crime.

We observe that Burglary and Theft are spread throughout the residences in Chicago. Assault activities are mostly focused in residences of Southern Chicago. To confirm our observations, we will identify the districts of Chicago with heavy concentration of Battery related crimes in Residences.

Count of Battery Related Crimes at Residences Across Various Districts in 2003
District crime_count
7 2123
8 1908
5 1900
4 1631
6 1606
11 1508
9 1459
25 1299
15 1133
22 1122

The residences in districts 7,8 and 5 had witnessed a huge concentration of Battery related Crimes in 2003. Looking at the Chicago district map, these districts were located in Central and South Chicago. This result is corroborated by our observations in the above figure.

To summarize, Battery and Theft were the most prominent crimes that occured on Residential properties in 2003. Majority of the battery related crimes occured because of Domestic issues, that may include physical or verbal nharassment with other residents. Second, Battery activities were concentrated mostly in South and Central Chicago. Third, regarding the timing of the activities, Theft and Burglary ocured mostly during the morning hours on weekdays, since residents are mostly at work.

Distribution of Different Types of Crimes in 2003

In the previous two sections, we explored the trend of criminal activities at two most prominent locations in Chicago in 2003; Street and Residence. In this section, we plan to broadly classify the criminal activities into two categories, Violent and Property Crimes. Violent crimes include Battery, Assault, and Robbery. On the other hand Property Crimes include Theft, Burglary, and Motor Vechicle Theft. Once we have classified these activities, we compare the proportion of these two types of crimes. We plan to use a stacked barchart to compute this proportion.

Relatively equal proportions of crimes in 2003 belong to the two types, Property and Violence, with Property crimes leading by a small amount. Next, we should explore which Property crime contributes the most, similarly for Violent crimes.

To this end, we subset two data frames, one focusing on Property crimes and other on Violent crimes respectively.Then we compute the individual percentage of each of the crime categories.

Earlier, we observed that Theft and Battery were the two most notorious crimes in the year 2003. This fact also gets confirmed from the above stacked bar chart, depicting Theft as a strong contributor to Property crimes and Battery supporting the Violent crimes. Now that we have analyzed the crime trend for the year 2003 in significant details, it is time to move on to another interesting questions. We saw that Streets and Residences were relatively unsafer as compared to other locations in the year 2003.But what matters is, whether the crime rate falling in the city of Chicago with time?

Are Chicago Streets Getting Safer?

Earlier, we saw that Street was the most notorious spot for the crimes in year 2003. It exceede other crime locations with a significant margin. Now, we will check the yearly trend of crime rate on the streets of Chicago. One way to check is by computing the percent of crimes committed on streets each year.

For few crimes we don’t have the location description available, hence they can be filtered out for this present analysis. Next we will filter and visualize the behavior of Street related crimes using a line chart.

Overall, the streets of Chicago are getting safer. Another way to check is by plotting the monthly average of number of crimes occuring on streets of Chicago.

This graph petty much corroborates our earlier finding; street crimes are decreasing in Chicago. As a digress, we know that Chicago has one of the severe winters in the Midwest area. Therefore, it is worth checking the crime rate on the streets of chicago on a monthly basis.

The crime rate is comparatively lower during the winter as expected. It can be attributed to less activity on the streets as compared to the summer time. We saw that streets in Chicago are becoming relatively safer. Are there any particular type of street related crimes that are declining the most or all of them are decreasing overall? We will try to answer this question for the top 10 crimes that occured on streets of Chicago in the year 2003.

Above, we see that except Theft on the streets of Chicago, all types of criminal activities have declined with time. That explains the steady decline of overall crimes on Chicago streets, making them relatively safer. Thats a relatively good news for the Chicago residents.

Visualization of Crime Distribution Using Heatmap

In the above sections, we found interesting facts, that make sense. Most of the observations comply with the actual behavior of different types of crime in the city of Chicago. Next, we plan to spice things up. It will be interesting to predict the probablity of a particular crime type occuring at a given time in the day, at a given location, etc. We customize the appearance using our own palette, my_palette <- colorRampPalette(c("red", "yellow", "green"))(n = 299).

To begin with, we present the likelihood of a particular type of crime occuring at a particular time in the day. It should be noted, for present analysis, we are concerned about the entire dataset. We write a function to compute the percentage of crimes of a particular type, grouped on a particular feature, such as time, location etc.

We will first visualize the distribution of different types of crimes with respect to time of the day. Heat map will be used for this purpose.

As we saw earlier, Prostitution related activities were rampant in the night. This agrees with our earlier observations. At the same time, liquor law as well as weapons law were violated the most during late night hours. Along the same lines, we would like to see which crimes are more prevalent on weekdays as compared to weekends?

As expected, liquor law is violated the most on weekends, especially on Friday and Saturday. Similarly, Burglary and Robbery are more frequent on weekdays, since they are more prevalent on residential properties and residents are out at work. Last, we analyze, for a given crime type, how probable it is to occur at a given location. Since there are 180 distinct locations, it might be difficult to visulaize the heatmap, therefore we plan to select top 20 locations in terms of crime count.

The above heatmap supports our earlier observations; Prostitution was prevalent on Streets as expected, Motor vehicles parked on streets are much more prone to theft, as well as Children are at risk only on Residential properties. In addition, Burglary related activities occured mostly in apartments and residences.

Discussion

We have analyzed the trend of criminal activities in the city of Chicago, during the period 2001 to 2018. Unfortunately, year 2003 has been the most notorious one, as per this dataset. Theft and Battery were the most notorious criminal activities in 2003. However, all types of criminal activities registered significant increase in the period 2002-2003. Street turned out to be the most unsafe place, in terms of crime count. This was followed by Residence.

Theft and Narcotics related crimes were the leading crimes on streets of Chicago in 2003. This was followed by Battery incidents. Digging a bit deeper, we found that Narcotics related activities dominated the streets of Western Chicago. Battery related activities were also prominent on the Residential properties in Chicago. But majority of Battery incidents occured in residences of South and Central Chicago. As per this data, Domestic violence contributed significantly to residential crimes.

Though Chicago streets were highly unsafe in 2003, the resident must be feeling safer lately. The overall crime rate on Streets has been declining at a decent pace. Moreover, street crimes peaked during summer months in Chicago. This can possibly be attributed to severe weather conditions in winter.

Towards the end, we have identified the likelihood of a particular type of crime occuring at a given time of day, or at a given location, etc. Prostitution and Liquor law violations, as expected are more likely to occur on the streets late in the evening. Similarly, Burglary and Robbery are more likely to occur on weekdays, since they are more prevalent on residential properties and residents are out at work.

For future analysis, it will be worthwhile to identify unsafe communities in the city of Chicago. This can help the people looking to rent or buy property. Additionally, it can also help the property owners to tag their property at reasonable price. Moreover, it would be useful to have a column in the dataset indicating whether the crime committed was a hate crime or not.