I’m Vivek Mangipudi. This is my data analysis and visualization submission with regards to the NNSC Data Analytics Intern position.
Crime in Chicago has been tracked by the Chicago Police Department’s Bureau of Records since the beginning of the 20th century. The city’s violent crime rate, is substantially higher than the US average. Although national crime rates in the United States have stayed near historic lows, Chicago was supposedly responsible for nearly half of 2016’s increase in homicides in the United States. .As of 2017, Chicago’s homicide rate is significantly higher when compared to the larger American cities of New York and Los Angeles, but lower when compared to smaller American cities. The reason for the violence which is localized to some areas of the city, including change in police tactics or increase in gang rivalry, remain unclear.
In this analysis, I explore the Chicago Crimes Data Set between the years 2001 - 2017, with the hope that I would be able to provide atleast some answers to the following questions :
1. How has the number of various crimes changed over time in Chicago?
2. How have the number arrests corresponding to the crimes changed over time in Chicago?
3. Are there any trends in the crimes being committed?
4. Which crimes are most frequently committed?
5. Which locations are these frequent crimes being committed in?
6. Are there certain high crime neighbourhoods?
7. How has the number of Homicides changed over the years in Chicago?
8. Can interesting , intuitive and interactive visualization be created to convey the story hidden in the data.
By being able to identify high crime neighbourhoods, we can work on arriving at a solution by using a combination of external demographic, socio-economic, cultural and ethinic data to figure out if the violence is being perpetrated by violent gangs, or are these gangs forming on account of a negligent and an incapable government, or if poverty and poor education is indirectly leading to all these crimes, or if social media such as facebook youtube and mainstream cinema/tv shows are to blame for culturally influencing individuals to take up arms, because the recurring themes among tv shows, movies and rap songs these days seems to be about guns gangs gold and girls , all the while gloryifing an unrealistic “hustling” life style.
## [1] 7941285 24
| X1 | X1_1 | ID | Case Number | Date | Block | IUCR | Primary Type | Description | Location Description |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 879 | 4786321 | HM399414 | 01/01/2004 12:01:00 AM | 082XX S COLES AVE | 0840 | THEFT | FINANCIAL ID THEFT: OVER $300 | RESIDENCE |
| 2 | 2544 | 4676906 | HM278933 | 03/01/2003 12:00:00 AM | 004XX W 42ND PL | 2825 | OTHER OFFENSE | HARASSMENT BY TELEPHONE | RESIDENCE |
| 3 | 2919 | 4789749 | HM402220 | 06/20/2004 11:00:00 AM | 025XX N KIMBALL AVE | 1752 | OFFENSE INVOLVING CHILDREN | AGG CRIM SEX ABUSE FAM MEMBER | RESIDENCE |
| 4 | 2927 | 4789765 | HM402058 | 12/30/2004 08:00:00 PM | 045XX W MONTANA ST | 0840 | THEFT | FINANCIAL ID THEFT: OVER $300 | OTHER |
| 5 | 3302 | 4677901 | HM275615 | 05/01/2003 01:00:00 AM | 111XX S NORMAL AVE | 0841 | THEFT | FINANCIAL ID THEFT:$300 &UNDER | RESIDENCE |
| 6 | 3633 | 4838048 | HM440266 | 08/01/2004 12:01:00 AM | 012XX S HARDING AVE | 0841 | THEFT | FINANCIAL ID THEFT:$300 &UNDER | APARTMENT |
| 7 | 3756 | 4791194 | HM403711 | 01/01/2001 11:00:00 AM | 114XX S ST LAWRENCE AVE | 0266 | CRIM SEXUAL ASSAULT | PREDATORY | RESIDENCE |
| 8 | 4502 | 4679521 | HM216293 | 03/15/2003 12:00:00 AM | 090XX S RACINE AVE | 5007 | OTHER OFFENSE | OTHER WEAPONS VIOLATION | RESIDENCE PORCH/HALLWAY |
| 9 | 4564 | 4792195 | HM405396 | 09/16/2004 10:00:00 AM | 003XX W HUBBARD ST | 0890 | THEFT | FROM BUILDING | RESIDENCE |
| 10 | 4904 | 4680124 | HM282389 | 01/01/2003 12:00:00 AM | 009XX S SPAULDING AVE | 0840 | THEFT | FINANCIAL ID THEFT: OVER $300 | RESIDENCE |
| Arrest | Domestic | Beat | District | Ward | Community Area | FBI Code | X Coordinate | Y Coordinate | Year |
|---|---|---|---|---|---|---|---|---|---|
| False | False | 424 | 4 | 7 | 46 | 06 | NA | NA | 2004 |
| False | True | 935 | 9 | 11 | 61 | 26 | 1173974 | 1876757 | 2003 |
| False | False | 1413 | 14 | 35 | 22 | 20 | NA | NA | 2004 |
| False | False | 2521 | 25 | 31 | 20 | 06 | NA | NA | 2004 |
| False | False | 2233 | 22 | 34 | 49 | 06 | 1174948 | 1831051 | 2003 |
| False | False | 1011 | 10 | 24 | 29 | 06 | NA | NA | 2004 |
| True | True | 531 | 5 | 9 | 50 | 02 | 1182247 | 1829375 | 2001 |
| False | False | 2222 | 22 | 21 | 73 | 26 | 1169911 | 1844832 | 2003 |
| False | False | 1831 | 18 | 42 | 8 | 06 | NA | NA | 2004 |
| False | False | 1134 | 11 | 24 | 29 | 06 | 1154521 | 1895755 | 2003 |
## X1 X1_1 ID Case Number
## Min. : 1 Min. : 0 Min. : 634 Length:7941285
## 1st Qu.:1985322 1st Qu.:1160283 1st Qu.: 3853210 Class :character
## Median :3970643 Median :2282371 Median : 6165079 Mode :character
## Mean :3970643 Mean :2673858 Mean : 5926071
## 3rd Qu.:5955964 3rd Qu.:4185491 3rd Qu.: 7716590
## Max. :7941285 Max. :6254267 Max. :10827880
## NA's :1
## Date Block IUCR
## Length:7941285 Length:7941285 Length:7941285
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
##
##
##
##
## Primary Type Description Location Description
## Length:7941285 Length:7941285 Length:7941285
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
##
##
##
##
## Arrest Domestic Beat District
## Length:7941285 Length:7941285 Min. : 111 Min. : 1.00
## Class :character Class :character 1st Qu.: 623 1st Qu.: 6.00
## Mode :character Mode :character Median :1111 Median :10.00
## Mean :1198 Mean :11.31
## 3rd Qu.:1732 3rd Qu.:17.00
## Max. :2535 Max. :31.00
## NA's :1 NA's :92
## Ward Community Area FBI Code X Coordinate
## Min. : 1.0 Min. : 0.0 Length:7941285 Min. : 0
## 1st Qu.:10.0 1st Qu.:23.0 Class :character 1st Qu.:1152887
## Median :22.0 Median :32.0 Mode :character Median :1165910
## Mean :22.6 Mean :37.7 Mean :1164456
## 3rd Qu.:34.0 3rd Qu.:58.0 3rd Qu.:1176336
## Max. :50.0 Max. :77.0 Max. :1205119
## NA's :700225 NA's :702092 NA's :105574
## Y Coordinate Year Updated On Latitude
## Min. : 0 Min. :2001 Length:7941285 Min. :36.62
## 1st Qu.:1858997 1st Qu.:2005 Class :character 1st Qu.:41.77
## Median :1890072 Median :2008 Mode :character Median :41.85
## Mean :1885555 Mean :2008 Mean :41.84
## 3rd Qu.:1909373 3rd Qu.:2010 3rd Qu.:41.91
## Max. :1951622 Max. :2017 Max. :42.02
## NA's :105575 NA's :2 NA's :105575
## Longitude Location
## Min. :-91.69 Length:7941285
## 1st Qu.:-87.71 Class :character
## Median :-87.67 Mode :character
## Mean :-87.67
## 3rd Qu.:-87.63
## Max. :-87.52
## NA's :105576
## [1] "THEFT"
## [2] "OTHER OFFENSE"
## [3] "OFFENSE INVOLVING CHILDREN"
## [4] "CRIM SEXUAL ASSAULT"
## [5] "MOTOR VEHICLE THEFT"
## [6] "SEX OFFENSE"
## [7] "DECEPTIVE PRACTICE"
## [8] "BATTERY"
## [9] "BURGLARY"
## [10] "WEAPONS VIOLATION"
## [11] "PUBLIC PEACE VIOLATION"
## [12] "NARCOTICS"
## [13] "GAMBLING"
## [14] "PROSTITUTION"
## [15] "LIQUOR LAW VIOLATION"
## [16] "INTERFERENCE WITH PUBLIC OFFICER"
## [17] "CRIMINAL DAMAGE"
## [18] "ASSAULT"
## [19] "STALKING"
## [20] "ARSON"
## [21] "CRIMINAL TRESPASS"
## [22] "HOMICIDE"
## [23] "ROBBERY"
## [24] "OBSCENITY"
## [25] "KIDNAPPING"
## [26] "INTIMIDATION"
## [27] "RITUALISM"
## [28] "DOMESTIC VIOLENCE"
## [29] "OTHER NARCOTIC VIOLATION"
## [30] "PUBLIC INDECENCY"
## [31] "IUCR"
## [32] "NON-CRIMINAL"
## [33] "HUMAN TRAFFICKING"
## [34] "CONCEALED CARRY LICENSE VIOLATION"
## [35] "NON - CRIMINAL"
## [36] "NON-CRIMINAL (SUBJECT SPECIFIED)"
As seen, there are 34 different primary types of Crimes.
I do not know how to explain - why number of arrests != number of crimes . Many possibilities
Based on the first plot, we will try to understand - how counts of top 5 most frequent crimes in Chicago have changed over the years.
Residences are number two on the list, which is rather scary , shocking and surprising!
Observations
Conclusion
The issue of bringing peace back to Chicago is no easy task. This is possible only through dedicated and coordinated efforts betweeen law enforcement agencies and the communities involved.
Additional Notes
All the plots are interactive.
This analysis provides a comprehensive overview based on historical data.
A more in depth analysis can be carried out depending upon the questions in context.
For instance , if are trying to reduce homicides, we can analyse data to look at places which are most homicides take place, locations where more homicides have taken place, who were the people who were the victims of homicide and then finally look at what kind of people were the perpetrators and what time of day/week were these crimes being committed.
Based on this information we could then arrive at a practical solution.
Analysis based on Day of week and hour of day were left out as they are rather simple to achieve. A future extension would be to integrate these plots in to a website or build a tool using Rshiny or the like.
A separate document/link is provided for interactively exploring high crime neighbourhoods. (as there seems to be some conflict between the leaflet package and xts time series objects and some other package)
Vivek Mangipudi. July 28, 2017.
Links to mapping - part 2 :
http://rpubs.com/stanspwan/chicagoNarco
http://rpubs.com/stanspwan/chicagobattery
http://rpubs.com/stanspwan/chicagoAssault
http://rpubs.com/stanspwan/chicagoDAMAGE
All these maps are interactive and zoomable upto street level.
Link to data : goo.gl/sRKHwe