Introduction:

I’m Vivek Mangipudi. This is my data analysis and visualization submission with regards to the NNSC Data Analytics Intern position.

Preliminary :

Crime in Chicago has been tracked by the Chicago Police Department’s Bureau of Records since the beginning of the 20th century. The city’s violent crime rate, is substantially higher than the US average. Although national crime rates in the United States have stayed near historic lows, Chicago was supposedly responsible for nearly half of 2016’s increase in homicides in the United States. .As of 2017, Chicago’s homicide rate is significantly higher when compared to the larger American cities of New York and Los Angeles, but lower when compared to smaller American cities. The reason for the violence which is localized to some areas of the city, including change in police tactics or increase in gang rivalry, remain unclear.

In this analysis, I explore the Chicago Crimes Data Set between the years 2001 - 2017, with the hope that I would be able to provide atleast some answers to the following questions :
1. How has the number of various crimes changed over time in Chicago?
2. How have the number arrests corresponding to the crimes changed over time in Chicago?
3. Are there any trends in the crimes being committed?
4. Which crimes are most frequently committed?
5. Which locations are these frequent crimes being committed in?
6. Are there certain high crime neighbourhoods?
7. How has the number of Homicides changed over the years in Chicago?
8. Can interesting , intuitive and interactive visualization be created to convey the story hidden in the data.

By being able to identify high crime neighbourhoods, we can work on arriving at a solution by using a combination of external demographic, socio-economic, cultural and ethinic data to figure out if the violence is being perpetrated by violent gangs, or are these gangs forming on account of a negligent and an incapable government, or if poverty and poor education is indirectly leading to all these crimes, or if social media such as facebook youtube and mainstream cinema/tv shows are to blame for culturally influencing individuals to take up arms, because the recurring themes among tv shows, movies and rap songs these days seems to be about guns gangs gold and girls , all the while gloryifing an unrealistic “hustling” life style.

Importance :

  1. The High number of homicides which take place in Chicago, don’t reflect well on the law enforcement agencies and the government.
  2. By carrying an indepth analysis, one can try to identify the root cause of the homicides and violence and then arrive arrive at a solution which would result in a reduction in bloodshed and help bring peace to this city which has long been traumatized by violent crimes.
  3. Once a solution is found, it could , to some degree be adopted by other cities which are prone to high rates of crime.

Data Exploration :

Size of Data :

## [1] 7941285      24

Sneak Peek into Data :

X1 X1_1 ID Case Number Date Block IUCR Primary Type Description Location Description
1 879 4786321 HM399414 01/01/2004 12:01:00 AM 082XX S COLES AVE 0840 THEFT FINANCIAL ID THEFT: OVER $300 RESIDENCE
2 2544 4676906 HM278933 03/01/2003 12:00:00 AM 004XX W 42ND PL 2825 OTHER OFFENSE HARASSMENT BY TELEPHONE RESIDENCE
3 2919 4789749 HM402220 06/20/2004 11:00:00 AM 025XX N KIMBALL AVE 1752 OFFENSE INVOLVING CHILDREN AGG CRIM SEX ABUSE FAM MEMBER RESIDENCE
4 2927 4789765 HM402058 12/30/2004 08:00:00 PM 045XX W MONTANA ST 0840 THEFT FINANCIAL ID THEFT: OVER $300 OTHER
5 3302 4677901 HM275615 05/01/2003 01:00:00 AM 111XX S NORMAL AVE 0841 THEFT FINANCIAL ID THEFT:$300 &UNDER RESIDENCE
6 3633 4838048 HM440266 08/01/2004 12:01:00 AM 012XX S HARDING AVE 0841 THEFT FINANCIAL ID THEFT:$300 &UNDER APARTMENT
7 3756 4791194 HM403711 01/01/2001 11:00:00 AM 114XX S ST LAWRENCE AVE 0266 CRIM SEXUAL ASSAULT PREDATORY RESIDENCE
8 4502 4679521 HM216293 03/15/2003 12:00:00 AM 090XX S RACINE AVE 5007 OTHER OFFENSE OTHER WEAPONS VIOLATION RESIDENCE PORCH/HALLWAY
9 4564 4792195 HM405396 09/16/2004 10:00:00 AM 003XX W HUBBARD ST 0890 THEFT FROM BUILDING RESIDENCE
10 4904 4680124 HM282389 01/01/2003 12:00:00 AM 009XX S SPAULDING AVE 0840 THEFT FINANCIAL ID THEFT: OVER $300 RESIDENCE


Arrest Domestic Beat District Ward Community Area FBI Code X Coordinate Y Coordinate Year
False False 424 4 7 46 06 NA NA 2004
False True 935 9 11 61 26 1173974 1876757 2003
False False 1413 14 35 22 20 NA NA 2004
False False 2521 25 31 20 06 NA NA 2004
False False 2233 22 34 49 06 1174948 1831051 2003
False False 1011 10 24 29 06 NA NA 2004
True True 531 5 9 50 02 1182247 1829375 2001
False False 2222 22 21 73 26 1169911 1844832 2003
False False 1831 18 42 8 06 NA NA 2004
False False 1134 11 24 29 06 1154521 1895755 2003

Data Distribution and Summary :

##        X1               X1_1               ID           Case Number       
##  Min.   :      1   Min.   :      0   Min.   :     634   Length:7941285    
##  1st Qu.:1985322   1st Qu.:1160283   1st Qu.: 3853210   Class :character  
##  Median :3970643   Median :2282371   Median : 6165079   Mode  :character  
##  Mean   :3970643   Mean   :2673858   Mean   : 5926071                     
##  3rd Qu.:5955964   3rd Qu.:4185491   3rd Qu.: 7716590                     
##  Max.   :7941285   Max.   :6254267   Max.   :10827880                     
##                                      NA's   :1                            
##      Date              Block               IUCR          
##  Length:7941285     Length:7941285     Length:7941285    
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
##                                                          
##  Primary Type       Description        Location Description
##  Length:7941285     Length:7941285     Length:7941285      
##  Class :character   Class :character   Class :character    
##  Mode  :character   Mode  :character   Mode  :character    
##                                                            
##                                                            
##                                                            
##                                                            
##     Arrest            Domestic              Beat         District    
##  Length:7941285     Length:7941285     Min.   : 111   Min.   : 1.00  
##  Class :character   Class :character   1st Qu.: 623   1st Qu.: 6.00  
##  Mode  :character   Mode  :character   Median :1111   Median :10.00  
##                                        Mean   :1198   Mean   :11.31  
##                                        3rd Qu.:1732   3rd Qu.:17.00  
##                                        Max.   :2535   Max.   :31.00  
##                                        NA's   :1      NA's   :92     
##       Ward        Community Area     FBI Code          X Coordinate    
##  Min.   : 1.0     Min.   : 0.0     Length:7941285     Min.   :      0  
##  1st Qu.:10.0     1st Qu.:23.0     Class :character   1st Qu.:1152887  
##  Median :22.0     Median :32.0     Mode  :character   Median :1165910  
##  Mean   :22.6     Mean   :37.7                        Mean   :1164456  
##  3rd Qu.:34.0     3rd Qu.:58.0                        3rd Qu.:1176336  
##  Max.   :50.0     Max.   :77.0                        Max.   :1205119  
##  NA's   :700225   NA's   :702092                      NA's   :105574   
##   Y Coordinate          Year       Updated On           Latitude     
##  Min.   :      0   Min.   :2001   Length:7941285     Min.   :36.62   
##  1st Qu.:1858997   1st Qu.:2005   Class :character   1st Qu.:41.77   
##  Median :1890072   Median :2008   Mode  :character   Median :41.85   
##  Mean   :1885555   Mean   :2008                      Mean   :41.84   
##  3rd Qu.:1909373   3rd Qu.:2010                      3rd Qu.:41.91   
##  Max.   :1951622   Max.   :2017                      Max.   :42.02   
##  NA's   :105575    NA's   :2                         NA's   :105575  
##    Longitude        Location        
##  Min.   :-91.69   Length:7941285    
##  1st Qu.:-87.71   Class :character  
##  Median :-87.67   Mode  :character  
##  Mean   :-87.67                     
##  3rd Qu.:-87.63                     
##  Max.   :-87.52                     
##  NA's   :105576

Various Types of crimes are :

##  [1] "THEFT"                            
##  [2] "OTHER OFFENSE"                    
##  [3] "OFFENSE INVOLVING CHILDREN"       
##  [4] "CRIM SEXUAL ASSAULT"              
##  [5] "MOTOR VEHICLE THEFT"              
##  [6] "SEX OFFENSE"                      
##  [7] "DECEPTIVE PRACTICE"               
##  [8] "BATTERY"                          
##  [9] "BURGLARY"                         
## [10] "WEAPONS VIOLATION"                
## [11] "PUBLIC PEACE VIOLATION"           
## [12] "NARCOTICS"                        
## [13] "GAMBLING"                         
## [14] "PROSTITUTION"                     
## [15] "LIQUOR LAW VIOLATION"             
## [16] "INTERFERENCE WITH PUBLIC OFFICER" 
## [17] "CRIMINAL DAMAGE"                  
## [18] "ASSAULT"                          
## [19] "STALKING"                         
## [20] "ARSON"                            
## [21] "CRIMINAL TRESPASS"                
## [22] "HOMICIDE"                         
## [23] "ROBBERY"                          
## [24] "OBSCENITY"                        
## [25] "KIDNAPPING"                       
## [26] "INTIMIDATION"                     
## [27] "RITUALISM"                        
## [28] "DOMESTIC VIOLENCE"                
## [29] "OTHER NARCOTIC VIOLATION"         
## [30] "PUBLIC INDECENCY"                 
## [31] "IUCR"                             
## [32] "NON-CRIMINAL"                     
## [33] "HUMAN TRAFFICKING"                
## [34] "CONCEALED CARRY LICENSE VIOLATION"
## [35] "NON - CRIMINAL"                   
## [36] "NON-CRIMINAL (SUBJECT SPECIFIED)"

As seen, there are 34 different primary types of Crimes.


Crimes

Most Frequently committed crimes :

TIME SERIES ANALYSIS

ALL CRIMES:

ALL ARRESTS

ALL CRIMES VS ALL ARRESTS :

I do not know how to explain - why number of arrests != number of crimes . Many possibilities


TIME SERIES OF MOST FREQUENT CRIMES 2001-2017: THEFT, BATTERY, CRIMINAL DAMAGE, NARCOTICS, OTHER OFFENSE.

Based on the first plot, we will try to understand - how counts of top 5 most frequent crimes in Chicago have changed over the years.

CRIMES PER CATEGORY PER YEAR

HEATMAP

In the heatmap, Certain crimes have white color in the heatmap. Either data was not collected for those types or data is missing.


STACKED COLUMN

LOCATIONS:

Location Description

Residences are number two on the list, which is rather scary , shocking and surprising!


Time series analysis of Crimes in top 4 locations - Street, Residence, Sidewalk and Other.

HIGH CRIME DISTRICTS

High Crime Neighbourhoods :



BARS

TIME SERIES

HEATMAP

Observations

  • The crimes (especially homicide) in 2016 see a sharp rise as compared to 2015, which is bad news.
  • The areas near the airport and the area near the harbor seems to be two very dangerous areas, as a lot of crimes have occured here.
  • Based on this dat, the number of crimes are way higher than the number of arrests. (Many possibilites as to why. . .)
  • The number of crimes involving sidewalks also seems to have greatly reduced. Probably this could be owed to higher Police patrolling.
  • There is huge increase in the number of homicides in Chicago in 2016 compared to previous years.

Conclusion

The issue of bringing peace back to Chicago is no easy task. This is possible only through dedicated and coordinated efforts betweeen law enforcement agencies and the communities involved.

Additional Notes
All the plots are interactive.
This analysis provides a comprehensive overview based on historical data.
A more in depth analysis can be carried out depending upon the questions in context.
For instance , if are trying to reduce homicides, we can analyse data to look at places which are most homicides take place, locations where more homicides have taken place, who were the people who were the victims of homicide and then finally look at what kind of people were the perpetrators and what time of day/week were these crimes being committed.
Based on this information we could then arrive at a practical solution.

Analysis based on Day of week and hour of day were left out as they are rather simple to achieve. A future extension would be to integrate these plots in to a website or build a tool using Rshiny or the like.

A separate document/link is provided for interactively exploring high crime neighbourhoods. (as there seems to be some conflict between the leaflet package and xts time series objects and some other package)

Vivek Mangipudi. July 28, 2017.

Links to mapping - part 2 :

http://rpubs.com/stanspwan/chicagoNarco

http://rpubs.com/stanspwan/chicagobattery

http://rpubs.com/stanspwan/chicagoAssault

http://rpubs.com/stanspwan/chicagoDAMAGE

All these maps are interactive and zoomable upto street level.

Link to data : goo.gl/sRKHwe