Baltimore 911 Calls Dataset

This dataset consists of police emergency and non-emergency calls to 911. Fields in the dataset includes the recordId, callDateTime, priority, district, description, callNumber and incidentLocation. It also consists of latitude and longitude information from where the call was made.

As shown below, there are quite a lot of latitude and longitude data missing in the data. The records are still useful if not using those two fields so the decision was made to keep it rather than throw a chunk of the data away.

##         recordId     callDateTime         priority         district 
##                0               14                0                0 
##      description       callNumber incidentLocation              lat 
##                0                0                0          1012972 
##             long             year            month             hour 
##          1012972                0                0                0 
##          weekStr          seasons 
##                0                0

Descriptive Statistics

Below shows the summary of the various fields in the dataset. Since most of the fields are categorical, the mean, median, etc. doesn’t mean anything as shown below. The useful information that the below summary gives us is that on average, the month of June has the most number of 911 calls and usually happens around the early afternoon. Other than those two pieces of information, the rest aren’t really helpful; therefore, it’s better to visualize them graphically as will be shown later. Also note that several new fields were added such as lat, long, year, month, hour, weekStr, and seasons. Further descriptive statistics is provided using table to see numerically what the distribution is. One thing interesting to note is that the number of 911 calls seem to be descreasing year after year which really doesn’t mean much because there could be multiple reasons why it could be decreasing so further analysis needs to be done to see why.

##     recordId        callDateTime                   priority        
##  Min.   :      1   Min.   :2015-01-01 00:01:00   Length:4265563    
##  1st Qu.:1066392   1st Qu.:2015-12-30 01:07:00   Class :character  
##  Median :2132782   Median :2017-01-05 19:16:00   Mode  :character  
##  Mean   :2137396   Mean   :2017-01-16 23:53:46                     
##  3rd Qu.:3199478   3rd Qu.:2018-01-30 19:22:00                     
##  Max.   :4301195   Max.   :2019-02-26 22:00:00                     
##                    NA's   :14                                      
##    district         description         callNumber       
##  Length:4265563     Length:4265563     Length:4265563    
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
##                                                          
##  incidentLocation        lat               long              year     
##  Length:4265563     Min.   :25.7      Min.   :-149.4    Min.   :2015  
##  Class :character   1st Qu.:39.3      1st Qu.: -76.7    1st Qu.:2015  
##  Mode  :character   Median :39.3      Median : -76.6    Median :2017  
##                     Mean   :39.3      Mean   : -76.7    Mean   :2017  
##                     3rd Qu.:39.3      3rd Qu.: -76.6    3rd Qu.:2018  
##                     Max.   :60.1      Max.   : -70.2    Max.   :2019  
##                     NA's   :1012972   NA's   :1012972                 
##      month             hour         weekStr            seasons         
##  Min.   : 1.000   Min.   : 0.00   Length:4265563     Length:4265563    
##  1st Qu.: 3.000   1st Qu.: 9.00   Class :character   Class :character  
##  Median : 6.000   Median :14.00   Mode  :character   Mode  :character  
##  Mean   : 6.301   Mean   :13.53                                        
##  3rd Qu.: 9.000   3rd Qu.:19.00                                        
##  Max.   :12.000   Max.   :23.00                                        
## 
## 
##                     Emergency           High            Low         Medium 
##           6656           1392         700484         942105        2009290 
##  Non-Emergency Out of Service 
##         604618           1018
## 
##     CD     CW     ED   EVT1   EVT2   EVT3   FIR1     HP   INFO     ND 
## 470069  54681 393557    105     43     10      2      2    259 430250 
##     NE     NW     SD     SE     SS     SW    TRU     WD 
## 607636 422470 467609 467007  12671 473885  53290 412017
## 
##    2015    2016    2017    2018    2019 
## 1071776 1048633 1003446  954487  187221
## 
##      1      2      3      4      5      6      7      8      9     10 
## 423278 392621 331131 347673 360963 361844 355045 361827 342605 348510 
##     11     12 
## 318730 321336
## 
##      0      1      2      3      4      5      6      7      8      9 
## 154529 122374 101460  76928  61739  57765  69612 105448 153607 174099 
##     10     11     12     13     14     15     16     17     18     19 
## 192421 208490 216199 219221 238360 242221 252851 265670 260025 243923 
##     20     21     22     23 
## 235247 222256 201854 189264
## 
##    Fri    Mon    Sat    Sun    Thu    Tue    Wed 
## 642995 602426 602078 551848 626788 622383 617045
## 
##    Fall  Spring  Summer  Winter 
## 1009845 1039767 1078716 1137235

Bar plot of the most crime in a particular incident location

This graph looks at whether there are particular incident locations that contributes to the most 911 calls from year to year. Looking at the below graph, the address, 100, which is probably an invalid address but included in here for completeness, has the highest 911 calls from 2015-1018. This graph clearly shows that some years, no 911 calls came from that location which is interesting. For example, the address, 400 N FRONT ST. only saw 911 calls in 2015 and nothing after which could be for many reasons and would required further analysis.

Histogram on number of 911 calls in a given season based on priority

This graph looks at how the number of 911 calls compares each year and whether there are more of one type of priority calls compared to other priority calls. Looking at the graph below, the most 911 calls each year contributes to mostly Medium calls.

Bar graph on number of 911 calls in the four seasons based on year

The below graph analyzes the data based on the four seasons (Fall, Spring, Summer and Fall) and goal is to see whether there are more 911 calls in a particular season and how it compares from year to year. Looking at the graph below, it doesn’t seem like a particular season contributes to more 911 calls. They seem symmetrical and no one year has a high number of 911 calls in a particular year.

Bar graph of top descriptions of the 911 calls

This graph shows the most common types of 911 calls in the various years. What really stood out is the fact that the number of 911/NO VOICE calls dominated any other real 911 calls. Nothing really stands out; it’s not surprising that traffic stop, common assault, and narcotics (given that Baltimore is the drug capital) is amongst the top 911 calls. Looking at the various years, nothing really stands out. It looks like the number of 911/NO VOICE calls was less in 2018 than previous years.

Conclusion

Overall, crimes does not seem to improve over the years and more work needs to done by law enforcement and local government officials to ensure the safety of the citizens of Baltimore. There seem to be some slight improvements but not enough. More focus needs to be on assaults, auto accidents and narcotics. These areas have one too many incidents and needs to be improved. With more time, further analysis could be done using the logitude and latitude information to see whether certain crimes concentrated in certain areas.