Where Police have killed Americans in 2015

Police brutality against African Americans has been on the spotlight during 2020, stealing some attention from the Covid 19 pandemic and motivating protests and riots all across America. The death of George Floyd in police custody, acted as a catalyst to the protests. These were exacerbated by a somewhat callous response from the Federal government which moved to suppress the protests instead of trying to offer a solution to the crisis.

The main perception is that police is more likely to kill an African American suspect than any other ethnic group, especially whites. As part of the datasets available for HW1, I found this particular one which listed police killings in 2015 and wanted to examine the data to see if perception corresponds to reality.

The data was taken from https://data.fivethirtyeight.com/ and copied to my personal github repo for easy examination.

Getting the Data:

library(tidyverse)
## -- Attaching packages ------------------------------------------------------ tidyverse 1.3.0 --
## v ggplot2 3.3.2     v purrr   0.3.4
## v tibble  3.0.3     v dplyr   1.0.0
## v tidyr   1.1.2     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.5.0
## -- Conflicts --------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
theURL <- "https://raw.githubusercontent.com/georg4re/DS607/master/data/2015-Police%20Killings.csv"
killings <- read.csv(file=theURL, fileEncoding="UTF-8-BOM")

head(killings)
##                 name age gender   raceethnicity    month day year
## 1 A'donte Washington  16   Male           Black February  23 2015
## 2     Aaron Rutledge  27   Male           White    April   2 2015
## 3        Aaron Siler  26   Male           White    March  14 2015
## 4       Aaron Valdez  25   Male Hispanic/Latino    March  11 2015
## 5       Adam Jovicic  29   Male           White    March  19 2015
## 6      Adam Reinhart  29   Male           White    March   7 2015
##            streetaddress         city state latitude  longitude state_fp
## 1           Clearview Ln    Millbrook    AL 32.52958  -86.36283        1
## 2 300 block Iris Park Dr    Pineville    LA 31.32174  -92.43486       22
## 3   22nd Ave and 56th St      Kenosha    WI 42.58356  -87.83571       55
## 4      3000 Seminole Ave   South Gate    CA 33.93930 -118.21946        6
## 5         364 Hiwood Ave Munroe Falls    OH 41.14857  -81.42988       39
## 6    18th St and Palm Ln      Phoenix    AZ 33.46938 -112.04332        4
##   county_fp tract_ce      geo_id county_id             namelsad
## 1        51    30902  1051030902      1051  Census Tract 309.02
## 2        79    11700 22079011700     22079     Census Tract 117
## 3        59     1200 55059001200     55059      Census Tract 12
## 4        37   535607  6037535607      6037 Census Tract 5356.07
## 5       153   530800 39153530800     39153    Census Tract 5308
## 6        13   111602  4013111602      4013 Census Tract 1116.02
##              lawenforcementagency   cause   armed  pop share_white share_black
## 1     Millbrook Police Department Gunshot      No 3779        60.5        30.5
## 2 Rapides Parish Sheriff's Office Gunshot      No 2769        53.8        36.2
## 3       Kenosha Police Department Gunshot      No 4079        73.8         7.7
## 4    South Gate Police Department Gunshot Firearm 4343         1.2         0.6
## 5          Kent Police Department Gunshot      No 6809        92.5         1.4
## 6       Phoenix Police Department Gunshot      No 4682           7         7.7
##   share_hispanic p_income h_income county_income comp_income county_bucket
## 1            5.6    28375    51367         54766   0.9379359             3
## 2            0.5    14678    27972         40930   0.6834107             2
## 3           16.8    25286    45365         54930   0.8258693             2
## 4           98.8    17194    48295         55909   0.8638144             3
## 5            1.7    33954    68785         49669   1.3848678             5
## 6             79    15523    20833         53596   0.3887044             1
##   nat_bucket  pov      urate    college
## 1          3 14.1 0.09768638 0.16850951
## 2          1 28.8 0.06572379 0.11140236
## 3          3 14.6 0.16629314 0.14731227
## 4          3 11.7 0.12482727 0.05013293
## 5          4  1.9 0.06354983 0.40395421
## 6          1   58 0.07365145 0.10295519
# summary(killings)

Manipulating the Data

For the purposes of my study, address information, beyond the state, is not needed and neither is geolocation information or census data. Initial examination will only look to race, gender, and if the individual was armed or not.

keeps <- c("name", "age", "gender", "raceethnicity", "state", "cause", "armed", "share_white", "share_black","share_hispanic")
subKillings <- killings[, keeps, drop=FALSE]
head(subKillings)
##                 name age gender   raceethnicity state   cause   armed
## 1 A'donte Washington  16   Male           Black    AL Gunshot      No
## 2     Aaron Rutledge  27   Male           White    LA Gunshot      No
## 3        Aaron Siler  26   Male           White    WI Gunshot      No
## 4       Aaron Valdez  25   Male Hispanic/Latino    CA Gunshot Firearm
## 5       Adam Jovicic  29   Male           White    OH Gunshot      No
## 6      Adam Reinhart  29   Male           White    AZ Gunshot      No
##   share_white share_black share_hispanic
## 1        60.5        30.5            5.6
## 2        53.8        36.2            0.5
## 3        73.8         7.7           16.8
## 4         1.2         0.6           98.8
## 5        92.5         1.4            1.7
## 6           7         7.7             79

Cleaning up some data

The age and share_ethnicity columns are typed as characters when they should be numeric, some ages are “unknown” but I’d rather have them as NA.

subKillings$age <- suppressWarnings(as.numeric(subKillings$age))
subKillings$share_white <- suppressWarnings(as.numeric(subKillings$share_white))
subKillings$share_black <- suppressWarnings(as.numeric(subKillings$share_black))
subKillings$share_hispanic <- suppressWarnings(as.numeric(subKillings$share_hispanic))
summary(subKillings)
##      name                age           gender          raceethnicity     
##  Length:467         Min.   :16.00   Length:467         Length:467        
##  Class :character   1st Qu.:28.00   Class :character   Class :character  
##  Mode  :character   Median :35.00   Mode  :character   Mode  :character  
##                     Mean   :37.37                                        
##                     3rd Qu.:45.00                                        
##                     Max.   :87.00                                        
##                     NA's   :4                                            
##     state              cause              armed            share_white   
##  Length:467         Length:467         Length:467         Min.   : 0.00  
##  Class :character   Class :character   Class :character   1st Qu.:26.20  
##  Mode  :character   Mode  :character   Mode  :character   Median :56.50  
##                                                           Mean   :51.92  
##                                                           3rd Qu.:77.50  
##                                                           Max.   :99.60  
##                                                           NA's   :2      
##   share_black    share_hispanic
##  Min.   : 0.00   Min.   : 0.0  
##  1st Qu.: 1.40   1st Qu.: 3.5  
##  Median : 7.40   Median :10.9  
##  Mean   :17.94   Mean   :22.0  
##  3rd Qu.:23.70   3rd Qu.:32.9  
##  Max.   :99.80   Max.   :98.8  
##  NA's   :2       NA's   :2

Making sense of it all

It would be good to aggregate some data to see the percentages of each ethnicity and compare them to the overall mean of that ethnicity.

allKillings <- count(subKillings)
killingsByRace <- count(subKillings, raceethnicity)
names(killingsByRace)[2] <- "killings"
Temp<-merge(x=killingsByRace,y=allKillings,all.x= TRUE)
names(Temp)[3] <- "allKillings"
Temp$PctKillingsByRace<-round((Temp$killings/Temp$allKillings)*100,3)
Temp
##            raceethnicity killings allKillings PctKillingsByRace
## 1 Asian/Pacific Islander       10         467             2.141
## 2                  Black      135         467            28.908
## 3        Hispanic/Latino       67         467            14.347
## 4        Native American        4         467             0.857
## 5                Unknown       15         467             3.212
## 6                  White      236         467            50.535

Conclusions

Based on this data, almost 29% of all police killings in 2015 were of african americans. This, compared to a mean population share of 17% shows a disparity in police killings that disproportionally affected African Americans. Whites were very close to their mean share in population and hispanics were significantly below their mean population share for the same period.

This study would benefit from data for other years, more comparisons and analysis of armed/unarmed, age, and other factors that might help better qualify the information being processed.