Police brutality against African Americans has been on the spotlight during 2020, stealing some attention from the Covid 19 pandemic and motivating protests and riots all across America. The death of George Floyd in police custody, acted as a catalyst to the protests. These were exacerbated by a somewhat callous response from the Federal government which moved to suppress the protests instead of trying to offer a solution to the crisis.
The main perception is that police is more likely to kill an African American suspect than any other ethnic group, especially whites. As part of the datasets available for HW1, I found this particular one which listed police killings in 2015 and wanted to examine the data to see if perception corresponds to reality.
The data was taken from https://data.fivethirtyeight.com/ and copied to my personal github repo for easy examination.
library(tidyverse)
## -- Attaching packages ------------------------------------------------------ tidyverse 1.3.0 --
## v ggplot2 3.3.2 v purrr 0.3.4
## v tibble 3.0.3 v dplyr 1.0.0
## v tidyr 1.1.2 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.5.0
## -- Conflicts --------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
theURL <- "https://raw.githubusercontent.com/georg4re/DS607/master/data/2015-Police%20Killings.csv"
killings <- read.csv(file=theURL, fileEncoding="UTF-8-BOM")
head(killings)
## name age gender raceethnicity month day year
## 1 A'donte Washington 16 Male Black February 23 2015
## 2 Aaron Rutledge 27 Male White April 2 2015
## 3 Aaron Siler 26 Male White March 14 2015
## 4 Aaron Valdez 25 Male Hispanic/Latino March 11 2015
## 5 Adam Jovicic 29 Male White March 19 2015
## 6 Adam Reinhart 29 Male White March 7 2015
## streetaddress city state latitude longitude state_fp
## 1 Clearview Ln Millbrook AL 32.52958 -86.36283 1
## 2 300 block Iris Park Dr Pineville LA 31.32174 -92.43486 22
## 3 22nd Ave and 56th St Kenosha WI 42.58356 -87.83571 55
## 4 3000 Seminole Ave South Gate CA 33.93930 -118.21946 6
## 5 364 Hiwood Ave Munroe Falls OH 41.14857 -81.42988 39
## 6 18th St and Palm Ln Phoenix AZ 33.46938 -112.04332 4
## county_fp tract_ce geo_id county_id namelsad
## 1 51 30902 1051030902 1051 Census Tract 309.02
## 2 79 11700 22079011700 22079 Census Tract 117
## 3 59 1200 55059001200 55059 Census Tract 12
## 4 37 535607 6037535607 6037 Census Tract 5356.07
## 5 153 530800 39153530800 39153 Census Tract 5308
## 6 13 111602 4013111602 4013 Census Tract 1116.02
## lawenforcementagency cause armed pop share_white share_black
## 1 Millbrook Police Department Gunshot No 3779 60.5 30.5
## 2 Rapides Parish Sheriff's Office Gunshot No 2769 53.8 36.2
## 3 Kenosha Police Department Gunshot No 4079 73.8 7.7
## 4 South Gate Police Department Gunshot Firearm 4343 1.2 0.6
## 5 Kent Police Department Gunshot No 6809 92.5 1.4
## 6 Phoenix Police Department Gunshot No 4682 7 7.7
## share_hispanic p_income h_income county_income comp_income county_bucket
## 1 5.6 28375 51367 54766 0.9379359 3
## 2 0.5 14678 27972 40930 0.6834107 2
## 3 16.8 25286 45365 54930 0.8258693 2
## 4 98.8 17194 48295 55909 0.8638144 3
## 5 1.7 33954 68785 49669 1.3848678 5
## 6 79 15523 20833 53596 0.3887044 1
## nat_bucket pov urate college
## 1 3 14.1 0.09768638 0.16850951
## 2 1 28.8 0.06572379 0.11140236
## 3 3 14.6 0.16629314 0.14731227
## 4 3 11.7 0.12482727 0.05013293
## 5 4 1.9 0.06354983 0.40395421
## 6 1 58 0.07365145 0.10295519
# summary(killings)
For the purposes of my study, address information, beyond the state, is not needed and neither is geolocation information or census data. Initial examination will only look to race, gender, and if the individual was armed or not.
keeps <- c("name", "age", "gender", "raceethnicity", "state", "cause", "armed", "share_white", "share_black","share_hispanic")
subKillings <- killings[, keeps, drop=FALSE]
head(subKillings)
## name age gender raceethnicity state cause armed
## 1 A'donte Washington 16 Male Black AL Gunshot No
## 2 Aaron Rutledge 27 Male White LA Gunshot No
## 3 Aaron Siler 26 Male White WI Gunshot No
## 4 Aaron Valdez 25 Male Hispanic/Latino CA Gunshot Firearm
## 5 Adam Jovicic 29 Male White OH Gunshot No
## 6 Adam Reinhart 29 Male White AZ Gunshot No
## share_white share_black share_hispanic
## 1 60.5 30.5 5.6
## 2 53.8 36.2 0.5
## 3 73.8 7.7 16.8
## 4 1.2 0.6 98.8
## 5 92.5 1.4 1.7
## 6 7 7.7 79
The age and share_ethnicity columns are typed as characters when they should be numeric, some ages are “unknown” but I’d rather have them as NA.
subKillings$age <- suppressWarnings(as.numeric(subKillings$age))
subKillings$share_white <- suppressWarnings(as.numeric(subKillings$share_white))
subKillings$share_black <- suppressWarnings(as.numeric(subKillings$share_black))
subKillings$share_hispanic <- suppressWarnings(as.numeric(subKillings$share_hispanic))
summary(subKillings)
## name age gender raceethnicity
## Length:467 Min. :16.00 Length:467 Length:467
## Class :character 1st Qu.:28.00 Class :character Class :character
## Mode :character Median :35.00 Mode :character Mode :character
## Mean :37.37
## 3rd Qu.:45.00
## Max. :87.00
## NA's :4
## state cause armed share_white
## Length:467 Length:467 Length:467 Min. : 0.00
## Class :character Class :character Class :character 1st Qu.:26.20
## Mode :character Mode :character Mode :character Median :56.50
## Mean :51.92
## 3rd Qu.:77.50
## Max. :99.60
## NA's :2
## share_black share_hispanic
## Min. : 0.00 Min. : 0.0
## 1st Qu.: 1.40 1st Qu.: 3.5
## Median : 7.40 Median :10.9
## Mean :17.94 Mean :22.0
## 3rd Qu.:23.70 3rd Qu.:32.9
## Max. :99.80 Max. :98.8
## NA's :2 NA's :2
It would be good to aggregate some data to see the percentages of each ethnicity and compare them to the overall mean of that ethnicity.
allKillings <- count(subKillings)
killingsByRace <- count(subKillings, raceethnicity)
names(killingsByRace)[2] <- "killings"
Temp<-merge(x=killingsByRace,y=allKillings,all.x= TRUE)
names(Temp)[3] <- "allKillings"
Temp$PctKillingsByRace<-round((Temp$killings/Temp$allKillings)*100,3)
Temp
## raceethnicity killings allKillings PctKillingsByRace
## 1 Asian/Pacific Islander 10 467 2.141
## 2 Black 135 467 28.908
## 3 Hispanic/Latino 67 467 14.347
## 4 Native American 4 467 0.857
## 5 Unknown 15 467 3.212
## 6 White 236 467 50.535
Based on this data, almost 29% of all police killings in 2015 were of african americans. This, compared to a mean population share of 17% shows a disparity in police killings that disproportionally affected African Americans. Whites were very close to their mean share in population and hispanics were significantly below their mean population share for the same period.
This study would benefit from data for other years, more comparisons and analysis of armed/unarmed, age, and other factors that might help better qualify the information being processed.