Policing in the community has been a hot button topic for almost a decade now, after the shooting of Travon Martin in 2012. Martin was a 17-year-old African American kid, who was fatally shot by George Zimmerman. While this wasn’t directly caused by a police officer, this has to do with the systematic racist involved in law enforcement. Zimmerman was the neighborhood watch coordinator in his community. This was one of the first dominos to fall in what would be 8 years of a reflection on society’s view of African Americans in today’s age.
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.0.2
## -- Attaching packages ------------------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.0     v purrr   0.3.4
## v tibble  3.0.1     v dplyr   0.8.5
## v tidyr   1.1.2     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.5.0
## Warning: package 'tidyr' was built under R version 4.0.2
## Warning: package 'forcats' was built under R version 4.0.2
## -- Conflicts ---------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(ggplot2)
projectoneWD <- "C:/Users/Joeyc/Documents/School/Fall 2020/Data 110"
setwd(projectoneWD)
getwd()
## [1] "C:/Users/Joeyc/Documents/School/Fall 2020/Data 110"
The dataset was already cleaned up so there wasn’t really any cleaning to do. I decided to change the dates in the data set you provided to only showing the year. This way we could easily break down shooting by year.
fatal_shooting_data <- read.csv("fatal_police_shootings_clean.csv", header = T, sep = "," )
view(fatal_shooting_data)
fatal_shooting_data %>% 
  ggplot() +
  geom_histogram(aes(x = year, fill = race)) +
  xlab("Year" ) +
  ylab("frequency") +
  ggtitle("Frequency of Fatal Shootings per year by Police") 
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Here there is a histogram set up to show the distribution of fatal shootings by police officers for each year over the last five and a half years. Obviously 2020 is much lower than the previous five years because we are all suck at home due to Covid. The histogram is further broken down into race, as shown by the legend to the right of the histogram. N/A is when they were unsure about the victims’ race. As we can see, close to 50% of the fatalities are White Non-Hispanic, around 25% are Black Non-Hispanic, 20% are Hispanic, and the remaining 5% make up the remaining races left out. At first look, we assume that Caucasians are the most victimized race for police shootings. We have to take a deeper dive into the data to compare this to the population of the United States; 76% of the population in the US is White Non-Hispanic, while 13% is Black Non-Hispanic. As we clearly see there is almost double the fatalities of American Americans, while the number of fatalities of white Non-Hispanic Americans falls way below their average demographic.
An interesting addition to this data would be comparing % of interactions that police face between races. Instead of comparing the population of the U.S to the % of fatalities. I just did not have access to those numbers and I am not sure if those numbers exist. This could be a more comparable number. I assume those numbers would be closer aligned than the population of the U.S. The interactions could still be due to racial profiling, leading to more interactions with police for minorities compared to white Non-Hispanic.
ggplot(fatal_shooting_data, aes(age, race)) + 
  geom_boxplot(fill = 3) +
  theme_minimal()
## Warning: Removed 233 rows containing non-finite values (stat_boxplot).

Lastly, we can take a look at the difference in distributions of police shootings based on age divided into races. We can avoid N/A and Other. Take a look at the mean age of fatal shootings between Black Non-Hispanics and White Non-Hispanic. The mean age for deaths of White Non-Hispanics are around 37.5, while the mean age for Black Non-Hispanic is around 30 years old. You can tell a larger % of deaths are at a much younger age for African Americans compared to Caucasians. The only thing is Caucasians have a lower min number than every other demographic even though a large majority of the fatalities are above 35 years old.