We will take an indepth look into trends to see how policing effects different groups of people. The data we will be working with today shows citizen fatalities caused by police officers over the last five years. These categories include race, whether or not they had a weapon, age, gender, state the shooting occurred in, and whether the victim was fleeing or not.
The dataset was already cleaned up so there wasn’t really any cleaning to do. I decided to change the dates in the data set you provided to only showing the year. This way we could easily break down shooting by year.
fatal_shooting_data <- read.csv("fatal_police_shootings_clean.csv", header = T, sep = "," )
view(fatal_shooting_data)
fatal_shooting_data %>%
ggplot() +
geom_histogram(aes(x = year, fill = race)) +
xlab("Year" ) +
ylab("frequency") +
ggtitle("Frequency of Fatal Shootings per year by Police")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Here there is a histogram set up to show the distribution of fatal shootings by police officers for each year over the last five and a half years. Obviously 2020 is much lower than the previous five years because we are all suck at home due to Covid. The histogram is further broken down into race, as shown by the legend to the right of the histogram. N/A is when they were unsure about the victims’ race. As we can see, close to 50% of the fatalities are White Non-Hispanic, around 25% are Black Non-Hispanic, 20% are Hispanic, and the remaining 5% make up the remaining races left out. At first look, we assume that Caucasians are the most victimized race for police shootings. We have to take a deeper dive into the data to compare this to the population of the United States; 76% of the population in the US is White Non-Hispanic, while 13% is Black Non-Hispanic. As we clearly see there is almost double the fatalities of American Americans, while the number of fatalities of white Non-Hispanic Americans falls way below their average demographic.
An interesting addition to this data would be comparing % of interactions that police face between races. Instead of comparing the population of the U.S to the % of fatalities. I just did not have access to those numbers and I am not sure if those numbers exist. This could be a more comparable number. I assume those numbers would be closer aligned than the population of the U.S. The interactions could still be due to racial profiling, leading to more interactions with police for minorities compared to white Non-Hispanic.
ggplot(fatal_shooting_data, aes(age, race)) +
geom_boxplot(fill = 3) +
theme_minimal()
## Warning: Removed 233 rows containing non-finite values (stat_boxplot).

Lastly, we can take a look at the difference in distributions of police shootings based on age divided into races. We can avoid N/A and Other. Take a look at the mean age of fatal shootings between Black Non-Hispanics and White Non-Hispanic. The mean age for deaths of White Non-Hispanics are around 37.5, while the mean age for Black Non-Hispanic is around 30 years old. You can tell a larger % of deaths are at a much younger age for African Americans compared to Caucasians. The only thing is Caucasians have a lower min number than every other demographic even though a large majority of the fatalities are above 35 years old.