Project 2

Author

Tikki Dibonge

U.S. Fatal Police Shootings from 2015 - 2024

Source: https://giffords.org/issues/police-shootings/

Introduction:

The topic for my project is fatal police shootings in the U.S. from the years 2015 to 2024. Police violence against civilians has been steadily increasing every year. Our government and police themselves have failed to accurately track this data so, the public has decided to take this issue into their own hands. Since 2015, there has been a decrease almost every single year in the amount of days police are not killing people in the United States. In the years that there was a very small increase, it was still an incredibly low and concerning number. In my project, I will be giving visualizations on the trends of fatal police shootings in the U.S., with a data set provided and collected by The Washington Post. I chose this topic because although there are some trends, police violence and fatal attacks affect everyone. It doesn’t matter your state, your race, gender, or age. My visualizations will highlight any trends shown, and the increase of fatal police shootings from 2015 to 2024. My map visualization will also highlight the large amounts of fatal police shootings and what the victims did or didn’t do before being fatally shot. In this data set the variables used are:

date (the day of the fatal shooting),

body_camera (whether news reports indicated the office was wearing a body camera that recorded any part of the incident)

city (the city where the fatal shooting took place)

state (the state in which the fatal shooting took place)

latitude and longitude (the latitude and longitude of the location of the fatal shooting),

name (the name of the victim)

age (the age of victim at time of shooting)

gender (the gender of victim. Options include male, female, non-binary, unknown)

race (the race and ethnicity of the victim. Options include: W = White, B = Black, A = Asian, N = Native American, H = Hispanic, O = Other, and “–” = Unknown, eg: “B;H”)

threat_type (actions the victim took leading up to fatal shooting. Options include shoot, point, attack, threat, move, flee, accident, undetermined)

armed_with (what, if anything, was the victim armed with. Options include gun, knife, blunt_object, other(BB guns, tasers, pepper spray, etc.), replica, undetermined, unknown (there was a weapon involved but kind unknown), unarmed, vehicle)

flee_status (how, if at all, was the victim moving relative to officers leading up to the shooting. Options include foot, car, other (via another vehicle), not)

# Load necessary libraries, set working directory, and load dataset
#| message: false
#| warning: false
library(tidyverse)
library(tidyr)
library(leaflet)
library(ggnewscale)
library(dplyr)
library(lubridate)
setwd("~/Downloads/DATA110")
fatalshootings <- read_csv("fatal-police-shootings-data.csv")
data(fatalshootings)
# Load and clean the fatalshootings dataset
library(dplyr)
library(lubridate)
fatal_clean <- fatalshootings |>
  mutate(
    date = as.Date(date), # Convert date column to date type
    year = year(date) # Get the year from the date
  ) |>
  drop_na(latitude, longitude) # Remove any rows with missing coordinates

B: This visualization shows the distribution of the ages of the victims from 2015-2024. The mid-to-late thirties range is the age with the highest amount of fatal police shootings and as the age increases, the fatal police shootings amount goes down. And on the other hand, as the age goes up from ages 18-25, fatal police shootings increase. I think this is pretty interesting to see what age groups are more likely to be fatally shot than others because I never really thought about it, and it would be interesting to see what this is connected to.

# Create age distribution plot
library(ggplot2)
ggplot(fatal_clean, aes(x= age)) +
  geom_histogram(bins = 30, fill = "pink", color = "white") + # Number of bins for ages, fill color for bars, border color of bars https://r-charts.com/distribution/histogram-binwidth-ggplot2/
  labs(
    title = "Ages of Fatal Shootings Distribution (2015-2024)",
    x = "Age",
    y = "Amount"
  ) +
  theme_minimal() + # Minimal theme for plot
  theme(
    plot.title = element_text(hjust = 0.5)) # Center title

B: This next visualization shows the distribution of fatal police shootings by race from 2015-2024. I thought this was interesting because over the past 9 years, this plot shows that white people are the most fatally shot. It makes sense that black people are next, followed by Hispanic but I thought it would for sure be either black or Hispanic people at the top. I wonder if it’s because maybe black and Hispanic fatal shootings are under reported/not reported but it could just be because white people have been/are the dominant population in America so naturally, there will be more. I also found it really surprising the extremely low amount of mixed race fatal police shootings and wonder if again, it’s an issue of under/not reporting or just a dominant population reason.

# Summarize counts by race for plot annotations
fatal_summary <- fatal_clean |>
  group_by(race) |>
  summarise(count = n())

# Create bar chart plot of shootings by race with annotations, caption, and theme
ggplot(fatal_clean, aes(x = race, fill = race)) +
  geom_bar() +
  scale_fill_brewer(palette = "Paired") +
  labs(
    title = "Fatal Police Shootings by Race (2015-2024)",
    x = "Race of Victims",
    y = "Amount of Fatal Shooting for Race",
    fill = "Race",
    caption = "Data Source: The Washington Post"
  ) +
  theme_bw() +
  theme(plot.title = element_text(hjust = 0.5)) +
  # Add annotations with numeric labels above the bars
  geom_segment(
    data = fatal_summary, 
    aes(x = race, xend = race, y = count, yend = count + 2),
    color = "black"
    ) +
  geom_text(data = fatal_summary, aes(x = race, y = count + 3, label = count), vjust = 0)

B: This visualization shows the victim’s reported threat type by race from 2015-2024. I think by this point it’s clear to see that white people are the dominant race in this data set and I believe it’s because they have the highest population in America. Regardless, it’s interesting to see the distribution across different threat types committed by the victims paired with how many, and their race. The most prevalent threat types committed were pointing a weapon at someone, shooting/firing a weapon, attacking with other weapons or physical force, and the victim having some kind of weapon visible to the police (threat). These make sense to me (not justifying the fatal shooting) as the police could have also been fatally shot or attacked and it’s their protocol to respond accordingly. It’s surprising to see the threat category though, as white people were almost, if not are in the thousands while all other races are lower than 600 and I wonder what this could be connected to.

# Create bar chart showing reported threat type by victim's race
ggplot(fatal_clean, aes(x = threat_type, fill = race)) +
  geom_bar(position = "dodge") + # Separate bars for each race https://ggplot2.tidyverse.org/reference/position_dodge.html
  scale_fill_brewer(palette = "Paired") +
  labs(
    title = "Victim's Reported Threat Type by Race (2015-2024)",
    x = "Threat Type",
    y = "Amount of Threat Type",
    fill = "Race",
  ) +


  theme_classic() +
  theme(
    plot.title = element_text(hjust = 0.5)
  )

B: The final visualization I created was simply the yearly count of fatal shootings from 2015 to 2024. As I stated in my intro paragraph, the amount has been increasing year by year. Of course, there are outliers from the beginning of 2020 to early 2022 because of the COVID shutdown so there were a lot less people outside and interacting with police but as soon as things opened back up, the numbers skyrocketed way past pre-pandemic and even 2015. I think this visualization is interesting because it’s kind of scary to see such a huge increase right after the pandemic ended in numbers that weren’t in 2015, 2016, and so on.

# Create line chart plot showing the yearly count of fatal shootings
fatal_clean |>
  count(year) |>
  ggplot(aes(x = year, y = n)) +
  geom_line(size = 1.2, color = "pink") +
  geom_point(size = 3, color = "darkred") +
  scale_x_continuous(
    breaks = 2015:2024, # Set x axis ticks for each year. https://ggplot2.tidyverse.org/reference/scale_continuous.html
    labels = 2015:2024
  ) +
  labs(
    title = "Yearly Count of Fatal Shootings",
    x = "Year",
    y = "Number of Shootings"
  ) +
  theme_classic() +
  theme(
    plot.title = element_text(hjust = 0.5)
  )

# Created a new variable for weapon category for leaflet map
fatal_clean <- fatal_clean |>
  mutate(
    deadly_category = case_when( #https://dplyr.tidyverse.org/reference/case_when.html
      armed_with %in% c("gun", "knife") ~ "deadly",
      armed_with %in% c("blunt_object", "other", "replica") ~ "not deadly",
      armed_with %in% c("unarmed", "undetermined") ~ "none",
      TRUE ~ "none"
    )
  )

B: Finally, my leaflet map visualization. There isn’t a specific area or state I wanted to focus on, I wanted to visually show just how many people have been fatally shot over the past 9 years. Seeing the data averaged and calculated in bar and line charts is interesting to look at, but seeing every single dot really puts this issue into perspective. And this is only data collected from The Washington Post that were reported, so there is probably a lot more. The red dots indicate the victim had a deadly weapon in possession, yellow indicating a non-deadly weapon in possession, and green indicating the victim had no weapon. Although there are times where being shot at can be somewhat justified, it being fatal is not. A person being a serious threat does not need to be fatally shot, I think there are other ways to deescalate the situation or person. This makes me think about factors like police training, police bias, gun violence, and even mental health and drug use. There are tons of clusters in the east coast and the south which I think is really interesting and raises a lot of questions. Overall, this data and map provided a lot of interesting information about these victims and how the state of our country has changed over the past 9 years.

# Interactive leaflet map of shootings from 2015-2024
leaflet(fatal_clean) |>
  addTiles() |>
  addCircleMarkers(
    lng = ~longitude,
    lat = ~latitude,
    radius = 5,
    color = ~case_when(
      deadly_category == "deadly" ~ "red", # Deadly weapons in red
      deadly_category == "not_deadly" ~ "yellow", #Non-lethal weapons in yellow
      deadly_category == "none" ~ "darkgreen", # No weapon in green
      TRUE ~ "yellow" # Default color to make sure third yellow color showed up in map
    ),
    popup = ~paste0(
      "<b>Name:</b> ", name, "<br>",
      "<b>Age:</b> ", age, "<br>",
      "<b>Gender:</b> ", gender, "<br>",
      "<b>Race:</b> ", race, "<br>",
      "<b>Body Cam:</b> ", body_camera, "<br>",
      "<b>Threat Type:</b> ", threat_type, "<br>",
      "<b>Flee Status:</b> ", flee_status, "<br>",
      "<b>Armed With:</b> ", armed_with, "<br>",
      "<b>Weapon:</b> ", deadly_category, "<br>",
      "<b>Date:</b> ", date, "<br>",
      "<b>City, State:</b> ", city, ", ", state
    ),
    fillOpacity = 0.7
  )