library(tidyverse)
library(sf)
library(tmap)
library(leaflet)
library(here)
library(tidycensus)
library(readr)
library(skimr)
library(ggplot2)
library(lubridate)
library(dplyr)
library(htmlwidgets)
data <- read_csv("fe.csv")
head(data)
## # A tibble: 6 × 36
## `Unique ID` Name Age Gender Race Race with imputation…¹
## <dbl> <chr> <dbl> <chr> <chr> <chr>
## 1 31495 Ashley McClendon 28 Female Afric… African-American/Black
## 2 31496 Name withheld by police NA Female Race … <NA>
## 3 31497 Name withheld by police NA Male Race … <NA>
## 4 31491 Johnny C. Martin Jr. 36 Male Race … <NA>
## 5 31492 Dennis McHugh 44 Male Europ… <NA>
## 6 31493 Ny'Darius McKinney 21 Male Race … <NA>
## # ℹ abbreviated name: ¹`Race with imputations`
## # ℹ 30 more variables: `Imputation probability` <chr>,
## # `URL of image (PLS NO HOTLINKS)` <chr>,
## # `Date of injury resulting in death (month/day/year)` <chr>,
## # `Location of injury (address)` <chr>, `Location of death (city)` <chr>,
## # State <chr>, `Location of death (zip code)` <chr>,
## # `Location of death (county)` <chr>, `Full Address` <chr>, Latitude <chr>, …
skim(data)
| Name | data |
| Number of rows | 31498 |
| Number of columns | 36 |
| _______________________ | |
| Column type frequency: | |
| character | 28 |
| logical | 2 |
| numeric | 6 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| Name | 0 | 1.00 | 4 | 82 | 0 | 29859 | 0 |
| Gender | 144 | 1.00 | 4 | 11 | 0 | 3 | 0 |
| Race | 1 | 1.00 | 14 | 57 | 0 | 11 | 0 |
| Race with imputations | 868 | 0.97 | 14 | 23 | 0 | 9 | 0 |
| Imputation probability | 884 | 0.97 | 1 | 19 | 0 | 6613 | 0 |
| URL of image (PLS NO HOTLINKS) | 16773 | 0.47 | 23 | 373 | 0 | 14667 | 0 |
| Date of injury resulting in death (month/day/year) | 0 | 1.00 | 10 | 10 | 0 | 7736 | 0 |
| Location of injury (address) | 556 | 0.98 | 3 | 74 | 0 | 28891 | 0 |
| Location of death (city) | 36 | 1.00 | 3 | 30 | 0 | 6335 | 0 |
| State | 1 | 1.00 | 2 | 2 | 0 | 51 | 0 |
| Location of death (zip code) | 182 | 0.99 | 5 | 5 | 0 | 11037 | 0 |
| Location of death (county) | 15 | 1.00 | 3 | 33 | 0 | 1532 | 0 |
| Full Address | 1 | 1.00 | 2 | 103 | 0 | 29691 | 0 |
| Latitude | 1 | 1.00 | 2 | 17 | 0 | 29514 | 0 |
| Agency or agencies involved | 78 | 1.00 | 13 | 266 | 0 | 6824 | 0 |
| Highest level of force | 4 | 1.00 | 5 | 33 | 0 | 18 | 0 |
| Name Temporary | 25969 | 0.18 | 6 | 58 | 0 | 5283 | 0 |
| Armed/Unarmed | 14419 | 0.54 | 4 | 19 | 0 | 9 | 0 |
| Alleged weapon | 14421 | 0.54 | 4 | 35 | 0 | 268 | 0 |
| Aggressive physical movement | 14418 | 0.54 | 4 | 42 | 0 | 31 | 0 |
| Fleeing/Not fleeing | 14419 | 0.54 | 4 | 42 | 0 | 25 | 0 |
| Description Temp | 27431 | 0.13 | 40 | 2239 | 0 | 3869 | 0 |
| URL Temp | 28281 | 0.10 | 1 | 723 | 0 | 3065 | 0 |
| Brief description | 2 | 1.00 | 7 | 2239 | 0 | 29882 | 0 |
| Dispositions/Exclusions INTERNAL USE, NOT FOR ANALYSIS | 3 | 1.00 | 7 | 89 | 0 | 153 | 0 |
| Intended use of force (Developing) | 3 | 1.00 | 2 | 22 | 0 | 8 | 0 |
| Supporting document link | 2 | 1.00 | 21 | 438 | 0 | 29268 | 0 |
| Foreknowledge of mental illness? INTERNAL USE, NOT FOR ANALYSIS | 62 | 1.00 | 2 | 19 | 0 | 4 | 0 |
Variable type: logical
| skim_variable | n_missing | complete_rate | mean | count |
|---|---|---|---|---|
| …33 | 31498 | 0 | NaN | : |
| …34 | 31498 | 0 | NaN | : |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| Unique ID | 1 | 1.00 | 15749.00 | 9092.55 | 1.00 | 7875 | 15749.00 | 23623.00 | 31497.00 | ▇▇▇▇▇ |
| Age | 1223 | 0.96 | 35.28 | 13.83 | 0.08 | 25 | 33.00 | 44.00 | 107.00 | ▂▇▃▁▁ |
| Longitude | 1 | 1.00 | -95.40 | 16.30 | -165.59 | -111 | -90.56 | -82.57 | -67.27 | ▁▁▅▇▇ |
| UID Temporary | 25969 | 0.18 | 15464.08 | 6559.72 | 9759.00 | 11156 | 12549.00 | 19240.00 | 30340.00 | ▇▁▁▁▂ |
| Unique ID formula | 31496 | 0.00 | 29497.00 | 2828.43 | 27497.00 | 28497 | 29497.00 | 30497.00 | 31497.00 | ▇▁▁▁▇ |
| Unique identifier (redundant) | 1 | 1.00 | 15749.00 | 9092.55 | 1.00 | 7875 | 15749.00 | 23623.00 | 31497.00 | ▇▇▇▇▇ |
p_distribution <- ggplot(data = data) +
geom_bar(aes(x = "", fill = Gender), stat = "count", position = "fill") +
coord_polar("y", start = 0) +
scale_fill_manual(values = c("#00008B","#AA0000", "#333333","#999999"))+
theme(panel.background = element_rect(fill = "black"))+
ggdark::dark_theme_classic()
## Inverted geom defaults of fill and color/colour.
## To change them back, use invert_geom_defaults().
print(p_distribution)
- The pie chart above illustrates the distribution of police killings
based on gender. - The majority of police killings were of Male,
accounting for nearly ninety percent of the total incidents.
p_age <- ggplot(data) +
geom_histogram(mapping = aes(x = Age, fill = Gender),
bins = 60,
color="black")+
scale_fill_manual(values = c("#00008B","#AA0000", "#333333","#999999"))+
ggdark::dark_theme_classic()
print(p_age)
## Warning: Removed 1223 rows containing non-finite values (`stat_bin()`).
- The stacked bar chart depicts the distribution of police killings
based on both gender and age groups. - The highest number of incidents
occurred in the 25-40 group. - There is a noticeable disparity in police
violence across different age ranges and genders, with young adult
female .
colnames(data)[9] <- "time"
data_time <- data %>%
mutate(time = as.Date(time, format = "%m/%d/%Y"))
data_time <- data_time %>%
mutate(year = year(time))
# Count the number of incidents per year and gender
incident_counts <- data_time %>%
group_by(year, Gender) %>%
count()
color_palette <- c("Female" = "#00008B", "Male" = "#AA0000", "Transgender" = "#333333", "NA" = "#00000000")
p_trend <- ggplot(data = incident_counts) +
geom_line(aes(x = year, y = n, color = Gender), size=2)+
scale_color_manual(values = color_palette) +
labs(x = "Year", y = "Number of Incidents")+
ggdark::dark_theme_classic()
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
print(p_trend)
- The line chart above showcases the trend of police killings over time.
- There has been an obvious increase trend in police killings over the
years, from 2000 - 2021. - The number of incidents peaked in 2020.
data_race <- read_csv("fe-cleaned.csv")
## Warning: One or more parsing issues, call `problems()` on your data frame for details,
## e.g.:
## dat <- vroom(...)
## problems(dat)
## Rows: 31495 Columns: 36
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (27): Name, Gender, Race, Race with imputations, Imputation probability,...
## dbl (6): Unique ID, Age, Latitude, Longitude, UID Temporary, Unique identif...
## lgl (3): Column, Column2, Unique ID formula
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
p_race <- ggplot(data = data_race) +
geom_bar(mapping = aes(x=Gender, fill=Race), position = "fill") +
scale_fill_manual(values = c("#AA0000", "#FF8C00", "#006400", "#000080", "#008B8B", "#006400", "#333333"))+
ggdark::dark_theme_classic()
print(p_race)
- The stacked bar chart above shows the distribution of police killings
based on both gender and race groups. - Among the Female group, more
victims Asian and White victims occurs compared to other genders.
save.image('1005.RData')
There is a significant gender disparity in police killings, with males accounting for nearly 90% of the total incidents. This suggests that males are disproportionately affected by police violence.
The age group most affected by police killings is 25-40, indicating that young adults are at a higher risk. This is a concerning trend that needs to be addressed.
There is a noticeable disparity in police violence across different age ranges and genders, with young adult females being particularly affected. This highlights the intersectionality of age, gender, and violence, and underscores the need for interventions that take these factors into account.
The number of police killings has been increasing over time, peaking in 2020. This upward trend is alarming and suggests that current measures to prevent police violence may be insufficient.
Among females, Asians and Whites are more likely to be victims of police killings compared to other races. This racial disparity points to the need for targeted interventions to protect these vulnerable groups.
These conclusions underscore the urgent need for comprehensive reforms and interventions to address police violence. It’s crucial that these efforts consider the intersecting factors of gender, age, and race to effectively reduce and prevent such incidents.