Main idea
“Black Lives Matter” was evoked by “incidents of police brutality and
racially motivated violence against black people” (source: Wikipedia).
The information I want to extract from the fatal encounters dataset is
1) how disproportionate police killings of black people are to the
population share of them and 2) whether there exists any difference in
the disproportionateness across states.
Load packages and print the session info
## Load pacakges
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.2.1 ✔ stringr 1.4.1
## ✔ readr 2.1.2 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(sf)
## Linking to GEOS 3.9.1, GDAL 3.4.3, PROJ 7.2.1; sf_use_s2() is TRUE
library(here)
## here() starts at C:/Users/iskim/Dropbox (GaTech)/2022-2023/2022 Fall/CP 8883 Intro to Urban Analytics/UA_module5
library(tmap)
library(tidycensus)
## print the session info
sessionInfo()
## R version 4.2.1 (2022-06-23 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 22621)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=English_United States.utf8
## [2] LC_CTYPE=English_United States.utf8
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.utf8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] tidycensus_1.2.3 tmap_3.3-3 here_1.0.1 sf_1.0-8
## [5] forcats_0.5.2 stringr_1.4.1 dplyr_1.0.10 purrr_0.3.4
## [9] readr_2.1.2 tidyr_1.2.1 tibble_3.1.8 ggplot2_3.3.6
## [13] tidyverse_1.3.2
##
## loaded via a namespace (and not attached):
## [1] fs_1.5.2 lubridate_1.8.0 RColorBrewer_1.1-3
## [4] httr_1.4.4 rprojroot_2.0.3 tools_4.2.1
## [7] backports_1.4.1 bslib_0.4.0 rgdal_1.5-32
## [10] utf8_1.2.2 R6_2.5.1 KernSmooth_2.23-20
## [13] DBI_1.1.3 colorspace_2.0-3 raster_3.5-29
## [16] sp_1.5-0 withr_2.5.0 tidyselect_1.1.2
## [19] leaflet_2.1.1 compiler_4.2.1 leafem_0.2.0
## [22] cli_3.4.0 rvest_1.0.3 xml2_1.3.3
## [25] sass_0.4.2 scales_1.2.1 classInt_0.4-7
## [28] proxy_0.4-27 rappdirs_0.3.3 digest_0.6.29
## [31] foreign_0.8-82 rmarkdown_2.16 base64enc_0.1-3
## [34] dichromat_2.0-0.1 pkgconfig_2.0.3 htmltools_0.5.3
## [37] dbplyr_2.2.1 fastmap_1.1.0 htmlwidgets_1.5.4
## [40] rlang_1.0.5 readxl_1.4.1 rstudioapi_0.14
## [43] jquerylib_0.1.4 generics_0.1.3 jsonlite_1.8.0
## [46] crosstalk_1.2.0 googlesheets4_1.0.1 magrittr_2.0.3
## [49] Rcpp_1.0.9 munsell_0.5.0 fansi_1.0.3
## [52] abind_1.4-5 terra_1.6-17 lifecycle_1.0.2
## [55] stringi_1.7.8 leafsync_0.1.0 yaml_2.3.5
## [58] tmaptools_3.1-1 maptools_1.1-4 grid_4.2.1
## [61] parallel_4.2.1 crayon_1.5.1 lattice_0.20-45
## [64] haven_2.5.1 stars_0.5-6 hms_1.1.2
## [67] knitr_1.40 pillar_1.8.1 uuid_1.1-0
## [70] codetools_0.2-18 reprex_2.0.2 XML_3.99-0.10
## [73] glue_1.6.2 evaluate_0.16 modelr_0.1.9
## [76] png_0.1-7 vctrs_0.4.1 tzdb_0.3.0
## [79] cellranger_1.1.0 gtable_0.3.1 assertthat_0.2.1
## [82] cachem_1.0.6 xfun_0.32 lwgeom_0.2-8
## [85] broom_1.0.1 e1071_1.7-11 class_7.3-20
## [88] googledrive_2.0.0 viridisLite_0.4.1 gargle_1.2.1
## [91] tigris_1.6.1 units_0.8-0 ellipsis_0.3.2
Load data and map the variable of interest
“Race” column contains the race of murdered indiviuals by police.
Because the column has 11 categories including “Race unspecified”, I
reduced the number of categories to four: Black, White, Other, and
Unspecified. The table below shows the distribution of race (both count
and percentage values) after the reclassification process; 22.26% are
black and 33.81% are white. Since more than a quarter of individuals in
the dataset (27.88%) do not have race information attached to them, it
would be better to compare the ratio of black and white in the dataset
and the ratio in population rather than to compare the share of black
people in the dataset and the share in population. The map following the
table shows the sptial distribution of police killings of civilians with
dots colored with race.
## Load data
df <- read_rds(here("data", "fe_data_sf_XYswap.rds"))
## Recode "Race" variable
df <- df %>%
mutate(Race_reclassified = case_when(
str_detect(Race, "African|Black") ~ "Black",
str_detect(Race, "White") ~ "White",
str_detect(Race, "Race unspecified") ~ "Unspecified",
TRUE ~ "Other"
)
)
## Check the distribution of the recoded variable (Race_reclassified)
df %>%
st_drop_geometry() %>%
count(Race_reclassified) %>%
mutate(percentage = round(n/nrow(df)*100, 2)) %>%
rename(count = n)
## Race_reclassified count percentage
## 1 Black 7012 22.26
## 2 Other 5054 16.05
## 3 Unspecified 8780 27.88
## 4 White 10650 33.81
## Create dummy variables indicating the race
df <- df %>%
mutate(black = case_when(Race_reclassified == "Black" ~ 1, TRUE ~ 0),
white = case_when(Race_reclassified == "White" ~ 1, TRUE ~ 0))
## Map the recoded variable (Race_reclassified)
tmap_mode('view') %>% suppressMessages()
tm_shape(df) + tm_dots(col = "Race_reclassified")