Main idea

“Black Lives Matter” was evoked by “incidents of police brutality and racially motivated violence against black people” (source: Wikipedia). The information I want to extract from the fatal encounters dataset is 1) how disproportionate police killings of black people are to the population share of them and 2) whether there exists any difference in the disproportionateness across states.

Load packages and print the session info

## Load pacakges
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6      ✔ purrr   0.3.4 
## ✔ tibble  3.1.8      ✔ dplyr   1.0.10
## ✔ tidyr   1.2.1      ✔ stringr 1.4.1 
## ✔ readr   2.1.2      ✔ forcats 0.5.2 
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(sf)
## Linking to GEOS 3.9.1, GDAL 3.4.3, PROJ 7.2.1; sf_use_s2() is TRUE
library(here)
## here() starts at C:/Users/iskim/Dropbox (GaTech)/2022-2023/2022 Fall/CP 8883 Intro to Urban Analytics/UA_module5
library(tmap)
library(tidycensus)

## print the session info
sessionInfo()
## R version 4.2.1 (2022-06-23 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 22621)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United States.utf8 
## [2] LC_CTYPE=English_United States.utf8   
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.utf8    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] tidycensus_1.2.3 tmap_3.3-3       here_1.0.1       sf_1.0-8        
##  [5] forcats_0.5.2    stringr_1.4.1    dplyr_1.0.10     purrr_0.3.4     
##  [9] readr_2.1.2      tidyr_1.2.1      tibble_3.1.8     ggplot2_3.3.6   
## [13] tidyverse_1.3.2 
## 
## loaded via a namespace (and not attached):
##  [1] fs_1.5.2            lubridate_1.8.0     RColorBrewer_1.1-3 
##  [4] httr_1.4.4          rprojroot_2.0.3     tools_4.2.1        
##  [7] backports_1.4.1     bslib_0.4.0         rgdal_1.5-32       
## [10] utf8_1.2.2          R6_2.5.1            KernSmooth_2.23-20 
## [13] DBI_1.1.3           colorspace_2.0-3    raster_3.5-29      
## [16] sp_1.5-0            withr_2.5.0         tidyselect_1.1.2   
## [19] leaflet_2.1.1       compiler_4.2.1      leafem_0.2.0       
## [22] cli_3.4.0           rvest_1.0.3         xml2_1.3.3         
## [25] sass_0.4.2          scales_1.2.1        classInt_0.4-7     
## [28] proxy_0.4-27        rappdirs_0.3.3      digest_0.6.29      
## [31] foreign_0.8-82      rmarkdown_2.16      base64enc_0.1-3    
## [34] dichromat_2.0-0.1   pkgconfig_2.0.3     htmltools_0.5.3    
## [37] dbplyr_2.2.1        fastmap_1.1.0       htmlwidgets_1.5.4  
## [40] rlang_1.0.5         readxl_1.4.1        rstudioapi_0.14    
## [43] jquerylib_0.1.4     generics_0.1.3      jsonlite_1.8.0     
## [46] crosstalk_1.2.0     googlesheets4_1.0.1 magrittr_2.0.3     
## [49] Rcpp_1.0.9          munsell_0.5.0       fansi_1.0.3        
## [52] abind_1.4-5         terra_1.6-17        lifecycle_1.0.2    
## [55] stringi_1.7.8       leafsync_0.1.0      yaml_2.3.5         
## [58] tmaptools_3.1-1     maptools_1.1-4      grid_4.2.1         
## [61] parallel_4.2.1      crayon_1.5.1        lattice_0.20-45    
## [64] haven_2.5.1         stars_0.5-6         hms_1.1.2          
## [67] knitr_1.40          pillar_1.8.1        uuid_1.1-0         
## [70] codetools_0.2-18    reprex_2.0.2        XML_3.99-0.10      
## [73] glue_1.6.2          evaluate_0.16       modelr_0.1.9       
## [76] png_0.1-7           vctrs_0.4.1         tzdb_0.3.0         
## [79] cellranger_1.1.0    gtable_0.3.1        assertthat_0.2.1   
## [82] cachem_1.0.6        xfun_0.32           lwgeom_0.2-8       
## [85] broom_1.0.1         e1071_1.7-11        class_7.3-20       
## [88] googledrive_2.0.0   viridisLite_0.4.1   gargle_1.2.1       
## [91] tigris_1.6.1        units_0.8-0         ellipsis_0.3.2

Load data and map the variable of interest

“Race” column contains the race of murdered indiviuals by police. Because the column has 11 categories including “Race unspecified”, I reduced the number of categories to four: Black, White, Other, and Unspecified. The table below shows the distribution of race (both count and percentage values) after the reclassification process; 22.26% are black and 33.81% are white. Since more than a quarter of individuals in the dataset (27.88%) do not have race information attached to them, it would be better to compare the ratio of black and white in the dataset and the ratio in population rather than to compare the share of black people in the dataset and the share in population. The map following the table shows the sptial distribution of police killings of civilians with dots colored with race.

## Load data
df <- read_rds(here("data", "fe_data_sf_XYswap.rds"))

## Recode "Race" variable
df <- df %>% 
  mutate(Race_reclassified = case_when(
    str_detect(Race, "African|Black")      ~ "Black",
    str_detect(Race, "White")              ~ "White",
    str_detect(Race, "Race unspecified")   ~ "Unspecified",
    TRUE                                   ~ "Other"
    )
  )

## Check the distribution of the recoded variable (Race_reclassified)
df %>% 
  st_drop_geometry() %>% 
  count(Race_reclassified) %>% 
  mutate(percentage = round(n/nrow(df)*100, 2)) %>% 
  rename(count = n)
##   Race_reclassified count percentage
## 1             Black  7012      22.26
## 2             Other  5054      16.05
## 3       Unspecified  8780      27.88
## 4             White 10650      33.81
## Create dummy variables indicating the race
df <- df %>% 
  mutate(black = case_when(Race_reclassified == "Black" ~ 1, TRUE ~ 0),
         white = case_when(Race_reclassified == "White" ~ 1, TRUE ~ 0))
## Map the recoded variable (Race_reclassified)
tmap_mode('view') %>% suppressMessages()
tm_shape(df) + tm_dots(col = "Race_reclassified")

Download Census data

The total popultation, black population, and white population of each state (and D.C. and Puerto Rico) were donwloaded from 2020 ACS 5-year data.

## Donwload the population by race for each state
census_api_key(read_rds(here("data","census_api_key.rds"))[1])
## To install your API key for use in future sessions, run this function with `install = TRUE`.
state <- get_acs(geography = "state", 
                 variables = c(tot.pop = "B01001_001", 
                               black.pop = "B01001B_001", 
                               white.pop = "B01001A_001"),
                 year = 2020,
                 survey = "acs5", 
                 geometry = T,    
                 output = "wide"  
                 ) %>% 
  suppressMessages() %>% 
  st_transform(4326) %>% 
  mutate(p.black = black.popE/tot.popE,
         p.white = white.popE/tot.popE)
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=                                                                     |   1%
  |                                                                            
  |=                                                                     |   2%
  |                                                                            
  |==                                                                    |   2%
  |                                                                            
  |==                                                                    |   3%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |===                                                                   |   5%
  |                                                                            
  |====                                                                  |   5%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |=====                                                                 |   8%
  |                                                                            
  |======                                                                |   8%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |=======                                                               |  11%
  |                                                                            
  |========                                                              |  11%
  |                                                                            
  |========                                                              |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |==========                                                            |  15%
  |                                                                            
  |===========                                                           |  15%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |============                                                          |  16%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |============                                                          |  18%
  |                                                                            
  |=============                                                         |  18%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |==============                                                        |  19%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |===============                                                       |  22%
  |                                                                            
  |================                                                      |  22%
  |                                                                            
  |================                                                      |  23%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |=================                                                     |  25%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |===================                                                   |  27%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=====================                                                 |  29%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |=======================                                               |  32%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |========================                                              |  34%
  |                                                                            
  |========================                                              |  35%
  |                                                                            
  |=========================                                             |  35%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |==========================                                            |  38%
  |                                                                            
  |===========================                                           |  38%
  |                                                                            
  |===========================                                           |  39%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |=============================                                         |  41%
  |                                                                            
  |=============================                                         |  42%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |===============================                                       |  44%
  |                                                                            
  |================================                                      |  45%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |=================================                                     |  48%
  |                                                                            
  |==================================                                    |  48%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |===================================                                   |  49%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |===================================                                   |  51%
  |                                                                            
  |====================================                                  |  51%
  |                                                                            
  |====================================                                  |  52%
  |                                                                            
  |=====================================                                 |  52%
  |                                                                            
  |=====================================                                 |  53%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |======================================                                |  55%
  |                                                                            
  |=======================================                               |  56%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |=========================================                             |  58%
  |                                                                            
  |=========================================                             |  59%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |===========================================                           |  62%
  |                                                                            
  |============================================                          |  62%
  |                                                                            
  |============================================                          |  63%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |==============================================                        |  65%
  |                                                                            
  |==============================================                        |  66%
  |                                                                            
  |===============================================                       |  66%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |===============================================                       |  68%
  |                                                                            
  |================================================                      |  68%
  |                                                                            
  |================================================                      |  69%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |==================================================                    |  71%
  |                                                                            
  |===================================================                   |  73%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |=====================================================                 |  75%
  |                                                                            
  |=====================================================                 |  76%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |=======================================================               |  78%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |========================================================              |  79%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |========================================================              |  81%
  |                                                                            
  |=========================================================             |  82%
  |                                                                            
  |==========================================================            |  83%
  |                                                                            
  |===========================================================           |  84%
  |                                                                            
  |============================================================          |  85%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=============================================================         |  87%
  |                                                                            
  |==============================================================        |  88%
  |                                                                            
  |==============================================================        |  89%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |================================================================      |  91%
  |                                                                            
  |================================================================      |  92%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |=================================================================     |  94%
  |                                                                            
  |==================================================================    |  95%
  |                                                                            
  |===================================================================   |  95%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |===================================================================== |  98%
  |                                                                            
  |===================================================================== |  99%
  |                                                                            
  |======================================================================| 100%

Get the ratios

Spatially joining the fatal estimate data (points) and ACS population data (multipolygons), a new dataset is created that contains the share of black (p.black), the share of white (p.white), the share of black killed (p.black.killed), the share of white killed (p.white.killed). The ratio of first two becomes the ratio of black and white in population (ratio.pop.black.white) and the ratio of last two becomes the ratio of black and white who are killed by police (ratio.killed.black.white). Lastly, the index of the degree of disproportionateness (ratio.killed.pop) is calcualted by dividing ratio.killed.black.white by ratio.killed.black.white. A index value larger than one indicates that the share of black people who are killed by police is higher than the share of black people in population. The larger the value, the more disproportionate the killings of black people by police are to the share of black people in population.

## Do spatial join of the state population data (multipolygon) and fatal encounters data (points)
summary <- state %>% 
  st_join(df, join = st_intersects) %>% 
  st_drop_geometry() %>% 
  group_by(GEOID) %>% 
  summarise(pop = mean(tot.popE),
            p.black = mean(p.black),
            p.white = mean(p.white),
            p.black.killed = mean(black),
            p.white.killed = mean(white)
            )

## Calculate the ratio of black population and white population and the ratio of killed black and killed white    
summary <- summary %>% 
  mutate(ratio.pop.black.white = p.black/p.white,
         ratio.killed.black.white = p.black.killed/p.white.killed,
         ratio.killed.pop = ratio.killed.black.white/ratio.pop.black.white)

## Attach the geometry information to the new dataset
summary <- state %>% 
  select(GEOID, NAME) %>%
  left_join(summary, by = "GEOID")
## Print the resulting dataset after removing (1) Puerto Rico with no fatal encounters data and (2) Alaska and Hawaii lying outside of mainland United States
summary <- summary %>% 
  filter(NAME != "Puerto Rico", NAME != "Alaska", NAME != "Hawaii") %>%
  arrange(GEOID)

summary %>% st_drop_geometry() %>% as_tibble()
## # A tibble: 49 × 10
##    GEOID NAME        pop p.black p.white p.bla…¹ p.whi…² ratio…³ ratio…⁴ ratio…⁵
##    <chr> <chr>     <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
##  1 01    Alabama  4.89e6  0.266    0.675  0.265   0.338   0.394    0.784    1.99
##  2 04    Arizona  7.17e6  0.0453   0.738  0.0710  0.350   0.0614   0.202    3.30
##  3 05    Arkansas 3.01e6  0.152    0.754  0.184   0.417   0.202    0.442    2.19
##  4 06    Califor… 3.93e7  0.0572   0.561  0.146   0.223   0.102    0.656    6.42
##  5 08    Colorado 5.68e6  0.0415   0.815  0.0722  0.388   0.0509   0.186    3.65
##  6 09    Connect… 3.57e6  0.107    0.742  0.271   0.382   0.144    0.709    4.92
##  7 10    Delaware 9.68e5  0.220    0.674  0.391   0.344   0.326    1.14     3.48
##  8 11    Distric… 7.02e5  0.454    0.411  0.641   0.0435  1.11    14.8     13.3 
##  9 12    Florida  2.12e7  0.159    0.716  0.301   0.368   0.222    0.817    3.67
## 10 13    Georgia  1.05e7  0.316    0.572  0.341   0.244   0.551    1.40     2.54
## # … with 39 more rows, and abbreviated variable names ¹​p.black.killed,
## #   ²​p.white.killed, ³​ratio.pop.black.white, ⁴​ratio.killed.black.white,
## #   ⁵​ratio.killed.pop

Visualize the ratios (mapping)

Below are three maps showing the spatial distributions of 1) the ratio of black and white in population 2) the ratio of black and white who are killed by police, and 3) the index of disproportionateness. The break points for colors on each map are quintiles. The first map illustrates that the ratio of black and white is genenerally high for states in the Southeast. Even though the second map looks similar to the first one, there are some states with a high ratio of black and white killed in the Midwest and Northeast. The third map shows that the disproportaionateness of police killings of black people are more prominent the Midwest and Northeast while the Southeast is the least disproportionate region. Surprisingly, California and Utah have high index values.

tm_shape(summary) + tm_polygons(col = "ratio.pop.black.white", style = "quantile") +
  tm_text("NAME", size = 0.8)
tm_shape(summary) + tm_polygons(col = "ratio.killed.black.white", style = "quantile") +
  tm_text("NAME", size = 0.8)
tm_shape(summary) + tm_polygons(col = "ratio.killed.pop", style = "quantile") +
  tm_text("NAME", size = 0.8)

Visualize the ratios (scatter plots)

For more investigations into the result, I created a scatter plot of the ratio of black and white in population and the ratio of black and white who are killed by police. According to the plot, there exist a positive correlation between the two ratios, which is expected. One outlier in the graph is the District of Columbia (D.C.). The index value of D.C. is about 15, indicating a enormouly serious disproportionate police killings of black people.
To better examine others, I drew another scatter plot with D.C. excluded. The plot also include the red line with equation “y = x”. Only two states, Wyoming and North Dakota, are below the line indicating the ratio of black and white killed is smaller than the ratio of black and white in population. In fact, Wyoming and North Dakota have 1% and 3% as the share of black population and no case of black people killed in the fatal encounter dataset. Considering the share of cases with unspecified race in the dataset and low shares of black population of the two states, it is quite probable that they are above the line in reality.

## Draw a scatter plot
outliers <- summary %>% 
  st_drop_geometry() %>% 
  arrange(desc(ratio.pop.black.white)) %>% 
  slice(1:5)

ggplot(data = summary %>% st_drop_geometry(), 
       mapping = aes(x=ratio.pop.black.white, y=ratio.killed.black.white)
       ) + 
  geom_point() +
  geom_point(data = outliers, size = 3, shape = 1, color = "black") + 
  ggrepel::geom_label_repel(data = outliers, mapping = aes(label = NAME)) + 
  labs(x = "Ratio of black and white (population)",
       y = "Ratio of black and white (fatal encounters)",
       title = "Ratio of black and white (population vs fatal encounters)") +
  theme_light()

## Draw a scatter plot
outliers <- summary %>% 
  st_drop_geometry() %>% 
  filter(NAME != "District of Columbia") %>% 
  arrange(desc(ratio.pop.black.white)) %>% 
  slice(1:4)

below.the.line <- summary %>% 
  st_drop_geometry() %>% 
  filter(ratio.killed.pop < 1)

high.ratio.fatal <- summary %>% 
  st_drop_geometry() %>% 
  filter(NAME %in% c("Illinois", "New Jersey"))

ggplot(data = summary %>% st_drop_geometry() %>% filter(NAME != "District of Columbia"), 
       mapping = aes(x=ratio.pop.black.white, y=ratio.killed.black.white)
       ) + 
  geom_point() +
  geom_point(data = outliers, size = 3, shape = 1, color = "black") + 
  ggrepel::geom_label_repel(data = outliers, mapping = aes(label = NAME)) + 
  geom_point(data = below.the.line, size = 3, shape = 1, color = "black") + 
  ggrepel::geom_label_repel(data = below.the.line, mapping = aes(label = NAME)) + 
  geom_point(data = high.ratio.fatal, size = 3, shape = 1, color = "black") + 
  ggrepel::geom_label_repel(data = high.ratio.fatal, mapping = aes(label = NAME)) + 
  labs(x = "Ratio of black and white (population)",
       y = "Ratio of black and white (fatal encounters)",
       title = "Ratio of black and white (population vs fatal encounters) \n(excluding D.C.)") +
  theme_light() + 
  geom_abline(intercept = 0, slope = 1, color="red") 

Conclusions

One way to check the disproportionateness would be calculating the ratio of the ratio of black and white killed and the ratio of black and white in population. According to the maps and plots, it is very clear that there ‘exists’ disproportionate police killings of black people are to the population share of black people across the US, because almost all states has a index value larger than one. The disproportionateness is more serious in the Midwest and Northeast while being less serious in the Southeast with more share of black people in population. Wisconsin, Illinois, Ohio, Pennsylvania, New Jersy, Missouri, Nebraska, Utah, and California are staes with high disproportionateness. Washington D.C. has a exceptionally high level of disproportionateness of 15. This exceptional value might stem from the fact that Washington D.C. is small and mostly urbanized compared to the states in the US.