My question is:

From 2015-2021 what proportion of fatal police shootings had body camera footage of the incident? Has the proportion of fatal police shootings with body camera footage changed over time?

Introduction

I will be using the Fatal Police Shootings data set from the openintro.org website. There are 6421 observations and 12 variables. I will be using 2 variables; date and body camera to answer my question about if there is a change in the proportion of body camera use over time. The data was collected from 2015 through 2021 by the Washington Post. It is periodically updated as more information becomes available.

I chose it because I am curious to see how quickly new technology is being adopted. I wanted to know if, as I suspect, there was an increase in body camera usage over time.

Data Anaylsis

I am going to select the date and body camera columns to answer the question. I am going to look at the dimensions and head of the data as well as check for any missing information. I will use group by and summarise and look at proportions of how frequently body cameras were used.

Then I will create tables, display the relevant numbers, create a scatterplot of the results and discuss them.

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.2.0     ✔ readr     2.1.6
## ✔ forcats   1.0.1     ✔ stringr   1.6.0
## ✔ ggplot2   4.0.2     ✔ tibble    3.3.1
## ✔ lubridate 1.9.5     ✔ tidyr     1.3.2
## ✔ purrr     1.2.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(corrplot)
## corrplot 0.95 loaded
library(lubridate)


setwd("~/Downloads/Data 101 Course materials/Data Sets")
f_p_s <- read.csv("fatal_police_shootings.csv")
str(f_p_s)
## 'data.frame':    6421 obs. of  12 variables:
##  $ date                   : chr  "2015-01-02" "2015-01-02" "2015-01-03" "2015-01-04" ...
##  $ manner_of_death        : chr  "shot" "shot" "shot and Tasered" "shot" ...
##  $ armed                  : chr  "gun" "gun" "unarmed" "toy weapon" ...
##  $ age                    : int  53 47 23 32 39 18 22 35 34 47 ...
##  $ gender                 : chr  "M" "M" "M" "M" ...
##  $ race                   : chr  "A" "W" "H" "W" ...
##  $ city                   : chr  "Shelton" "Aloha" "Wichita" "San Francisco" ...
##  $ state                  : chr  "WA" "OR" "KS" "CA" ...
##  $ signs_of_mental_illness: chr  "True" "False" "False" "True" ...
##  $ threat_level           : chr  "attack" "attack" "other" "attack" ...
##  $ flee                   : chr  "Not fleeing" "Not fleeing" "Not fleeing" "Not fleeing" ...
##  $ body_camera            : chr  "False" "False" "False" "False" ...
head(f_p_s)
##         date  manner_of_death      armed age gender race          city state
## 1 2015-01-02             shot        gun  53      M    A       Shelton    WA
## 2 2015-01-02             shot        gun  47      M    W         Aloha    OR
## 3 2015-01-03 shot and Tasered    unarmed  23      M    H       Wichita    KS
## 4 2015-01-04             shot toy weapon  32      M    W San Francisco    CA
## 5 2015-01-04             shot   nail gun  39      M    H         Evans    CO
## 6 2015-01-04             shot        gun  18      M    W       Guthrie    OK
##   signs_of_mental_illness threat_level        flee body_camera
## 1                    True       attack Not fleeing       False
## 2                   False       attack Not fleeing       False
## 3                   False        other Not fleeing       False
## 4                    True       attack Not fleeing       False
## 5                   False       attack Not fleeing       False
## 6                   False       attack Not fleeing       False
colSums(is.na(f_p_s))
##                    date         manner_of_death                   armed 
##                       0                       0                       0 
##                     age                  gender                    race 
##                     285                       0                       0 
##                    city                   state signs_of_mental_illness 
##                       0                       0                       0 
##            threat_level                    flee             body_camera 
##                       0                       0                       0

Even though there are 285 na’s in the age column it is not an issue because I am not using the age variable.

f_p_s$year <- format(as.Date(f_p_s$date), "%Y")

This groups all the incidents together by year.

df <- f_p_s |>
group_by(year, body_camera) |>
summarise(number_incidents = n()
)
## `summarise()` has regrouped the output.
## ℹ Summaries were computed grouped by year and body_camera.
## ℹ Output is grouped by year.
## ℹ Use `summarise(.groups = "drop_last")` to silence this message.
## ℹ Use `summarise(.by = c(year, body_camera))` for per-operation grouping
##   (`?dplyr::dplyr_by`) instead.
df
## # A tibble: 14 × 3
## # Groups:   year [7]
##    year  body_camera number_incidents
##    <chr> <chr>                  <int>
##  1 2015  False                    918
##  2 2015  True                      75
##  3 2016  False                    814
##  4 2016  True                     145
##  5 2017  False                    878
##  6 2017  True                     108
##  7 2018  False                    869
##  8 2018  True                     121
##  9 2019  False                    863
## 10 2019  True                     136
## 11 2020  False                    847
## 12 2020  True                     174
## 13 2021  False                    365
## 14 2021  True                     108

False means no camera footage. True means there was a recording mentioned by news sources.

total<- xtabs(~year, data=f_p_s)
total
## year
## 2015 2016 2017 2018 2019 2020 2021 
##  993  959  986  990  999 1021  473

This shows the total number of fatal police shootings that took place by year.

df_true<- f_p_s |>
group_by(year, body_camera) |>
summarise(number_incidents = n()
) |>
  filter(body_camera=="True")
## `summarise()` has regrouped the output.
## ℹ Summaries were computed grouped by year and body_camera.
## ℹ Output is grouped by year.
## ℹ Use `summarise(.groups = "drop_last")` to silence this message.
## ℹ Use `summarise(.by = c(year, body_camera))` for per-operation grouping
##   (`?dplyr::dplyr_by`) instead.
df_true
## # A tibble: 7 × 3
## # Groups:   year [7]
##   year  body_camera number_incidents
##   <chr> <chr>                  <int>
## 1 2015  True                      75
## 2 2016  True                     145
## 3 2017  True                     108
## 4 2018  True                     121
## 5 2019  True                     136
## 6 2020  True                     174
## 7 2021  True                     108

This shows the total fatal police shootings that were captured by body camera by year.

#2015
75/993
## [1] 0.0755287
#2016
145/959
## [1] 0.1511992
#2017
108/986
## [1] 0.1095335
#2018
121/990
## [1] 0.1222222
#2019
136/999
## [1] 0.1361361
#2020
174/1021
## [1] 0.1704212
#2021
108/473
## [1] 0.2283298
years <- (c(2015, 2016, 2017, 2018, 2019, 2020, 2021))
proportion_body_camera <- (c(0.0755287, 0.1511992, 0.1095335, 0.1222222, 0.1361361, 0.1704212, 0.2283298))


plot(years, proportion_body_camera)

Discussion of the results:

2021 had the highest proportion of fatal shooting incidents captured on body camera at almost 23%, (a significant increase from previous years) and 2015 had the least at around 7.5%. In 2016 the proportion of fatal shooting incidents captured on camera doubled to 15%. After that in 2017 the usage declines to almost 11% then continues to increase each year through 2021.

So in conclusion, the proportion of fatal shooting incidents captured on body camera has certainly increased since 2015, with the largest increase taking place between 2015 and 2016, and despite a decrease in 2016, it continued to increase each year through and including 2021.

It would be interesting to compare these results to current usage to determine if this trend has continued. It would also be worthwhile to look at whether the increase in the usage of the body camera has led to fewer incidents of fatal police shootings; if the camera creates a deterrent for those who would threaten to harm police officers and as well as for officers to have more transparency and accountability for actions taken.

It is important to note that the data on body camera usage from this data set was gathered from whether there were news reports, i.e. if it was made public that there was body camera footage of a fatal police shooting. We do not know if there is more footage that exists than we are aware of. It would be helpful to know if there is a requirement for this information to be shared to know if we are indeed receiving the complete picture of the body camera usage during of fatal police shootings. Although the Washington Post is updating this data set as more information becomes avaialble, more investigation of this point is required.

Additionally, it is worth noting that the Covid-19 pandemic and the resulting lockdowns and social distancing may have accounted for the decline in fatal police shootings in 2021 and may have affected the proportion of body camera use that year, though it is imagined that that would be the case in 2020 as well and the numbers don’t reflect that in that year.

Sources https://www.openintro.org/data/index.php?data=fatal_police_shootings

https://www.statology.org/italics-in-r/ (For the scatterplot)

For further reading:

https://www.jstor.org/stable/resrep71359?seq=8