This is an investigation into Lyman Stone’s Above the Law: The Data Are In on Police, Killing, and Race (Above the Law) published in Public Discourse. I read Above the Law while searching for non-partisan, non-sensational commentary on police violence and racial bias. Stone did a nice job reasoning through the issues and supporting his conclusions, which may be summarized in this excerpt:

Police violence in America is extraordinary in its intensity. It is disproportionate to the actual threats facing police officers, and it has risen significantly in recent years without apparent justification.

I dug into some of the data used in Above the Law to reach my own conclusions. Here is what I found.

  • Police killings have been flat at around 1,200 per year for the last two decades. Expressed in per-capita terms, police killings have been falling. (Above the Law found that police killings doubled over the period to 1,700.)
  • Police killings are infrequent, occurring at around the same rate as deaths due to childbirth. Police violence may be a social problem, but the media spotlight probably creates an unrealistic picture of the magnitude of the issue. You are almost 20 times as likely to be killed by your neighbor than by your police department.
  • Blacks are the victims of police killings at a rate disproportionate to population representation. But as a percentage of violent offenders, it is unclear whether blacks are disproportionately killed by police.

The analysis below is less a critique of Above the Law, and more of a second exploration of the data sources to reach my own conclusions. I coded this analysis in R Markdown and embedded my source code at most (but not all) steps for reference.

The Rise in Fatal Encounters with Police

Above the Law begins with a characterization of the magnitude and trend of police violence. There are several possible data sources, and they all have limitations. Stone relies mostly on the Fatal Encounters (FE) database. Its data is likely the most accurate. However it relies on scraping internet reports, so it is vulnerable to under-reporting historical incidents. The raw FE data is conveniently available as a Google Sheets file.

# help with googlesheets4 at https://googlesheets4.tidyverse.org/
googlesheets4::gs4_deauth() # not using a private sheet, so no need for token
fe_raw <- googlesheets4::read_sheet("https://docs.google.com/spreadsheets/d/1dKmaV_JiWcG8XBoRgP8b4e9Eopkpgt7FL7nyspvzAsE/edit#gid=0")

fe <- fe_raw %>%
  janitor::clean_names() %>%
  mutate(subjects_race = factor(subjects_race),
         subjects_race_with_imputations = factor(subjects_race_with_imputations),
         subjects_race_with_imputations = fct_lump_prop(subjects_race_with_imputations, 0.10))

Here is the FE data series shown in Stone’s Figure 1. Stone explains that multiple academic studies have confirmed the accuracy of the data, but I’m a little skeptical. For one thing, he only cites a single study, and that study only confirmed that the data was accurate; it did not confirm the data was complete. A search on Google Scholar returned a couple other studies. One mentioned historical coverage as a possible limitation because internet data is generally more sparse the further back you look. According to FE, fatal police encounters doubled from ~850 in 2000 to ~1,800 in 2019. The increase is astonishing!

fe %>%
  filter(date_year < 2020) %>%
  count(date_year) %>%
  ggplot(aes(x = date_year, y = n)) +
  geom_line(color = "#868B8E") +
  theme_minimal() +
  scale_y_continuous(limits = c(0, NA), labels = scales::number_format(big.mark = ",")) +
  labs(title = "Killings by Police",
       x = "", y = "",
       caption = "Source: Fatal Encounters, fatalencounters.org.")
*Fatal Encounters Database* data series from Figure 1 of *Above the Law*.

Fatal Encounters Database data series from Figure 1 of Above the Law.

I doubt the level of fatal encounters is quite this high. For one thing, not all fatal encounters are police killings. The victim might commit suicide, or die in a car crash during a pursuit, etc. Fortunately, FE includes a field to characterize the incidents as “Deadly force”, “Less-than-lethal force”, “No [force]”, “Suicide”, and a few others. Here is a breakdown of the frequencies by type of force.

fe %>% 
  # small bit of data cleaning
  mutate(intentional_use_of_force_developing = case_when(
    intentional_use_of_force_developing == "Intentional use of force" ~ "Less-than-lethal force",
    intentional_use_of_force_developing == "Np" ~ "No",
    intentional_use_of_force_developing == "Vehicle" ~ "Pursuit",
    intentional_use_of_force_developing == "Vehic/Purs" ~ "Pursuit",
    intentional_use_of_force_developing == "Unknown" ~ "Undetermined",
    TRUE ~ intentional_use_of_force_developing
  )) %>%
  group_by(intentional_use_of_force_developing) %>%
  replace_na(list(intentional_use_of_force_developing = "Undetermined", 
                  cause_of_death = "Unknown/Unspecified")) %>%
  # consolidate the bottom 10% into "Other"
  mutate(cause_of_death = fct_lump_prop(cause_of_death, prop = 0.10)) %>%
  ungroup() %>%
  count(intentional_use_of_force_developing, cause_of_death) %>%
  arrange(intentional_use_of_force_developing, desc(n)) %>%
  janitor::adorn_totals() %>%
  flextable::flextable() %>%
  flextable::colformat_int(j = "n") %>%
  flextable::set_caption("FE data, 2000 through mid-Sept 2020. Selected grouping (see code).") %>%
  flextable::autofit()
FE data, 2000 through mid-Sept 2020. Selected grouping (see code).

intentional_use_of_force_developing

cause_of_death

n

Deadly force

Gunshot

17,368

Deadly force

Other

19

Less-than-lethal force

Tasered

911

Less-than-lethal force

Other

312

Less-than-lethal force

Asphyxiated/Restrained

267

Less-than-lethal force

Beaten/Bludgeoned with instrument

174

No

Medical emergency

242

No

Drowned

168

No

Gunshot

145

No

Drug overdose

131

No

Other

124

Pursuit

Vehicle

5,887

Pursuit

Other

3

Suicide

Gunshot

2,883

Suicide

Other

109

Undetermined

Undetermined

45

Undetermined

Other

5

Total

-

28,793

Even within the “Deadly force” and “Less-than-lethal force” there are causes of death like “Drug overdose” that would not qualify as a police killing. I think Stone should have at least limited his Figure 1 to “Deadly force” and “Less than lethal force” cases. At the time of this analysis, those sum to 19,051.

Filtering out non-force police encounters does not change the trend though. Police killings still more than doubled over 20 years. Here are the fatal encounters broken down by presence of police force.

fe_smry <- fe %>%
  filter(date_year < 2020) %>%
  mutate(force = if_else(intentional_use_of_force_developing %in%
                           c("Deadly force", 
                             "Intentional use of force", 
                             "Less-than-lethal force"), 1, 0),
         any = 1,
         nonforce = any - force) %>%
  group_by(date_year) %>%
  summarize(.groups = "drop",
            `Police Force` = sum(force), 
            `Non-Force` = sum(nonforce), 
            `Any Cause` = sum(any)) %>%
  pivot_longer(-date_year, names_to = "grp", values_to = "n") %>%
  mutate(grp = factor(grp, levels = c("Any Cause", "Police Force", "Non-Force")))

fe_smry %>%
  ggplot(aes(x = date_year, y = n, color = grp)) +
  geom_line() +
  geom_smooth(method = "lm", se = FALSE, linetype = 3, size = .5) +
  scale_color_manual(values = c(`Any Cause` = "#868B8E", `Police Force` = "#EF7C8E", `Non-Force` = "#18A558")) +
  theme_minimal() +
  scale_y_continuous(limits = c(0, NA), labels = scales::number_format(big.mark = ",")) +
  theme(legend.position = "top") +
  labs(title = "Fatal Encounters with Police",
       x = "", y = "", color = "",
       caption = "Source: Fatal Encounters, fatalencounters.org.")

I am skeptical about the slopes of the lines. Non-Force fatal encounters nearly doubled too. That is strange. My hunch is that this is due to decreasing availability of documentation as you look backward in time. I trust the recent data much more than the earlier data.

One way to get a better sense of the trend in fatal encounters with police force is by assuming fatal encounters without police force are flat. That is, I can lift up the “Non-Force” line on the left side so it is approximately level, then lift up the “Police Force” like by the same proportion. The remaining slope in the “Police Force” line would yield a better picture of the trend.

Fitting a trend line through the data, non-force fatal encounters in 2000 were 48% of their 2019 value and police force fatal encounters were 46% of their 2019 value. That suggests nearly all of the upward slope in force-related fatal encounters could be explained by better reporting. In the figure below, I scaled the pre-2019 values to adjust for the slope of the non-force fatal encounters line.

fe_mdl %>%
  select(date_year, `Police Force`, `Police Force (scaled)`) %>%
  pivot_longer(-date_year) %>%
  mutate(name = factor(name, levels = c("Police Force", "Police Force (scaled)"))) %>%
  ggplot(aes(x = date_year, y = value, color = name)) +
  geom_line() +
  geom_smooth(method = "lm", se = FALSE, linetype = 3, size = .5) +
  scale_color_manual(values = c(`Police Force` = "#EF7C8E", `Police Force (scaled)` = "#18A558")) +
  theme_minimal() +
  scale_y_continuous(limits = c(0, NA), labels = scales::number_format(big.mark = ",")) +
  theme(legend.position = "top") +
  labs(title = "Fatal Encounters with Police Using Force",
       subtitle = "Pre-2019 values scaled for improved reporting rates.",
       x = "", y = "", color = "",
       caption = "Source: Fatal Encounters, fatalencounters.org.")

The green line is probably a much better picture of the level of fatal encounters with police force over the last two decades. Stone found police killings doubling to a currently annual level of around 1,700 per year. I think police killings have been flat at about 1,200.

It’s probably also a good idea to normalize the data for population changes. In 2000, the U.S. population was 281 million. By 2019 it had increased 16% to 327 million. Here is the prior figure expressed per-capita. Now the slope is slightly negative.

fe_mdl_pop <- fe_mdl
fe_mdl_pop$Population <- census_extrap$Total
fe_mdl_pop <- fe_mdl_pop %>%
  mutate(`Police Force (scaled, per-capita)` = `Police Force (scaled)` / Population * 1e6)

fe_mdl_pop %>%
  ggplot(aes(x = date_year, y = `Police Force (scaled, per-capita)`)) +
  geom_line(color = "#18A558") +
  geom_smooth(method = "lm", se = FALSE, linetype = 3, size = .5, color = "#18A558") +
#  scale_color_manual(values = c(`Police Force` = "#EF7C8E", `Police Force (scaled)` = "#18A558")) +
  theme_minimal() +
  scale_y_continuous(limits = c(0, NA), labels = scales::number_format(big.mark = ",")) +
  # theme(legend.position = "top") +
  labs(title = "Per Capita Fatal Encounters with Police Using Force",
       subtitle = "Pre-2019 values scaled for improved reporting rates.",
       x = "", y = "per million inhabitants",
       caption = "Source: Fatal Encounters, fatalencounters.org.")

The probability of a fatal encounter involving police force was about 4 per one million population in 2000, and has fallen to around 3.5 per one million population over 20 years. My adjustments to Stone’s Figure 1 are fairly crude, but I am not surprised to learn that deadly police encounters per capita are falling given the intense media scrutiny on the issue. Whether the level (~3.5 per million) is too high, or the reduction over time (-.5 in 20 years) is a separate question.

I was a little surprised the levels are that high. I live in Cleveland in Cuyahoga County where the population is about 1.2 million. So you might expect about 4 fatalities due to police force per year. To sanity check the results, I pulled the records for the last 3 years (2017-2019). There were 10 results. All look legitimate.

fe %>%
  filter (date_year %in% c(2017:2019) &
            location_of_death_state == "OH" &
            location_of_death_county == "Cuyahoga" &
            intentional_use_of_force_developing %in% c("Deadly force",
                             "Intentional use of force",
                             "Less-than-lethal force")) %>%
  mutate(Date = lubridate::ymd(date_of_injury_resulting_in_death_month_day_year)) %>%
  select(Date, 
         Victim = subjects_name, 
         Description = a_brief_description_of_the_circumstances_surrounding_the_death) %>%
  arrange(Date) %>%
  flextable::flextable() %>%
  flextable::theme_zebra() %>%
  flextable::autofit() %>%
  flextable::set_caption("Fatal Encounters with police in Cuyahoga County, OH, 2017-2019.")
Fatal Encounters with police in Cuyahoga County, OH, 2017-2019.

Date

Victim

Description

2017-03-07

Roy Dale Evans Jr.

Strongsville police attempted to stop a vehicle for a traffic violation. The driver failed to stop and continued to speed, police said. The pursuit ended at I-271 and I-71. Strongsville Officer Jason Miller fired his weapon at least one time, killing the driver, Roy Dale Evans Jr.

2017-03-13

Luke O. Stewart

A Euclid police officer fatally shot Luke O. Stewart during a confrontation, police said. Police were investigating a report of a suspicious vehicle parked on South Lakeshore. Police did not say what precipitated the shooting.

2017-04-10

Jeffrey James Findlay

Police were called after a neighbor said she saw Findlay pointing something at his ex-girlfriend. Police said Findlay was in the driveway in front of the home with a gun when they arrived, and he ignored orders to drop the gun before he was shot and killed.

2017-10-25

Antonio Levison

Police were called for a report of shots fired. Officers encountered two "suspicious men" who they tried to stop, police said. One of the men showed a handgun. Both of the men ran away, and a gun fell from one of the men who fell down as police gave chase. Officers continued to chase the second man and gave commands for him to stop and put his hands up before they ran in the backyard of a home. The man allegedly showed a gun and the officer shot and killed him.

2018-01-13

Thomas Yatsko

An off-duty police officer shot and killed Thomas Yatsko about 11 p.m. at the Corner Alley bowling alley, police said. A fight broke out inside the bar and the off-duty police officer, who was working part-time security at the business, escorted several men outside. Yatsko allegedly came back to the bowling alley and attacked the officer, who shot and killed him.

2018-05-24

Brett Luengo

Brett Luengo crashed his car along Interstate 90 East. Witnesses said he went after anyone who stopped to help. When a Cuyahoga County Sheriff's deputy pulled up, he again became aggressive. Video shows the man on the ground flailing while ignoring the deputy's commands before getting up and going after him, still shouting. The deputy repeatedly ordered him to get on the ground, and shocked him with a Taser. The deputy shot and killed him when the man allegedly lunged at him.

2018-06-20

Jonathan Legg

An officer stopped the car Jonathan Legg was driving because a license plate check showed the license plate on the Chevrolet Impala did not belong to that car, police said. He was shot and killed when he allegedly shot one of the officers.

2019-07-09

Shawn M. Toney

Shawn Toney was experiencing some kind of medical or mental crisis when he was shot and killed while in a standoff with police. He allegedly fired one shot, as did an officer, but it was not immediately reported who owned the bullet that killed him.

2019-11-05

Maurice Brown

Maurice Brown allegedly shot a woman inside her apartment then followed her outside and refused officers' orders to drop the gun, and was shot and killed by police as he raised the gun to the woman's head.

2019-11-15

Mark Sheppard

A homicide suspect, Mark Sheppard, was holding a shotgun when a Cleveland officer shot and killed him in the driveway of an East Side home, police said. Neither Sheppard nor the officer's partner fired any shots during the incident.

It is difficult to conceptualize a proportion like 3.5 per million of population. To get some sense of the risk, I looked up some other fatality statistics.

pop2017 <- census_extrap %>% filter(Year == 2017) %>% select(Total) %>% pluck(1)
pop2018 <- census_extrap %>% filter(Year == 2018) %>% select(Total) %>% pluck(1)
pop2019 <- census_extrap %>% filter(Year == 2019) %>% select(Total) %>% pluck(1)
fe2019 <- fe_smry %>% filter(date_year == 2019 & grp == "Police Force") %>% select(n) %>% pluck(1)
tribble(
  ~Cause, ~`Deaths per 1 million population`, ~Reference,
  "Police Force", fe2019 / pop2019 * 1e6, "(2019 data from fatalencounters.org)",
  "Assault (homicide)", 19510 / pop2017 * 1e6, "https://www.cdc.gov/nchs/data/nvsr/nvsr68/nvsr68_09-508.pdf",
  "Alcohol-Impaired Driving Fatalities", 10511 / pop2018 * 1e6, "https://crashstats.nhtsa.dot.gov/Api/Public/ViewPublication/812826",
  "Accidental discharge of firearm", 486 / pop2017 * 1e6, "https://www.cdc.gov/nchs/data/nvsr/nvsr68/nvsr68_09-508.pdf",
  "Pregnancy, childbirth and the puerperium", 1208 / pop2017 * 1e6, "https://www.cdc.gov/nchs/data/nvsr/nvsr68/nvsr68_09-508.pdf",
  "Complications of medical and surgical care", 4459 / pop2017 * 1e6, "https://www.cdc.gov/nchs/data/nvsr/nvsr68/nvsr68_09-508.pdf"
) %>%
  flextable::flextable() %>%
  flextable::colformat_num(j = 2, digits = 1) %>%
  flextable::autofit()

Cause

Deaths per 1 million population

Reference

Police Force

3.5

(2019 data from fatalencounters.org)

Assault (homicide)

59.9

https://www.cdc.gov/nchs/data/nvsr/nvsr68/nvsr68_09-508.pdf

Alcohol-Impaired Driving Fatalities

32.1

https://crashstats.nhtsa.dot.gov/Api/Public/ViewPublication/812826

Accidental discharge of firearm

1.5

https://www.cdc.gov/nchs/data/nvsr/nvsr68/nvsr68_09-508.pdf

Pregnancy, childbirth and the puerperium

3.7

https://www.cdc.gov/nchs/data/nvsr/nvsr68/nvsr68_09-508.pdf

Complications of medical and surgical care

13.7

https://www.cdc.gov/nchs/data/nvsr/nvsr68/nvsr68_09-508.pdf

I cherry-picked the causes. Homocide and drunk driving accidents seemed like good comparisons. I picked “Accidental discharge of firearm” and “pregnancy” because they have rates in the same ballpark as police killings. I don’t mean to trivialize death by police force - death during childbirth is not preventable, but death from police force probably is. Nevertheless, it does seem like we have as much to fear from ourselves, our neighbors, and our medical professionals as we have to fear from our police.

Measuring Racial Bias

Compared to population proportions, the victims of fatal encounters with police are disproportionately black. When compared to the population of violent offenders, it’s a close call. Here are the fatal encounters data grouped by race/ethnicity, with population percentages and violent offender percentages. The violent offender percentages are from Race and Hispanic Origin of Victims and Offenders, 2012-15 (Table 1) published by the U.S. Department of Justice.

avg_pop <- census_extrap %>% select(-source, -Total) %>% pivot_longer(-Year) %>%
  group_by(name) %>%
  summarize(sum_pop = sum(value), .groups = "drop") %>%
  mutate(pop_pct = sum_pop / sum(sum_pop) * 100) %>%
  select(-sum_pop)
violent_off <- tribble(
  ~`race/hisp`, ~viol_pct,
  "African-American/Black", 0.227*100,
  "European-American/White", 0.438*100,
  "Hispanic/Latino", 0.144*100,
  "Other", (0.022+.06+.028+.08)*100)
fe %>%
  filter(intentional_use_of_force_developing %in% c("Deadly force", 
                                                    "Intentional use of force", 
                                                    "Less-than-lethal force")) %>%
  count(subjects_race_with_imputations) %>%
  ungroup() %>%
  mutate(pct = n / sum(n) * 100) %>%
  inner_join(avg_pop, by = c("subjects_race_with_imputations" = "name")) %>%
  inner_join(violent_off, by = c("subjects_race_with_imputations" = "race/hisp")) %>%
  janitor::adorn_totals() %>%
  flextable::flextable() %>%
  flextable::colformat_int(j = "n") %>%
  flextable::colformat_num(j = c("pct", "pop_pct", "viol_pct"), digits = 0, suffix = "%") %>%
  flextable::autofit()

subjects_race_with_imputations

n

pct

pop_pct

viol_pct

African-American/Black

5,407

28%

12%

23%

European-American/White

8,903

47%

64%

44%

Hispanic/Latino

3,434

18%

16%

14%

Other

1,307

7%

8%

19%

Total

19,051

100%

100%

100%

African-American/Black were 28% of the fatalities, 12% of the U.S. population, and 23% of violent offenders (at least from 2012-2015). There is a problem with the table though. The “Other” category for violent offenders is so high (19%) because it is composed of four sub-categories:

  • 2.2% are offenders perceived to be American Indian or Alaska Native or Asian, Native Hawaiian, or Other Pacific Islander
  • 6% are single offenders of two or more races
  • 2.8% are multiple offenders of various races
  • 8% are unknown race or number of offenders

The 2.2% are definitely not African-American/Black, but it seems likely that some of the the 16.8% of mixed or unknown race would be part of the African-American/Black fatal encounter figure. Stone says only about 22 percent of violent offenders are black, but does not say where that statistic comes from. We need to get this number right because 22% is already pretty close to the 28% victimization number.

My conclusion is that it is unclear whether blacks are more likely to be the victim of fatal police force than other racial/ethnic groups.