Traffic Stops in Nashville, TN from 2010 to 2018: A Statistical Analysis of Bias

Abstract

Is there bias in traffic stops in Nashville, TN? This paper examines traffic stop records in Nashville from 2010 to 2018 to determine whether there is evidence of racial bias. The analysis begins by comparing these records to U.S. Census data for the city, followed by several statistical tests.

The investigation starts with an inference for a single proportion, focusing on the number of stops involving Black drivers—revealing substantial bias. A subsequent Veil of Darkness analysis yields inconclusive results. A chi-square test of independence then shows significant differences in search rates across racial groups, and a difference in two proportions analysis confirms racial disparities in who is subjected to searches.

Logistic regression further demonstrates that Black and Hispanic drivers are significantly more likely to be searched. Finally, an analysis of arrest rates when contraband is found reveals additional racial bias. These findings indicate clear patterns of biased practices in traffic stops, searches, and arrests conducted by the Nashville Police Department from 2010 to 2018. While further analysis by region, officer, or other variables may provide additional insights, the existing data demonstrate systemic racial bias in Nashville’s traffic enforcement.

Part 1 - Introduction

Early 20th-century sociologist Max Weber defined the state as an entity that “successfully upholds a claim to the monopoly of the legitimate use of physical force.”[^1] In the United States, this monopoly is primarily granted to the police.

As Lord Acton wrote, “power tends to corrupt.”[^2] Therefore, it is vital that the power and authority bestowed upon the police be reviewed regularly.

With this in mind, this paper intends to explore the following question:

“Is there an indication of racial bias in a data set of traffic stops that took place in Nashville, TN from 2010 to 2018?”

This analysis draws on a data set of vehicle traffic stops conducted in Nashville, TN between 2009 and 2019, which is narrowed down to the years 2010 through 2018. Racial demographic data from the U.S. Census is also used.

Part 2 - Data

Importing the data

The following code imports the traffic stop data and filters stops to between 2010 and 2018.

# load data

#import the nashville traffic stop data
nashville_raw <- read_csv("tn_nashville_2020_04_01.csv")

## Rows: 3092351 Columns: 42
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (13): raw_row_number, location, subject_race, subject_sex, officer_id_h...
## dbl   (6): lat, lng, precinct, reporting_area, zone, subject_age
## lgl  (21): arrest_made, citation_issued, warning_issued, contraband_found, c...
## date  (1): date
## time  (1): time
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

#filter only dates from 2010 through 2018
nash <- nashville_raw |> 
  filter(between(date, as.Date("2010-01-01"), as.Date("2018-12-31"))) |>
  mutate(date = as.Date(date), time = as_hms(time))

The following code imports the census data.

#import census data
census2010_raw <- read_csv("us_census_2010_nashville_tn.csv")

## Rows: 73 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Label (Grouping), United States
## num (2): Knoxville city, Tennessee, Nashville-Davidson metropolitan governme...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

census2020_raw <- read_csv("us_census_2020_nashville_tn.csv")

## Rows: 73 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Label (Grouping)
## num (3): United States, Knoxville city, Tennessee, Nashville-Davidson metrop...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

The following code cleans the census data.

#consolidate census2010 data into race types that match the Nashville traffic stop data
census2010 <- census2010_raw[2:11,] |> 
  clean_names() |> 
  select(!united_states:knoxville_city_tennessee)
  
census2010 <- census2010 |> 
  rename(race = label_grouping, nashville_population_2010 = nashville_davidson_metropolitan_government_balance_tennessee) |>
  mutate(race = str_remove_all(race, "alone"),
         race = tolower(str_trim(race)))
  
census2010 <- census2010 |> bind_rows(census2010 |> 
              filter(race %in% c("asian", "native hawaiian and other pacific islander")) |>  
                          summarise(across(where(is.numeric), sum)) |>  
                          mutate(race = "asian/pacific islander")) |>   
  mutate(race = str_replace(race, "some other race", "other"),
         race = str_replace_all(race, ":", " "),
         race = str_replace(race, "black or african american", "black"),
         race = str_replace(race, "hispanic or latino", "hispanic"),
         race = str_trim(race))

#consolidate census2020 data into race types that match the Nashville traffic stop data
census2020 <- census2020_raw[2:11,] |> clean_names() |>
  select(!united_states:knoxville_city_tennessee)

census2020 <- census2020 |> rename(race = label_grouping,   nashville_population_2020 = nashville_davidson_metropolitan_government_balance_tennessee) |>
  mutate(race = str_remove_all(race, "alone"),
         race = tolower(str_trim(race)))

census2020 <- census2020 |> bind_rows(census2020 |>  
              filter(race %in% c("asian", "native hawaiian and other pacific islander")) |>  
                          summarise(across(where(is.numeric), sum)) |>  
                          mutate(race = "asian/pacific islander")) |>  
  mutate(race = str_replace(race, "some other race", "other"),
         race = str_replace_all(race, ":", " "),
         race = str_replace(race, "black or african american", "black"),
         race = str_replace(race, "hispanic or latino", "hispanic"),
         race = str_trim(race),
         race = str_remove(race, "population of "))

The following code continues cleaning the census data.

census2010 <- census2010 |> 
  filter(race != "asian", race != "native hawaiian and other pacific islander", race != "not hispanic", race != "population of one race") |>
  arrange(desc(nashville_population_2010))

census2020 <- census2020 |> 
  filter(race != "asian", race != "native hawaiian and other pacific islander", race != "not hispanic", race != "one race") |>
  arrange(desc(nashville_population_2020))

The following code joins the two sets of census data.

census <- full_join(census2010, census2020, by = join_by(race))

Population Data

An aspect of this data is that, in both instances, it is purported to be population data.

The traffic stop data is the entire collection of vehicle traffic stops made between 2010 and 2018 in Nashville, TN. While it does not include all police activity in that period of time it represents all traffic stops.

The US Census is also an attempt at counting the full population of the United States. As a result the population data is able to be treated as full population data.

Undoubtedly the size of the task of counting every person, and recording every traffic stop, is a process with potential errors. For analysis, in this paper, both values will be treated as very large samples when necessary.

Traffic Stop Data

The data used in this paper was distributed by the The Standford Open Policing Project and was collected by them from the Nashville Police Department.[^3] The Stanford Open Policing Project is a group of researchers and journalists at Stanford University.

For this analysis the data has been filtered include only complete years, specifically 2010 to 2018. This results in a total of 3,078,116 traffic stops.

The data includes information about each traffic stop, including:

date
time
location
officer

It also includes demographics information about the individual who was stopped:

age
race
sex

Additionally, it records actions taken during the stop, including:

arrests
searches
frisks
outcomes: warning, citation, arrest

A preview of the data:

head(nash)

US Census data

Data from the US Census US Census is also used.[^4] This data is collected by the US government about the population of the United States.

The data from the US census is demographic data regarding the race of the population of Nashville, TN. This data is cleaned to match the racial demographic notation as listed in the traffic stop data set. To achieve this several changes were necessary.

Demographic categories “Asian” and “Native Hawaiian and other Pacific Islander” were combined. While this paper acknowledges the distinct nature of these separate groups, the Nashville traffic data set has them combined into a single category. Since they cannot be separated out in that data set, they have been combined in the Census data.

*Demographic designation “Black or African American” has been shortened to match the traffic data set as “Black”

*Multi-racial categories have been combined into a single count, rather than having separate combined categorizations for 2, 3, 4, or more racial groups. Additionally, since multiple races are not defined and could mean a wide range of things, they will not be included in analysis

*Since “other” is not defined in either the census or traffic stop data, it will not be included in analysis

*Hispanic or latino is shortened to Hispanic to match the traffic stop data

A glimpse at the census data:

census |> arrange(desc(nashville_population_2010))

Part 3 - Exploratory data analysis

In Nashville the US Census population demographics are as follows:

census |> 
  mutate(percent_change = round((nashville_population_2020 - nashville_population_2010)/nashville_population_2010, digits = 2)) |> arrange(desc(nashville_population_2010))

Looking at the percent change over time there were notable increases in most racial demographics between the two census’. There are several possible explanations for these increases, including migration within the United Sates, immigration, population increase and cultural changes in the way people self-identify.

The total populations of Nashville in the US Census 2010 and 2020:

census |> summarize(nashville_population_2010 = sum(nashville_population_2010), nashville_population_2020 = sum(nashville_population_2020)) |> mutate(percent_change = round( (nashville_population_2020 - nashville_population_2010)/nashville_population_2010, digits = 2))

census |> 
  mutate(proportion_2010 = round(nashville_population_2010 / sum(nashville_population_2010), digits = 2),
         proportion_2020 = round(nashville_population_2020 / sum(nashville_population_2020), digits = 2)
         ) |> arrange(desc(nashville_population_2010))

census_long <- census |> 
  pivot_longer(
    cols = !race,
    names_to = c("city", "dimension", "year"),
    names_sep = "_"
  )

Without knowing how the population changed year over year, for the remainder of this analysis, this paper will use an estimated proportion adjusted average of the census data between 2010 and 2018, calculated using the following code.

The code estimates the population in 2018 based on a constant linear change over time and then averages the 2010 and estimated 2018 population. This is the reference population that will be used for the remainder of this analysis.

avg_census <- census |> mutate(
  est_2018 = (nashville_population_2020 - nashville_population_2010) / 10 * 8 + nashville_population_2010) |> 
  mutate(avg_est_census_2010_to_2018 = (est_2018 + nashville_population_2010) / 2, 
    avg_est_census_2010_to_2018 = round(avg_est_census_2010_to_2018, digits = 0)
    ) |> 
  select(race, avg_est_census_2010_to_2018) |> 
  arrange(desc(avg_est_census_2010_to_2018)) |> 
  mutate(est_census_proportion = round(avg_est_census_2010_to_2018 / sum(avg_est_census_2010_to_2018), digits = 2)) 

avg_census

avg_census |> summarize(total_population = sum(avg_est_census_2010_to_2018))

When considering the traffic stops data the totals and proportions by race are as follows:

race <- nash |> group_by(subject_race) |> summarise(n = n(),
stop_proportion = round(n / nrow(nash),2)) |> arrange(desc(n)) |> 
  rename(stops = n)

race

The table below shows a comparison of the proportion of stops to the average population proportion by race:

prop_table <- inner_join(race, avg_census, by = c("subject_race" = "race")) |> 
  select(subject_race, stop_proportion, est_census_proportion) |> 
  mutate(stops_less_proportion = round(stop_proportion - est_census_proportion, digits = 2)) |> filter(subject_race != "other")

prop_table

prop_table_ggplot <- prop_table |> select(subject_race:est_census_proportion) |> pivot_longer(
    cols = !subject_race,
    names_to = "proportion_type",
    values_to = "proportion"
  )

ggplot(prop_table_ggplot, aes(x = subject_race, y = proportion, fill = proportion_type)) + 
  geom_bar(position = "dodge", stat = "identity") +
  xlab("Race") +
  ylab("Proportion") +
  scale_fill_manual(
    name = "Proportion Type", 
    labels = c("Census Proportion", "Traffic Stop Proportion"),
    values = c("est_census_proportion" = "cadetblue", "stop_proportion" = "cornsilk3")
  )

The immediate take away from the above table is that while White, Asian/Pacific Islanders, and Hispanic race types are stopped at a rate slightly less than their proportion among the population, Black drivers are stopped at a rate about 40% more than their representation within the population.

Additionally, there may be additional value in examining if there are higher rates of stops by race in different precincts:

precinct_plot <- nash |> filter(precinct != "NA" & subject_race != "NA" & subject_race != "unknown" & subject_race != "other") |> group_by(subject_race, precinct) |> 
  summarize(stops = n(), .groups = "drop") |> 
  group_by(precinct) |> 
  mutate(stop_proportion = round(stops / sum(stops),2)) |> 
  arrange(precinct, desc(stop_proportion))

precinct_plot

ggplot(precinct_plot, aes(x = precinct, y = stop_proportion, fill = subject_race)) +
  geom_bar(position = "stack", stat = "identity")+
  scale_fill_manual(values= wes_palette("Zissou1", n = 4))

While there is some variation in the proportion of stops by race, it is notable that 75% of traffic stops in precinct six are Black drivers. Nashville Police Precinct six is also referred to as Midtown Hills and encompasses some of the most affluent neighborhoods in the city.

Part 4 - Inference

Inference of a single proportion

As mentioned previously, the data is purported to be full population data, so direct comparison of the proportions indicates that with Black drivers being stopped at a rate of 38.2% of the all stops, and being 27% of the population, they are stopped at 11% higher rate than their proportion within the population or 41% greater than their actual population proportion, this appears to be an obvious bias.

However, since the process of recording both sets of population data has the potential for error, there is value in performing statistical analysis. To do so, this paper starts by using an inference of a single proportion using the US Census data as known population data, and the proportion from the sample data.

To do so we will set the following hypothesis test:

\[ H_0: \hat{p} = p \] \[ H_A: \hat{p} \neq p \] Where: \[ {p} = \text{population proportion of people who are black} \] \[ \hat{p} = \text{proportion of traffic stops where the driver was black} \]

The following code calculates the p_value for a single proportion z-test:

census_proportion_black <- filter(avg_census, race == "black")[["est_census_proportion"]]

z_single_prop <- nash |> 
  drop_na(subject_race) |> 
  mutate(test_race = case_when(subject_race == "black" ~ "black", TRUE ~ "not_black")) |> 
  specify(response = test_race, success = "black") |> 
  hypothesize(null = "point", p = census_proportion_black) |> 
  calculate(stat = "z")

p_value <- 2 * pnorm(z_single_prop$stat, lower.tail = FALSE)

print(paste("The calculated p-value is", p_value))

## [1] "The calculated p-value is 0"

Before continuing, it is necessary to confirm that the necessary conditions are met in order to utilize a normal distribution.

Independence

** Since the data is ostensibly the full population, indpendence is not assured. Another approach would be to create random sample from the population data set that is less than 10% of the total population. Below the single proportion inference will be repeated with a sample of 100,000 traffic stops.

Success-Failure Condition

** There is expected to be at least 10 success and 10 failures in the sample. Since the sample size is so large, and neither proportion is extremely small, this condition is met.

Below, the inference for a single proportion is repeated using a random sample of the traffic stop data. The results are the same, a p-value of 0.

nash_sampled_set <- nash |> slice_sample(n = 100000, replace = FALSE) |> filter(subject_race != "NA") |>  mutate(is_black = ifelse(subject_race == "black", "yes", "no"))

prop_black <- mean(nash_sampled_set$is_black == "yes")

single_null_dist <- nash_sampled_set |> 
  drop_na(is_black) |> 
  specify(response = is_black, success = "yes") |> 
  hypothesize(null = "point", p = census_proportion_black) |> 
  generate(reps = 1000, type = "draw") |> 
  calculate(stat = "prop")

suppressWarnings(single_null_dist |> get_p_value(obs_stat = prop_black, direction = "two-sided"))

visualize(single_null_dist) +
  shade_p_value(obs_stat = prop_black, direction = "two-sided") +
  labs(
    title = "Null Distribution: Proportion of Stops Where Driver was Black",
    subtitle = paste("Observed Proportion:", round(prop_black, 3)),
    x = "Proportion",
    y = "Density"
  ) +
  theme_minimal()

The calculated p-value is so low that it is below the calculated significant figures and therefore below the significance level of 5%.

This indicates that the null hypothesis should be rejected.

There is a 95% confidence level that there is a statistically significant difference between the proportion of traffic stops where the subject was Black and the proportion of Black people within the population of Nashville.

Since the proportion of traffic stops where the subject was Black is higher than the proportion within the population, there appears to be a racial bias in traffic stops.

Confidence Interval

Extending this analysis to the confidence interval for the single proportion.

A column must be created to help define the success parameter, which in this case is being identified as Black for the traffic stop.

nash_extended <- nash |> mutate(
  is_black = ifelse(subject_race == "black", "yes", "no")
)

nash_sampled_set |> 
  filter(is_black != "NA") |> 
  count(is_black) |> 
  mutate(p_hat = round(n / sum(n),3))

The confidence interval analysis for the population proportion:

single_prop_ci <- nash_sampled_set |> 
  drop_na(is_black) |> 
  specify(response = is_black, success = "yes") |> 
  generate(reps = 100, type = "bootstrap") |> 
  calculate(stat = "prop") |> 
  get_ci(level = 0.95) |> 
  print()

## # A tibble: 1 × 2
##   lower_ci upper_ci
##      <dbl>    <dbl>
## 1    0.375    0.380

The 95% confidence interval is approximately 0.3745559 to 0.3801756. There is a 95% likelihood that the true proportion of stops of Black drivers is contained within the confidence interval. When comparing this with the census population data of the Black population at approximately 27% of Nashville’s population, there continues to be an appearance of bias.

Veil of Darkness Test

An alternative way to consider stops and possible race based bias is to use the Veil of Darkness Test. The premise of this test is to examine the rates of stops when it is light out versus when it is dark out. This approach relies on the hypothesis that, if officers are engaged in racial profiling, they are less able to do so at night. Stated plainly: the proportion of traffic stops where the subject is Black will be lower at night.

However, to execute this analysis, comparing day time stops to night time stops may not be representative as it cannot accommodate for a wide range of other variables. Instead, a better approach is to compare stops executed at the same time but at different times of year, based on seasonal light conditions.

To perform this analysis, data regarding sunset and dusk must be added to the data set. The following code is adapted from The Stanford Open Policing Project.[^5]

# Get timezone for Nashville
nash_lat <- 36.1616
nash_long <- -86.7816
tz <- lutz::tz_lookup_coords(nash_lat, nash_long, warn = F)

# Helper function
time_to_minute <- function(time) {
  hour(time) * 60 + minute(time)
}

?getSunlightTimes

# Compute sunset time for each date in our dataset
sunset_times <- nash |> 
  mutate(
    lat = nash_lat,
    lon = nash_long
  ) |> 
  select(date, lat, lon) |> 
  distinct()
  
sunset_times <- getSunlightTimes(
    data = sunset_times,
    keep = c("sunset", "dusk"), 
    tz = tz
  )


sunset_times <- sunset_times |> 
  mutate(across(c(sunset, dusk), as_hms)) |> 
  mutate(
    sunset_minute = time_to_minute(sunset),
    dusk_minute = time_to_minute(dusk)
  ) |> 
  select(date, sunset, dusk, ends_with("minute"))

The following table shows that the earliest dusk falls is at about 5:00pm in early December and the latest dusk falls is about 8:40pm in late June.

sunset_times |> arrange(dusk) |> filter(dusk == min(dusk) | dusk == max(dusk))

To utilize this observation the following code joins the sunset and dusk information to the traffic stops data, filters out the period between sunset and dusk, and filters down to only Black and White drivers, in order to simplify the analysis.

vod <- nash |>  
  drop_na(time) |> 
  left_join(sunset_times, by = "date") |> 
  mutate(
    minute = time_to_minute(time),
    minutes_after_dark = minute - dusk_minute,
    is_dark = minute > dusk_minute,
    min_dusk_minute = min(dusk_minute),
    max_dusk_minute = max(dusk_minute),
    is_black = subject_race == "black"
  ) |> 
  filter(
    minute >= min_dusk_minute,
    minute <= max_dusk_minute,
    !(minute > sunset_minute & minute < dusk_minute),
    subject_race %in% c("black", "white")
  )

print(paste("The data set is now", nrow(vod), "observations."))

## [1] "The data set is now 429594 observations."

The table below shows the proportion of Black drivers pulled over between 6:30PM and 6:45PM when it was dark vs when it was light.

vod <- vod |> 
  mutate(time = as_hms(time))

vod_plot <- vod |> filter(time >= as_hms("18:30:00") & time <= as_hms("18:45:00"), precinct != "NA") |>
  group_by(is_dark, precinct) |> summarize(
  prop_black = mean(is_black), .groups = "drop") |>  arrange(precinct)

ggplot(vod_plot, aes(x = precinct, y = prop_black, fill = is_dark)) +
  geom_bar(position = "dodge", stat = "identity") +
  ggtitle("Traffic stops between 6:30pm and 6:45pm of black drivers by precinct") +
  xlab("Nashville Police Precinct") +
  ylab("Proportion") +
  scale_fill_manual(
    name = "Light out or Dark out", 
    labels = c("Light out", "Dark out"),
    values = c("TRUE" = "cadetblue", "FALSE" = "cornsilk3")
  )

In six of the eight precincts there is a reduction in stops of Black drivers when it is dark. But is this statistically significant?

To determine this the following code block performs a logistical regression on the full data. This code was taken from The Stanford Open Policing Project.

mod1 <- glm(
  is_black ~ is_dark + splines::ns(minute, df = 6),
  family = binomial,
  data = vod
)

mod1

## 
## Call:  glm(formula = is_black ~ is_dark + splines::ns(minute, df = 6), 
##     family = binomial, data = vod)
## 
## Coefficients:
##                  (Intercept)                   is_darkTRUE  
##                    -0.241548                     -0.002387  
## splines::ns(minute, df = 6)1  splines::ns(minute, df = 6)2  
##                     0.071025                     -0.017688  
## splines::ns(minute, df = 6)3  splines::ns(minute, df = 6)4  
##                     0.027178                      0.077966  
## splines::ns(minute, df = 6)5  splines::ns(minute, df = 6)6  
##                     0.102554                      0.109267  
## 
## Degrees of Freedom: 429593 Total (i.e. Null);  429586 Residual
## Null Deviance:       591000 
## Residual Deviance: 590900    AIC: 590900

The results of the logistical regression are inconclusive. While the coefficient indicates that there is a decrease in rates at which Black drivers are stopped after dark, the difference is so small as to be practically insignificant.

Why is this? The table below provides some clues:

vod |>
  group_by(is_dark) |> summarize(
  prop_black = mean(is_black), .groups = "drop")

As the table indicates, it appears that the proportion of stops of Black drivers overall goes up very slightly when it is dark out and when considering all precincts.

The Veil of Darkness does not show bias in this instance, but it does not show its absence. It shows that it is not a useful model when used in this way.

Proportions of Searches

Until now, only stops have been considered. Since the entire data is of stops that actually happened, it does not consider what happened for the stop to occur. The reasons for the stop in the first place could be many, ones that point towards bias but also ones that may not relate to bias. It is possible that Black drivers are stopped at a higher rate because neighborhoods where they drive are more heavily policed. This cannot be determined from this data.

Criticisms to the previous analyses might argue that there are different driving behaviors and customs between different raceial groups. Critics may argue that Black drivers drive more often or for longer distances, or have a different level of compliance to traffic laws. None of this can be determined or dismissed using this data set.

However, the rates at which stops become searches can be considered: Of the people stopped, at what rates are those stops becoming searches? And of those searches at what rate is contraband found? Does this data imply bias?

Search proportions by race are shown in the following table:

search_prop <- nash_sampled_set |> mutate(subject_race = factor(subject_race, levels = c("white", "black", "hispanic", "asian/pacific islander", "unknown", "other"))) |> 
  filter(search_conducted != "NA", subject_race != "NA", subject_race != "other", subject_race != "unknown") |> 
  group_by(subject_race, search_conducted) |> 
  summarize(searches = n(), .groups = "drop") |> 
  group_by(subject_race) |> 
  mutate(search_proportion = round(searches / sum(searches), 2)) |> 
  filter(search_conducted == TRUE) |> 
  print()

## # A tibble: 4 × 4
## # Groups:   subject_race [4]
##   subject_race           search_conducted searches search_proportion
##   <fct>                  <lgl>               <int>             <dbl>
## 1 white                  TRUE                 1582              0.03
## 2 black                  TRUE                 2232              0.06
## 3 hispanic               TRUE                  310              0.06
## 4 asian/pacific islander TRUE                   26              0.02

ggplot(search_prop, aes(x = subject_race, y = search_proportion, fill = subject_race)) + 
  geom_bar(position = "dodge", stat = "identity") +
  scale_fill_manual(values= wes_palette("Zissou1", n = 4))

Chi Square Test

When considering these proportions above it appears that searches are executed on Black and Hispanic drivers at twice the rate of White drivers, and three times the rate of Asian/Pacific Islanders. To check for independence a chi-square test can be used to ensure that there is a 95% likelihood that there is a difference in search rates between different racial groups.

searched <- nash_sampled_set |> 
  mutate(searched = ifelse(search_conducted == TRUE, "yes", "no")) |> 
  filter(subject_race %in% c("white", "black", "hispanic", "asian/pacific islander"))

chi_sq_stat <- searched |> 
  drop_na(search_conducted) |> 
  specify(explanatory = subject_race, response = searched) |> 
  hypothesize(null = "independence") |> 
  calculate(stat = "Chisq")

pchisq(chi_sq_stat$stat, df = 3, lower.tail = FALSE) |> print()

##     X-squared 
## 6.495306e-117

The chi-squared test results in an incredibly small test statistic. Since this value is less than 0.05 then a null hypothesis, that there is no difference between the rates of searches, is rejected. This indicates that there is a likely difference between search rates, statistically.

To further analyze these, a difference in proportions test can be conducted. In this instance the test will be limited to a comparison of White drivers and Black drivers and the rates they are searched.

\[ H_0: \hat{p}_b = \hat{p}_w \] \[ H_A: \hat{p}_b \neq \hat{p}_w \] Where: \[ \hat{p}_b = \text{proportion of traffic stops where the subject is black and is searched} \] \[ \hat{p}_w = \text{proportion of traffic stops where the subject is white and is searched} \]

searched |> filter(subject_race %in% c("white", "black")) |> 
  drop_na(search_conducted) |> 
  specify(response = searched, explanatory = subject_race, success = "yes") |>
  generate(reps = 100, type = "bootstrap") |> 
  calculate(stat = "diff in props", order = c("black", "white")) |> 
  get_ci(level = 0.95)

Since the confidence interval does not contain 0, the null hypothesis is rejected. With a 95% confidence level, there is statistical evidence that traffic stops where the driver is Black are searched at a higher rate than when the driver is White.

The confidence interval indicates what was suggested by the proportions: that a Black driver is searched about 3% more often than a White driver. Three percent may not sound like a lot but, as a proportion, it is twice the rate that White drivers are searched.

In firm numbers this is 20,021 more searches of Black drivers than of White drivers. When considered against the total population of Nashville this means as much as 39.7% of Nashville’s Black population was searched between 2010 and 2018 (67,663 searches divided by the average Black population of 170284), while only as much as 12.8% of Nashville’s White population was searched in the same time period (47,642 searches divided by the average White population of 370,502).

While these searches probably included multiple searches of the same drivers, it indicates that a Black resident of Nashville is more than three times more likely to be have been searched than a White resident in that time period despite being about half as much of the population. Does this sound like bias? Or at least the uneven application of authority? It does to the author of this paper.

To execute this same analysis whilst comparing rates of stops for White and Hispanic drivers:

searched |> filter(subject_race %in% c("white", "hispanic")) |> 
  drop_na(search_conducted) |> 
  specify(response = searched, explanatory = subject_race, success = "yes") |>
  generate(reps = 100, type = "bootstrap") |> 
  calculate(stat = "diff in props", order = c("hispanic", "white")) |> 
  get_ci(level = 0.95)

These results comparing Hispanic and White drivers and their search rates are comparable to the comparison of Black and White drivers and show similar evidence of bias.

nash |> group_by(subject_race) |> filter(search_conducted != "NA", subject_race != "NA") |> summarize(search = sum(search_conducted))

Contraband Hits

This analysis can be extended beyond stops and searches to consider the rate of searches that produce contraband. The analysis above regarding search proportions can be criticized because of the vagueries of why a search was conducted. Critics will assert that the officer on the scene most likely had ample reason to conduct all of those searches.

To buffer against this criticism we can examine the rates at which a search is fruitful: What searches produce contraband? Or as they will be referred to moving forward: what searches have “hits”?

The following code produces a column that identifies if a “hit” was made during a search, i.e. contraband was found.

searched_hit <- nash |> filter(subject_race %in% c("white", "black", "hispanic", "asian/pacific islander")) |> 
  mutate(hit = ifelse(contraband_found == TRUE, "yes", "no"))

searched_plot <- searched_hit |> mutate(subject_race = factor(subject_race, levels = c("white", "black", "hispanic", "asian/pacific islander", "unknown", "other"))) |> 
  filter(search_conducted = TRUE) |> 
  group_by(subject_race) |> 
  summarize(hit_rate = mean(contraband_found, na.rm = T))

ggplot(searched_plot, aes(x = subject_race, y = hit_rate, fill = subject_race)) +
  geom_bar(position = "dodge", stat = "identity") +
  ggtitle("Hit Rate of Searches") +
  xlab("Race") +
  ylab("Proportion of Search Hits") +
  scale_fill_manual(values= wes_palette("Zissou1", n = 4))

The rates above indicate a ~8% higher hit rate for Black drivers over White drivers. A first consideration might be to justify the searches. However, Black drivers are being searched at twice the rate as White drivers. Also, remember Hispanic drivers are searched at the same rate as Black drivers and yet the hit rate is only 12%. The drop in hit rate compared to search rate indicates bias.

It’s tempting to extend these numbers out. That, if White and Asian/Pacific Islander drivers were searched as often as Black and Hispanic drivers, then the would be two to three times as many hits for them in searches. Is it possible that many White and Asian/Pacific Islanders are “riding dirty” through the streets of Nashville?

While it might be entertaining to speculate about, this is not likely the case. What is more likely is that police are searching White drivers and Asian/Pacific Islander drivers with a greater amount of probable cause or a more meaningful justification for conducting the search. While the police’s pretense for searching both Black and Hispanic drivers is likely more weak.

Logistic Regression when considering contraband being found and whether an arrest was made

A final possible analysis would be to consider what happened after contraband was found and whether an arrest was made. In this instance, it has already been confirmed that contraband was found, could race play a factor in whether an arrest is made? Arrests can often be at the discretion of the officer.

nash_contraband <- nash |> 
  filter(contraband_found == TRUE, arrest_made != "NA") |> mutate(subject_race = factor(subject_race, levels = c("white", "black", "hispanic", "asian/pacific islander", "unknown", "other"))) |> filter(subject_race %in% c("white", "black", "hispanic", "asian/pacific islander")) |> mutate(searched = ifelse(search_conducted == TRUE, 1, 0))

contraband_plot <- nash_contraband |> 
  group_by(subject_race) |> 
  summarize(arrest_prop = mean(arrest_made, na.rm = TRUE),
            race_prop = n() / nrow(nash_contraband))

ggplot(contraband_plot, aes(x = reorder(subject_race, arrest_prop), y = arrest_prop, fill = subject_race)) +
  geom_bar(position = "dodge", stat = "identity") +
  ggtitle("Proportion of arrests after contraband is found") +
  xlab("Race") +
  ylab("Proportion of Arrests") +
  scale_fill_manual(values= wes_palette("Zissou1", n = 4))

The plot shows that there is a difference in the proportion at which arrest are made. For each of these cases a stop has been made, a search has been conducted, and contraband has been found, yet there is a difference in the rates at which arrests are made depending on race.

To examine this the code below uses logistic regression to examine the relationship between an arrest being made and the subject race.

contraband_glm <- glm(arrest_made ~ subject_race, data = nash_contraband, family = binomial)

summary(contraband_glm)

## 
## Call:
## glm(formula = arrest_made ~ subject_race, family = binomial, 
##     data = nash_contraband)
## 
## Coefficients:
##                                    Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                        -0.64724    0.02114 -30.621  < 2e-16 ***
## subject_raceblack                   0.12263    0.02718   4.511 6.44e-06 ***
## subject_racehispanic                0.35142    0.06165   5.700 1.20e-08 ***
## subject_raceasian/pacific islander -0.07732    0.17898  -0.432    0.666    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 33999  on 25940  degrees of freedom
## Residual deviance: 33956  on 25937  degrees of freedom
## AIC: 33964
## 
## Number of Fisher Scoring iterations: 4

The results indicate that both Black and Hispanic drivers have higher odds of being arrested than White drivers after being found with contraband. This result indicates bias.

The p-value for Asian/Pacific Islander drivers is greater than 0.05 and is therefore not statistically significantly different than the odds for White drivers to be arrested.

A plot of the results:

plot_contraband <- tidy(contraband_glm, conf.int = TRUE, exponentiate = TRUE)

ggplot(plot_contraband[-1, ], aes(x = term, y = estimate)) +
  geom_point() +
  geom_errorbar(aes(ymin = conf.low, ymax = conf.high), width = 0.2) +
  geom_hline(yintercept = 1, linetype = "dashed", color = "gray") +
  labs(
    title = "Odds of Arrest by Race (Relative to White Drivers)",
    x = "Subject Race",
    y = "Odds Ratio"
  ) +
  theme_minimal()

The logistic regression indicates bias.

To explore this further, the logistical regression will be repeated but with additional control variables in an attempt to see if race continues to be statistically significant when other variables are introduced.

contraband_glm_more_variables <- glm(arrest_made ~ subject_race + subject_age + subject_sex + time + precinct, data = nash_contraband, family = binomial)

summary(contraband_glm_more_variables)

## 
## Call:
## glm(formula = arrest_made ~ subject_race + subject_age + subject_sex + 
##     time + precinct, family = binomial, data = nash_contraband)
## 
## Coefficients:
##                                      Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                        -7.409e-01  6.081e-02 -12.184  < 2e-16 ***
## subject_raceblack                   1.341e-01  2.921e-02   4.592 4.39e-06 ***
## subject_racehispanic                4.469e-01  6.764e-02   6.608 3.90e-11 ***
## subject_raceasian/pacific islander -9.105e-02  1.950e-01  -0.467 0.640563    
## subject_age                         1.260e-02  1.233e-03  10.219  < 2e-16 ***
## subject_sexmale                     1.130e-01  3.370e-02   3.353 0.000798 ***
## time                               -6.375e-06  4.768e-07 -13.369  < 2e-16 ***
## precinct                           -1.639e-02  6.235e-03  -2.630 0.008550 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 30360  on 23117  degrees of freedom
## Residual deviance: 30027  on 23110  degrees of freedom
##   (2823 observations deleted due to missingness)
## AIC: 30043
## 
## Number of Fisher Scoring iterations: 4

Part 5 - Conclusion

In this analysis of traffic stop in Nashville, TN there was evidence of bias based on race.

The rate at which Black drivers were stopped relative to their proportion within the overall population, seemed evident from proportions. This was confirmed by the inference of a single proportion.

A Veil of Darkness analysis was not useful in showing bias as the logistic model failed to explain the difference in traffic stop rates, largely because the rate of traffic stops for Black drivers went up after dark.

A chi-square test of independence indicated that the proportion of searches by race were in fact different. A subsequent difference of proportions test showed that Black drivers were searched about 3% more often than White drivers, which was about twice as often.

Finally, a logistic regression showed race as a statistically significant predictor of arrest after contraband was discovered. This finding was supported by a second logistic regression with additional variables added for control.

There is ample evidence of racial bias in traffic stops in Nashville, TN from 2010 to 2018.

References

[^1] Weber, Max (1978). Roth, Guenther; Wittich, Claus (eds.). Economy and Society. Berkeley: U. California P. p. 54. [^2] https://www.acton.org/research/lord-acton-quote-archive [^3] https://openpolicing.stanford.edu/ [^4] U.S. Census Bureau. (2020). RACE. Decennial Census, DEC Redistricting Data (PL 94-171), Table P1. Retrieved March 30, 2025, from https://data.census.gov/table/DECENNIALPL2020.P1?q=nashville,+TN. [^5] https://openpolicing.stanford.edu/tutorials/