This article is about what police misconduct settlement data, or lack thereof, can tell us about how police departments are changing for better or worse our current era of increasing calls for police reform. The article focuses heavily on the fact that the data on settlements for police misconduct is difficult to draw conclusions from due to the lack of standardization between departments, lack of transparency, and lack of data in some cases.
I am loading two libraries for this project. The dplyr library to help with manipulating the dataframe and the ggplot2 library for plotting the data.
install.packages(“dplyr”) install.packages(“ggplot2”)
library("dplyr")
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library("ggplot2")
The data set I chose was the Chicago, IL data set. I chose this data set because it had a large amount of rows which I believe would make for more interesting analysis. The data set contains 24 columns and 1515 records. The data can be found in the same directory as this file as chicago_edited.csv
chicagoMisconductPayouts <- read.csv(file = 'chicago_edited.csv')
head(chicagoMisconductPayouts)
## calendar_year city state incident_date incident_year filed_date filed_year
## 1 2019 Chicago IL NA NA NA NA
## 2 2018 Chicago IL NA NA NA NA
## 3 2019 Chicago IL NA NA NA NA
## 4 2010 Chicago IL NA NA NA NA
## 5 2010 Chicago IL NA NA NA NA
## 6 2010 Chicago IL NA NA NA NA
## closed_date amount_awarded other_expenses collection total_incurred
## 1 2019-01-29 160000 NA NA NA
## 2 2018-10-02 100000 NA NA NA
## 3 2019-05-30 70000 NA NA NA
## 4 2010-09-24 60000 NA NA NA
## 5 2010-03-18 99000 NA NA NA
## 6 2010-03-31 22500 NA NA NA
## case_outcome docket_number claim_number
## 1 Civil Litigation - General:Dismissed:Settlement\n 2016 C 08033 NA
## 2 Civil Litigation - General:Dismissed:Settlement\n 2016 C 08236 NA
## 3 Civil Litigation - General:Dismissed:Settlement\n 2018 C 04020 NA
## 4 Civil Litigation - General:Dismissed:Settlement\n 00L005230 NA
## 5 Civil Litigation - General:Dismissed:Settlement\n 00L007137 NA
## 6 Civil Litigation - General:Dismissed:Settlement\n 06L003486 NA
## court plaintiff_name plaintiff_attorney
## 1 NA NA NA
## 2 NA NA NA
## 3 NA NA NA
## 4 NA NA NA
## 5 NA NA NA
## 6 NA NA NA
## matter_name
## 1 Itemid Al Matar v PO D.R. Borchardt #16806; PO T.P. Hansen #3833; Sgt. Lucid #2361; PO M. Walter #4118; and the City of Chicago
## 2 Anthony Hawks v Chicago Police Department 18th Precinct, Officer Lawrence Gade, Jr., Officers John Doe 1-10
## 3 Hasin Ramadan and Dwight Gamble v. Sergeant Xavier Elizondo, Star No. 1340; Officer David Salgado, Star No. 16347; and City of Chicago
## 4 Pauline Underdown v. City of Chicago
## 5 PATSY MCCALL V. CITY OF CHICAGO
## 6 CAMILLE GILLIAM V. CITY OF CHICAGO, AND THE CHICAGO POLICE DEPT., AND UNKNOWN CHICAGO POLICE OFFICERS
## location summary_allegations status
## 1 NA Dispute:General:Police Matters:Excessive Force Minor Closed
## 2 NA Dispute:General:Police Matters:Excessive Force Minor Closed
## 3 NA Dispute:General:Police Matters:False Arrest Closed
## 4 NA Dispute:General:Police Matters:Excessive Force Serious Closed
## 5 NA Dispute:General:Police Matters:Excessive Force Serious Closed
## 6 NA Dispute:General:Police Matters:False Arrest Closed
## role flag_dept
## 1 Defendant Chicago Police Board
## 2 Defendant Chicago Police Board
## 3 Defendant Chicago Police Board
## 4 Defendant Chicago Police Dept
## 5 Defendant Chicago Police Dept
## 6 Defendant Chicago Police Dept
nrow(chicagoMisconductPayouts)
## [1] 1515
I subsetted the data down to just two columns. The year that the misconduct settlement was awarded and the dollar amount awarded. Those two values are the the most important variables in this data set.
payoutsWithYear <- subset(chicagoMisconductPayouts, select = c("calendar_year", "amount_awarded"))
head(payoutsWithYear)
## calendar_year amount_awarded
## 1 2019 160000
## 2 2018 100000
## 3 2019 70000
## 4 2010 60000
## 5 2010 99000
## 6 2010 22500
I then grouped each row by the calendar year and summarized the data by taking the mean of the payouts for each year, summed the payouts per year, and counted the amount of settlements per year.
summary <- payoutsWithYear %>%
group_by(calendar_year) %>%
summarise(Average_Amount_Rewarded = mean(amount_awarded),Total_Amount_Rewarded = sum(amount_awarded), Number_Of_Settlements = n())
summary
## # A tibble: 10 × 4
## calendar_year Average_Amount_Rewarded Total_Amount_Rewarded Number_Of_Settle…
## <int> <dbl> <dbl> <int>
## 1 2010 305778. 24462226. 80
## 2 2011 203483. 22790143. 112
## 3 2012 390242. 60487581. 155
## 4 2013 506050. 92607082. 183
## 5 2014 186598. 29669161. 159
## 6 2015 150062. 22509331. 150
## 7 2016 160494. 31296263. 195
## 8 2017 534597. 94623599. 177
## 9 2018 314646. 56950918. 181
## 10 2019 261709. 32190160. 123
I graphed the total settlements in dollars per year as I thought it might show a trend in the dollar amount of settlements increasing over time. What I found was there were two peak years 2013 and 2017 with much lower amounts in the remaining years. Some other interesting insights are that 2017 appears to have fewer payouts than both 2016 and 2018 indicating that they may have been a uncharacteristically large payout in the year of 2017 causing it to be higher than other years.
ggplot(data=summary, aes(x=calendar_year, y=Total_Amount_Rewarded)) +
geom_bar(stat = "identity") +
scale_y_continuous(
labels = scales::comma_format(big.mark = ','),breaks = seq(0, floor(max(summary$Total_Amount_Rewarded)), by = 10000000)) +
scale_x_continuous(breaks = seq(min(summary$calendar_year), max(summary$calendar_year), by = 1)) +
ylab("Total Amount Rewarded ($)") +
xlab("Year")
Something that the article brought up that I would be interested in pursing further is comparing the settlements for more conservative cities versus more liberal cities. The idea being conservative areas of the country may be more apt to side with the police and less likely to award large settlements.
If I were to do that I would plot the data from cities with different political leanings and compare the data to see if it supported the hypothesis. This might require additional data to determine what areas have what political affiliations.