Capital Punishment in America
Crosby Haile
August 10, 2017
In a time of political mistrust, the debate over capital punishment is still brewing. With historical data, I hope to reveal the prevalence of capital punishment over time and any biases skewing the results. The data set is large, with over 15,000 executions occurring over almost 400 years, and ranging from firing squads in colonial Jamestown, all the way to the massive media controversies facing the early 2000s.
The scope of this study covers a lot of history, and even more legislature. I am interested to understand the trends in capital punishment over time and how these changes correlate with pivotal events in our human history.
To explore this relationship, I will be utilizing data from Executions in the United States, 1608-2002: The ESPY File. This data was collected in partnership with the National Science Federation and the United States Department of Justice. Seeing as the United States, as we know it, did not exist until 1776, the data includes earlier executions in any territories that would later became states.
Key Variables Include:
My proposed approach is to first plot the variables of interest by year. These graphs will help to shed light on any biases in the capital punishment system and how changes in legislature through the years has affected the number or method of executions.
I will then take a closer look at the years showing major shifts. The goal here would be to see if I can isolate an event, political or otherwise, that might help explain the results.
This analysis is intended to help consumers form an opinion on capital punishment based on sound data. The issue has been hotly contested for hundreds of years, meaning there is no shortage of op-ed pieces littering the internet. I hope that my analysis can help consumers, including myself, gain a clearer understanding of capital punishment, without biased interruption.
Ultimately, I would like to understand if there were any biases still present in 2002 and, if so, do they still exist today?
In 2017, capital punishment is still legal in 31 states.
Should it be?
The following packages are required in order to run code without errors.
library(tidyverse) # easy installation of packages
library(ggplot2) # plotting & visualizing data
library(maps) # geographical data
library(DT) # to create functional table in HTML
The data set contains information about executions performed under civil authority in the United States between 1608 and 2002. The data was collected between 1970 and 2002 with the help of records from the State Department of Corrections, newspapers, court proceedings, and historical recordkeepers.
First, we must import the csv file and specify column names. There are several columns that have no relevance for our analysis. I have coded these columns as numbers in order to differentiate them from the variables of interest.
raw_data <- read_csv("raw_data.csv",
col_names = c("1", "2",
"3", "4",
"Race", "Age",
"Name", "5",
"6", "Conviction",
"Method", "7",
"8", "Year",
"9", "State",
"10", "11",
"Sex", "12",
"Occupation"),
skip = 1)
For our purposes, we will narrow down the data to 9 key variables.
scrubbed = raw_data[,c("Year", "State",
"Name", "Age",
"Sex", "Race",
"Occupation", "Conviction",
"Method")]
In order to help round out the data, we will introduce two new categorical variables : Region and Era. This will help us to better visualize geographical and historical trends.
Groups states based on geographical regions specified in The US Census.
scrubbed$Region <- ifelse(scrubbed$State %in% c("Illinois", "Indiana", "Michigan", "Ohio", "Wisconsin"), "East North Central",
ifelse(scrubbed$State %in% c("Alabama", "Kentucky", "Mississippi", "Tennessee"), "East South Central",
ifelse(scrubbed$State %in% c("New Jersey", "New York", "Pennsylvania"), "Middle Atlantic",
ifelse(scrubbed$State %in% c("Arizona", "Colorado", "Idaho", "Montana", "Nevada", "New Mexico", "Utah", "Wyoming"), "Mountain",
ifelse(scrubbed$State %in% c("Connecticut", "Maine", "Massachusetts", "New Hampshire", "Rhode Island", "Vermont"), "New England",
ifelse(scrubbed$State %in% c("Alaska", "California", "Hawaii", "Oregon", "Washington"), "Pacific",
ifelse(scrubbed$State %in% c("Delaware", "Florida", "Georgia", "Maryland", "North Carolina", "South Carolina", "Virginia", "Washington, D.C.", "West Virginia"), "South Atlantic",
ifelse(scrubbed$State %in% c("Iowa", "Kansas", "Minnesota", "Missouri", "Nebraska", "North Dakota", "South Dakota"), "West North Central",
ifelse(scrubbed$State %in% c("Arkansas", "Louisiana", "Oklahoma", "Texas"), "West South Central", "NA")))))))))
A somewhat subjective grouping based on US History.
scrubbed$Era <- ifelse(scrubbed$Year < 1630, "Early America",
ifelse(scrubbed$Year >= 1630 & scrubbed$Year < 1763, "Colonial Period",
ifelse(scrubbed$Year >= 1763 & scrubbed$Year < 1783, "Revolutionary Period",
ifelse(scrubbed$Year >= 1783 & scrubbed$Year < 1815, "Young Republic",
ifelse(scrubbed$Year >= 1815 & scrubbed$Year < 1860, "Expansionary Period",
ifelse(scrubbed$Year >= 1860 & scrubbed$Year < 1876, "Civil War & Reconstruction",
ifelse(scrubbed$Year >= 1876 & scrubbed$Year < 1914, "Second Industrial Revolution",
ifelse(scrubbed$Year >= 1914 & scrubbed$Year < 1933, "WWI & Depression",
ifelse(scrubbed$Year >= 1933 & scrubbed$Year < 1945, "New Deal & WWII",
ifelse(scrubbed$Year >= 1945 & scrubbed$Year < 1960, "Postwar America",
ifelse(scrubbed$Year >= 1960 & scrubbed$Year < 1980, "Vietnam Era",
ifelse(scrubbed$Year >= 1980 & scrubbed$Year <= 2002, "Rise of Technology", "NA"))))))))))))
To easily run reports based on frequency, we will create subsets.
First, we will create a subset based on the frequency of executions by year.
count_Years <- scrubbed %>% group_by(Year) %>%
tally()
However, we will also want to add the Era variable. This will allow us to see the frequency by year, as well as shine light on the historical context.
count_Years$Era <- ifelse(count_Years$Year < 1630, "Early America",
ifelse(count_Years$Year >= 1630 & count_Years$Year < 1763, "Colonial Period",
ifelse(count_Years$Year >= 1763 & count_Years$Year < 1783, "Revolutionary Period",
ifelse(count_Years$Year >= 1783 & count_Years$Year < 1815, "Young Republic",
ifelse(count_Years$Year >= 1815 & count_Years$Year < 1860, "Expansionary Period",
ifelse(count_Years$Year >= 1860 & count_Years$Year < 1876, "Civil War & Reconstruction",
ifelse(count_Years$Year >= 1876 & count_Years$Year < 1914, "Second Industrial Revolution",
ifelse(count_Years$Year >= 1914 & count_Years$Year < 1933, "WWI & Depression",
ifelse(count_Years$Year >= 1933 & count_Years$Year < 1945, "New Deal & WWII",
ifelse(count_Years$Year >= 1945 & count_Years$Year < 1960, "Postwar America",
ifelse(count_Years$Year >= 1960 & count_Years$Year < 1980, "Vietnam Era",
ifelse(count_Years$Year >= 1980 & count_Years$Year <= 2002, "Rise of Technology", "NA"))))))))))))
Next, we will create a subset based on the frequency of executions by method.
count_Method <- scrubbed %>% group_by(Method) %>%
tally()
Next, we will create a subset based on the frequency of executions by conviction.
count_Conviction <- scrubbed %>% group_by(Conviction) %>%
tally()
Next, we will create a subset based on the frequency of executions by state.
count_State <- scrubbed %>% group_by(State) %>%
tally()
Next, we will create a subset based on the frequency of executions by race.
count_Race <- scrubbed %>% group_by(Year, Race, Region) %>%
tally()
This variable is unique in that some observtions have values of NA. In order to create tidy graphs, we will need to eliminate these records.
count_Race <- na.omit(count_Race)
And finally, we will create a subset that shows race and region.
Race_Region = scrubbed %>% group_by(Race, Region)
Like we did to achieve count_Race, we will remove NA values.
Race_Region <- na.omit(Race_Region)
## long lat group order region subregion
## 1 -87.46201 30.38968 1 1 Alabama <NA>
## 2 -87.48493 30.37249 1 2 Alabama <NA>
## 3 -87.52503 30.37249 1 3 Alabama <NA>
## 4 -87.53076 30.33239 1 4 Alabama <NA>
## 5 -87.57087 30.32665 1 5 Alabama <NA>
## 6 -87.58806 30.32665 1 6 Alabama <NA>