Natural disasters can uproot peoples’ entire lives in a matter of days. It goes without saying that such untimely acts of God can create a multitude of challenges for election administrators in affected areas. Do these events affect voter turnout? Are certain areas less equipped to handle these disasters? What can be done to mitigate the adverse effects? In this paper, we will explore these questions and others as we examine the effects of Hurricane Matthew and how it potentially affected voter turnout in the 2016 general election specifically in Georgia.
For the sake of this analysis, I decided to examine the state of Georgia by its counties. All the source data used in my observations also split Georgia by county, so this serves as the best, most compact way to study the location of voters. The goal is to determine which areas were most affected and then analyze turnout, demographics, and other information that could reveal interesting findings. To determine such areas, I found a map of Georgia, also split by counties, published by the Federal Emergency Management Agency (FEMA)1 Georgia Hurricane Matthew (DR-4284) https://www.fema.gov/disaster/4284.
This map indicates which counties required different levels of assistance. Clearly, the southeastern coastal areas were designated to receive both individual and public assistance. As such, I hypothesized that these areas are more likely to have faced lower turnout than the rather high state average of 76.53%2 Georgia Statewide General Election 2016 Results http://results.enr.clarityelections.com/GA/63991/184321/en/summary.html.
Furthermore, I hypothesized that these counties likely had higher rates of early voting than the state did overall. It is well-documented that states experiencing hurricanes often see spikes in early voting since voters might be displaced and early voting does not restrict voters by precinct.
In my observations, I used a recent copy of the Georgia Daily Voter File, as provided by Professor McDonald, along with a publicly available copy of the Georgia Voter History file found online3 Georgia Voter History http://elections.sos.ga.gov/Elections/voterhistory.do.
Before we begin, we must install and load in all the following packages.
install.packages(c("tidvyverse","dplyr", "ggplot2", "RColorBrewer", "devtools", "dplyr", "stringr", "maps", "mapdata"))
library(tidyverse)
library(dplyr)
library(ggplot2)
library(RColorBrewer)
library(ggmap)
library(maps)
library(mapdata)
Now we must read in the data that we will be observing.
Georgia.County.Codes <- read_csv("georgia_county_codes.txt", col_names = c("state", "state FP", "COUNTY_CODE", "subregion", "Class FP"))
Georgia.Voter.History <- read.table("Georgia 2016.TXT", colClasses = c(rep("character")), sep = " ", fill = TRUE)
Georgia.Voter.File <- read.table("Georgia_Daily_VoterBase.txt", colClasses = c("character", "character", "character", rep("NULL", 11), "character", "character", "character", "character", rep("NULL", 29), "character", "character", rep("NULL", 3)), sep = "|", header = TRUE, fill = TRUE, quote = '', na.strings = 'NULL')
Georgia.Voter.File.Copy <- Georgia.Voter.File
Georgia.Voter.File.Copy2 <- Georgia.Voter.File
Note that copies were made of the voter file to maximize efficiency in case we need to use the unedited original file again.
The voter file I am using here is from just before the 2016 general election. Just in case there are individuals who registered to vote after the October 11th deadline for that election, we must filter the data frame accordingly.
Georgia.Voter.File.Copy$REGISTRATION_DATE <- as.numeric(Georgia.Voter.File.Copy$REGISTRATION_DATE)
Georgia.Voter.File.Copy <- filter(Georgia.Voter.File.Copy, REGISTRATION_DATE < 20161011)
We must also remove individuals who are marked as inactive on the voter rolls.
Georgia.Voter.File.Copy <- filter(Georgia.Voter.File.Copy, VOTER_STATUS == "A")
Next, we create a tibble for the voter history file and organize the columns in a more tidy fashion.
History.tibble <- as_tibble(Georgia.Voter.History)
History.tibble$V2 = NULL
History.tibble <- transform(History.tibble, county = substr(V1,1,3),
REGISTRATION_NUMBER = substr(V1,4,11),
elect.date = substr(V1,12,19),
elect.type = substr(V1,20,22),
party = substr(V1,23,24),
absentee = substr(V3,1,1),
provisional = substr(V3,2,2),
supplemental = substr(V3,3,3))
History.tibble$V1 = NULL
History.tibble$V3 = NULL
Lastly, we filter the voter history tibble to only include voters from the 2016 general election, since the source file includes all elections from 2016.
History.tibble <- filter(History.tibble, History.tibble$elect.date == "20161108" & History.tibble$elect.type == "003")
We can now create our denominator of eligible voters for the 2016 general election.
Voters_By_County_Denominator <- count(Georgia.Voter.File.Copy, COUNTY_CODE)
Before we can proceed, we must merge the voter file with the voter history file (by registration number) to obtain a data frame of all 2016 general election voters in Georgia.
Merge <- merge(Georgia.Voter.File, History.tibble, by = "REGISTRATION_NUMBER")
Now we can calculate our numerator of 2016 general election voters and analyze the data in a variety of ways.
Voters_By_County_Numerator <- count(Merge, COUNTY_CODE)
Voters_By_County <- data.frame(Georgia.County.Codes$COUNTY_CODE, Voters_By_County_Numerator[,2], Voters_By_County_Denominator[,2])
colnames(Voters_By_County) <- c("subregion", "Voters_By_County_Numerator", "Voters_By_County_Denominator")
To visualize Georgia’s voter turnout in the 2016 general election, we will create a geo-map of the state split by its counties, outlining in bold the group of counties that were designated to receive individual and public assistance.
states <- map_data("state")
GA_df <- subset(states, region == "georgia")
counties <- map_data("county")
GA_county <- subset(counties, region == "georgia")
FEMA <- c("bulloch", "bryan", "chatham", "effingham", "evans", "glynn", "liberty", "long", "mcintosh", "wayne")
FEMA_county <- subset(GA_county, subregion %in% FEMA)
GA_base <- ggplot(data = GA_df, mapping = aes(x = long, y = lat, group = group)) +
coord_fixed(1.3) +
geom_polygon(color = "black", fill = "gray")
Voters_By_County$subregion <- Georgia.County.Codes$subregion[match(Voters_By_County$subregion, Georgia.County.Codes$COUNTY_CODE)]
Voters_By_County_Numerator <- as.numeric(Voters_By_County$Voters_By_County_Numerator)
Voters_By_County_Denominator <- as.numeric(Voters_By_County$Voters_By_County_Denominator)
Voters_By_County$Turnout <- Voters_By_County_Numerator / Voters_By_County_Denominator
County.Data <- inner_join(GA_county, Voters_By_County, by = "subregion")
remove.axes <- theme(
axis.text = element_blank(),
axis.line = element_blank(),
axis.ticks = element_blank(),
panel.border = element_blank(),
panel.grid = element_blank(),
axis.title = element_blank()
)
GA_base +
geom_polygon(data = County.Data, aes(fill = Turnout), color = "white") +
geom_polygon(color = "black", fill = NA) +
scale_fill_distiller(palette = "OrRd") +
ggtitle("2016 Georgia Voter Turnout by County") +
theme_bw() +
geom_polygon(data = FEMA_county, fill = NA, color = "black") +
remove.axes
While the subsetted group of counties affected by Hurricane Matthew (outlined in black) does not seem to have experienced high turnout relative to much of the rest of the state, only Liberty County and Long County seem to have experienced significantly lower turnout. As such, we will conduct demographical analysis exclusively on these two counties and compare our findings to the rest of the state.
First, let’s dive into the demographical data for Liberty County as made available by the U.S. Census Bureau4 QuickFacts Liberty County, Georgia https://www.census.gov/quickfacts/libertycountygeorgia. We will compare this data to that of the entire state of Georgia5 QuickFacts Georgia https://www.census.gov/quickfacts/ga.
Table 1. U.S. Census Bureau Proportions as of July 1, 2017.
| 18 & Under | 65+ | Female | White | Black | Hispanic | |
|---|---|---|---|---|---|---|
| Georgia | 24.1 | 13.5 | 51.3 | 52.8 | 32.2 | 9.6 |
| Liberty | 28.1 | 8.8 | 49.3 | 38.7 | 44.1 | 12.8 |
Obviously, citizens under 18 years of age are not eligible to vote. However, these are the only categories available for us to get an idea of the age distribution for our observations.
With that in mind, it appears that Liberty County has a relatively younger and more racially diverse population than the state as a whole. Liberty County is also significantly less female than the rest of the state.
Next, let’s take a look at the demographical data for Long County as made available by the U.S. Census Bureau6 QuickFacts Long County, Georgia https://www.census.gov/quickfacts/longcountygeorgia. We will also compare this data to that of the entire state of Georgia.
Table 2. U.S. Census Bureau Proportions as of July 1, 2017.
| 18 & Under | 65+ | Female | White | Black | Hispanic | |
|---|---|---|---|---|---|---|
| Georgia | 24.1 | 13.5 | 51.3 | 52.8 | 32.2 | 9.6 |
| Liberty | 27.7 | 9.0 | 49.6 | 57.5 | 27.4 | 11.0 |
Based on this chart, Long County also skews a bit younger and more male than does the rest of the state. Unlike Liberty County, however, it contains a higher percentage of whites than the state average, but Long County still has a greater proportion of Hispanics than does the state as a whole.
Now, let’s start comparing turnout between these two counties and the state of Georgia based on different demographics.
Before we can create graphs for Liberty and Long County voters, we must create a couple new data frames.
Liberty <- filter(Merge, Merge$COUNTY_CODE == "089")
Liberty_Total <- filter(Georgia.Voter.File.Copy, Georgia.Voter.File.Copy$COUNTY_CODE == "089")
Long <- filter(Merge, Merge$COUNTY_CODE == "091")
Long_Total <- filter(Georgia.Voter.File.Copy, Georgia.Voter.File.Copy$COUNTY_CODE == "091")
Now we are ready to visualize voter turnout in Liberty County, Long County, and Georgia based on race.
Merge$RACE[which(Merge$RACE == "")] <- "U"
Liberty$RACE[which(Liberty$RACE == "")] <- "U"
Long$RACE[which(Long$RACE == "")] <- "U"
Total_Race <- count(Georgia.Voter.File.Copy, Georgia.Voter.File.Copy$RACE)
Liberty_Total_Race <- count(Liberty_Total, Liberty_Total$RACE)
Long_Total_Race <- count(Long_Total, Long_Total$RACE)
Merge.Copy <- Merge %>%
group_by(RACE) %>%
summarise(count=n()) %>%
mutate(perc=count/Total_Race$n)
Liberty.Copy <- Liberty %>%
group_by(RACE) %>%
summarise(count=n()) %>%
mutate(perc=count/Liberty_Total_Race$n)
Long.Copy <- Long %>%
group_by(RACE) %>%
summarise(count=n()) %>%
mutate(perc=count/Long_Total_Race$n)
ggplot(data = Merge.Copy, mapping = aes(x = (factor(RACE)), y = perc, fill = "red")) +
geom_bar(stat = "identity") +
ggtitle(" 2016 Georgia Turnout by Race") +
labs(x = "Race", y = "Turnout") +
ylim(0, 1) +
scale_x_discrete(breaks = c("WH", "BH", "HP", "AI", "AP", "OT", "U"),
labels = c("White", "Black", "Hispanic", "Native American", "Asian", "Other", "Unknown")) +
theme(legend.position = "none")
ggplot(data = Liberty.Copy, mapping = aes(x = (factor(RACE)), y = perc, fill = "red")) +
geom_bar(stat = "identity") +
ggtitle(" 2016 Liberty Turnout by Race") +
labs(x = "Race", y = "Turnout") +
ylim(0, 1) +
scale_x_discrete(breaks = c("WH", "BH", "HP", "AI", "AP", "OT", "U"),
labels = c("White", "Black", "Hispanic", "Native American", "Asian", "Other", "Unknown")) +
theme(legend.position = "none")
ggplot(data = Long.Copy, mapping = aes(x = (factor(RACE)), y = perc, fill = "red")) +
geom_bar(stat = "identity") +
ggtitle(" 2016 Long Turnout by Race") +
labs(x = "Race", y = "Turnout") +
ylim(0, 1) +
scale_x_discrete(breaks = c("WH", "BH", "HP", "AI", "AP", "OT", "U"),
labels = c("White", "Black", "Hispanic", "Native American", "Asian", "Other", "Unknown")) +
theme(legend.position = "none")
Merge.Copy$difference1 <- Merge.Copy$perc - Liberty.Copy$perc
Merge.Copy$difference2 <- Merge.Copy$perc - Long.Copy$perc
ggplot(data = Merge.Copy, mapping = aes(x = (factor(RACE)), y = difference1, fill = "red")) +
geom_bar(stat = "identity") +
ggtitle(" 2016 Turnout Difference Between Liberty and Georgia by Race") +
labs(x = "Race", y = "Turnout") +
ylim(0, 1) +
scale_x_discrete(breaks = c("WH", "BH", "HP", "AI", "AP", "OT", "U"),
labels = c("White", "Black", "Hispanic", "Native American", "Asian", "Other", "Unknown")) +
theme(legend.position = "none")
ggplot(data = Merge.Copy, mapping = aes(x = (factor(RACE)), y = difference2, fill = "red")) +
geom_bar(stat = "identity") +
ggtitle(" 2016 Turnout Difference Between Long and Georgia by Race") +
labs(x = "Race", y = "Turnout") +
ylim(0, 1) +
scale_x_discrete(breaks = c("WH", "BH", "HP", "AI", "AP", "OT", "U"),
labels = c("White", "Black", "Hispanic", "Native American", "Asian", "Other", "Unknown")) +
theme(legend.position = "none")
Based on these graphs, we can see that Liberty County’s dropoff in turnout percentages from the state averages most highly affected Hispanic voters. Turnout percentage amongst white and Asian voters in Liberty County decreased by a fairly wide margin as well, but black turnout appears to have decreased the least.
Long County’s dropoffs, on the other hand, were most significant amongst Hispanic voters and white voters. Black and Asian turnout percentages appear to have decreased the least. Turnout rate amongst Native Americans in Long County actually eclipsed that of the state, hence why there is no bar displayed on the turnout difference graph.
Here we turn our attention to turnout amongst different genders. Let’s visualize this data in Liberty County, Long County, and the state of Georgia as a whole.
Merge$GENDER[which(Merge$GENDER == "")] <- "U"
Liberty$GENDER[which(Liberty$GENDER == "")] <- "U"
Long$GENDER[which(Long$GENDER == "")] <- "U"
Merge_Subset <- subset(Merge, Merge$GENDER %in% c("F", "M", "O"))
Liberty_Subset <- subset(Liberty, Liberty$GENDER %in% c("F", "M", "O"))
Long_Subset <- subset(Long, Long$GENDER %in% c("F", "M", "O"))
Total_Gender <- count(Georgia.Voter.File.Copy, Georgia.Voter.File.Copy$GENDER)
Liberty_Total_Gender <- count(Liberty_Total, Liberty_Total$GENDER)
Long_Total_Gender <- count(Long_Total, Long_Total$GENDER)
Merge.Copy <- Merge_Subset %>%
group_by(GENDER) %>%
summarise(count=n()) %>%
mutate(perc=count/Total_Gender$n)
Liberty.Copy <- Liberty_Subset %>%
group_by(GENDER) %>%
summarise(count=n()) %>%
mutate(perc=count/Liberty_Total_Gender$n)
Long.Copy <- Long_Subset %>%
group_by(GENDER) %>%
summarise(count=n()) %>%
mutate(perc=count/Long_Total_Gender$n)
ggplot(data = Merge.Copy, mapping = aes(x = (factor(GENDER)), y = perc, fill = "red")) +
geom_bar(stat = "identity") +
ggtitle(" 2016 Georgia Turnout by Gender") +
labs(x = "Gender", y = "Turnout") +
ylim(0, 1) +
scale_x_discrete(breaks = c("F", "M", "O", "U"),
labels = c("Female", "Male", "Other", "Unknown")) +
theme(legend.position = "none")
ggplot(data = Liberty.Copy, mapping = aes(x = (factor(GENDER)), y = perc, fill = "red")) +
geom_bar(stat = "identity") +
ggtitle(" 2016 Liberty Turnout by Gender") +
labs(x = "Gender", y = "Turnout") +
ylim(0, 1) +
scale_x_discrete(breaks = c("F", "M", "O", "U"),
labels = c("Female", "Male", "Other", "Unknown")) +
theme(legend.position = "none")
ggplot(data = Long.Copy, mapping = aes(x = (factor(GENDER)), y = perc, fill = "red")) +
geom_bar(stat = "identity") +
ggtitle(" 2016 Long Turnout by Gender") +
labs(x = "Gender", y = "Turnout") +
ylim(0, 1) +
scale_x_discrete(breaks = c("F", "M", "O", "U"),
labels = c("Female", "Male", "Other", "Unknown")) +
theme(legend.position = "none")
Merge.Copy$difference1 <- Merge.Copy$perc - Liberty.Copy$perc
Merge.Copy$difference2 <- Merge.Copy$perc - Long.Copy$perc
ggplot(data = Merge.Copy, mapping = aes(x = (factor(GENDER)), y = difference1, fill = "red")) +
geom_bar(stat = "identity") +
ggtitle(" 2016 Turnout Difference Between Liberty and Georgia by Gender") +
labs(x = "Gender", y = "Turnout") +
ylim(0, 1) +
scale_x_discrete(breaks = c("F", "M", "O", "U"),
labels = c("Female", "Male", "Other", "Unknown")) +
theme(legend.position = "none")
ggplot(data = Merge.Copy, mapping = aes(x = (factor(GENDER)), y = difference2, fill = "red")) +
geom_bar(stat = "identity") +
ggtitle(" 2016 Turnout Difference Between Long and Georgia by Gender") +
labs(x = "Gender", y = "Turnout") +
ylim(0, 1) +
scale_x_discrete(breaks = c("F", "M", "O", "U"),
labels = c("Female", "Male", "Other", "Unknown")) +
theme(legend.position = "none")
Both males and females experienced similarly high turnout percentage decreases in Liberty County compared to the remainder of the state.
In Long County, turnout rates dipped slightly more amongst women, but not by a very significant margin.
Next, let’s analyze whether turnout varied significantly between different age groups.
Merge$Age <- (2016 - as.numeric(Merge$BIRTHDATE))
Liberty$Age <- (2016 - as.numeric(Liberty$BIRTHDATE))
Long$Age <- (2016 - as.numeric(Long$BIRTHDATE))
Liberty_Total$Age <- (2016 - as.numeric(Liberty_Total$BIRTHDATE))
Long_Total$Age <- (2016 - as.numeric(Long_Total$BIRTHDATE))
Georgia.Voter.File.Copy$Age <- (2016 - as.numeric(Georgia.Voter.File.Copy$BIRTHDATE))
Merge$age_range <- "<30"
Merge$age_range[which(Merge$Age>=30)] <- "30-45"
Merge$age_range[which(Merge$Age>45)] <- "45+"
Liberty$age_range <- "<30"
Liberty$age_range[which(Liberty$Age>=30)] <- "30-45"
Liberty$age_range[which(Liberty$Age>45)] <- "45+"
Long$age_range <- "<30"
Long$age_range[which(Long$Age>=30)] <- "30-45"
Long$age_range[which(Long$Age>45)] <- "45+"
Georgia.Voter.File.Copy$age_range <- "<30"
Georgia.Voter.File.Copy$age_range[which(Georgia.Voter.File.Copy$Age>=30)] <- "30-45"
Georgia.Voter.File.Copy$age_range[which(Georgia.Voter.File.Copy$Age>45)] <- "45+"
Liberty_Total$age_range <- "<30"
Liberty_Total$age_range[which(Liberty_Total$Age>=30)] <- "30-45"
Liberty_Total$age_range[which(Liberty_Total$Age>45)] <- "45+"
Long_Total$age_range <- "<30"
Long_Total$age_range[which(Long_Total$Age>=30)] <- "30-45"
Long_Total$age_range[which(Long_Total$Age>45)] <- "45+"
Total_Age <- count(Georgia.Voter.File.Copy, Georgia.Voter.File.Copy$age_range)
Liberty_Total_Age <- count(Liberty_Total, Liberty_Total$age_range)
Long_Total_Age <- count(Long_Total, Long_Total$age_range)
Merge.Copy <- Merge %>%
group_by(age_range) %>%
summarise(count=n()) %>%
mutate(perc=count/Total_Age$n)
Liberty.Copy <- Liberty %>%
group_by(age_range) %>%
summarise(count=n()) %>%
mutate(perc=count/Liberty_Total_Age$n)
Long.Copy <- Long %>%
group_by(age_range) %>%
summarise(count=n()) %>%
mutate(perc=count/Long_Total_Age$n)
ggplot(data = Merge.Copy, mapping = aes(x = factor(age_range), y = perc, fill = "red")) +
geom_bar(stat = "identity") +
ggtitle(" 2016 Georgia Turnout by Age") +
labs(x = "Age", y = "Turnout") +
ylim(0, 1) +
theme(legend.position = "none")
ggplot(data = Liberty.Copy, mapping = aes(x = factor(age_range), y = perc, fill = "red")) +
geom_bar(stat = "identity") +
ggtitle(" 2016 Liberty Turnout by Age") +
labs(x = "Age", y = "Turnout") +
ylim(0, 1) +
theme(legend.position = "none")
ggplot(data = Long.Copy, mapping = aes(x = factor(age_range), y = perc, fill = "red")) +
geom_bar(stat = "identity") +
ggtitle(" 2016 Long Turnout by Age") +
labs(x = "Age", y = "Turnout") +
ylim(0, 1) +
theme(legend.position = "none")
Merge.Copy$difference1 <- Merge.Copy$perc - Liberty.Copy$perc
Merge.Copy$difference2 <- Merge.Copy$perc - Long.Copy$perc
ggplot(data = Merge.Copy, mapping = aes(x = (factor(age_range)), y = difference1, fill = "red")) +
geom_bar(stat = "identity") +
ggtitle(" 2016 Turnout Difference Between Liberty and Georgia by Age") +
labs(x = "Age", y = "Turnout") +
ylim(0, 1) +
theme(legend.position = "none")
ggplot(data = Merge.Copy, mapping = aes(x = (factor(age_range)), y = difference2, fill = "red")) +
geom_bar(stat = "identity") +
ggtitle(" 2016 Turnout Difference Between Long and Georgia by Age") +
labs(x = "Age", y = "Turnout") +
ylim(0, 1) +
theme(legend.position = "none")
Young voters between 18 and 30 years of age turned out in substantially fewer numbers in Liberty County than they did in the state overall. This is quite revealing, as we saw in the Census statistics that Liberty County skews fairly younger than the rest of the state. Turnout percentage was also quite low amongst 30 to 45 year olds in Liberty County. For individuals aged 45 and older, turnout percentage barely dropped off at all.
The dropoff in turnout amongst the different age groups in Long County was a bit more evenly spread, but it still clearly affected younger voters much more so than it did older voters.
Finally, let’s visualize how people in Liberty County and Long County voted versus how people voted in the rest of the state.
HurricaneCounties <- subset(Merge, Merge$COUNTY_CODE %in% c("015", "016", "025", "051", "054", "063", "089", "091", "098", "151"))
Merge_absentee_summary <- Merge %>%
group_by(absentee) %>%
summarise(Percent = n()/nrow(.) * 100)
Liberty_absentee_summary <- Liberty %>%
group_by(absentee) %>%
summarise(Percent = n()/nrow(.) * 100)
Long_absentee_summary <- Long %>%
group_by(absentee) %>%
summarise(Percent = n()/nrow(.) * 100)
HurricaneCounties_absentee_summary <- HurricaneCounties %>%
group_by(absentee) %>%
summarise(Percent = n()/nrow(.) * 100)
ggplot(data = Merge_absentee_summary, mapping = aes(x = "", y = Percent, fill = absentee)) +
geom_bar(width = 1, stat = "identity") +
ggtitle(" 2016 Georgia Voters by Method of Voting") +
labs(x = "", y = "") +
scale_y_continuous(breaks = round(cumsum(rev(Merge_absentee_summary$Percent)), 1)) +
scale_fill_discrete(name = "Early/Absentee/Mail",
breaks = c("N", "Y"),
labels = c("No", "Yes")) +
coord_polar("y", start = 0)
ggplot(data = Liberty_absentee_summary, mapping = aes(x = "", y = Percent, fill = absentee)) +
geom_bar(width = 1, stat = "identity") +
ggtitle(" 2016 Liberty Voters by Method of Voting") +
labs(x = "", y = "") +
scale_y_continuous(breaks = round(cumsum(rev(Liberty_absentee_summary$Percent)), 1)) +
scale_fill_discrete(name = "Early/Absentee/Mail",
breaks = c("N", "Y"),
labels = c("No", "Yes")) +
coord_polar("y", start = 0)
ggplot(data = Long_absentee_summary, mapping = aes(x = "", y = Percent, fill = absentee)) +
geom_bar(width = 1, stat = "identity") +
ggtitle(" 2016 Long Voters by Method of Voting") +
labs(x = "", y = "") +
scale_y_continuous(breaks = round(cumsum(rev(Long_absentee_summary$Percent)), 1)) +
scale_fill_discrete(name = "Early/Absentee/Mail",
breaks = c("N", "Y"),
labels = c("No", "Yes")) +
coord_polar("y", start = 0)
ggplot(data = HurricaneCounties_absentee_summary, mapping = aes(x = "", y = Percent, fill = absentee)) +
geom_bar(width = 1, stat = "identity") +
ggtitle(" 2016 FEMA Designated Voters by Method of Voting") +
labs(x = "", y = "") +
scale_y_continuous(breaks = round(cumsum(rev(HurricaneCounties_absentee_summary$Percent)), 1)) +
scale_fill_discrete(name = "Early/Absentee/Mail",
breaks = c("N", "Y"),
labels = c("No", "Yes")) +
coord_polar("y", start = 0)
Unsurprisingly, a slightly larger percentage of voters in Liberty County cast their ballots early, in some fashion, than did voters in the state overall. Overall, I would have expected there to be a bit larger of a discrepancy in these numbers, but it seems that voters in this county did not need to vote early in substantially higher numbers than usual after experiencing the effects of Hurricane Matthew.
When considering voters in Long County and in the subsetted group of counties designated for public and individual assistance according to the FEMA map, we observe a much more interesting phenomenon: a significantly lower percentage of voters cast their ballots early than did the state as a whole. Further study into different areas affected by natural disasters would be helpful to determine whether this is a fluke or indeed an indication of a larger trend that contradicts my hypothesis.
I faced a variety of challenges while compiling this data into a presentable final product. As an R novice, there was a rather steep learning curve to understanding the intricacies of coding in the proper formats, especially with regards to reading in data. I encountered a variety of errors and warning messages that interfered with my progress throughout this semester, and sometimes it took quite a while to figure out what exactly needed to be done to proceed.
An example of a rather tricky problem was figuring out the correct way to merge the Georgia voter file with the Georgia voter history to create my numerator of total voters in the state. After researching different methods, I tried the following code.
Georgia.Voter.File.Copy2$REGISTRATION_NUMBER <- History.tibble$reg.number[match(History.tibble$reg.number, Georgia.Voter.File.Copy2$REGISTRATION_NUMBER)]
Upon running this code, I received the following error message:
Error in $<-.data.frame(*tmp*, REGISTRATION_NUMBER, value = c(3955143L, : replacement has 4165157 rows, data has 6776829
The idea here was to check all the voter registration numbers in the voter file and keep only those numbers that also appeared in the voter history file. This code, however, is not able to handle null replacement values, as there must exist some value in each of the original cells/rows.
I was quite confused at how to work around this error until I came across the merge function, which creates a new data frame only with the desired matched values. I ran the following line of code, also shown previously, which did the trick.
Merge <- merge(Georgia.Voter.File.Copy2, History.tibble, by = "REGISTRATION_NUMBER")
With this newly constructed merged data set, I was able to proceed making great use of it in the remainder of my work.
I would also like to note a choice I made that might have affected the results of my research. Before merging the two data sets, I filtered out inactive voters from the Georgia voter file. This lowered my statewide turnout rate denominator by about 600,000 individuals, leading to my aggregate voter turnout figure to be 80.86204% (computed below). This number is well higher than the official rate of 76.53% posted by the Secretary of State7 Georgia Election Results http://results.enr.clarityelections.com/GA/63991/184321/en/summary.html.
nrow(Merge) / nrow(Georgia.Voter.File.Copy)
## [1] 0.8086204
I find it quite confusing how there potentially exist a couple hundred thousand individuals in Georgia that were able to cast their ballots in the 2016 general election despite being marked as inactive in the voter file at the time. Perhaps there is something completely different going on here, but it seems most likely that this is indeed the case.
Lastly, I would have liked to have conducted more analysis on other demographics such as party breakdown, but since there is no party registration in Georgia, the only available data in the voter file is “party last voted”. Because such a low percentage of the registered voter population participates in party primaries, most of the cells in this column are blank. This makes any meaningful analysis of this demographic category impossible.
After conducting a great deal of analysis on voter turnout in the 2016 general election across the state of Georgia, most of my findings were a bit surprising, to say the least.
Both of my hypotheses were shown to most likely be incorrect. While Liberty County and Long County both faced lower turnout rates than did the rest of the state, it is difficult to confidently attribute this to the aftermath of the storm since most of the other counties in the subset experienced turnout rates closer to the state average. Also, because I strictly utilized observational data, I obviously cannot make any conclusions about causation regardless. As such, there might be a multitude of confounding variables that contributed to lower turnout in these counties compared to the remainder of the state.
Furthermore, early voting rates in this subsetted region, aside from Liberty County, seem to have been much lower than the state average. This also suggests that there are likely outside factors that attributed to Liberty County and Long County having lower turnout since early voting typically surges in areas recently affected by natural disasters. Further study into whether this supposed phenomenon actually occurs would be helpful to clear up this conundrum.
That being said, I am not going to just shrug my shoulders and give up just yet. Deeper investigation into some available data could help shed light on some interesting factoids that could guide future research into the matter.
Reverting back to the Census data utilized previously, I notice that Liberty County is relatively poorer than the rest of the state is. Median household income between 2012 and 2016 in Liberty County was $42,484, which seems significantly lower than the state’s median household income of $51,037 during that time. Liberty County’s poverty rate is also a bit high at 15.3%, compared to Georgia’s rate of 14.9%.
Long County is a bit closer to the state average in terms of median household income at $50,848, but its poverty rate is substantially higher than the rest of the state at 18.6%.
If the hurricane did in fact affect voter turnout in these two counties, it is worth noting these numbers since it is reasonable to infer that their respective Supervisors of Elections might not have had sufficient resources to tackle the various obstacles that potentially came with Hurricane Matthew. Thus, perhaps early voting rates were lower in these counties due to a lack of adequate infrastructure to accommodate demand for it.
Lastly, if other areas affected by different natural disasters have experienced upticks in early voting, it might be wise for Supervisors in the counties most affected by Hurricane Matthew to invest more into early voting for future elections as a precautionary measure.
As a side note, I am eager to see the findings of my classmates’ research into how Hurricane Matthew affected voter turnout in Florida and North Carolina, and I hope to continue studying the effects of natural disasters on voter turnout in the future. As Hurricane Michael recently swept through the Florida Panhandle, it would be wise to now examine the potential effects that storm had on turnout in that specific area for the 2018 general election. It is possible that might be a better case study to analyze, but nonetheless such research would certainly be a valuable addition to the current body of literature.