Code
library(readr)
library(dplyr)
library(ggplot2)
library(lubridate)
library(tidyverse)Based on a UNC study in 2023 (Found HERE), the most Republican counties in North Carolina were Mitchell, Avery, and Yadkin counties. These counties were concentrated in western North Carolina. The UNC study (found HERE) also found that the most Democratic counties in North Carolina in 2023 were Hertford, Bertie, and Northampton counties, all of which were concentrated in northeast North Carolina.
With this understanding, I wanted to analyze voter weekly voter registration data (found HERE) in each of these counties to see how voter registration rates, both overall and within particular demographics, changed alongside major events that occurred throughout the 2024 presidential election. The dates I focused on were:
June 27: Trump-Biden Debate
July 13: Trump gets shot
July 15-18: RNC Convention
July 21: Biden drops out
August 6: Kamala Chooses Tim Walz as running mate
August 19-22: DNC Convention
September 10: Harris-Trump Debate
While the voter registration rates will not necessarily show a causation, I am hoping to analyze the correlation between each of these events and the change of voter registration rates within my selected demographics, which will compare gender, race, and political affiliation.
First, I will need to upload the data and necessary packages.
library(readr)
library(dplyr)
library(ggplot2)
library(lubridate)
library(tidyverse)dem_counties <- read_csv(("~/Desktop/VoterRegistrationData/DemCounties.csv"))
rep_counties <- read_csv(("~/Desktop/VoterRegistrationData/RepCounties.csv"))Group by date and calculate total registration for each dataset
dem_total <- dem_counties %>%
group_by(Date) %>%
summarise(Total_Voters = sum(Total))
rep_total <- rep_counties %>%
group_by(Date) %>%
summarise(Total_Voters = sum(Total))Create a combined plot
ggplot() +
geom_line(data = dem_total, aes(x = Date, y = Total_Voters, color = "Democratic Counties"), size = 1) +
geom_point(data = dem_total, aes(x = Date, y = Total_Voters, color = "Democratic Counties"), size = 3) +
geom_line(data = rep_total, aes(x = Date, y = Total_Voters, color = "Republican Counties"), size = 1) +
geom_point(data = rep_total, aes(x = Date, y = Total_Voters, color = "Republican Counties"), size = 3) +
scale_color_manual(values = c("Democratic Counties" = "blue", "Republican Counties" = "red")) +
labs(title = "Total Voter Registration Over Summer Months",
x = "Date",
y = "Total Number of Registered Voters",
color = "County Group") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
legend.position = "bottom")From an aerial view, it seems there wasn’t much change among both the conservative and liberal counties, while there was a slight increase from 8/31-9/14, which was following the DNC Convention and prior to the Harris-Trump Debate,
Let’s save the plot.
ggsave("summer_voter_registration_trend.png", width = 10, height = 6)Now, we can create a line graph for each county’s total population
ggplot(rep_counties, aes(x = Date, y = Total, color = County, group = County)) +
geom_line(size = 1) +
geom_point(size = 3) +
scale_color_manual(values = c("AVERY" = "blue", "MITCHELL" = "green", "YADKIN" = "red")) +
labs(title = "Total Voter Registration for Republican Counties",
x = "Date",
y = "Total Number of Registered Voters",
color = "County") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
legend.position = "bottom")This graph shows that Yadkin County’s total registration count increased the most among the conservative counties.
Save the plot
ggsave("republican_counties_trend.png", width = 10, height = 6)Print out summary statistics for each county
rep_counties %>%
group_by(County) %>%
summarise(
Min_Total = min(Total),
Max_Total = max(Total),
Total_Change = max(Total) - min(Total)
)# A tibble: 3 × 4
County Min_Total Max_Total Total_Change
<chr> <dbl> <dbl> <dbl>
1 AVERY 12988 13126 138
2 MITCHELL 11303 11359 56
3 YADKIN 25119 25334 215
Now, let’s create a line graph with facets for just the conservative counties to see how their total registered voters changed.
ggplot(rep_counties, aes(x = Date, y = Total, group = 1)) +
geom_line(color = "blue", size = 1) +
geom_point(color = "red", size = 3) +
facet_wrap(~ County, scales = "free_y", ncol = 3) +
labs(title = "Voter Registration Trends for Republican Counties",
x = "Date",
y = "Total Number of Registered Voters") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
plot.title = element_text(hjust = 0.5))It seems that the sharpest decrease in voter registration in Avery County was following 7/27, in which Biden dropped out of the debate. The sharpest decreases for Mitchell and Yadkin County were following 9/14, which was following the Harris-Trump debate.
Save the plot
ggsave("republican_counties_line_trend.png", width = 12, height = 4)Now, let’s calculate total change for each conservative county
county_changes <- rep_counties %>%
group_by(County) %>%
summarise(
Start_Total = first(Total),
End_Total = last(Total),
Total_Change = last(Total) - first(Total),
Percent_Change = ((last(Total) - first(Total)) / first(Total)) * 100
)Avery County had the largest percent change with 1.0625192.
Now, we can create a line graph for each liberal county’s total population
ggplot(dem_counties, aes(x = Date, y = Total, color = County, group = County)) +
geom_line(size = 1) +
geom_point(size = 3) +
scale_color_manual(values = c("BERTIE" = "blue", "HERTFORD" = "green", "NORTHAMPTON" = "red")) +
labs(title = "Total Voter Registration for Democratic Counties",
x = "Date",
y = "Total Number of Registered Voters",
color = "County") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
legend.position = "bottom")There is a clear sharp decrease for Northampton County following 7/27, which was after Biden dropped out of the race.
Let’s save this plot
ggsave("democratic_counties_trend.png", width = 10, height = 6)Let’s also print out summary statistics for each county
dem_counties %>%
group_by(County) %>%
summarise(
Min_Total = min(Total),
Max_Total = max(Total),
Total_Change = max(Total) - min(Total)
)# A tibble: 3 × 4
County Min_Total Max_Total Total_Change
<chr> <dbl> <dbl> <dbl>
1 BERTIE 12639 12787 148
2 HERTFORD 13829 13971 142
3 NORTHAMPTON 12756 12913 157
This shows that Northampton’s registration count increased the most.
Create a line graph with facets for each county
ggplot(dem_counties, aes(x = Date, y = Total, group = 1)) +
geom_line(color = "blue", size = 1) +
geom_point(color = "red", size = 3) +
facet_wrap(~ County, scales = "free_y", ncol = 3) +
labs(title = "Voter Registration Trends for Democratic Counties",
x = "Date",
y = "Total Number of Registered Voters") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
plot.title = element_text(hjust = 0.5))This graph highlights the sharp drop in voter registration for Northampton County following 7/27, but while Bertie and Hertford Counties sharply decreased following 9/14, there was a new peak for Northampton County.
Save the plot
ggsave("democratic_counties_line_trend.png", width = 12, height = 4)Now, let’s calculate the total change for each liberal county
county_changes <- dem_counties %>%
group_by(County) %>%
summarise(
Start_Total = first(Total),
End_Total = last(Total),
Total_Change = last(Total) - first(Total),
Percent_Change = ((last(Total) - first(Total)) / first(Total)) * 100
)The largest change was with Northampton county, with a percent change of 1.121041.
The most Republican county is Mitchell, and the most Democratic County is Hertford. So let’s take a deeper dive how certain demographics changed over time in these two counties.
First, white voter count in Hertford County
hertford_white <- read_csv(("~/Desktop/VoterRegistrationData/hertford_white.csv"))hertford_white$Date <- as.Date(paste0("2024-", hertford_white$Date), format = "%Y-%m/%d")
ggplot(hertford_white, aes(x = Date, y = White)) +
geom_line(color = "blue", size = 1) +
geom_point(color = "darkblue", size = 3) +
labs(
title = "White Voter Registration in Hertford County",
x = "Date",
y = "Number of White Voters"
) +
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5),
plot.caption = element_text(color = "gray50"),
axis.text.x = element_text(angle = 45, hjust = 1)
)Most notably, we can see a steady uphill climb once Kamala began her campaign, with a small decrease following the Harris-Trump debate.
Now, let’s find the percent change in this category
hertford_white_changes <- hertford_white |>
summarise(
Start_Total = first(White),
End_Total = last(White),
Total_Change = last(White) - first(White),
Percent_Change = ((last(White) - first(White)) / first(White)) * 100
)The percent change was -0.2408257, showing an overall decrease.
Then, black voter count in Hertford county
hertford_black <- read_csv(("~/Desktop/VoterRegistrationData/hertford_black.csv"))
hertford_black$Date <- as.Date(paste0("2024-", hertford_black$Date), format = "%Y-%m/%d")
ggplot(hertford_black, aes(x = Date, y = Black)) +
geom_line(color = "blue", size = 1) +
geom_point(color = "darkblue", size = 3) +
labs(
title = "Black Voter Registration in Hertford County",
x = "Date",
y = "Number of Black Voters"
) +
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5),
plot.caption = element_text(color = "gray50"),
axis.text.x = element_text(angle = 45, hjust = 1)
)The sharpest climb was after the Democratic National Convention.
Now, let’s find the percent change in this category
hertford_black_changes <- hertford_black |>
summarise(
Start_Total = first(Black),
End_Total = last(Black),
Total_Change = last(Black) - first(Black),
Percent_Change = ((last(Black) - first(Black)) / first(Black)) * 100
)The percent change was 0.3976383, showing a minor increase.
Now, female voter count in Hertford County
hertford_female <- read_csv(("~/Desktop/VoterRegistrationData/hertford_female.csv"))
hertford_female$Date <- as.Date(paste0("2024-", hertford_female$Date), format = "%Y-%m/%d")
ggplot(hertford_female, aes(x = Date, y = Female)) +
geom_line(color = "blue", size = 1) +
geom_point(color = "darkblue", size = 3) +
labs(
title = "Female Voter Registration in Hertford County",
x = "Date",
y = "Number of Female Voters"
) +
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5),
plot.caption = element_text(color = "gray50"),
axis.text.x = element_text(angle = 45, hjust = 1)
)There was a sharp increase following the Democratic National Convention, with sharp decrease following the Republican National Convention and the Harris Trump debate.
Now, let’s find the percent change in this category
hertford_female_changes <- hertford_female |>
summarise(
Start_Total = first(Female),
End_Total = last(Female),
Total_Change = last(Female) - first(Female),
Percent_Change = ((last(Female) - first(Female)) / first(Female)) * 100
)The percent change was 0.2953417, showing a very minor increase.
Then, male voter count in Hertford County
hertford_male <- read_csv(("~/Desktop/VoterRegistrationData/hertford_male.csv"))
hertford_male$Date <- as.Date(paste0("2024-", hertford_male$Date), format = "%Y-%m/%d")
ggplot(hertford_male, aes(x = Date, y = Male)) +
geom_line(color = "blue", size = 1) +
geom_point(color = "darkblue", size = 3) +
labs(
title = "Male Voter Registration in Hertford County",
x = "Date",
y = "Number of Male Voters"
) +
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5),
plot.caption = element_text(color = "gray50"),
axis.text.x = element_text(angle = 45, hjust = 1)
)This graph shows a very steady uphill climb in male voter registration over these summer months, with one noticeable decrease following Tim Walz being chosen as Harris’s running mate.
Now, let’s find the percent change in this category
hertford_male_changes <- hertford_male |>
summarise(
Start_Total = first(Male),
End_Total = last(Male),
Total_Change = last(Male) - first(Male),
Percent_Change = ((last(Male) - first(Male)) / first(Male)) * 100
)This shows quite a large percent change of 1.26183, which is a significant increase.
Now, democratic voters in Hertford County
hertford_democratic <- read_csv(("~/Desktop/VoterRegistrationData/hertford_democratic.csv"))
hertford_democratic$Date <- as.Date(paste0("2024-", hertford_democratic$Date), format = "%Y-%m/%d")
ggplot(hertford_democratic, aes(x = Date, y = Democratic)) +
geom_line(color = "blue", size = 1) +
geom_point(color = "darkblue", size = 3) +
labs(
title = "Democratic Voter Registration in Hertford County",
x = "Date",
y = "Number of Democratic Voters"
) +
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5),
plot.caption = element_text(color = "gray50"),
axis.text.x = element_text(angle = 45, hjust = 1)
)This graph shows a very steady decline in voter registration in Hertford County among democrats. Even though there was a brief increase following Kamala being chosen to step into Biden’s place, there was a continuous decline in voter registration until the Democratic National Convention.
Now, let’s find the percent change in this category
hertford_democratic_changes <- hertford_democratic |>
summarise(
Start_Total = first(Democratic),
End_Total = last(Democratic),
Total_Change = last(Democratic) - first(Democratic),
Percent_Change = ((last(Democratic) - first(Democratic)) / first(Democratic)) * 100
)This shows a minor decrease with -0.2408257.
Now, Republican voters in Hertford County
hertford_republican <- read_csv(("~/Desktop/VoterRegistrationData/hertford_republican.csv"))
hertford_republican$Date <- as.Date(paste0("2024-", hertford_republican$Date), format = "%Y-%m/%d")
ggplot(hertford_republican, aes(x = Date, y = Republican)) +
geom_line(color = "blue", size = 1) +
geom_point(color = "darkblue", size = 3) +
labs(
title = "Republican Voter Registration in Hertford County",
x = "Date",
y = "Number of Republican Voters"
) +
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5),
plot.caption = element_text(color = "gray50"),
axis.text.x = element_text(angle = 45, hjust = 1)
)This shows a consistent uphill climb among Hertford County Republicans, with a slight decline following Kamala choosing Walz as her running mate.
Now, let’s find the percent change in this category
hertford_republican_changes <- hertford_republican |>
summarise(
Start_Total = first(Republican),
End_Total = last(Republican),
Total_Change = last(Republican) - first(Republican),
Percent_Change = ((last(Republican) - first(Republican)) / first(Republican)) * 100
)This shows quite a significant increase with a percent change of 1.43213.
Now, let’s do the same for the most Republican county, Mitchell County
First, white voter count in Mitchell County
mitchell_white <- read_csv(("~/Desktop/VoterRegistrationData/mitchell_white.csv"))
mitchell_white$Date <- as.Date(paste0("2024-", mitchell_white$Date), format = "%Y-%m/%d")
ggplot(mitchell_white, aes(x = Date, y = White)) +
geom_line(color = "red", size = 1) +
geom_point(color = "darkred", size = 3) +
labs(
title = "White Voter Registration in Mitchell County",
x = "Date",
y = "Number of White Voters"
) +
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5),
plot.caption = element_text(color = "gray50"),
axis.text.x = element_text(angle = 45, hjust = 1)
)This shows the strongest increase following Biden dropping out of the race.
Now, let’s find the percent change in this category
mitchell_white_changes <- mitchell_white|>
summarise(
Start_Total = first(White),
End_Total = last(White),
Total_Change = last(White) - first(White),
Percent_Change = ((last(White) - first(White)) / first(White)) * 100
)White residents in Mitchell County had a very minor change with 0.413997.
Then, black voter count in Mitchell County
mitchell_black <- read_csv(("~/Desktop/VoterRegistrationData/mitchell_black.csv"))
mitchell_black$Date <- as.Date(paste0("2024-", mitchell_black$Date), format = "%Y-%m/%d")
ggplot(mitchell_black, aes(x = Date, y = Black)) +
geom_line(color = "red", size = 1) +
geom_point(color = "darkred", size = 3) +
labs(
title = "Black Voter Registration in Mitchell County",
x = "Date",
y = "Number of Black Voters"
) +
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5),
plot.caption = element_text(color = "gray50"),
axis.text.x = element_text(angle = 45, hjust = 1)
)#now, let's find the percent change in this category
mitchell_black_changes <- mitchell_black |>
summarise(
Start_Total = first(Black),
End_Total = last(Black),
Total_Change = last(Black) - first(Black),
Percent_Change = ((last(Black) - first(Black)) / first(Black)) * 100
)This graph is interesting because the total count only changed by two voters, with the strongest increase following the Democratic National Convention.
Now, let’s find the percent change in this category
mitchell_black_changes <- mitchell_black |>
summarise(
Start_Total = first(Black),
End_Total = last(Black),
Total_Change = last(Black) - first(Black),
Percent_Change = ((last(Black) - first(Black)) / first(Black)) * 100
)While actually a small increase, the percent change was 2.325581.
Now, female voter count in Mitchell County
mitchell_female <- read_csv(("~/Desktop/VoterRegistrationData/mitchell_female.csv"))
mitchell_female$Date <- as.Date(paste0("2024-", mitchell_female$Date), format = "%Y-%m/%d")
ggplot(mitchell_female, aes(x = Date, y = Female)) +
geom_line(color = "red", size = 1) +
geom_point(color = "darkred", size = 3) +
labs(
title = "Female Voter Registration in Mitchell County",
x = "Date",
y = "Number of Female Voters"
) +
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5),
plot.caption = element_text(color = "gray50"),
axis.text.x = element_text(angle = 45, hjust = 1)
)This shows a steady incline following Biden dropping out and Kamala joining the race.
Now, let’s find the percent change in this category
mitchell_female_changes <- mitchell_female |>
summarise(
Start_Total = first(Female),
End_Total = last(Female),
Total_Change = last(Female) - first(Female),
Percent_Change = ((last(Female) - first(Female)) / first(Female)) * 100
)There was a very minor increase with a percent change of 0.2953417.
Then, male voter count in Mitchell County
mitchell_male <- read_csv(("~/Desktop/VoterRegistrationData/mitchell_male.csv"))
mitchell_male$Date <- as.Date(paste0("2024-", mitchell_male$Date), format = "%Y-%m/%d")
ggplot(mitchell_male, aes(x = Date, y = Male)) +
geom_line(color = "red", size = 1) +
geom_point(color = "darkred", size = 3) +
labs(
title = "Male Voter Registration in Mitchell County",
x = "Date",
y = "Number of Male Voters"
) +
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5),
plot.caption = element_text(color = "gray50"),
axis.text.x = element_text(angle = 45, hjust = 1)
)The largest decrease was following Biden dropping out and Kamala stepping in his place, and the largest increase was following the Harris-Trump debate.
Now, let’s find the percent change in this category
mitchell_male_changes <- mitchell_male |>
summarise(
Start_Total = first(Male),
End_Total = last(Male),
Total_Change = last(Male) - first(Male),
Percent_Change = ((last(Male) - first(Male)) / first(Male)) * 100
)There was a minor increase with 0.3798481.
Now, democratic voters in Mitchell County
mitchell_democratic <- read_csv(("~/Desktop/VoterRegistrationData/mitchell_democratic.csv"))
mitchell_democratic$Date <- as.Date(paste0("2024-", mitchell_democratic$Date), format = "%Y-%m/%d")
ggplot(mitchell_democratic, aes(x = Date, y = Democratic)) +
geom_line(color = "red", size = 1) +
geom_point(color = "darkred", size = 3) +
labs(
title = "Democratic Voter Registration in Mitchell County",
x = "Date",
y = "Number of Democratic Voters"
) +
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5),
plot.caption = element_text(color = "gray50"),
axis.text.x = element_text(angle = 45, hjust = 1)
)The largest increase was following the Democratic National Convention and the largest decrease was following the Harris-Trump debate.
Now, let’s find the percent change in this category
mitchell_democratic_changes <- mitchell_democratic|>
summarise(
Start_Total = first(Democratic),
End_Total = last(Democratic),
Total_Change = last(Democratic) - first(Democratic),
Percent_Change = ((last(Democratic) - first(Democratic)) / first(Democratic)) * 100
)There was a pretty large increase with 0.8737864.
Now, Republican voters in Mitchell County
mitchell_republican <- read_csv(("~/Desktop/VoterRegistrationData/mitchell_republican.csv"))
mitchell_republican$Date <- as.Date(paste0("2024-", mitchell_republican$Date), format = "%Y-%m/%d")
ggplot(mitchell_republican, aes(x = Date, y = Republican)) +
geom_line(color = "red", size = 1) +
geom_point(color = "darkred", size = 3) +
labs(
title = "Republican Voter Registration in Mitchell County",
x = "Date",
y = "Number of Republican Voters"
) +
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5),
plot.caption = element_text(color = "gray50"),
axis.text.x = element_text(angle = 45, hjust = 1)
)There was a quite significant increase following the Harris-Trump debate and a minor decrease following Trump getting shot.
Now, let’s find the percent change in this category
mitchell_republican_changes <- mitchell_republican|>
summarise(
Start_Total = first(Republican),
End_Total = last(Republican),
Total_Change = last(Republican) - first(Republican),
Percent_Change = ((last(Republican) - first(Republican)) / first(Republican)) * 100
)There was a positive increase of 0.5376344.
This study was quite interesting because when comparing each demographic between the two counties, there are many indirect correlations, where one point in Hertford County increases while the same point in Mitchell County decreases.