##(Introduction)

This project examines the impact of same-day registration (SDR) on voter turnout in following elections. Specifically, it focuses on voters in North Carolina during the 2020 general election who took advantage of SDR—allowing them to register and vote after the traditional registration deadline. By comparing this group to individuals who registered during the same period but did not use SDR and were therefore unable to vote in 2020, the analysis seeks to determine whether SDR participants are more likely to vote in the 2024 general election. This project shows how SDR policies affect voting habits over time, helping us understand how making elections more accessible can encourage people to stay engaged in voting.

##(Data Used)

For this analysis, I utilized two data sets from the state of North Carolina: the statewide voter history file and the statewide voter file.

The voter history file contains records of voters’ participation in past elections, including details about the election, voting method, and party affiliation during voting. The voter file includes information about registered voters, such as demographic details, party affiliation, and registration date.

To focus the analysis on the most relevant information, I only selected columns necessary for examining voter turnout and demographic trends. This includes the unique voter identifier (ncid), the voter’s registration date, election history (whether the voter participated in certain elections), and demographic information (such as age, race, and party affiliation).

The columns were selected using the following code: Reading and selecting relevant columns from the voter history file

voter_registration <- read_tsv(“/Users/anabella/Documents/project/ncvhis_statewide.txt”, col_select = c(“ncid”, “election_lbl”, “election_desc”, “voting_method”, “voted_party_cd”, “vtd_label”, “county_id”, “county_desc”))

Reading and selecting relevant columns from the voter file

voter_history <- read_tsv(“/Users/anabella/Documents/project/ncvoter_statewide.txt”, col_select = c(“ncid”, “county_id”, “county_desc”, “birth_year”, “age_at_year_end”, “race_code”, “ethnic_code”, “party_cd”, “gender_code”, “registr_dt”))

The datasets were merged using a left join on the ncid column, which serves as the unique identifier for each voter. This ensured that all records from the voter file were retained, even if there was no corresponding entry in the voter history file. The code for merging is shown below:

###(Merging the datasets)

merged_data <- voter_history %>% left_join(voter_registration, by = “ncid”)

###(Filtering the Data for Same Day Registrants in the 2020 General Election)

To identify same-day registrants (SDR), I first converted the registr_dt column from a character format to a Date format for easier comparison. The registr_dt column contains the date a voter registered, and by converting it, I could filter the data based on a specific registration deadline. I then defined the registration deadline for the 2020 election as October 9, 2020. Finally, I filtered the data to select voters who registered after this deadline and who voted in the election (as indicated by the non-NA values in the voted_party_cd column).

#####(Below is the code I used to filter the data for people who registered after the registration deadline and voted:)

sdr_users <- merged_data %>% filter(registr_dt > registration_deadline & !is.na(voted_party_cd))

##(Demographic Analysis of Same Day Registrants)

##(Histogram Showing Age Distribution of Same-day Registrants)

The histogram of the age distribution for same-day registrants (SDRs) was created using the ggplot2 package. First, I loaded the package and used the ggplot() function to plot the age data. The geom_histogram() function was applied to visualize the distribution, with the age at year-end as the variable of interest. To customize the plot, I set the bin width to 1 and defined color and fill properties.Additionally, the x-axis was limited to ages between 18 and 100 to focus on the relevant age range for SDRs.

#####(This code was used to create a histogram of age distribution for same-day registrants, setting the bin width to 1 and restricting the x-axis to ages between 18 and 100.)

ggplot(sdr_users, aes(x = age_at_year_end)) + geom_histogram(binwidth = 1, fill = “#FFC0CB”, color = “black”, alpha = 0.7) + scale_x_continuous(limits = c(18, 100))

## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## Rows: 31388804 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr (7): county_desc, election_lbl, election_desc, voting_method, voted_part...
## dbl (1): county_id
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 8883637 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr (7): county_desc, ncid, registr_dt, race_code, ethnic_code, party_cd, ge...
## dbl (3): county_id, birth_year, age_at_year_end
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Warning: Removed 383 rows containing non-finite outside the scale range
## (`stat_bin()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_bar()`).

The histogram shows a downward trend, indicating that younger individuals were more likely to use same-day registration. While not every age group follows this pattern, the overall trend shows a decrease in SDR usage as age increases.

##(Pie Chart Showing Party Distribution of Same-day Registrants in the 2020 General Election)

The pie chart visualizes the political party breakdown of SDR users.

#####(To prepare the data, the mutate() function was used to clean and standardize the voted_party_cd column, mapping abbreviations like “rep” and “dem” to full party names:)

mutate(voted_party_cd = case_when( voted_party_cd == “rep” ~ “Republican”, voted_party_cd == “dem” ~ “Democrat”, voted_party_cd == “una” ~ “Independent”, TRUE ~ “Other” ))

This ensured consistency in labeling before grouping and summarizing the data.

#####(The chart itself was created using ggplot(), with geom_bar() to generate the base and coord_polar(theta = “y”) to transform it into a pie chart:)

ggplot(party_breakdown, aes(x = ““, y = Count, fill = voted_party_cd)) + geom_bar(stat =”identity”, width = 1, color = “white”) + coord_polar(theta = “y”)

This visualization highlights the proportion of SDR users in each party, with text labels added to show the percentage breakdown for clarity.

The pie chart reveals that Republicans and Democrats had nearly equal representation among SDR users, with their proportions being very similar. This suggests that same-day registration was utilized at comparable rates by both major political parties.

##(Pie Chart Showing Race Distribution of Same-day Registrants in the 2020 General Election)

This pie chart was created to display the percentage of SDR users who identify as Black, White, or neither. First, I categorized users by race using the mutate() function with case_when(), assigning “White” for race code “W”, “Black” for race code “B”, and “Does not Identify with Either” for all other values. #####(This ensured clear labels for each group:)

mutate(race_category = case_when( race_code == “W” ~ “White”, race_code == “B” ~ “Black”, TRUE ~ “Does not Identify with Either” )))

Next, we grouped the data by race category and calculated the percentage for each group. Finally, we used ggplot() to create a polar bar chart, transforming it into a pie chart with coord_polar(theta = “y”), and added percentages as text labels for clarity. The colors were customized using scale_fill_manual().

The majority of same-day registrants identify as white, reflecting a larger representation in this group. However, the proportions align closely with the racial demographics of the United States population.

###(Calculating Voter Turnout in the 2024 General Election among SDR Users in the 2020 General Election)

This code was used to calculate the voter turnout in the 2024 General Election for same-day registrants (SDR) from the 2020 General Election. First, it filters the data for the 2024 election and identifies who voted based on their voting method. Then, it filters out the 2020 SDR users and calculates the turnout by comparing the number of SDR users who voted in 2024 to the total number of SDR users. #####(Step 1: Filter the dataset for the November 2024 General Election)

merged_data_2024 <- merged_data[merged_data$election_lbl == “11/05/2024”, ]

#####(Step 2: Identify voters who participated in the 2024 General Election)

merged_data_2024\(voted_2024 <- ifelse(merged_data_2024\)voting_method != ““, 1, 0)

#####(Step 3: Filter for same-day registrants (SDR) in 2020)

sdr_users_2020 <- sdr_users[sdr_users\(ncid %in% merged_data_2024\)ncid, ]

#####(Step 4: Calculate turnout for 2024 among 2020 SDR users)

turnout_2024_sdr <- sum(merged_data_2024\(voted_2024[merged_data_2024\)ncid %in% sdr_users_2020$ncid]) / nrow(sdr_users_2020)

###(Calculating voter turnout in the 2024 General Election for Non-Voters (people who registered after the traditional deadline but were not able to vote))

This code identifies late registrants of the 2020 General Election who were not able to vote and did not use same-day registration (SDR). It then calculates the voter turnout in 2024 for these non-voting late registrants by filtering for their participation in the 2024 General Election and computing their turnout rate.

#####(Step 1: Filter for the 2020 General Election (November 3, 2020))

merged_data_2020 <- merged_data %>% filter(election_lbl == “11/03/2020”)

#####(Step 2: Identify people who did not vote in the 2020 General Election)

non_voters_2020hopefully <- late_registrants_2020[!late_registrants_2020\(ncid %in% merged_data_2020\)ncid, ]

#####(Step 3: Filter for the 2024 General Election)

merged_data_2024 <- merged_data %>% filter(election_lbl == “11/05/2024”)

#####(Step 4: Identify voters who participated in the 2024 General Election)

merged_data_2024\(voted_2024 <- ifelse(merged_data_2024\)voting_method != ““, 1, 0)

#####(Step 5: Find those who registered late in 2020 but did not vote in 2020)

non_voters_2020_2024 <- merged_data_2024[merged_data_2024\(ncid %in% non_voters_2020hopefully\)ncid, ]

#####(Step 6: Calculate turnout for 2024 among these non-voters in 2020)

turnout_2024_non_voters <- sum(non_voters_2020_2024$voted_2024) / nrow(non_voters_2020hopefully)

##(Comparison of 2024 Voter Turnout amongst 2020 Same Day Registrants and 2020 Late Registrants who were not able to vote in 2020)

##(Bar Graph)

This bar graph compares the voter turnout in the 2024 general election between Same-Day Registrants and non-voters from the 2020 general election. It highlights the difference in turnout percentages for each group.

#####(The code begins by creating a data frame turnout_data that holds the groups (“SDR Users” and “Non-Voters”) along with their respective voter turnout percentages for the 2024 general election.)

turnout_data <- data.frame( group = c(“SDR Users”, “Non-Voters”), turnout = c(turnout_2024_sdr, turnout_2024_non_voters) )

The ggplot function then generates a bar graph, where aes(x = group, y = turnout * 100, fill = group) maps the groups to the x-axis and their turnout percentages to the y-axis, while geom_bar(stat = “identity”) creates the bars. To ensure the y-axis covers the entire percentage range, the expand_limits(y = c(0, 100)) is added, which sets the y-axis to span from 0% to 100%.

##(Table comparing 2024 Voter Turnout: SDR Voters vs Non-Voters)

The table created in this code is used to compare the voter turnout in 2024 between Same-Day Registration (SDR) voters from 2020 and non-voters from 2020, showing how much more likely SDR voters are to vote in 2024 compared to non-voters. The table displays the turnout percentages for each group and calculates the percentage by which SDR voters are more likely to vote in 2024, using a formula.

#####(The following code chunk was used to calculate the percent more likely that sdr users were to vote in 2024 than non-voters. This chunk calculates how much more likely SDR voters are to vote in 2024 compared to non-voters. The formula finds the difference in turnout rates between SDR voters and non-voters, divides it by the non-voters’ turnout, and multiplies by 100 to express the result as a percentage.)

percentage_more_likely_all <- ((turnout_2024_sdr - turnout_2024_non_voters) / turnout_2024_non_voters) * 100

#####(The following snippet creates a table with three columns: the group names (SDR Voters and Non-Voters), the turnout percentages for each group in 2024, and the percentage by which SDR voters are more likely to vote (calculated in the previous step). “NA” is placed for non-voters in the percentage difference column since they are not used as a comparison base.)

turnout_table_all <- data.frame( Group = c(“SDR Voters (2020)”, “Non-Voters (2020)”), Turnout_2024 = c(paste0(round(turnout_2024_sdr * 100, 1), “%”), paste0(round(turnout_2024_non_voters * 100, 1), “%”)), % More Likely to Vote in 2024 = c(paste0(round(percentage_more_likely_all, 1), “%”), “NA”) )

Voter Turnout Comparison for SDR Users vs Non-Voters (2020)
Group Turnout_2024 % More Likely to Vote in 2024
SDR Voters (2020) 39.9% 262.2%
Non-Voters (2020) 11% NA

The bar graph and table demonstrate that individuals who utilized Same-Day Registration (SDR) in 2020 and were able to vote that year were significantly more likely to vote in the 2024 election compared to those who registered late in 2020 and were unable to vote. 39.9% of SDR users voted in 2024, while only `11% of non-voters did. Specifically, the table highlights that SDR users were 262.2% (or 2.622 times) more likely to vote in 2024 than non-voters.

###(Finding 2024 Voter Turnout for 2020 SDR Users Based on Political Affiliation)

#####(Step 1: Filter SDR users for Republicans, Democrats, and Independents)

sdr_users_rep <- sdr_users_2020[sdr_users_2020$voted_party_cd == “REP”, ]

sdr_users_dem <- sdr_users_2020[sdr_users_2020$voted_party_cd == “DEM”, ]

sdr_users_una <- sdr_users_2020[sdr_users_2020$voted_party_cd == “UNA”, ]

#####(Step 2: Calculate turnout in 2024 for SDR Republicans)

turnout_2024_sdr_rep <- sum(merged_data_2024\(voted_2024[merged_data_2024\)ncid %in% sdr_users_rep$ncid]) / nrow(sdr_users_rep)

#####(Step 3: Calculate turnout in 2024 for SDR Democrats)

turnout_2024_sdr_dem <- sum(merged_data_2024\(voted_2024[merged_data_2024\)ncid %in% sdr_users_dem$ncid]) / nrow(sdr_users_dem)

#####(Step 4: Calculate turnout in 2024 for SDR Independents)

turnout_2024_sdr_una <- sum(merged_data_2024\(voted_2024[merged_data_2024\)ncid %in% sdr_users_una$ncid]) / nrow(sdr_users_una)

###(Finding 2024 Voter Turnout for 2020 Non-Voters Based on Political Affiliation)

#####(Step 1: Filter non-SDR users for Republicans and Democrats)

non_sdr_users_rep <- non_voters_2020hopefully[non_voters_2020hopefully$voted_party_cd == “REP”, ]

non_sdr_users_dem <- non_voters_2020hopefully[non_voters_2020hopefully$voted_party_cd == “DEM”, ]

non_sdr_users_una <- non_voters_2020hopefully[non_voters_2020hopefully$voted_party_cd == “UNA”, ]

#####(Step 2: Calculate turnout in 2024 for non-SDR Republicans)

turnout_2024_non_sdr_rep <- sum(merged_data_2024\(voted_2024[merged_data_2024\)ncid %in% non_sdr_users_rep$ncid]) / nrow(non_sdr_users_rep)

#####(Step 3: Calculate turnout in 2024 for non-SDR Democrats)

turnout_2024_non_sdr_dem <- sum(merged_data_2024\(voted_2024[merged_data_2024\)ncid %in% non_sdr_users_dem$ncid]) / nrow(non_sdr_users_dem)

#####(Step 4: Calculate turnout in 2024 for non-SDR Independents)

turnout_2024_non_sdr_una <- sum(merged_data_2024\(voted_2024[merged_data_2024\)ncid %in% non_sdr_users_una$ncid]) / nrow(non_sdr_users_una)

##(Bar Graph Showing Voter Turnout in 2024 based on Party and SDR Status)

To produce this graph, I created a data frame containing turnout percentages for each group (SDR and non-SDR voters) across three political affiliations: Democrats, Republicans, and Independents. I used the geom_bar() function to create grouped bar graphs with custom colors for each party and SDR status. Additionally, the geom_text() function was used to add percentage labels on top of the bars, and the ylim(0, 100) function ensured the y-axis spans from 0 to 100% for better comparison.

#####(This line of code:)

geom_bar(stat = “identity”, position = position_dodge(width = 0.7), width = 0.6)

was used to create grouped bar graphs, where each bar represents a turnout percentage. The stat = “identity” argument ensures the heights of the bars correspond directly to the values in the data, while position_dodge() spaces the bars within each group for clear comparison.

#####(This line of code:)

ylim(0, 100)

was used to set the y-axis range from 0% to 100%, ensuring the different sizes of the bars are represented accurately relative to the full turnout percentage scale.

##(Table Showing Difference in 2024 Voter Turnout based on SDR Status and Political Affiliation)

I will now explain how the following table was created, which displays voter turnout by party and SDR status, along with the calculated differences in voting likelihood between SDR and non-SDR groups.

#####(The following lines of code calculate how much more likely each group is to vote based on their SDR status:)

percentage_more_likely_rep <- ((turnout_2024_sdr_rep - turnout_2024_non_sdr_rep) / turnout_2024_non_sdr_rep) * 100

This line calculates the percentage difference in voting likelihood between SDR and non-SDR Republicans. It computes the difference between the turnout of SDR Republicans and non-SDR Republicans, then divides it by the non-SDR Republicans’ turnout and multiplies by 100 to express it as a percentage. This step was repeated for Democrats and Independents.

The data.frame() function is used to create a table with pre-calculated values for each group. This function is used to organize the data into a tabular format. It combines the party names, their respective voter turnout values, and the calculated percentage differences into a single data structure for easy display.

#####(The code below uses the kable() function to take the data frame (party_data) and format it into a table. The caption argument adds a title, col.names sets the column names, and align specifies the alignment of the columns (left for the “Party” column and center for the other two columns).)

kable(party_data, caption = “Voter Turnout by Party and SDR Status”, col.names = c(“Party”, “Voter Turnout (2024)”, “(%) More Likely to Vote in 2024”), align = c(“l”, “c”, “c”))

Voter Turnout by Party and SDR Status
Party Voter Turnout (2024) (%) More Likely to Vote in 2024
Republicans (SDR) 42.9% 242.1%
Republicans (Non-SDR) 12.5% -
Democrats (SDR) 40.9% 620.4%
Democrats (Non-SDR) 5.7% -
Independents (SDR) 58.4% 293.7%
Independents (Non-SDR) 14.8% -

The bar graph and table both illustrate the significant impact of Same-Day Registration (SDR) on voter turnout in 2024, comparing the likelihood of voting between SDR and non-SDR groups in each political party. From the bar graph, we observe that individuals who used SDR in 2020 were notably more likely to vote in 2024 compared to those who were non-voters in 2020. Specifically, the voter turnout for Republicans who used SDR was 42.9%, while Republicans who were non-voters in 2020 had a turnout of 12.5%. For Democrats, 40.9% of SDR users voted, while only 5.7% of non-voters in 2020 cast their ballots. Among Independents, 58.4% of SDR users voted, compared to 14.8% of non-voters.

The table reinforces these findings, showing the percentage increase in likelihood to vote for SDR users compared to non-voters. For example, SDR Republicans were 242.1% (2.421 times) more likely to vote than non-voter Republicans, SDR Democrats were 620.4% (6.204 times) more likely to vote than non-voter Democrats, and SDR Independents were 293.7% (2.937 times) more likely to vote than non-SDR Independents.

When comparing the likelihood to vote across these three political groups, we see a clear trend:

Democrats benefit the most from SDR, with an increase in likelihood to vote of 620.4%. Independents come next, with a 293.7% increase in voter turnout. Republicans show the smallest increase, at 242.1%.

In conclusion, while Same-Day Registration positively impacts voter turnout across all three groups—Republicans, Democrats, and Independents—it most strongly influences voter turnout among Democrats. This suggests that SDR has a particularly powerful effect in mobilizing Democratic voters for the 2024 general election.

##(Finding 2024 Voter Turnout for SDR Users based on Race)

#####(Step 1: Filter SDR users for Black and White People)

sdr_users_white <- sdr_users_2020[sdr_users_2020$race_code == “W”, ]

sdr_users_black <- sdr_users_2020[sdr_users_2020$race_code == “B”, ]

#####(Step 2: Calculate turnout in 2024 for White SDR Users)

turnout_2024_sdr_white <- sum(merged_data_2024\(voted_2024[merged_data_2024\)ncid %in% sdr_users_white$ncid]) / nrow(sdr_users_white)

#####(Step 3: Calculate turnout in 2024 for Black SDR Users)

turnout_2024_sdr_black <- sum(merged_data_2024\(voted_2024[merged_data_2024\)ncid %in% sdr_users_black$ncid]) / nrow(sdr_users_black)

###(Finding 2024 Voter Turnout for Non-Voters based on Race)

#####(Step 1: Filter non-SDR users for White and Black People)

non_sdr_users_white <- non_voters_2020hopefully[non_voters_2020hopefully$race_code == “W”, ]

non_sdr_users_black <- non_voters_2020hopefully[non_voters_2020hopefully$race_code == “B”, ]

#####(Step 2: Calculate turnout in 2024 for white non-voters)

turnout_2024_non_sdr_white <- sum(merged_data_2024\(voted_2024[merged_data_2024\)ncid %in% non_sdr_users_white$ncid]) / nrow(non_sdr_users_white)

#####(Step 3: Calculate turnout in 2024 for black non-voters)

turnout_2024_non_sdr_black <- sum(merged_data_2024\(voted_2024[merged_data_2024\)ncid %in% non_sdr_users_black$ncid]) / nrow(non_sdr_users_black)

##(Bar Graph Showing the Difference in 2024 Voter Turnout based on Race and SDR Status)

The bar graph displays the difference in voter turnout in 2024 based on race and Same-Day Registration (SDR) status. It compares the turnout rates of SDR users and non-SDR voters among different racial groups, showing the impact of SDR on voter participation amongst different races.

The data preparation and plotting involved a few key steps.

#####(First, using the code below, a data frame was created with the turnout data for both SDR and non-SDR groups for two racial categories, “White” and “Black.” This data was structured in a way that grouped the turnout rates by SDR status and race.)

data <- data.frame( Group = c(“SDR White”, “Non-SDR White”, “SDR Black”, “Non-SDR Black”), Turnout = c(turnout_2024_sdr_white * 100, turnout_2024_non_sdr_white * 100, turnout_2024_sdr_black * 100, turnout_2024_non_sdr_black * 100), Race = c(“White”, “White”, “Black”, “Black”), SDR_Status = c(“SDR”, “Non-SDR”, “SDR”, “Non-SDR”) )

The Group column defines the combination of race and SDR status, while the Turnout column represents the voter turnout percentage.

Next, ggplot2 was used to create the bar graph. The aes() function defined the variables for the x-axis (group), y-axis (turnout), and fill color (interaction between race and SDR status). The geom_bar() function created the bar graph, with stat = “identity” ensuring that the bars represent the actual values in the Turnout column.

#####(Here, position_dodge() was used to ensure the bars for SDR and non-SDR groups were displayed side by side.)

ggplot(data, aes(x = Group, y = Turnout, fill = interaction(Race, SDR_Status))) + geom_bar(stat = “identity”, position = position_dodge(width = 0.7), width = 0.6) +

The scale_fill_manual() function was used to assign specific colors to each combination of race and SDR status. This allows us to differentiate the groups clearly. The geom_text() was used to display the turnout percentages directly on the bars, and labs() added titles and axis labels for clarity.

##(Table Showing Difference in 2024 Voter Turnout based on SDR Status and Race)

This code calculates and creates a table that shows the difference in voter turnout for different racial groups based on their use of Same-Day Registration (SDR) in 2020.

#####(First, I calculated the difference in voter turnout in 2024 between the two racial groups based on SDR status, determining how much more likely SDR users were to vote in 2024 compared to non-voters within each racial group. This was done using the following formula, which computes the percentage increase in turnout for SDR voters.)

((turnout_SDR - turnout_non_SDR) / turnout_non_SDR) * 100

Next, the code creates a data frame called race_data to store the calculated values and the original voter turnout data for both SDR and non-SDR groups. The columns in this data frame include “Race,” “Voter Turnout (2024),” and “Difference (%),” with turnout values formatted as percentages to one decimal place.

#####(Finally, the kable() function from the knitr package is used to create a simple table that displays the voter turnout and differences for each group. The col.names argument defines the column headers, and align specifies the alignment of the columns for a cleaner presentation.)

kable(race_data, caption = “Voter Turnout by Race and SDR Status”, col.names = c(“Race”, “Voter Turnout (2024)”, “(%) More Likely to Vote in 2024”), align = c(“l”, “c”, “c”))

Voter Turnout by Race and SDR Status
Race Voter Turnout (2024) (%) More Likely to Vote in 2024
White (SDR) 39.2% 264.7%
White (Non-SDR) 10.7% -
Black (SDR) 38.4% 447.4%
Black (Non-SDR) 7% -
##  [1] "ncid"            "election_lbl"    "election_desc"   "voting_method"  
##  [5] "voted_party_cd"  "vtd_label"       "county_id.x"     "county_desc.x"  
##  [9] "county_id.y"     "county_desc.y"   "birth_year"      "age_at_year_end"
## [13] "race_code"       "ethnic_code"     "party_cd"        "gender_code"    
## [17] "registr_dt"

The bar graph and table both highlight the significant impact of Same-Day Registration (SDR) on voter turnout in 2024 based on race. For both White and Black voters, SDR users were much more likely to vote compared to non-voters. Specifically, the voter turnout for White SDR voters was 39.2%, compared to 10.7% for White non-voters, and for Black SDR voters, it was 38.4%, compared to 7% for Black non-voters.

The table shows that White SDR voters were 264.7% more likely to vote than non-SDR White voters (2.647 times more likely), while Black SDR voters were 447.4% more likely to vote than non-SDR Black voters (4.474 times more likely). These findings indicate that while SDR has a significant impact on voter turnout for both racial groups, the effect is much stronger for Black voters than for white voters.

##(Table Showing Difference in 2024 Voter Turnout based on Age)

The following table shows the voter turnout rates in 2024 for each age group, comparing non-SDR users and SDR users. It also displays the percentage by which SDR users were more likely to vote in 2024 compared to non-SDR users within each age group.

This code performs several tasks related to analyzing voter turnout by age group, specifically comparing the turnout between SDR (Same-Day Registration) users and non-SDR users in 2024. It categorizes users into age groups, calculates voter turnout rates for each group, and generates a table with the results. The final section of the code creates a new variable that shows how much more likely SDR users are to vote compared to non-SDR users, expressed as a factor (e.g., 1.5 times more likely).

The first part of the code categorizes SDR users and non-SDR users into different age groups using the categorize_age function. For each user, the age_at_year_end value is passed into the categorize_age function, which assigns them to a specific age group based on their age.

#####(The following code is used to apply categorization to SDR users)

sdr_users_2020\(age_group <- sapply(sdr_users_2020\)age_at_year_end, categorize_age)

#####(The following code is used to apply categorization to non-SDR users)

non_voters_2020hopefully\(age_group <- sapply(non_voters_2020hopefully\)age_at_year_end, categorize_age)

I defined a function (calculate_turnout_by_age) that calculates the voter turnout rate for each age group by comparing the number of voters in each group to the total number of users in that group. It loops through each unique age group, calculates the turnout, and stores the results.This function was then used to calculate voter turnout for both SDR and non-SDR users in 2024, as shown below:

#####(Calculate turnout rates for SDR users in 2024)

turnout_2024_sdr_age <- calculate_turnout_by_age(sdr_users_2020, merged_data_2024, “age_group”, “voted_2024”)

#####(Calculate turnout rates for non-SDR users in 2024)

turnout_2024_non_sdr_age <- calculate_turnout_by_age(non_voters_2020hopefully, merged_data_2024, “age_group”, “voted_2024”)

The results from the calculate_turnout_by_age function are combined into a data frame. The age_groups vector contains all unique age groups, and the sapply function is used to apply the calculated turnout rates for SDR and non-SDR users to each age group.

Finally, the table is displayed using kable from the knitr package. This table shows the voter turnout for SDR users and non-SDR users for each age group, as well as the percentage difference in voter turnout between the two groups.

The last part of the code creates a new column, Times_More_Likely, which calculates how many times more likely SDR users were to vote in 2024 compared to non-SDR users. The value is the ratio of SDR turnout to non-SDR turnout for each age group, and it is calculated as a factor instead of a percentage. This part of the code will be used in a line graph.

Voter Turnout by Age Group and SDR Status
Age Group SDR Voter Turnout (2024) Non-SDR Voter Turnout (2024) % More Likely for 2020 SDR Users to Vote in 2024
31-40 31-40 40.4% 8.2% 394.4%
25-30 25-30 44.1% 7.5% 486.1%
41-60 41-60 35.4% 9.9% 256.2%
61+ 61+ 29.4% 7% 320.4%
18-24 18-24 69.9% 13.4% 421.3%

##(Line Graph Showing Effects of SDR on Voter Turnout Among Different Age Groups)

The line graph shows the relative likelihood of SDR users voting in 2024 compared to non-SDR users across different age groups. It displays how many times more likely SDR users are to vote than non-SDR users, with each point representing a specific age group and the line connecting these points to show the trend.

Using ggplot2, the ggplot() function is called to create the line graph. The aes() function maps the x-axis to Age_Group and the y-axis to Times_More_Likely. A line is drawn using geom_line(), connecting the data points with a light pink color, and individual points are added with geom_point() in dark red. The labs() function adds a title and axis labels, and the theme_minimal() and theme() functions refine the graph’s appearance by using a minimal theme and rotating the x-axis labels for better readability.

The table and line graph shows that in every age group, SDR users from 2020 were significantly more likely to vote in 2024 than people who were late registrants in 2020 but were not able to vote. For example, in the 18-24 age group, SDR users had a voter turnout of 69.9%, compared to just 13.4% for late registrants, making them 421.3% more likely to vote. In the 25-30 age group, SDR users had a turnout of 44.1%, compared to 7.5% for late registrants, making them 486.1% more likely to vote, the highest among the groups. The 31-40 age group had 40.4% turnout for SDR users versus 8.2% for late registrants, with SDR users being 394.4% more likely to vote. In the 41-60 age group, SDR users had 35.4% turnout versus 9.9% for late registrants, resulting in a 256.2% higher likelihood. Finally, in the 61+ age group, SDR users had 29.4% turnout compared to 7% for late registrants, making them 320.4% more likely to vote.

While the 25-30 age group showed the greatest difference in voter turnout between SDR users and late registrants, the line graph overall shows a negative trend: younger age groups were more strongly affected by SDR in terms of voter turnout.

##(Conclusion)

Same-day registration (SDR) has proven to significantly increase voter turnout in future elections across all demographic groups I examined. By making voting more accessible, SDR encourages participation, particularly among those who might otherwise miss the traditional registration deadline. Voters who used SDR in the 2020 general election were more likely to return to the polls in the 2024 election compared to those who registered after the deadline but were unable to vote in 2020. The experience of voting once increases the likelihood of future participation, and SDR plays a key role in facilitating this.

Although the effect of SDR on voter turnout was observed across all demographics, it had the most substantial impact on Democratic voters compared to Republicans and Independents. Additionally, SDR significantly boosted Black voter turnout more than White voter turnout, highlighting its potential to reduce barriers for historically underrepresented groups. Lastly, there was a clear trend indicating that SDR had a more pronounced effect on younger voters compared to older ones, emphasizing the importance of making voting easier for younger generations.Overall, the data suggests that SDR can foster a more engaged voting population, ultimately leading to higher participation in future elections.

##(Possible Future Research)

For future research, it would be valuable to explore the long-term effects of same-day registration (SDR) on voter behavior, particularly in terms of how it influences voter turnout over multiple election cycles. Understanding whether the initial boost in turnout provided by SDR leads to sustained political engagement or if its effects diminish over time would help policymakers assess its lasting impact. Additionally, examining how SDR affects turnout in states with varying political environments and demographics would offer insights into whether its effectiveness is influenced by factors such as political polarization, voter access, or regional differences. Furthermore, exploring how SDR interacts with other voting reforms—such as early voting, mail-in ballots, or automatic voter registration—could provide a more comprehensive understanding of how these measures work together to shape overall electoral participation. This research could offer valuable guidance on the combination of reforms that most effectively encourage long-term civic involvement.