Voter Turnout of Naturalized (Foreign Born) Citizens and U.S.-Born Citizens
Research Question: How does voter turnout differ between Naturalized (Foreign Born) citizens and U.S.-Born Citizens?
Introduction.
Naturalized citizens must undergo a longer process and take a citizenship class as part of their path toward United States citizenship. In contrast, U.S.-born citizens may have taken civic courses during their education. Each group, however, has a different experience, which could influence their voter turnout.
When it comes to who is likely to register to vote, scholars have conducted research in where “the odds of registering among naturalized citizens are 36 percent lower and the odds of voting are 26 percent lower than those of native-born citizens. This may be because naturalized citizens in general have not developed strong ties within their communities or do not relate as well as the native born to the issues or candidates”(Bass and Casper 504).
In this project, I will compare voter turnout between naturalized (foreign-born) citizens and U.S.-born citizens across various elections. The focus is: How does voter turnout differ between naturalized (foreign-born) citizens and U.S.-born citizens?
Using the North Carolina Voter Registration Data and Voter History Data, I will examine the 2024 election, the November 2023 election, and the 2018 Midterm election.
This analysis will focus on Wake County, as it is one of the counties with the largest foreign-born population.
I will also analyze in focusing selective dates such as June 28 to July 5, 2024, March 05, 2024, and September 14, 2024 to September 17, 2024. As these dates are primarily where many naturalization ceremonies occurred during 2024, according to the U.S. Citizenship and Immigration Services.
Null Hypothesis: There is no difference in voter turnout rates between naturalized and U.S. born citizens.
Alternative Hypothesis: Naturalized citizens have lower voter turnout rates than U.S.-born citizens.
Getting Started
Download Necessary Libraries.
library(tidycensus) library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.2
✔ ggplot2 3.5.2 ✔ tibble 3.3.0
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.1.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Attaching package: 'scales'
The following object is masked from 'package:purrr':
discard
The following object is masked from 'package:readr':
col_factor
Rows: 4063003 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: "\t"
chr (13): county_desc, voter_reg_num, election_lbl, election_desc, voting_me...
dbl (2): county_id, voted_county_id
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
All voters categorized or labeled as “Removed,” “Inactive,” “Denied,” or “Confidential” were excluded. This ensures the study focuses only on voters labeled as active.
This final filter limits the data-set to voters who have left the Birth State section blank, or are recorded as NA. For now, we will assume these individuals are foreign-born and naturalized.
Merging Data-sets
FBVOTER_REGISTR_DATA <- FB_VOTER_Registr |>left_join(WAKE_VOTER_HIST, by ="voter_reg_num")
This step shows that we merged the Voter History File with the filtered Voter Registration Data. The two data-sets were merged using voter_reg_num, which serves as a unique identifier for each voter in Wake County, North Carolina.
FBVOTER_REGISTR_DATA <- FBVOTER_REGISTR_DATA |>mutate(election_lbl =as.Date(election_lbl, format ="%m/%d/%Y")) |>mutate(registr_dt =as.Date(registr_dt, format ="%m/%d/%Y")) USBORN_VOTER_Registr <- USBORN_VOTER_Registr |>mutate(election_lbl =as.Date(election_lbl, format ="%m/%d/%Y")) |>mutate(registr_dt =as.Date(registr_dt, format ="%m/%d/%Y"))
Vital Step: To determine if we can look which days throughout 2024 had the most individuals being registered to vote. And if we can set apart which individuals who have NA in the Birth State variable are Naturalized citizens.
This was done to determine whether it is possible to distinguish naturalized individuals from those with Birth State recorded as NA. However, this revealed that it is rather difficult to identify naturalized citizens based solely on their registration date. Although there are noticeable peaks in March, October, and November. However the ones in October and November may be due to voter registration deadlines. But the ones in March, is related to Naturalization Ceremonies being held, as the USCIS records indicate that March had the most naturalization being conducted during that month with 77,200.
Selecting Dates To Determine Naturalized Citizens
According to the USCIS, there is a lot of naturalization ceremonies occurring during Late June/Early July, due to Independence Day period. And in the week from September 14 to September 23, also due to Citizenship Week, and September 17 being Citizenship Day and Constitution Day. While the graph did indicate March had a high number of individuals registering to vote, especially March 05, 2024. I will select these dates, and determine the voter turnout between U.S.Born and Foreign/Naturalized Citizens.
# Define the date rangesceremony_dates <-as.Date(c("2024-03-05")) # single dateindependence_dates <-seq(as.Date("2024-06-28"), as.Date("2024-07-05"), by ="day")constitution_dates <-seq(as.Date("2024-09-14"), as.Date("2024-09-23"), by ="day")# Combine all dates into a single vectorselected_dates <-c(ceremony_dates, independence_dates, constitution_dates)# Filter datasetsFBVOTER_filtered <- FBVOTER_REGISTR_DATA |>filter(registr_dt %in% selected_dates)USBORN_filtered <- USBORN_VOTER_Registr |>filter(registr_dt %in% selected_dates) # Create new variableFBVOTER_filtered <- FBVOTER_filtered %>%mutate(group ="Foreign")USBORN_filtered <- USBORN_filtered %>%mutate(group ="USborn")# Combine into one datasetall_filtered <-bind_rows(FBVOTER_filtered, USBORN_filtered)
Seeing if both Data-sets are Balanced
This showcases that between the two groups it is almost balanced each other, in regards with the variable of age.
# A tibble: 3 × 3
Statistic Foreign USborn
<chr> <chr> <chr>
1 F 38.9% 57.3%
2 M 28.8% 42.6%
3 U 32.3% 0.1%
This showcases that there is an imbalance of gender throughout both data-sets, which is not helpful to determine voter-turnout.
# A tibble: 8 × 3
Race Foreign USborn
<chr> <chr> <chr>
1 Asian 3.3% 2.7%
2 Black 15.4% 22.9%
3 Middle Eastern 0.8% 0.9%
4 Native American / Indigenous 0.3% 0.1%
5 Other 5% 7.7%
6 Pacific Islander 0% <NA>
7 Unknown 24.4% 0.8%
8 White 50.8% 64.8%
In the case of race, it is also unbalanced. Which to resolve this issue is to see if we can balance it out, by matching.
Matching
Warning: Fewer control units than treated units in some `exact` strata; not all
treated units will get a match.
After matching, the results indicate that between both groups, it is balanced out. In order to balanced out, I needed to create a age-group, to reduce errors and unbalanced data.
Results only using Individuals from the Specific Registered Dates
Based on this result, it does indicate, that the the Foreign/Naturalized citizens had a lower voter turnout percentage whie the U.S. born citizens had a high percentage. However, it is to note, that this is only based on a limited sample data.
# 2023 Municipal municipal_2023 <- turnout_summary %>%filter(election_lbl ==as.Date("2023-11-07"))municipal_2023$group_label <-ifelse(municipal_2023$group ==1, "Foreign", "US-born")ggplot(municipal_2023, aes(x = group_label, y = turnout_pct, fill = group_label)) +geom_col(width =0.6, show.legend =FALSE) +# nicer bar width, remove legendgeom_text(aes(label =paste0(round(turnout_pct, 1), "%")), # add % labels on topvjust =-0.5, size =5) +scale_y_continuous(expand =expansion(mult =c(0, 0.1))) +# give space for labelsscale_fill_manual(values =c("US-born"="#1f78b4", "Foreign"="#33a02c")) +# distinct colorslabs(title ="Voter Turnout: 2023 Municipal Election",x ="",y ="Turnout (%)" ) +theme_minimal(base_size =14) +theme(axis.text.x =element_text(face ="bold"),axis.title.y =element_text(face ="bold"),plot.title =element_text(face ="bold", hjust =0.5) )
However, in the 2018 Midterm Election and the 2023 Municipal(local) Election, the U.S. born citizens had a lower voter turnout than the Naturalized citizens. Which then, reflects that U.S.born citizens are less aware or participate less in these elections. But, this again, is from a sample population, the difference between groups is not high.
This reflects the results between both groups based on the previous graphs. In analyzing gender, it appears that U.S.-born women had a higher voter turnout percentage than U.S.-born men in the 2018 and 2023 elections, but not in the 2024 election. Among foreign (naturalized) citizens, women had higher voter turnout than men in 2018 and 2023, but not in the 2024 .
This showcases that in the younger age group from 18-23 the Foreign-Born/Naturalized citizens have a higher voter turnout than the U.S. born citizens. Based on these results, we can conclude that voter turnout is higher across majority age groups in U.S Born citizens group compared to foreign-born/naturalized citizens,suggesting greater civic engagement.
Review
After conducting this study, I learned that in some cases, foreign-born/naturalized citizens had higher voter turnout in the past general election. However, I would like to examine previous years to determine if this is a consistent trend. In contrast, for local and midterm elections, U.S.Born citizens appear to be more civically engaged. A gender analysis also shows which gender from each group had a higher voter turnout in each election. Regarding age, the younger demographic among the foreign-born/naturalized citizens dominate voter turnout in the 2024 general election.
Things I would do differently:
I would examine additional previous years to determine whether these trends persist over time. Additionally, I would seek more accurate information to verify whether voters with a birth state recorded as NA are truly foreign-born or naturalized citizens, rather than relying on selective dates. Finally, I would conduct the study on a much larger sample, as this analysis focused only on Wake County and required reducing the dataset to ensure balanced data between the two groups. I would also like to analyze and compare voter turnout across states, considering whether a state leans Democratic (blue) or Republican (red), to assess whether this results in different turnout patterns between naturalized and U.S.-born citizens.
References: Bass, Loretta E., and Lynne M. Casper. “Differences in Registering and Voting between Native-Born and Naturalized Americans.” Population Research and Policy Review, vol. 20, no. 6, 2001, pp. 483–511. JSTOR, http://www.jstor.org/stable/40230326. Accessed 13 Dec. 2025.