Party Switching and Primary Election Participation

A look at the relationship between party-switching and primary election participation in North Carolina from 2012-2018.

Project Description: This project uses North Carolina’s voter registration and voter history files as well as voter snapshot files from the May 8th 2012 primary and May 6th 2014 primary elections. Combined, these files allowed me to look at party switching behavior among voters in between the May 2012 and May 2014 primary elections, and then pair that data with voter’s participation in subsequent primary elections. This topic was interesting to me because of it’s potential to expose any trends or correlations in the voter behaviors of party-switching and participation in primary elections. Furthermore, the project aims to explore which age groups or party affiliations, if any, are more or less inclined to engage in party-switching.

Some Notes/Limitations: In the process of my analysis, I ran in to several memory & processing issues. I used resources I had available to limit the effect that these issues had on my project, but some decisions were made with this in mind. For example, importing older files (2012 and 2014) seemed to allow for more efficiency than more recent snapshot files (perhaps due to size).

Literature

Past literature seems to focus more on the behavior of party-switching as an indication of partisan politics. There is not much to strongly indicate causes for party-switching. Still, topics discussed in past literature are relevant to an analysis on party-switching and participation.

Smidt (2017): This article by Corwin D. Smidt of Michigan State University looks at how partisans vote (it focuses less on official party-switching and more on whether the vote casted aligns with party-affiliation). Smidt concludes that Americans currently exhibit the highest observed rates of party allegiance when voting across successive presidential elections and attributes this to clear and easy recognition of party-differences. With clearly polarized choices, American’s battle little conflict or concern in picking a side and sticking to it. This could point away from the possibility that those who do switch parties usually do so because of genuine feelings of alignment with the opposing party’s values. https://www.jstor.org/stable/26384737

Bock & Schnabel (2021): This short analysis by Harvard and Cornell University students identifies shifts in party-affiliation from 2016-2020. It discusses trends in self-reported partisan identification, and concludes that despite a downward trend in voter’s engaging in party-switching overall, there is still a significant minority who do realign. https://journals.sagepub.com/doi/epub/10.1177/23780231211057322

Hypothesis

H1: I anticipate finding a positive correlation between party-switching and participation in subsequent primary elections among North Carolina voters.

H2: I expect to find higher rates of party-switching among the younger population.

H3 I expect to find that voters whose original party affiliation is independent/unaffiliated engage in higher-rates of party-switching than those who originally identify as Republican or Democrat.

Project Outline

Primary Election Participation - Hypothesis 1

1. Creating Data Set

Summary: I created my final data set by joining (using an inner join) the May 2012 and May 2014 Voter Snapshot files using the voter registration number as a common identifier. This gave me a data set with voter’s party affiliations in 2012 and 2014. I then used the mutate() function to add a column that would indicate whether there was a party switch within that 2 year period. At this point, I joined my data with the North Carolina Voter History file -this would allow me to see participation in primary elections over the past ten years using the filter() function for primary elections only. I chose to further filter the data for only the following two primary elections (March 2016 and May 2018) for relevancy. I then created a new column for the March 2016 and May 2018 elections with a true/false value to indicate if the voter participated in each election.

Note: Pre-processing each data set included grouping the data by voter registration number and using the distinct() function to get rid of duplicate entries caused through joining the may files and then again with the voter history files.

mayjoin <- inner_join(may2010, may2014, by = "voter_reg_num")

mayjoin <- mayjoin %>% mutate(party_changes = ifelse(party_cd.x == party_cd.y, 1, 2))

Here I loaded in voter history files

finaldata <- inner_join(mayjoin, swhist, by = "voter_reg_num")

finaldata <- filter(finaldata, grepl("primary", election_desc, ignore.case = TRUE))

mar2016 <- "03/15/2016 PRIMARY"
finaldata <- finaldata %>% group_by(voter_reg_num) %>% mutate(mar2016 = any(election_desc == may2014))

may2018 <- "05/08/2018 PRIMARY"
finaldatadata <- finaldata %>% group_by(voter_reg_num) %>% mutate(may2018 = any(election_desc == may2018))

I created subsets for those who switched parties and those who did not. This made it easier when I started to create visuals for the data.

noswitchdata <- subset(finaldata, party_changes == 1) %>% group_by(voter_reg_num) 

switchdata <- subset(finaldata, party_changes == 2) %>% group_by(voter_reg_num)

I used the following code to count voters who participated in each election based on whether they had switched party affiliation or not. I did this to identify potential trends before graphing data. Note: I repeated this for both subsets of data and for both elections (4 total).

Findings:

March 2016 participation among those who did NOT switch parties between 2012 and 2014: 93.5%

## Did not participate: 34116 
##  Did participate: 491390

March 2016 Participation among those who switched parties between 2012 and 2014: 92%

## Did not participate: 4460 
##  Did participate: 52149

May 2018 participation among those who did NOT switch parties: 56.5%

## Did not participate: 228773 
##  Did participate: 296733

May 2018 participation among those who switched parties: 49.7%

## Did not participate: 28928 
##  Did participate: 27681

Summary: In the 2016 presidential primary, there seems to be no statistically significant difference in participation rates among those who switched parties in-between 2012 and 2014 and those who did not (there is a 1% difference). In the 2018 primary, however, 56% of voters who did NOT switch parties in between 2012 and 2014 participated in the 2018 primary compared to the 49% of those who did switch parties. With the numbers alone, it is evident that those who did not engage in party switching participated in both primary elections at higher rates than those who did switch parties. In neither election is there a significant difference in participation rate among party-switchers and non party-switchers, but the 2018 data actually invalidates my original hypothesis. If anything, it is suggestive that strong-partisan voters participate in primary elections at higher rates.

2. Visualizing Data

Note: With no statistically significant differences in data, it was hard to decide appropriate visualization methods. Below, I used simple bar-graphs to show percentages of voters who did and did not participate in each election based on the party-switching behavior.

This code creates a bar graph with the subsetted data for those who did not switch party-affiliation to display their participation in the 2016 primary. I used geom_text() to add percentage values to each x-axis variable, labs() for the title and axis labels, and scale_y_continuous() to change the y-axis from scientific notation to exact numbers for clear visualization.

ggplot(noswitchdata, aes(x = mar2016, fill = mar2016)) 
  + geom_bar(color = "black") 
  + geom_text(stat = "count", aes(label = paste0(round((..count..)/sum(..count..)*100, 1), "%")), position = position_stack(vjust = 0.5), size = 3) 
  + labs(title = "March 2016 Presidential Primary Participation Among non-Party-Switchers", subtitle = "Voters who did not switch party-affiliation in-between the previous two primary elections", x = "Participation", y = "Count", fill = "Key")
  + theme_minimal()
  + scale_y_continuous(labels = scales::number_format())

Note: Here I will provide code for each subset, but not for each election (to avoid including almost the same code 4x)

ggplot(switchdata, aes(x = mar2016, fill = mar2016)) 
  + geom_bar(color = "black") 
  + geom_text(stat = "count", aes(label = paste0(round((..count..)/sum(..count..)*100, 1), "%")), position = position_stack(vjust = 0.5), size = 3) 
  + labs(title = "March 2016 Presidential Primary Participation Among Party-Switchers", subtitle = "Voters who switched party-affiliation in-between the previous two primary elections", x = "Participation", y = "Count", fill = "Key") 
  + theme_minimal() 
  + scale_y_continuous(labels = scales::number_format())

Resulting Graphs

In the graphs below, “true” represents voters who did participate, and “false” represents those who did not.

1. March 2016, No Party-Switch

2. March 2016, Party-Switch

3. May 2018, No Party-Switch

4. May 2018, Party-Switch

3. H1 Conclusion

The data and graphs above display little to no correlation between party-switching and primary election participation among North Carolina voters in between 2012 and 2018. My original hypothesis that there would be a positive correlation between the party-switching behavior and participation in subsequent primary elections seems to have been proven wrong thus far. In 2016, there was little difference in participation rates among those who swapped parties and those who did not. In 2018, there was an approximate 6.5% difference in participation rates between those who swapped parties and those who did not, but that additional 6.5% was held by the voters who did not engage in party-switching, suggesting higher rates of political engagement overall among stronger partisans. It is worth noting the significant difference in participation rates overall in the 2016 presidential primary election compared to the 2018 primaries. This could potentially be attributed voters higher rates of political engagement during a presidential election year.

Continuation: Up until this point I have only tested my original hypothesis. The remainder of my project will look into relationships between demographics, including age, race, and gender, and party-switching to identify if any particular demographic is more/less likely to engage in party-switching than its counterpart. Originally, I expected to find a positive correlation between party-switching and primary election participation, and then use trends in voter-demographics to gain further insight in to potential explanations for why people switch-parties. Instead, I will shift my focus away from election participation in general, and zero in on identifying which demographics, if any, engage in party-switching at higher rates.

Demographic Trends - Hypotheses 2 and 3

To test my second third hypothesis by evaluating trends in party-switching among certain demographics, I used a left_join to bring the data set that I used for my first hypothesis and the North Carolina voter registration files together. This added additional information to my already established data set including age, race, and gender.

demodataset <- left_join(finaldata, votereg, by = "voter_reg_num")

Note: I used the count() function to count the values in the “party_changes” column in both my original data set and the data set that resulted from joining the voter registration data and subsequent preprocessing (analyzing/eliminating duplicates and missing values) to ensure that they matched and I was continuing to analyze the same set of voters.

Age

To calculate which voters switch party-affiliation at the highest rate based on age, I began with my “demodataset” -which consisted of my original dataset and the added demographic information. I used mutate() to create an additional column with age ranges, which I established using the cut() function. I then used group_by(), summarise(), and count() to group the data by the previously established age ranges and the party_change variable (which uses a 1 to indicate no party switch, and 2 to indicate a party switch) and then count each combination.Finally, I added an additional column using mutate() which calculated the switch rate for each age range using the following formula: # of individuals who switched parties / the total number of individuals in each age range. The exact code follows, along with the data set which I use to graph the data later on.

agedata <- demodataset %>% mutate(age_range = cut(age_at_year_end, breaks = c(18, 30, 45, 60, 75, Inf), labels = c("18-29", "30-44", "45-59", "60-74", "75+")))

agedata <- agedata %>% group_by(age_range, party_changes) %>% summarise(count = n()) %>% pivot_wider(names_from = party_changes, values_from = count, values_fill = 0) %>% mutate(switch_rate = `2`/(`1` + `2`))

Age Range	No Switch	Switch	Switch Rate
18-29	59869	9821	0.1409
30-44	95924	13936	0.12685
45-59	128572	12918	0.0912
60-74	146414	12459	0.0784
75+	91758	7140	0.0722

The table (and the graph below) shows that the age range 18-29 has the highest rate of party-affiliation-switching at 14%. This means that of the voters between the ages of 18 and 29 in our data set, 14% of them engaged in party-switching. As age increases, the percentage of individuals who switched parties decreases, all the way through to ages 75+, which holds the lowest rate pf switching at 7.2%. My second hypothesis seems true: the younger population engages in higher rates of party-switching than the older population.

Below is the code used to create the graph for this data, and the graph itself. The code uses geom_bar() to create a bar graph with age range values on the x-axis and switch rates on the y-axis. Geom_text() labels each bar with the exact percentages, aes(fill = age_range) specifies the fill color for each bar (unique to each x-axis value), and labs() creates labels for each component on the graph for easy-visualization.

ggplot(agedata, aes(x = age_range, y = switch_rate, fill = age_range)) + geom_bar(stat = "identity", position = "dodge") + geom_text(aes(label = scales::percent(switch_rate)), position = position_dodge(width = 0.9), vjust = -0.5) + labs(title = "Switch Rates by Age Range", subtitle = "The percentage of each age-range who switched party-affiliations in between 2012 and 2014 in North Carolina", x = "Age Range", y = "Switch Rate")  +theme_minimal()

Looking at this graph, it is easy to see that the rates at which voters switch party-affiliations decreases as age increases, suggesting that hypothesis 2 was accurate.

Race

To calculate switch rates based on race, I used a similar strategy to the one described under “Age”, minus the step of identifying age ranges. Below is the code used to create a data set with switch rates for each unique value under the “race_code” variable, as well as the resulting data frame.

racedata <- demodata %>% group_by(race_code, party_changes) %>% summarise(count = n()) %>% pivot_wider(names_from = party_changes, values_from = count, values_fill = 0) %>% mutate(switch_rate = `2`/(`1`+`2`))

Race	No Switch	Switch	Switch Rate
A	3939	779	0.1651
B	94594	9096	0.0877
I	3416	323	0.0862
M	2070	304	0.128
O	11130	1757	0.1363
P	18	3	0.1429
U	26936	4372	0.1396
W	383366	39975	0.0944

I used the data frame above and the code below to graph the data similar to how the age data was graphed.

ggplot(racedata, aes(x = race_code, y = switch_rate, fill = race_code)) + geom_bar(stat = "identity", position = "dodge") + geom_text(aes(label = scales::percent(switch_rate)), position = position_dodge(width = 0.9), vjust = -0.5) + labs(title = "Switch Rates by Race", subtitle = "The percentage of each age-range who switched party-affiliations in between 2012 and 2014 in North Carolina", x = "Race", y = "Switch Rate", fill = "Race") + theme_minimal()

No particular hypothesis was tested here, but in the graph above it is evident that the Asian (A) race change party-affiliation at the highest rate, and Black and Indian (B/I) switch party-affiliation at the lowest rate, with Whites (W) also switching at a low-rate.

Party switching based on original affiliation

Aside from demographic trends, I was interested in looking at original party affiliations (in this case, party affiliation in 2012) and the rates at which each group swapped parties in between 2012 and 2014. It should be noted that this will really only show any trends in this particular two years, but if any amount of voters have switched their party affiliation before the data used in this project (2012) than it is not representative of overall trends in original party-affiliation switches. It might be indicative of an overall trend, but would need further analysis across a larger time-frame (this is a point where further research could be suggested).

Below is the code used to create the data set for analyzing switch rates across party affiliations at point A (2012). The results will display the rates at which each original party affiliation switched in between 2012 and 2014. Here I can gain evidence for/against my third hypothesis: those who originally identified as “unaffiliated” will switch party-affiliation at higher rates than those who identify with a party. This would support the theory that voters use “unaffiliated” as a placeholder rather than a permanent identification.

partyaffdata <- demodata %>% group_by(party_cd.x) %>% summarise(total_count = n(), switch_rate = sum(party_changes ==2)/total_count) %>% arrange(desc(switch_rate))

The graph below shows that in between the years of 2012 and 2014, liberals in North Carolina switched party-affiliation at a rate that is nearly double that of democrats, republicans, or those unaffiliated. This does not provide support for hypothesis 3, but could be used to make inferences about the political environment at the time that might explain why liberals switched at such a high rate. Or, with further analysis, it could be concluded that this is a typical trend across time.

ggplot(partyaffdata, aes(x = party_cd.x, y = switch_rate, fill = party_cd.x)) + geom_bar(stat = "identity") + geom_text(aes(label = scales::percent(switch_rate)), position = position_dodge(width = 0.9), vjust = -0.5) + labs(title = "Switch Rates by `Original` Party Affiliation", subtitle = "The percentage of each original party identification (id in 2012) who switched party-affiliations in between 2012 and 2014 in North Carolina", x = "Original ID", y = "Switch Rate", fill = "Original ID") + theme_minimal()

Final Conclusion/Suggestions for Further Research

Conclusion

Primary Election Participation (Hypothesis 1): Contrary to the initial hypothesis, there was little to no correlation between party-switching and subsequent primary election participation. While there was a slight difference in participation rates in the 2018 primary, it was not statistically significant, and the 2016 data did not support the hypothesis.

Age/Demographics (Hypothesis 2): The analysis revealed a clear trend in party-switching rates based on age. Younger individuals, aged 18-29, exhibited the highest rate of party-switching at 14%, while the rate decreased with increasing age groups, reaching 7.2% for individuals aged 75 and above. This finding supported the second hypothesis that younger populations are more likely to engage in party-switching.

Original Party Affiliation (Hypothesis 3): The analysis of party-switching rates based on original party affiliation (in 2012) revealed that liberals had the highest switching rate, nearly double that of Democrats, Republicans, and the unaffiliated. This result did not support the hypothesis that voters with an initial identification as “unaffiliated” would switch parties at higher rates.

Discussion/Suggestions

While this data analysis only scratched the surface of overall trends in party-switching by analyzing a subset of voters in North Carolina, further research endeavors could contribute to a more comprehensive understanding of party-switching dynamics and incentives for switching. Extending the analysis to cover a longer time frame to identify any temporal trends in party-switching behavior could provide a more complete understanding of how political dynamics and voter behaviors evolve over time. Specifically, this project could be continued on from 2014 to 2022 for a greater understanding of party-switching behavior in the most recent decade. Additionally, it might be interesting to investigate external factors, such as major political events that might influence party-switching behavior. Understanding contextual elements could offer a more nuanced interpretation of observed trends.

Ultimately, the behavior of party-switching is a pretty nuanced topic to study, with various external sociopolitical factors that may influence why each individual chooses to change parties at a given point in time. Predicting electoral participation based on the behavior of party switching, or identifying demographic trends in the behavior, cannot provide any solid answer to why people switch parties. Still, identifying trends in voter-behavior could be applied to predicting electoral outcomes: understanding how and why voters change affiliations may help in predicting potential shifts in voting patterns, which can be valuable for political strategists and pollsters. Identifying these trends may also provide insight into how specific subgroups engage with the political environment, which is essential to understand for the purpose of tailoring messages and campaigns to diverse audiences.

Final Project Rpubs

Kaylee Dion

2023-12-03