Amelia Thomson-DeVeaux, Mithani J, Bronner L. Why Millions Of Americans Don’t Vote. FiveThirtyEight. Published online October 26, 2020. Accessed September 1, 2023. https://projects.fivethirtyeight.com/non-voters-poll-2020-election/

Introduction

Understanding Non-Voting Behavior in the U.S.: The Role of Socio-Economic Factors

Many non-voters in the U.S. often go unnoticed, but they are significant. We ask: What patterns or traits define non-voters? Can these traits help predict non-voting behavior?

This research focuses on whether those with lower education and income are more likely to avoid voting. We aim to highlight these overlooked groups, delve into their voting behavior, and provide strategies for improved voter outreach and policy decisions.

Step 1:

We’re utilizing a dataset from FiveThirtyEight. You can access it directly from their GitHub: https://raw.githubusercontent.com/fivethirtyeight/data/master/non-voters/nonvoters_data.csv).

#load the data
url <- "https://raw.githubusercontent.com/fivethirtyeight/data/master/non-voters/nonvoters_data.csv"
raw_data <- read.csv(url, stringsAsFactors = FALSE)

Step 2:

We’re processing the dataset to select only the relevant columns for our research.

# Subset the data for the desired columns
voting_data <- raw_data %>%
  select(RespId, educ, race, gender, ppage, income_cat, voter_category)

# Inspecting the first few rows of the data
head(voting_data)
##   RespId                educ  race gender ppage    income_cat voter_category
## 1 470001             College White Female    73      $75-125k         always
## 2 470002             College White Female    90 $125k or more         always
## 3 470003             College White   Male    53 $125k or more       sporadic
## 4 470007        Some college Black Female    58       $40-75k       sporadic
## 5 480008 High school or less White   Male    81       $40-75k         always
## 6 480009 High school or less White Female    61       $40-75k   rarely/never
# Displaying the top 10 rows as a table using kable
voting_data %>%
  head(10) %>%
  kable()
RespId educ race gender ppage income_cat voter_category
470001 College White Female 73 $75-125k always
470002 College White Female 90 $125k or more always
470003 College White Male 53 $125k or more sporadic
470007 Some college Black Female 58 $40-75k sporadic
480008 High school or less White Male 81 $40-75k always
480009 High school or less White Female 61 $40-75k rarely/never
480010 High school or less White Female 80 $125k or more always
470008 Some college Other/Mixed Female 68 $75-125k always
470010 College White Male 70 $125k or more always
470011 Some college White Male 83 $125k or more always

Step 3: Descriptive Statistics

We’ll begin by summarizing some basic statistics about our data.

# Generate summary
summary(voting_data)
##      RespId           educ               race              gender         
##  Min.   :470001   Length:5836        Length:5836        Length:5836       
##  1st Qu.:472070   Class :character   Class :character   Class :character  
##  Median :474152   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :474654                                                           
##  3rd Qu.:476218                                                           
##  Max.   :488325                                                           
##      ppage        income_cat        voter_category    
##  Min.   :22.00   Length:5836        Length:5836       
##  1st Qu.:36.00   Class :character   Class :character  
##  Median :54.00   Mode  :character   Mode  :character  
##  Mean   :51.69                                        
##  3rd Qu.:65.00                                        
##  Max.   :94.00

Survey’s gender distribution:

Of the survey respondents, 2,896 are female and 2,940 are male, showing a nearly even gender distribution.

# Majority male or female?
table(voting_data$gender)
## 
## Female   Male 
##   2896   2940

Survey’s average education level:

To find the average education level, we’ll convert the string values in the “educ” column to numbers first.

voting_data$educ_numeric <- as.numeric(factor(voting_data$educ, levels = c("High school or less", "Some college", "College"), ordered = TRUE))

Now that “educ_numeric” is numeric, we can compute the dataset’s average education level.

average_education <- mean(voting_data$educ_numeric, na.rm = TRUE)

The average education score is 2.09, based on our scale:

  • 1 for “High school or less”
  • 2 for “Some college”
  • 3 for “College”

This means most respondents have had some college education, with a few fully completing college, and others only finishing high school or less.

Step 4: Visualizing and interpreting data across different categories

1. Voting Behavior by Education Level

ggplot(voting_data, aes(x=educ, fill=voter_category)) +
  geom_bar(position="fill") +
  theme_minimal() +
  labs(y="Proportion", x="Education Level", title="Voting Behavior by Education Level", fill="Voted?")

From the graph, it appears that individuals with some college education or who went to college vote more than those with just a high school education or less.

Also, those with only a high school education or less tend to never vote more often than those with “Some college” and “College”.

2. Voting Behavior by Race

ggplot(voting_data, aes(x=race, fill=voter_category)) +
  geom_bar(position="fill") +
  theme_minimal() +
  labs(y="Proportion", x="Race", title="Voting Behavior by Race", fill="Voted?")

The graph shows that white and black individuals vote more than Hispanic and those of Other/Mixed Races. However, this graph alone does not offer significant insights as to why groups are not voting, so these data alone might not be very valuable for our study.

3. Voting Behavior by Gender

ggplot(voting_data, aes(x=gender, fill=voter_category)) +
  geom_bar(position="fill") +
  theme_minimal() +
  labs(y="Proportion", x="Gender", title="Voting Behavior by Gender", fill="Voted?")

The graph indicates that both female and male voters have similar patterns when it comes to always voting, rarely/never voting, and voting sporadically.

4. Voting Behavior by Income Category

ggplot(voting_data, aes(x=income_cat, fill=voter_category)) +
  geom_bar(position="fill") +
  theme_minimal() +
  labs(y="Proportion", x="Income Category", title="Voting Behavior by Income Category", fill="Voted?")

The graph shows that individuals earning less than $40k vote less frequently than the other income categories. Their likelihood of never or rarely voting is higher than those with higher incomes. Thus, income might be a good indicator to predict non-voting behavior.

5. Interaction Between Education and Income

Let’s start by observing the interaction between educ and income_cat in determining voter_category.

ggplot(voting_data, aes(x=educ, fill=voter_category)) +
  geom_bar(position="fill") +
  facet_wrap(~ income_cat, scales = "free", ncol = 3) +
  theme_minimal() +
  labs(y="Proportion", x="Education Level", title="Interaction of Education and Income on Voting Behavior", fill="Voted?") +
  theme(legend.position="bottom", axis.text.x = element_text(size=6, angle=30, hjust=1)) # Adjusting font size here

The graphs suggest a relationship between income and education among non-voters.

The first graph shows that among those in the highest income level with a “High school or less” education vote less often than those of higher education. However the difference is not very obvious.

As we go down in income level, we see the rates of non-voters increase as the education level decreases.

Evidently, we see the last graph indicates that individuals earning less than $40k and with only “High school or less” are more likely to abstain from voting.

Discussion

Our research highlights notable patterns in voting behavior across different demographic groups:

Conclusion

In conclusion, while several factors contribute to voting behavior, education and income emerge as significant predictors. Understanding these patterns can guide outreach efforts and policies aiming to bolster voter participation.