Voter turnout in the United States is vastly unequal and polarized. The wealthy are more likely to vote than the poor. If the poor are less likely to vote, then they have a lower probablility of having their interests, preferences and concerns met in public policy by the president elect.
While the association between higher family incomes and greater voter turnout is well documented, the reason for this relationship is not fully understood. During the 2016 presidential election, there was a clear positive association between family income and participation in voting. By international standards, the percentage of people who vote in the United States is low by international standards. Furthermore, voting rates are unevenly distributed across socioeconomic groups, with much lower voting rates among the poor.
In this project, I explore the connection between voter turnout in 2018 and median household income.
I used three different sites to gather, clean and utilize the data. The first is 2018 general election data from ElectProject. The second location from which I downloaded data is from the World Population Review. This site presented 2020 voter data per state that was interesting to compare and contrast with 2018 data taken from the previous site. The final source I utilized for data is the United States Census Bureau, from which I downloaded a dataset based on the characteristics of sex and poverty from 2018 voter population in that given year.
I first loaded the necessary packages and loaded my csv and xlsx files. I assigned a variable name to each dataset, and then analyzed different aspects of each dataset to compile interesting visualizations and conclusions. I also used Tableau to create two unique geo-spatial visualizations pertaining to voter turnout in both 2018 and 2020.
I made a map component of 2018 voter turnout data on Tableau before beginning to code for specific visualizations in R. Here is a link to the first geo-spatial map I made, titled “Voter Turnout by State 2018.” The map displays both voter turnout for that given year, as well as median household income for the 2018 year. According to the map, it can be concluded that states including Virginia, Washington, California, New York, Boston, Alaska, Hawaii, and other states with the teal-dark blue coloring have higher median household incomes, between $60,100-$82,600.
Of the states in the teal-dark blue color, five of them are swing states (Colorado, Minnesota, Nevada, Virginia and Washington). The phrase swing states refers to any state that could reasonably be won by either the Democratic or Republican presidential candidate by a swing in votes. In the past couple of elections, these are the noteworthy swing states:
While it is interesting to note that five of the swing states have the upper-higher level of median household income, it is also noteworthy that those select states do not have larger red circles, indicators of a higher total number of ballots being collected. While this does not reflect the hypothesis on a smaller scale, the hypothesis is reflected on a larger scale. Generally speaking, from looking at the map as a whole, the dark green-dark blue-colored states have larger red circles, indicators of higher ballot count.
I next turned to R to analyze more of the data in a specific manner. In R, I first took a look at analyzing which states had the most vote count for highest office. I arranged the data by sorting by the top 20 states.
library(tidytext)
library(tidyverse)
install.packages("readxl")
## Error in contrib.url(repos, "source"): trying to use CRAN without setting a mirror
library(readxl)
statevoter <- read_xlsx("~/Downloads/statevoter2018.xlsx")
statevoter %>%
head(20) %>%
arrange(desc(`2018 Vote for Highest Office`)) %>%
ggplot(aes(reorder(State, `2018 Vote for Highest Office`), `2018 Vote for Highest Office`)) + geom_col() + coord_flip() -> chart1
# Change appearance of chart 1
chart1 + theme_light() +
xlab("States with the Most Votes") +
ylab("Number of Votes for Highest Office") +
ggtitle("States with the Most Votes for Highest Office 2018")
From this chart, it is evident that California, Florida, Illonois, Georgia and Colorado have the most votes for highet office positions in 2018. In this chart, two of the states with the most votes for highest office are swing states (Colorado and Florida).
In R, I also analyzed the states with the highest overall ballot count in a similar visualization.
statevoter %>%
head(20) %>%
arrange(desc(`Estimated or Actual 2018 Total Ballots Counted`)) %>%
ggplot(aes(reorder(State, `Estimated or Actual 2018 Total Ballots Counted`), `Estimated or Actual 2018 Total Ballots Counted`)) + geom_col() + coord_flip() -> chart2
# Change appearance of chart 2
chart2 + theme_light() +
xlab("States with Most Ballots Counted") +
ylab("2018 Total Ballots Counted") +
ggtitle("States with the Highest Ballot Count 2018")
Of the states with the highest overall ballot count in 2018, two of them are swing states (also Colorado and Florida). From the map visualization, Colorado is in the middle-upper median household income range of $60,100-$70,100. Florida is in the middle-lower median household income range of $51,400-$56,200.
Additionally, I loaded another dataset into R that took a more detailed look at voter characteristics including sex and poverty level. I thought that this would be interesting to take a look at while analyzing the 2018 voter turnout.
I first wanted to sort and view the data by states with the highest male vote estimate in 2018.
sexpoverty <- read_xlsx("~/Downloads/sexpoverty.xlsx")
sexpoverty %>%
arrange(desc(`Men Estimate`)) %>%
ggplot(aes(reorder(`State Name`, `Men Estimate`), `Men Estimate`)) + geom_col() + coord_flip() -> chart3
# Change appearance of chart 3
chart3 + theme_light() +
xlab("States") +
ylab("Number of Male Votes") +
ggtitle("States with Highest Number of Male Votes 2018")
From this chart, we can see that North Carolina, Florida, Ohio, Pennsylvania, Michigan, Texas, Illonois, New York and California tend to have a higher number of male voters in 2018. Of these states, Florida, Michigan, North Carolina, Ohio and Pennsylvania are all swing states.
I next did the same process but for female voters.
sexpoverty %>%
arrange(desc(`Women Estimate`)) %>%
ggplot(aes(reorder(`State Name`, `Women Estimate`), `Women Estimate`)) + geom_col() + coord_flip() -> chart4
# Change appearance of chart 4
chart4 + theme_light() +
xlab("State") +
ylab("Number of Female Votes") +
ggtitle("States with Highest Number of Female Votes 2018")
From this chart, we can see that Idaho, Florida, Virginia, Ohio, Pennsylvania, Georgia, New York, Texas, Illonois and California had higher female voter numbers in 2018. It is also seen that Florida, Ohio and Pennsylvania are all swing states that had higher numbers of men and women voting in the 2018 election.
Bringing the analysis back to income and financial stability, I explored the states with the highest number of individual voting who were located below poverty level.
sexpoverty %>%
arrange(desc(`Below Poverty Level Estimate`)) %>%
ggplot(aes(reorder(`State Name`, `Below Poverty Level Estimate`), `Below Poverty Level Estimate`)) + geom_col() + coord_flip() -> chart5
# Change appearance of chart 5
chart5 + theme_light() +
xlab("State") +
ylab("Number of Votes Below Poverty Level") +
ggtitle("States with Highest Concentration of Voters Below Poverty Level 2018")
From this visualization, I can conclude that North Carolina, Michigan, Ohio, Florida, Georgia, Texas, New York, Pennsylvania, Illinois and California are the states with a higher concentration of poorer voters. Of these states, five of them are swing states (Florida, Michigan, North Carolina, Ohio and Pennsylvania).
So, what does all of this say about swing states?
Presidential candidates tend to get to know the swing states well and visit them frequently. States that have been voting the same party for decades, and are party loyal, are places where candidates can raise a lot of money, but know that if candidates spend their resources elsewhere they probably will not lose their vote.
Since swing states are not generally party loyal, they can give the candidate an extra boost they may need to win the election. It usually comes down to swing states in close elections to determine who the next elected President of the United States is.
Swing states may have higher numbers of people voting in general, as well as people below the poverty level voting, in response to the presidential candidates actively visiting, interacting with, and sharing insight with these states.
I chose to incorporate data from 2020 due to research and trends I have noticed within the news and politics today. According to the New York Times, very large voter turnout is expected in the year 2020.
In last year’s midterm elections, key blocks of voters rose in unheard of numbers, with Democrats - especially women - scoring large victories across the country. That flipped control of the U.S. House of Representatives back to Democrats for the first time since 2011.
According to an article from USA Today, the 2020 election is likely to come down to not just the swing states themselves but the voting blocks that define them, and whether or not they show up at the polls.
For 2020, I also wanted to create a map visualization on Tableau. Here is a link to the second map I have made and published to Tableau Public for 2020 voter data. Similar to the 2018 map, this one analyzes voter turnout with median household income. The 2020 map sorts the states by the total number of votes for highest office. In the details of each state, the eligible voting population is also listed.
It is interesting to see that the median household income levels for each state pertaining to the color range have stayed pretty consistent. I was not expecting much change to occur between the two-year gap in regards to median household income.
From this map, I was able to observe which states would approximately have higher number of higher office votes according to the size of the red points on each state. The states with most number of votes for higher office are, in large part, comprised of swing states. Swing states including Colorado, Florida, Michigan, Minnesota, North Carolina, Pennsylvania, Virginia and Washington have clearly been targeted more by presidential candidates and politicians during the election to not only get more voters but get more individuals to participate in the voting polls for higher office roles.
I next turned to R to analyze more of the data in a specific manner. After loading the csv dataset, I was able analyze which states would have the most votes cast for higher office.
voter2020 <- read.csv("~/Downloads/voter2020.csv")
voter2020 %>%
head(20) %>%
arrange(desc(totalHighestOffice)) %>%
ggplot(aes(reorder(State, totalHighestOffice), totalHighestOffice)) + geom_col() + coord_flip() -> chart6
# Change appearance of chart 6
chart6 + theme_light() +
xlab("State") +
ylab("Number of Votes for Highest Office") +
ggtitle("States with Highest Number of Votes for Highest Office 2020")
According to the chart that the code presents, Florida, Pennsylvania, Ohio, Michigan, North Carolina and Virginia are the top six states that have the highest number of votes for those in highest office. All of these states are swing states. What’s more is that every single swing state ranks in the top 20 visualization, apart from Nevada. It is clear that the swing states are the intended audience, target and may even be the deciding factor in the 2020 elections.
In R, I also looked at the states with the largest eligible voter population.
voter2020 %>%
head(20) %>%
arrange(desc(votingEligiblePop)) %>%
ggplot(aes(reorder(State, votingEligiblePop), votingEligiblePop)) + geom_col() + coord_flip() -> chart7
#Change appearance of chart 7
chart7 + theme_light() +
xlab("State") +
ylab("Total Voting Eligible Population") +
ggtitle("States with the Highest Eligible Voting Population Total 2020")
The results in this visualization are consistent with those from the previous chart. Florida, Pennsylvania, Ohio, Michigan, North Carolina and Virginia are the states that rank in the top six with the highest eligible voter population. It only makes sense that presidential candidates target these states. With more eligible voters to contribute to the polls and swing the results, getting these specific states to vote for or against candidates has not been more crucial. Again, all of the swing states, apart from Nevada, are pictured in the visualization. Nevada tends to have a lower number of eligible voters as well as a lower median household income.
It is important to note that the fact that higher income is associated with greater voter turnout does not necessarily mea that having higher income causes people to be more likely to vote. In theory, there are reasons why higher incomes could make people more likely or inclined to vote.
Voting an be a costly activity. Doing so requires time, skills, information, a certain level of health, access to transportaion and more. It is possible that having higher incomes provides people with resources that make the activity of voting easier.
From the insights I have gathered, states with a higher voter turnout tend to have higher household median incomes. However, as the elections and political climate continues to change, so does the type of voter turnout that we tend to see. The states that we now see are being targeted are the swing states, with a higher eligible voting population, with a lower and middle median household income. While the middle and upper-middle income earning states are still crucial to elections, the lower and middle income-earning states either want to display their voices and concerns more, or they are being marketed to and targeted in more effective methods than before.