Niko Hellman & Serena Richard
Basic demographics: age, sex, race (include hispanic variable?)
Extended demographics: labforce (labor force), educ99 (educational attainment), statefip (state), metro (metropolitan area)
Voting variables: votehow, votewhen, voted
Year: 2016 only
After the historic November election just a few weeks ago, it is ever apparent how valuable voter data can be to understanding the beliefs and actions of our country’s people. Maps of the U.S. were scattered over social media depicting voter-turnout based on key demographic identities such as race, gender and age. As we know, those three key demographic identities and their intersections with themselves and other confounding factors bring to light key systemic issues that in turn create voter suppression. Unique to an election during a pandemic, we saw voters vote by mail in record numbers, changing election day into election week and pulling in more voters than ever before in our country’s history. This dataset will provide important insights into the relationships between communities of people and their voter status. Without the most recent 2020 November election data, we will focus on the last presidential election (November 2016) by piping our data through the YEAR variable. Our overarching research question is what are the relationships between demographic variables (such as age, race, gender, education, employment status, and geographic location) and voter turnout in the 2016 November election? We set out to understand three questions. How do basic demographics interact with the metro, state, labor force, education variables? How do basic demographics impact how, when, and if you vote? How does metro, labor force, education impact how, when, and if you vote? Understanding these relationships will provide valuable evidence when identifying where changes must be made in order to increase voter turnout and decrease voter suppression in future elections.
We would like to meet with you to further specify our questions and find interesting relationships, as well as learn to make map visuals! This is what we have looked at so far but we plan to continue with more EDA beyond what we have here.
# The data set including only our variables of interest:
head(newsub)
## # A tibble: 6 x 11
## YEAR STATEFIP METRO AGE SEX RACEHISP LABFORCE EDUSIMPLE VOTEHOW VOTEWHEN
## <dbl> <chr> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <chr>
## 1 2016 1 2 70 1 1 2 4 1 1
## 2 2016 1 2 63 2 1 1 3 99 99
## 3 2016 1 2 59 1 2 2 3 99 99
## 4 2016 1 2 79 2 2 1 1 99 99
## 5 2016 1 2 57 2 2 1 1 99 99
## 6 2016 1 2 80 1 1 1 5 1 1
## # … with 1 more variable: VOTED <chr>
Perhaps the most important thing this visual highlighted is the high proportion of white voters in this data set. This begs the question: is this true for the true American population of 2016? Because of this, compared to white voters, other racial categories have smaller counts. From what we can see, it appears more individuals voted than not in each category. We would love to learn how to add percentage amounts to these bars to better clarify these results. This could clarify whether the proportion of voters to non-voters is greater for certain groups.
Note that for education level, 0 = no school; 1 = some school but no high school diploma; 2 = high school graduate or GED; 3 = some college but no degree; 4 = Associate degree; 5 = Bachelors degree; 6 = Masters degree; and 7 = Professional or Doctorate degree. We observe that there are more voters in the labor force than out. The highest voting demographic in these categories are those with a bachelors degree in the labor force. However, outside of the labor force, we observe that those with only up to a high school education voted the most.
These graphs provide insight because, although they do not look at voting-specific variables, it highlights that white men dominate the work force. Combined with our earlier graph that showed how those in the labor force vote at a higher turnout than those not in the labor force, it might make sense. However our earlier graph also shows that white women have a higher voter turnout than men. This highlights the complexities of this data and the highly intersectional nature of voting demographics.
AGE (gives each person’s age at last birthday): 0 = under 1 year; 1 = 1; … 90 = 90 (90+, 1988-2002); 91 = 91; … 99 = 99+
SEX: 1 = Male; 2 = Female
RACEHISP (modified to include Hispanic): 1 = White; 2 = Black; 3 = American Indian/Aleut/Eskimo; 4 = Asian or Pacific Islander; 5 = More than one race; 6 = Hispanic; 9 = unknown/blank; *** disproportionally over-represents white people
STATEFIP (dentifies the household’s state of residence, using FIPS coding scheme which orders states alphabetically - Need to ask Heather about how to clean this up)
LABFORCE: 1 = not in labor force; 2 = in labor force
EDUSIMPLE: 0 = no school; 1 = some school but no high school diploma; 2 = high school graduate or GED; 3 = some college but no degree; 4 = Associate degree; 5 = Bachelors degree; 6 = Masters degree; 7 = Professional or Doctorate degree
METRO (metropolitan area): 1 = Not in metro area; 2 = central city; 3 = outside central city
VOTEHOW: 2 = By mail; 1 = In person
VOTEWHEN: 1 = on election day; 2 = before election day
VOTED: 1 = did not vote; 2 = voted
YEAR, pipe data to only use 2016 election data Reports the year in which the survey was conducted