Let’s take a look at the voting history for every state since 1976. Our data comes from the MIT Election Lab. In addition, we make use of the tidyverse package.
1976 electoral map
Before you knit this document, you will need to install the package tidyverse with the code install.packages("tidyverse"). You may run this code in the console window. A package only needs to be installed once, but must be loaded from your library each time you wish to use it. Below is the code to load the package from your library after it has been installed.
library(tidyverse)
We will work with the data vote.
url <- "https://dataverse.harvard.edu/api/access/datafile/:persistentId?persistentId=doi:10.7910/DVN/42MVDX/ZBRZDY"
vote <- read.csv(file = url, header = T, sep = "\t")
vote <- as_tibble(vote)
Let’s take a glimpse at what vote contains.
glimpse(vote)
## Observations: 3,739
## Variables: 14
## $ year <int> 1976, 1976, 1976, 1976, 1976, 1976, 1976, 1976,...
## $ state <fct> Alabama, Alabama, Alabama, Alabama, Alabama, Al...
## $ state_po <fct> AL, AL, AL, AL, AL, AL, AL, AK, AK, AK, AK, AZ,...
## $ state_fips <int> 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 4, 4, 4, 4, 4,...
## $ state_cen <int> 63, 63, 63, 63, 63, 63, 63, 94, 94, 94, 94, 86,...
## $ state_ic <int> 41, 41, 41, 41, 41, 41, 41, 81, 81, 81, 81, 61,...
## $ office <fct> US President, US President, US President, US Pr...
## $ candidate <fct> Carter, Jimmy, Ford, Gerald, Maddox, Lester, Bu...
## $ party <fct> democrat, republican, american independent part...
## $ writein <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE,...
## $ candidatevotes <int> 659170, 504070, 9198, 6669, 1954, 1481, 308, 71...
## $ totalvotes <int> 1182850, 1182850, 1182850, 1182850, 1182850, 11...
## $ version <int> 20171015, 20171015, 20171015, 20171015, 2017101...
## $ notes <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
Our primary variables of interest will be
Let’s add a new variable propvotes that is the proportion of votes for each candidate in each state for each election.
vote <- vote %>% mutate(propvotes = candidatevotes / totalvotes)
glimpse(vote)
## Observations: 3,739
## Variables: 15
## $ year <int> 1976, 1976, 1976, 1976, 1976, 1976, 1976, 1976,...
## $ state <fct> Alabama, Alabama, Alabama, Alabama, Alabama, Al...
## $ state_po <fct> AL, AL, AL, AL, AL, AL, AL, AK, AK, AK, AK, AZ,...
## $ state_fips <int> 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 4, 4, 4, 4, 4,...
## $ state_cen <int> 63, 63, 63, 63, 63, 63, 63, 94, 94, 94, 94, 86,...
## $ state_ic <int> 41, 41, 41, 41, 41, 41, 41, 81, 81, 81, 81, 61,...
## $ office <fct> US President, US President, US President, US Pr...
## $ candidate <fct> Carter, Jimmy, Ford, Gerald, Maddox, Lester, Bu...
## $ party <fct> democrat, republican, american independent part...
## $ writein <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE,...
## $ candidatevotes <int> 659170, 504070, 9198, 6669, 1954, 1481, 308, 71...
## $ totalvotes <int> 1182850, 1182850, 1182850, 1182850, 1182850, 11...
## $ version <int> 20171015, 20171015, 20171015, 20171015, 2017101...
## $ notes <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ propvotes <dbl> 5.572727e-01, 4.261487e-01, 7.776134e-03, 5.638...
Let’s take a look at the state of Florida and how the proportion of votes for Democrat and Republican candidates changed from 1976 to 2016.
state.of.interest <- "Florida"
party.of.interest <- c("democrat", "republican")
vote %>%
filter(party %in% party.of.interest & state == state.of.interest) %>%
ggplot(mapping = aes(x = year, y = propvotes)) +
geom_line(mapping = aes(color = party), size = 1.5) +
scale_color_manual(values=c("blue", "red")) +
scale_x_continuous(breaks = vote$year) +
labs(x = "Year", y = "Proportion of Votes", color = "Party")
In the above chunk, change state.of.interest to a different state of your choice. Be sure to capitalize the first letter, put the state in quotes, and spell the state’s name correctly.
Next, let’s look at four states and see how the proportion of votes for Democrat changed over time.
states.of.interest <- c("Pennsylvania", "Ohio", "Michigan", "Florida")
party.of.interest <- "democrat"
vote %>%
filter(party %in% party.of.interest & state %in% states.of.interest) %>%
ggplot(mapping = aes(x = year, y = propvotes)) +
geom_line(mapping = aes(color = state), size = 1.5) +
scale_x_continuous(breaks = vote$year) +
labs(x = "Year", y = "Proportion of Votes", color = "State")
In the above chunk, change states.of.interest to a new set of states. You do not need to choose four. Be sure to separate states with a comma and place each state in quotes.
Let’s now expand on the above plot. For each state of interest, we plot the proportion of Democrat and Republican votes for each election.
states.of.interest <- c("Pennsylvania", "Ohio", "Michigan", "Florida")
party.of.interest <- c("democrat", "republican")
vote %>%
filter(party %in% party.of.interest & state %in% states.of.interest) %>%
ggplot(mapping = aes(x = year, y = propvotes)) +
geom_line(mapping = aes(color = party), size = 1.5) +
scale_color_manual(values=c("blue", "red")) +
facet_wrap(~state) +
scale_x_continuous(breaks = vote$year) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
labs(x = "Year", y = "Proportion of Votes", color = "State")
Change states.of.interest to a set of states of your interest.
Let’s look at the total votes for each party from the 1992 election. George H.W. Bush lost this election, in part, due to the large number of votes for the Independent candidate, Ross Perot. We only display parties that received more than 100,000 votes.
year.of.interest <- 1992
vote %>% filter(year == year.of.interest) %>%
group_by(party) %>% summarize(total.votes = sum(candidatevotes)) %>%
filter(total.votes > 100000)
## # A tibble: 7 x 2
## party total.votes
## <fct> <int>
## 1 "" 181453
## 2 conservative 177000
## 3 democrat 44856747
## 4 independent 19829462
## 5 libertarian 280848
## 6 republican 38798913
## 7 right-to-life 127959
Change year.of.interest to a new year and see the votes by party with a minimum of 100,000 votes. What do you think “” means in row 1?
Given the data at hand, think of a plot you would like to create to better visualize the information in vote. Think bigger than just bar plots, scatter plots, and line graphs. Later you will learn how to make all of the above plots and more. Type your answer below.