library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(tidyr)
library(stringr)
library(readr)
url <- "http://dl.tufts.edu/file_assets/generic/tufts:MS115.003.001.00001/0"
if (!file.exists("all-votes.tsv")) {
  download.file(url, "nnv-all-votes.zip")
  unzip("nnv-all-votes.zip", files = "all-votes.tsv")
}
nnv <- read_tsv("all-votes.tsv")

names(nnv) <-names(nnv) %>% str_to_lower() %>% str_replace_all("\\ ", "_")

nnv <- nnv %>% 
  mutate(year = str_extract(date, "\\d{4}") %>% as.integer())

Exploring Multiple Elections in a Single Election Year

I started exploring the New Nation Votes dataset by filtering it down to just elections from the state of Massachusetts. My initial intent was to explore the Governor and Lieutenant Governor offices. I noticed early on that individuals were receiving votes in multiple offices during the same election year. For example, when looking at the voting returns for Samuel Adams for Lt. Governor in 1787 and 1788 he also received votes for Governor, U.S. House of Representatives, and Electoral College.

nnv_ma <-nnv %>%
    filter(state == "Massachusetts") %>% 
    filter(is.na(district) & is.na(town)& is.na(city)& is.na(county)) %>% 
    distinct() %>% 
    arrange(year)
nnv_ma %>% 
  filter(year==1787, name_id=="AS0022") %>% 
  ggplot(aes(x=office))+
  geom_bar(stat = "identity", aes(y=vote), fill="red") +
  geom_text(aes(label=vote, y=vote), position=position_dodge(width=0.9), vjust=-0.25)+
  labs(title="Samuel Adams' 1787 votes per Office", y="Total Votes", x='')+
  theme(axis.text.x =
               element_text(size  = 10,
                            angle = 45,
                            hjust = 1,
                            vjust = 1))

nnv_ma %>% 
  filter(year==1788, name_id=="AS0022") %>% 
  ggplot(aes(x=office))+
  geom_bar(stat = "identity", aes(y=vote), fill="#0072B2") +
  geom_text(aes(label=vote, y=vote), position=position_dodge(width=0.9), vjust=-0.25)+
  labs(title="Samuel Adams' 1788 votes per Office", y="Total Votes", x='')+
  theme(axis.text.x =
               element_text(size  = 10,
                            angle = 45,
                            hjust = 1,
                            vjust = 1))

Probing further into the political career of Samuel Adams, I wanted to know if this was an isolated occurrence or did he receive votes in multiple offices often. I decided to plot out his political career using a geom line graph while separating the different offices by color. The resulting graph highlights the drastic increase in votes Samuel Adams received for both Lt. Governor and Governor in the 1790s and then steep drop off as well.

Samuel Adams provides an interesting case but I need to see the bigger picture for the state of Massachusetts. Again, was Samuel Adams’ political career the anomaly or was it a common occurrence for candidates to receive votes in multiple offices? With what frequency did this occur and was it increasing or decreasing over time? In order to get at the answers to these questions I needed to count up the observations while also filtering out null values and second/third/fourth/etc. ballot counts. The results would document who of all the Massachusetts candidates received votes in multiple offices in a single election year and how many offices in that year.

nnv_ma_count <- nnv_ma %>%  
  filter(iteration == "First Ballot") %>% 
  count(name, name_id, id, year) %>%
  filter(name_id != "null") %>% 
  count(name, name_id, year) %>% 
  filter(n >= 2)
  
nnv_ma_count %>%  
  ggplot(aes(x = n))+
  geom_bar(stat = "count")+  
  labs(title="Total Number of Occurences by Office", y="Total Count", x='Number of Offices')

The previous bar chart highlights that candidates received votes in as many as seven offices in any given election year. That seems exceptionally high as I am not even sure how many offices are available for election. I needed the specifics of each of the elections that they were nominated in multiple offices. By left joining my multi-election dataset back tot he original, I am left with all the observations for the politicians I am most interested in. Now, the New Nation Votes contains different types of elections (General, Special, Legislative) which can skew my results. I I plotted them on a bar chart in order to compare them against one another. Then I added a line chart in order to see their frequency over time.

multi_elections <-nnv_ma_count %>% 
  left_join(nnv_ma, by = c("name","name_id", "year"))

multi_elections %>% 
  ggplot(aes(x = type))+
    geom_bar(stat = "count")

multi_elections %>% 
  count(year, type) %>% 
  ggplot(aes(x = year))+
  geom_line(aes(y=n, color = type) )+
  labs(title="Frequency of Election Types over Time", y = "Total Number of Offices Voted For", x = 'Year', colour = "Election Type")

These multi-office elections were overwhelmingly a result of the general election. Furthermore, they appear to be decline as the Republic begins to establish itself further into the nineteenth century. The spikes in the general election line graph indicates some sort of anomaly or issue with my computation. That can be explored later. What I am interested in now is whether a candidate received a large number of votes in multiple offices or were they concentrated in one with a smattering in other offices. In addition, if they received large voting turn outs in multiple offices, did they ever win more than one office in the same election year?

winner_multi_elections <- multi_elections %>% 
  filter(iteration == "First Ballot" & type == "General") %>% 
  ungroup() %>% 
  select(id) %>% 
  distinct() %>% 
  left_join(nnv_ma, by = c("id")) %>% 
  group_by(id) %>% 
  mutate(won = vote == max(vote, na.rm = TRUE)) %>% 
  filter(won)

multi_winner <- winner_multi_elections %>% 
  count(name, name_id, year) %>% 
  ungroup() %>% 
  filter(n>=2) %>% 
  arrange(desc(n))

multi_winner %>% 
  print()
## Source: local data frame [7 x 4]
## 
##              name name_id  year     n
##             (chr)   (chr) (int) (int)
## 1      Abiel Wood  WA0060  1812     2
## 2 Abraham Lincoln  LA0041  1813     2
## 3 Abraham Lincoln  LA0041  1814     2
## 4      Elisha May  ME0023  1796     2
## 5     Jabez Upham  UJ0005  1806     2
## 6    Samuel Adams  AS0022  1794     2
## 7    Samuel Lyman  LS0018  1792     2

The table shows that a total of seven candidates who won multiple offices in a single election year. It happened in the 1790s and largely in the 1810s. What offices did they win?

multi_office_winners <- multi_winner %>% 
  left_join(winner_multi_elections, by = c("name", "name_id", "year")) %>% 
  select(name, year, office, id) %>% 
  ungroup() %>% 
  arrange(year) %>% 
  print()
## Source: local data frame [14 x 4]
## 
##               name  year                        office
##              (chr) (int)                         (chr)
## 1     Samuel Lyman  1792 U.S. House of Representatives
## 2     Samuel Lyman  1792 U.S. House of Representatives
## 3     Samuel Adams  1794           Lieutenant Governor
## 4     Samuel Adams  1794                      Governor
## 5       Elisha May  1796 U.S. House of Representatives
## 6       Elisha May  1796             Electoral College
## 7      Jabez Upham  1806 U.S. House of Representatives
## 8      Jabez Upham  1806      House of Representatives
## 9       Abiel Wood  1812 U.S. House of Representatives
## 10      Abiel Wood  1812             Electoral College
## 11 Abraham Lincoln  1813                     Moderator
## 12 Abraham Lincoln  1813      House of Representatives
## 13 Abraham Lincoln  1814      House of Representatives
## 14 Abraham Lincoln  1814                     Moderator
## Variables not shown: id (chr)

Some of the results makes me question the state of the data. In 1792, Samuel Lyman won in two different elections for the U.S. House of Representatives which seems suspect. Abraham Lincoln (no not that one) was elected to the House of Representatives but was also elected as the Moderator in both 1813 and 1814. Perhaps the most interesting co-election out of this group is the 1794 election of Samuel Adams. He is shown as winning both the Governorship and the Lt. Governorship in that year. How was something like this resolved so early on in the Republic? Did Adams choose one of the offices and the “runner-up” in the other office fill that role? Did Adams take on both positions (seems unlikely)? If anything, the next step would be to do some close reading of sources to find the answers and then to expand this model to include all of the states. If something like this occurred in a different state like Pennsylvania or Virginia, a comparison of how the two states mitigated the situation would be interesting and illuminating.