library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
library(tidyr)
library(stringr)
library(readr)
url <- "http://dl.tufts.edu/file_assets/generic/tufts:MS115.003.001.00001/0"
if (!file.exists("all-votes.tsv")) {
download.file(url, "nnv-all-votes.zip")
unzip("nnv-all-votes.zip", files = "all-votes.tsv")
}
nnv <- read_tsv("all-votes.tsv")
names(nnv) <-names(nnv) %>% str_to_lower() %>% str_replace_all("\\ ", "_")
nnv <- nnv %>%
mutate(year = str_extract(date, "\\d{4}") %>% as.integer())
I started exploring the New Nation Votes dataset by filtering it down to just elections from the state of Massachusetts. My initial intent was to explore the Governor and Lieutenant Governor offices. I noticed early on that individuals were receiving votes in multiple offices during the same election year. For example, when looking at the voting returns for Samuel Adams for Lt. Governor in 1787 and 1788 he also received votes for Governor, U.S. House of Representatives, and Electoral College.
nnv_ma <-nnv %>%
filter(state == "Massachusetts") %>%
filter(is.na(district) & is.na(town)& is.na(city)& is.na(county)) %>%
distinct() %>%
arrange(year)
nnv_ma %>%
filter(year==1787, name_id=="AS0022") %>%
ggplot(aes(x=office))+
geom_bar(stat = "identity", aes(y=vote), fill="red") +
geom_text(aes(label=vote, y=vote), position=position_dodge(width=0.9), vjust=-0.25)+
labs(title="Samuel Adams' 1787 votes per Office", y="Total Votes", x='')+
theme(axis.text.x =
element_text(size = 10,
angle = 45,
hjust = 1,
vjust = 1))
nnv_ma %>%
filter(year==1788, name_id=="AS0022") %>%
ggplot(aes(x=office))+
geom_bar(stat = "identity", aes(y=vote), fill="#0072B2") +
geom_text(aes(label=vote, y=vote), position=position_dodge(width=0.9), vjust=-0.25)+
labs(title="Samuel Adams' 1788 votes per Office", y="Total Votes", x='')+
theme(axis.text.x =
element_text(size = 10,
angle = 45,
hjust = 1,
vjust = 1))
Probing further into the political career of Samuel Adams, I wanted to know if this was an isolated occurrence or did he receive votes in multiple offices often. I decided to plot out his political career using a geom line graph while separating the different offices by color. The resulting graph highlights the drastic increase in votes Samuel Adams received for both Lt. Governor and Governor in the 1790s and then steep drop off as well.
Samuel Adams provides an interesting case but I need to see the bigger picture for the state of Massachusetts. Again, was Samuel Adams’ political career the anomaly or was it a common occurrence for candidates to receive votes in multiple offices? With what frequency did this occur and was it increasing or decreasing over time? In order to get at the answers to these questions I needed to count up the observations while also filtering out null values and second/third/fourth/etc. ballot counts. The results would document who of all the Massachusetts candidates received votes in multiple offices in a single election year and how many offices in that year.
nnv_ma_count <- nnv_ma %>%
filter(iteration == "First Ballot") %>%
count(name, name_id, id, year) %>%
filter(name_id != "null") %>%
count(name, name_id, year) %>%
filter(n >= 2)
nnv_ma_count %>%
ggplot(aes(x = n))+
geom_bar(stat = "count")+
labs(title="Total Number of Occurences by Office", y="Total Count", x='Number of Offices')
The previous bar chart highlights that candidates received votes in as many as seven offices in any given election year. That seems exceptionally high as I am not even sure how many offices are available for election. I needed the specifics of each of the elections that they were nominated in multiple offices. By left joining my multi-election dataset back tot he original, I am left with all the observations for the politicians I am most interested in. Now, the New Nation Votes contains different types of elections (General, Special, Legislative) which can skew my results. I I plotted them on a bar chart in order to compare them against one another. Then I added a line chart in order to see their frequency over time.
multi_elections <-nnv_ma_count %>%
left_join(nnv_ma, by = c("name","name_id", "year"))
multi_elections %>%
ggplot(aes(x = type))+
geom_bar(stat = "count")
multi_elections %>%
count(year, type) %>%
ggplot(aes(x = year))+
geom_line(aes(y=n, color = type) )+
labs(title="Frequency of Election Types over Time", y = "Total Number of Offices Voted For", x = 'Year', colour = "Election Type")
These multi-office elections were overwhelmingly a result of the general election. Furthermore, they appear to be decline as the Republic begins to establish itself further into the nineteenth century. The spikes in the general election line graph indicates some sort of anomaly or issue with my computation. That can be explored later. What I am interested in now is whether a candidate received a large number of votes in multiple offices or were they concentrated in one with a smattering in other offices. In addition, if they received large voting turn outs in multiple offices, did they ever win more than one office in the same election year?
winner_multi_elections <- multi_elections %>%
filter(iteration == "First Ballot" & type == "General") %>%
ungroup() %>%
select(id) %>%
distinct() %>%
left_join(nnv_ma, by = c("id")) %>%
group_by(id) %>%
mutate(won = vote == max(vote, na.rm = TRUE)) %>%
filter(won)
multi_winner <- winner_multi_elections %>%
count(name, name_id, year) %>%
ungroup() %>%
filter(n>=2) %>%
arrange(desc(n))
multi_winner %>%
print()
## Source: local data frame [7 x 4]
##
## name name_id year n
## (chr) (chr) (int) (int)
## 1 Abiel Wood WA0060 1812 2
## 2 Abraham Lincoln LA0041 1813 2
## 3 Abraham Lincoln LA0041 1814 2
## 4 Elisha May ME0023 1796 2
## 5 Jabez Upham UJ0005 1806 2
## 6 Samuel Adams AS0022 1794 2
## 7 Samuel Lyman LS0018 1792 2
The table shows that a total of seven candidates who won multiple offices in a single election year. It happened in the 1790s and largely in the 1810s. What offices did they win?
multi_office_winners <- multi_winner %>%
left_join(winner_multi_elections, by = c("name", "name_id", "year")) %>%
select(name, year, office, id) %>%
ungroup() %>%
arrange(year) %>%
print()
## Source: local data frame [14 x 4]
##
## name year office
## (chr) (int) (chr)
## 1 Samuel Lyman 1792 U.S. House of Representatives
## 2 Samuel Lyman 1792 U.S. House of Representatives
## 3 Samuel Adams 1794 Lieutenant Governor
## 4 Samuel Adams 1794 Governor
## 5 Elisha May 1796 U.S. House of Representatives
## 6 Elisha May 1796 Electoral College
## 7 Jabez Upham 1806 U.S. House of Representatives
## 8 Jabez Upham 1806 House of Representatives
## 9 Abiel Wood 1812 U.S. House of Representatives
## 10 Abiel Wood 1812 Electoral College
## 11 Abraham Lincoln 1813 Moderator
## 12 Abraham Lincoln 1813 House of Representatives
## 13 Abraham Lincoln 1814 House of Representatives
## 14 Abraham Lincoln 1814 Moderator
## Variables not shown: id (chr)
Some of the results makes me question the state of the data. In 1792, Samuel Lyman won in two different elections for the U.S. House of Representatives which seems suspect. Abraham Lincoln (no not that one) was elected to the House of Representatives but was also elected as the Moderator in both 1813 and 1814. Perhaps the most interesting co-election out of this group is the 1794 election of Samuel Adams. He is shown as winning both the Governorship and the Lt. Governorship in that year. How was something like this resolved so early on in the Republic? Did Adams choose one of the offices and the “runner-up” in the other office fill that role? Did Adams take on both positions (seems unlikely)? If anything, the next step would be to do some close reading of sources to find the answers and then to expand this model to include all of the states. If something like this occurred in a different state like Pennsylvania or Virginia, a comparison of how the two states mitigated the situation would be interesting and illuminating.