This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(tidyr)
library(stringr)
library(readr)
url <- "http://dl.tufts.edu/file_assets/generic/tufts:MS115.003.001.00001/0"
if (!file.exists("all-votes.tsv")) {
  download.file(url, "nnv-all-votes.zip")
  unzip("nnv-all-votes.zip", files = "all-votes.tsv")
}
nnv <- read_tsv("all-votes.tsv")

names(nnv) <-names(nnv) %>% str_to_lower() %>% str_replace_all("\\ ", "_")
  1. What kinds of elections were there?
nnv$type %>% 
unique()
## [1] "General"             "Legislative"         "Special"            
## [4] "Legislastive"        "Special Legislative" "Special Election"

There are five different elections recorded in the dataset. However, their is a transcribing error that lists “Legislative” as “Legislastive.”

  1. How many of each kind of election?
nnv %>% count(type)
## Source: local data frame [6 x 2]
## 
##                  type      n
##                 (chr)  (int)
## 1             General 620179
## 2        Legislastive     11
## 3         Legislative  24576
## 4             Special  14770
## 5    Special Election     14
## 6 Special Legislative     75
ggplot(nnv, aes(x = type)) +
  geom_bar(stat = "count")

  1. How many candidates and how often do they appear?
nnv_clean_nameID <- nnv%>%
  filter(name_id!="null")

nnv_clean_nameID$name_id %>% 
  unique() %>% 
  length()
## [1] 33761
nnv %>% count(name, name_id) %>% 
  ungroup() %>% 
  arrange(desc(n))
## Source: local data frame [41,738 x 3]
## 
##                 name name_id     n
##                (chr)   (chr) (int)
## 1       Caleb Strong  SC0023  6282
## 2   William Phillips  PW0081  6136
## 3     Elbridge Gerry  GE0049  5420
## 4             others    null  3993
## 5     James Sullivan  SJ0366  3930
## 6      William Heath  HW0171  3910
## 7  Edward H. Robbins  RE0018  3693
## 8         scattering    null  3650
## 9       Samuel Adams  AS0022  3390
## 10        Moses Gill  GM0022  3298
## ..               ...     ...   ...

There are a total of 33,761 different candidates over the timespan of this dataset. I removed teh “null” values from the name_id column. The most common individual in the dataset is “Caleb Strong” with 6282 observations.

  1. Which parties?
nnv_aff<- nnv %>% 
  filter(affiliation != "null")

nnv_aff$affiliation %>% 
  unique()
##   [1] "Federalist"                                                  
##   [2] "Democrat"                                                    
##   [3] "Republican"                                                  
##   [4] "Democratic"                                                  
##   [5] "Pro-Slavery"                                                 
##   [6] "Restrictionist"                                              
##   [7] "Anti-Restrictionist"                                         
##   [8] "Coalition"                                                   
##   [9] "Federal"                                                     
##  [10] "Republicans"                                                 
##  [11] "Federalists"                                                 
##  [12] "Clintonian"                                                  
##  [13] "Bucktail"                                                    
##  [14] "Quid"                                                        
##  [15] "Lewis Ticket"                                                
##  [16] "Clinton Ticket"                                              
##  [17] "Lewisite"                                                    
##  [18] "Anti-Division"                                               
##  [19] "Division"                                                    
##  [20] "Administration"                                              
##  [21] "Anti-Administration"                                         
##  [22] "Administration Ticket"                                       
##  [23] "Tammany Ticket"                                              
##  [24] "Clintonian Ticket"                                           
##  [25] "Anti-Federal"                                                
##  [26] "Anti-Federalist"                                             
##  [27] "Clintonian / Federal"                                        
##  [28] "Federalist/Clintonian"                                       
##  [29] "Independent/Federalist"                                      
##  [30] "Indepedent/Federalist"                                       
##  [31] "1st Ticket Democratic Republican"                            
##  [32] "2nd Ticket Democratic Republican"                            
##  [33] "Clinton"                                                     
##  [34] "Republican/Clintonian"                                       
##  [35] "Clintonian/Federal"                                          
##  [36] "Clintonian/Federalist"                                       
##  [37] "High Federalist"                                             
##  [38] "Low Federalist"                                              
##  [39] "High Republican"                                             
##  [40] "Low Republican"                                              
##  [41] "Others"                                                      
##  [42] "Union"                                                       
##  [43] "Independent"                                                 
##  [44] "The People's Ticket"                                         
##  [45] "King Caucus Ticket"                                          
##  [46] "People's Candidate"                                          
##  [47] "People's Ticket"                                             
##  [48] "King Caucus"                                                 
##  [49] "Republican/Federalist"                                       
##  [50] "Anti-Federalist (Republican)"                                
##  [51] "Union of Parties"                                            
##  [52] "Federal / Clintonian"                                        
##  [53] "Federal/Clintonian"                                          
##  [54] "Anti-Clintonian"                                             
##  [55] "Anti-Clinton"                                                
##  [56] "nul"                                                         
##  [57] "Anti Federal"                                                
##  [58] "Republican / Anti Federalist"                                
##  [59] "Republican / Anti Fedealist"                                 
##  [60] "Republican/ Anti Federalist"                                 
##  [61] "Republican/Anti-Federalist"                                  
##  [62] "Anti-Caucus"                                                 
##  [63] "Caucus"                                                      
##  [64] "Opposition Republican"                                       
##  [65] "No Jew-Bill Ticket"                                          
##  [66] "Jew-Bill Ticket"                                             
##  [67] "Federalist / I"                                              
##  [68] "Opposition"                                                  
##  [69] "Federalist/Opposition"                                       
##  [70] "Caucus Ticket"                                               
##  [71] "Anti-Caucus Ticket"                                          
##  [72] "Moderates"                                                   
##  [73] "Violents"                                                    
##  [74] "Jacksonian"                                                  
##  [75] "Adamite"                                                     
##  [76] "Crawfordite"                                                 
##  [77] "Pig Point or Lower Candidates"                               
##  [78] "Choptank Bridge or Upper Candidates"                         
##  [79] "Chesapeake"                                                  
##  [80] "Potomac"                                                     
##  [81] "Democrat/Republican"                                         
##  [82] "The Federal Republican Ticket"                               
##  [83] "Crawford"                                                    
##  [84] "Jackson"                                                     
##  [85] "Jeffersonian"                                                
##  [86] "Bank Tax Ticket"                                             
##  [87] "Anti-Bank Tax Ticket"                                        
##  [88] "Independent Republican"                                      
##  [89] "War Ticket"                                                  
##  [90] "Peace Ticket"                                                
##  [91] "Federal Republican"                                          
##  [92] "American"                                                    
##  [93] "French"                                                      
##  [94] "Peacemaker"                                                  
##  [95] "Warhawk"                                                     
##  [96] "Pro-Administration"                                          
##  [97] "Federalist/Union"                                            
##  [98] "Peace"                                                       
##  [99] "Anti-Relief"                                                 
## [100] "Relief"                                                      
## [101] "For New Election"                                            
## [102] "Against New Election"                                        
## [103] "Constitutionalist"                                           
## [104] "Court"                                                       
## [105] "Country"                                                     
## [106] "New-Electionist"                                             
## [107] "Williamsite"                                                 
## [108] "Anti-Slavery"                                                
## [109] "opp. Republican"                                             
## [110] "Federalist and Republican"                                   
## [111] "Federal/Quid"                                                
## [112] "Tammany"                                                     
## [113] "Anti-Tammany"                                                
## [114] "Democratic Republican"                                       
## [115] "Opposition/Federal"                                          
## [116] "Republican/Quid"                                             
## [117] "Minority"                                                    
## [118] "Federalist/Quid"                                             
## [119] "Republican / Federalist"                                     
## [120] "Anti-Tolerationist"                                          
## [121] "Tolerationist"                                               
## [122] "Anit-Republican"                                             
## [123] "Jacobin"                                                     
## [124] "Democrats"                                                   
## [125] "Federal/Independent Republican"                              
## [126] "Leibite"                                                     
## [127] "Whig"                                                        
## [128] "Leib"                                                        
## [129] "Constitutionalist/Federalist"                                
## [130] "Federalist/Constitutionalist"                                
## [131] "Constitutional/Federalist"                                   
## [132] "Constutionalist"                                             
## [133] "Constitutionalsit"                                           
## [134] "Repblican"                                                   
## [135] "Hiester ticket"                                              
## [136] "Findlay ticket"                                              
## [137] "Hiesterites"                                                 
## [138] "Anti-Republican"                                             
## [139] "Tory"                                                        
## [140] "Federalist Republican"                                       
## [141] "Federal Republicans"                                         
## [142] "Quid/Federalist"                                             
## [143] "Constitutionalist/Republican"                                
## [144] "Friends of Peace"                                            
## [145] "British"                                                     
## [146] "Republican Danville"                                         
## [147] "Republican Bloomsburg"                                       
## [148] "Old School Republican"                                       
## [149] "Republican - New School"                                     
## [150] "Republican - Old School"                                     
## [151] "New School Democrat"                                         
## [152] "Old School Democrats"                                        
## [153] "Federlaist"                                                  
## [154] "Conventionalist"                                             
## [155] "Constitutional"                                              
## [156] "Federalist/Republican"                                       
## [157] "First Democratic Ticket"                                     
## [158] "Second Democratic Ticket"                                    
## [159] "Friends of the Constitution"                                 
## [160] "Snyderites or Jacobins"                                      
## [161] "American Republican"                                         
## [162] "Oppositional Republican"                                     
## [163] "Federal/Constitutional"                                      
## [164] "New School"                                                  
## [165] "Old School"                                                  
## [166] "Constitutional/Federal"                                      
## [167] "Ticket in favor of the division of old Northumberland County"
## [168] "Second Republican Ticket"                                    
## [169] "Non-Descripts"                                               
## [170] "Snyderites"                                                  
## [171] "Old School Democrat"                                         
## [172] "Democrat/Old School"                                         
## [173] "Republican-Democratic Delegation"                            
## [174] "Federal-Independent Democratic Delegation"                   
## [175] "Findlayite"                                                  
## [176] "Binnite"                                                     
## [177] "Federalist/Constitutional"                                   
## [178] "Federal/Constitutionalist"
  1. What does each row in the dataset represent? Each row represents the total votes for a candidate in a specific election for a specific year for a specific region. However, the compilers of this dataset also choes to calculate total votes for elections at various geogrpahic levels, thus repeating vote counts as new observations.
  1. Which years are in the dataset, and how many elections are there in each year?
nnv <- nnv %>% 
  mutate(year = str_extract(date, "\\d{4}") %>% as.integer())

nnv %>% 
  group_by(year) %>% 
  summarise(unique_elements = n_distinct(id))
## Source: local data frame [40 x 2]
## 
##     year unique_elements
##    (int)           (int)
## 1   1787              69
## 2   1788             159
## 3   1789             186
## 4   1790             168
## 5   1791             156
## 6   1792             194
## 7   1793             167
## 8   1794             243
## 9   1795             202
## 10  1796             357
## ..   ...             ...
  1. Which states are represented?
nnv_clean_state<-nnv %>% 
  filter(state != "null")

nnv_clean_state$state %>% 
  unique()
##  [1] "Vermont"        "Missouri"       "New Hampshire"  "Tennessee"     
##  [5] "New York"       "NY"             "Alabama"        "Maryland"      
##  [9] "Indiana"        "Georgia"        "New Jersey"     "Massachusetts" 
## [13] "Rhode Island"   "Maine"          "North Carolina" "Kentucky"      
## [17] "Mississippi"    "Illinois"       "Louisiana"      "Ohio"          
## [21] "South Carolina" "Virginia"       "Delaware"       "Connecticut"   
## [25] "Pennsylvania"

Part Two: Exploring the Dataset

I started the exploring the dataset portion of the assignment but filtering the dataset down to just Massachusetts. As you had shown us some visualizaitons for Governor, I decided to look at Lieutenant Governor position. I noticed that Samuel Adams received votes for that office in 1787 and 1788 but that he also received votes for other offices (Governor, U.S. House of Representatives, Electoral College).

nnv_ma <-nnv %>%
    filter(state == "Massachusetts") %>% 
    filter(is.na(district) & is.na(town)& is.na(city)& is.na(county)) %>% 
    arrange(year)

nnv_ma %>% 
  filter(year==1787, name_id=="AS0022") %>% 
  ggplot(aes(x=office))+
  geom_bar(stat = "identity", aes(y=vote), fill="red") +
  geom_text(aes(label=vote, y=vote), position=position_dodge(width=0.9), vjust=-0.25)+
  labs(title="Samuel Adams' 1787 votes per Office", y="Total Votes", x='')+
  theme(axis.text.x =
               element_text(size  = 10,
                            angle = 45,
                            hjust = 1,
                            vjust = 1))

I decided to use a geom_line to map out Samuel Adams’ political career. This graph shows the drastic increase in votes for both “Lieutenant Governor” and “Governor.” Anytime a point appears, he recieved at least one vote for that office.

Following this line of inquiry futher, I wanted to know how many individuals, within the state of Massacusetts and the datasets date range, received votes for multiple political offices in the same election year.

nnv_ma_count <- nnv_ma %>%  count(name, name_id, office, year) %>%
  filter(name_id != "null") %>% 
  count(name, name_id, year) %>% 
  filter(n>=2)
  
nnv_ma_count %>%  
  ggplot(aes(x = n))+
  geom_bar(stat = "count")+  
  labs(title="Total Number of Occurences by Office", y="Total Count", x='Number of Offices')

This shows you the number of candidates that that received votes for 2,3,4,or 5 offices in a particular given election year.

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.