library(png)
library(grid)
library(dplyr)
library(ggplot2)
library(RColorBrewer)
library(ggthemes)
library(ggthemes)
library(ggplot2)
library(maps)
library(mapdata)

states <- map_data("state")
dim(states)

## [1] 15537     6

url <- "https://docs.google.com/spreadsheets/d/1RJqUjwhiKblNJutnxUH96JQjttC-QL2hV9SH6f9FpRU/pub?output=csv"
election.data <- read.csv(url)

election.data.ave <- election.data %>%
  mutate(region = tolower(State)) %>%
  group_by(region)
  
states1 <- left_join(states, election.data.ave)

us.base <- ggplot(data = states1, 
                  mapping = aes(x = long, y = lat, 
                  group = group, fill = PerTrump)) + 
  coord_fixed(1.3) + 
  geom_polygon(color = "black")

us.base + theme_map() + 
  scale_fill_continuous(name="Percent", 
            low = "white", high = "red",  
            breaks=c(44, 46, 48, 50, 52)) + 
ggtitle("Percent of Trump Support in Swing States")

The main takeaway of this graph is how well Donald Trump did in swing states. Now that all the verified data is in, Trump actually won these 15 states by an average of 5.4 points, which identifies exactly where the election was won. Additionally, the user looking at this graph should focus on what is called the rust belt, or in more simple terms: the Midwest. Political context tells us that the midwest is pretty safe for a Democrat, with most states breaking for Democrats. Clearly this graph shows the opposite, so that also is a point of strong difference. The point of this set of graphs is to note effective factors to explain the election results, and this is a good factor in order to discern where the election was won/lost.

This graphic was created with the us map template. We mutated by region to make it clear that the states borders were necessary, and then we joined the states and election data to be able to complete the graph. On a gradient from white to red, with the minimum being 44, and the maximum being 52, we made sure the legend was clear in order to show which state was most Trump-leaning versus most Clinton-leaning.

image <- png::readPNG("~/Downloads/Iowa.png")
g <- rasterGrob(image, interpolate=TRUE)

image <- png::readPNG("~/Downloads/Maine.png")
h<- rasterGrob(image, interpolate=TRUE)
  
image <- png::readPNG("~/Downloads/Ohio.png")
i<- rasterGrob(image, interpolate=TRUE)

image <- png::readPNG("~/Downloads/michigan.png")
j<- rasterGrob(image, interpolate=TRUE)

image <- png::readPNG("~/Downloads/Wisconsin.png")
k<- rasterGrob(image, interpolate=TRUE)

image <- png::readPNG("~/Downloads/Minnesota.png")
l<- rasterGrob(image, interpolate=TRUE)

image <- png::readPNG("~/Downloads/Pennsylvania.png")
m<- rasterGrob(image, interpolate=TRUE)

image <- png::readPNG("~/Downloads/NewHampshire.png")
n<- rasterGrob(image, interpolate=TRUE)

image <- png::readPNG("~/Downloads/Nevada.png")
o<- rasterGrob(image, interpolate=TRUE)

image <- png::readPNG("~/Downloads/Florida.png")
p<- rasterGrob(image, interpolate=TRUE)

image <- png::readPNG("~/Downloads/NorthCarolina.png")
q<- rasterGrob(image, interpolate=TRUE)

image <- png::readPNG("~/Downloads/Colorado.png")
r<- rasterGrob(image, interpolate=TRUE)

image <- png::readPNG("~/Downloads/Georgia.png")
s<- rasterGrob(image, interpolate=TRUE)

image <- png::readPNG("~/Downloads/Arizona.png")
t<- rasterGrob(image, interpolate=TRUE)

image <- png::readPNG("~/Downloads/Texas.png")
u<- rasterGrob(image, interpolate=TRUE)

election.data <- transform(election.data, State = factor(State,
                                        levels = c("Iowa", "Maine", "Ohio", "Michigan","Wisconsin", "Minnesota", "Pennsylvania", "New Hampshire", "Nevada", "Florida", "North Carolina", "Colorado", "Georgia", "Arizona", "Texas")))

ggplot(election.data, aes(State, Margin.Shift)) +
geom_point() +
  geom_hline(aes(yintercept=0, colour="red")) +
annotation_custom(g, xmin=0, xmax=2 , ymin= -13.5, ymax=-15.5) +
annotation_custom(h, xmin=1.5, xmax=2.5, ymin=-14, ymax=-12) +
annotation_custom(i, xmin=2.5, xmax=3.5, ymin= -12, ymax= -10) +
annotation_custom(j, xmin=3.5, xmax=4.5, ymin=-11, ymax=-8.5)  +
annotation_custom(k, xmin= 4.5, xmax=5.5, ymin=-8.5, ymax=-6.5) +
annotation_custom(l, xmin=5.5, xmax=6.5, ymin=-7, ymax= -5) +
annotation_custom(m, xmin=6.5, xmax=7.5, ymin=-6.9, ymax= -4.9) +
annotation_custom(n, xmin=7.5, xmax= 8.5, ymin=-6, ymax= -4 ) +
annotation_custom(o,xmin=8.5, xmax=9.5, ymin= -5.2, ymax= -3.2) +
annotation_custom(p, xmin=9.5, xmax=10.5, ymin=-3.8, ymax= -1.8) +
annotation_custom(q, xmin= 10.5, xmax=11.5, ymin=-3, ymax= -1 ) +
annotation_custom(r, xmin= 11.5, xmax=12.5, ymin=-1, ymax=1)  +
annotation_custom(s, xmin=12.5, xmax=13.5, ymin=1.5, ymax= 3.5) +
annotation_custom(t, xmin=13.5, xmax=14.5, ymin= 4.2, ymax=6.2) +
annotation_custom(u, xmin=14.5, xmax=15.5, ymin=5.5, ymax=7.5) +
  ggtitle("State Margin Shift 2012 vs. 2016") +
  ylab("Margin Shift Toward Clinton") +
  theme_few() +
  ylim(-16,16) +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  theme(legend.position = "none")

This graph shows the margin change of the presidential vote total in each state between 2012 and 2016. The higher the negative number, the more Republican the state has become, the higher the positive number the more Democratic. The states were arranged in order from largest margin shift towards republicans, to largest margin shift towards democrats, and images of the states were added over their respective points the graph illustrates how Donald Trump was able to achieve significantly higher vote proportions in almost all swing states especially those in the Midwest, and that Hillary Clinton was only able to improve on Obama’s margins in states that Trump still was able to easily win.

This graph was created by plotting states on the X-axis and margin shift on the Y-axis. For the states to appear in the correct order the states were factored into different levels, and the states were then listed in the order they would appear on the graph. Once the states appeared in order from most republican margin change to most democratic, images were added to the graph using the “png” and “grid” packages. Images for each state were downloaded, read into R using the png package and plotted on the graph using the grid package. Titles and labels were added to the graph, and the text on the X-axis was tilted 45 degrees

url <- "https://docs.google.com/spreadsheets/d/1RJqUjwhiKblNJutnxUH96JQjttC-QL2hV9SH6f9FpRU/pub?output=csv"
election.data <- read.csv(url)
election.data <- election.data %>%
   mutate(Total.12.Votes = as.numeric(gsub(",", "", Total.12.Votes)),
          Total.16.Votes = as.numeric(gsub(",", "", Total.16.Votes)))

df.12 <- election.data %>% 
  select(State, Total.12.Votes, StateType) %>% 
  mutate(Year = 2012) %>%
  rename(Total.Votes = Total.12.Votes)

df.16 <- election.data %>% 
  select(State, Total.16.Votes, StateType) %>% 
  mutate(Year = 2016) %>%
  rename(Total.Votes = Total.16.Votes)

df.both <- rbind(df.12, df.16)

library(ggrepel)
ggplot(df.both, aes(Year, Total.Votes, color = State)) + 
  geom_point() + 
  geom_line() + 
  geom_text(data = filter(df.both, Year == 2012), aes(x = 2011, y = Total.Votes, label = State)) + 
  xlim(2010, 2016) + 
  ylab("Total Votes (Millions)") +
  ggtitle("Voter Turnout 2012 vs. 2016 by State") +
  theme(legend.position = "none")

This graph shows the increase in the number of raw votes in each state from 2012-2016. Effectively, if the line goes up for a state from 2012-2016 turnout has gone up, if the line goes down then turnout has gone. The graph shows that turnout stayed the same or increased in most states, showing evidence against the idea that higher turnout is beneficial to democrats.

To create this graph both the 2012 vote total and 2016 vote total variables were mutated to be read as numeric using the “as.numeric” code. Two data frames were created, one saving the State and 2012 Vote total variables, and one saving the State, and 2016 variables. Both the 2012 vote total ad 2016 vote totals were saved as “Total.votes” so that when the data frames are merged both variables can appear on the graph. Once the data frames are merged the graph is created using ggplot in a standard format, with each state being labeled and given individual colors, while the legend is removed.

election.data1 <- election.data %>%
  filter(State ==  "Nevada" | State ==
        "North Carolina" | State ==  "Maine" | State ==  
        "Iowa" | State == "Florida" | State ==  "Colorado" | 
         State ==  "Arizona") 

dem.early <- election.data1 %>% 
  select(State, PerDemEVote) %>%
  mutate(Type = "DemEarly") %>%
  rename(Percent = PerDemEVote)

rep.early <- election.data1 %>% 
  select(State, PerRepEVote) %>%
  mutate(Type = "RepEarly") %>%
  rename(Percent = PerRepEVote)

ind.early <- election.data1 %>% 
  select(State, PerIndEVote) %>%
  mutate(Type = "IndEarly") %>%
  rename(Percent = PerIndEVote)

other.early <- election.data1 %>% 
  select(State, PerOtherEVote) %>%
  mutate(Type = "OtherEarly") %>%
  rename(Percent = PerOtherEVote)

early.both <- rbind(dem.early, rep.early, ind.early, other.early)

ggplot(early.both, aes(State, Percent, fill = Type)) +
  geom_bar(stat = "identity") +
  ylab("Early Vote Percent Share") +
  ggtitle("Early Voting by Party Identification") +
  scale_fill_manual("Early Voting Types", values=c("mediumblue", "purple3", "slategray", "red2"))

The key takeaways of this graphic is that early voting types are pretty consistent throughout early voting in swing states. Yet, these states did not nearly turn out the same. Moreover, the Democratic share in this graph would indicate that most of these states should go Democratic if it was representative, and that is not always true. This demonstrates that this factor as trying to describe what happened in this election is not a consistent metric. For a user, they should understand that Democrat, Republican, Independent, or Other is just what each voter identifies as. It does not give any data on who they actually voted for, so again, this metric is hard to follow in order to give affirmative answers of what happened this election.

In order to create this graph, first we mutated to create election.data1 so then we couldfilter for the seven states that we wanted data for. Once we did that, we had to create the four different types of early voting in order to not just have a bar graph, but a stacked bar chart. We did that by defining each early voting group, selecting State and the variable from our data, and then finally renaming it. Once those four were done we binded them together under early.both to be able to call the data for our graph. Once that was done we knitted our graph with the appropitate title, labels, and colors to the graph.

election.data1 <- election.data %>%
  filter(State ==  "Nevada" | State ==
        "North Carolina" | State ==  "Maine" | State ==  
        "Iowa" | State == "Florida" | State ==  "Colorado" | 
         State ==  "Arizona") 

dem.overall <- election.data1 %>% 
  select(State, PerClinton) %>%
  mutate(Type = "Overall", Lean = "Democratic") %>%
  rename(Percent = PerClinton)

dem.early <- election.data1 %>% 
  select(State, PerDemEVote) %>%
  mutate(Type = "Early", Lean = "Democratic") %>%
  rename(Percent = PerDemEVote)

rep.overall <- election.data1 %>% 
  select(State, PerTrump) %>%
  mutate(Type = "Overall", Lean = "Republican") %>%
  rename(Percent = PerTrump)

rep.early <- election.data1 %>% 
  select(State, PerRepEVote) %>%
  mutate(Type = "Early", Lean = "Republican") %>%
  rename(Percent = PerRepEVote)

dem.both <- rbind(dem.overall, dem.early, rep.overall, rep.early)

ggplot(dem.both, aes(State, Percent, colour = Type)) + 
  geom_point(stat = "identity")  + 
  coord_flip() +
  facet_wrap(~Lean) +
  ggtitle("Early Vote vs. Overall Vote by Swing State") +
  scale_colour_manual("Voting Time", values=c("lightsalmon4", "orange"))

The main takeaway from this graph is that early voting is rarely representative on overall voting trends. Of course, this is all data only from swing states, but it is still as crucial to demonstrate that the margins for Democrats is much more predictive than that of Republicans. This should also demonstrate that following early voting is not consistent for both parties, and thus overall it is most likely not perfectly predictive either. A user would be benefitted from knowing that early vote tends to skew a bit more Democratic, which would surely assist in understanding why this graph shows the results it does.

To create this graphic, we did an almost identical process to the first graph. We filtered for the same desired states, and then ceated four different values for Republican and Democrat votes for early and overall voting respectively. Then we binded those four, and created a point graph, and faceted it to clearly show the difference between the two parties.

Conclusion

Our graphs show two major conclusions. First, that Donald Trump was successful in winning the presidency by drastically shifting the margins in many states. Graph 1 shows that Trump received strong support in many swing states and graph two shows that Trump was able to massively shift the margins from 2012 in states that mattered a lot, especially states in the Midwest. The second conclusion our graph shows is that many metric used by media pundits and analysts, specifically overall turnout, and early vote returns, are not always predictive of the election outcome. Graph 3 shows that even though turnout went up in many states, it did not signal a victory for secretary Clinton. It is often said in politics that higher turnout favors democrats because they mobilize low voting populations, but our graph shows that this was not true in the 2016 election. Graph’s 4 and 5 show that it is not effective to use early voting to predict the presidential election outcome. Graph 4 shows that early voting looks like it benefits democrats, with Democratic Party affiliation ballots outnumbering Republicans in most states. But since early voting results only show the party affiliation of the voter, not the way they actually voted, it is hard to know who early voting is favoring. Graph 5 shows that early voting returns of republican and democratic ballots did not match up with the vote totals received by democratic and republican candidates. Together these two graphs show how early voting is not an effective indicator in predicting presidential elections, further proving our second point.

If we could explore this topic further, hopefully we could look at how different factors, like the demographic make up of the electorate, predict the outcome of the election. In addition it would be useful to expand on research on turnout and early voting looking to see if early voting or turnout levels had been more predictive in previous presidential elections. By expanding our area of study we could further reinforce our findings, and show how many factors are not predictive in presidential elections. We hope that our analysis of the 2016 election showed that certain factors often used to predict the election are not accurate, and that it is important not to cherry pick information that confirms your biases. Our graphs show how early voting and overall turnout changes are inaccurate indicators of the overall election result, and makes sure the audience understands that many election indicators can be ineffective.

2016 Election

Isaac Selchaif & Jacques Phelps

December 7, 2016

Conclusion