This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more of my R tutorials visit http://mikewk.com/statistics.

Contents

Packages and Data

This tutorial covers downloading and plotting polling averages from [Huffington Post] Pollster. It uses ‘pollstr’ to download polling data and plots the data using ‘ggplot2’.

library(pollstR) # for retrieving HuffPost Pollster data
library(ggplot2) # for plots

Download GOP polling data from Huffington Post Pollster (estimates by date).

gop16 <- pollstr_chart("2016-national-gop-primary", convert = TRUE)$estimates_by_date

Prepping the Data

Identify candidates with the highest averages.

cutoff <- gop16[gop16[, "date"] >= "2015-08-18", ]
cutoff <- with(cutoff, aggregate(value, by=list(choice), FUN='mean'))
cutoff[order(cutoff$x, decreasing=TRUE), ]
##      Group.1    x
## 16     Trump 26.6
## 1       Bush 10.7
## 2     Carson  9.3
## 17 Undecided  7.3
## 14     Rubio  6.8
## 8   Huckabee  5.5
## 4       Cruz  5.3
## 5    Fiorina  5.2
## 18    Walker  5.1
## 13 Rand Paul  4.6
## 3   Christie  3.3
## 10    Kasich  2.9
## 12     Perry  1.9
## 9     Jindal  1.2
## 7     Graham  0.9
## 15  Santorum  0.8
## 11    Pataki  0.1
## 6    Gilmore  0.0

Subset data by selecting start date and candidates.

gop16 <- pollstr_chart("2016-national-gop-primary", convert = TRUE)$estimates_by_date
gop16 <- gop16[gop16[, "date"] >= "2015-05-01", ]
gop16 <- subset(gop16, 
                choice %in% c("Trump", 
                              "Bush", 
                              "Carson", 
                              "Rubio", 
                              "Huckabee",
                              "Cruz",
                              "Fiorina",
                              "Walker",                              
                              "Rand Paul",
                              "Christie"))

Plotting the Data

All that’s left is to plot it. Here’s the code I used:

ggplot(gop16, aes(x = date, y = value, col = choice)) + geom_line(size = 1.5, alpha=1) +
# format chart theme
  theme_bw() + geom_hline(yintercept=0,size=1.2,colour="#535353") +
  theme(legend.position="none") + 
# titles and labels
  ggtitle("GOP Polling Averages") +
  theme(panel.background=element_rect(fill="#F0F0F0")) +
  theme(panel.border=element_blank()) +
  theme(plot.background=element_rect(fill="#F0F0F0")) +
  theme(panel.grid.major=element_line(colour="#D6D6D6",size=.75)) +
  theme(panel.grid.minor=element_line(colour="#E3E3E3",size=.75)) +
  theme(axis.ticks=element_blank()) +
  theme(axis.text=element_text(size=10,colour="#535353", face='bold')) +
  theme(axis.title=element_blank()) +
  theme(plot.title=element_text(face="bold",hjust=-.05,vjust=1.5,colour="#3C3C3C",size=20)) +
# line annotations
  geom_text(aes(x=as.Date('2015-08-07'), y=13.5, label="Bush", colour="Bush")) +
  geom_text(aes(x=as.Date('2015-08-05'), y=23, label="Trump", colour="Trump")) +
  geom_text(aes(x=as.Date('2015-07-25'), y=9.5, label="Walker", colour="Walker"))

And that’s it!