Brexit

Author

Asher Scott

From lines 9-14 I loaded in tidyverse, whihc contains the dslabs datasets. I decided to use the dataset brexit_polls and used the summary function to see which of the pollster companies had the most info.

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dslabs)
summary(brexit_polls)
   startdate             enddate              pollster      poll_type 
 Min.   :2016-01-08   Min.   :2016-01-10   ICM    :28   Online   :85  
 1st Qu.:2016-03-04   1st Qu.:2016-03-08   YouGov :26   Telephone:42  
 Median :2016-04-22   Median :2016-04-26   ORB    :14                 
 Mean   :2016-04-16   Mean   :2016-04-18   ComRes :10                 
 3rd Qu.:2016-05-31   3rd Qu.:2016-06-01   Opinium: 9                 
 Max.   :2016-06-23   Max.   :2016-06-23   TNS    : 9                 
                                           (Other):31                 
   samplesize       remain           leave          undecided     
 Min.   : 497   Min.   :0.3500   Min.   :0.3200   Min.   :0.0000  
 1st Qu.:1010   1st Qu.:0.4100   1st Qu.:0.3900   1st Qu.:0.0900  
 Median :1693   Median :0.4400   Median :0.4200   Median :0.1300  
 Mean   :1694   Mean   :0.4424   Mean   :0.4223   Mean   :0.1265  
 3rd Qu.:2010   3rd Qu.:0.4800   3rd Qu.:0.4500   3rd Qu.:0.1700  
 Max.   :4772   Max.   :0.5500   Max.   :0.5500   Max.   :0.3000  
                                                                  
     spread        
 Min.   :-0.10000  
 1st Qu.:-0.02000  
 Median : 0.01000  
 Mean   : 0.02008  
 3rd Qu.: 0.05000  
 Max.   : 0.19000  
                   

From lines 17-20 I filtered the brexit_polls so that they only contained the four largest polling organizations as there were some that only had 2-3 entries. The one issue I had was becuase some of these polling organizations had overlapping names, ex: ORB & ORB Telegraph so I had to remove those aswell.

BP <- brexit_polls %>% 
  filter(str_detect(pollster, "ICM|YouGov|ORB|ComRes")) %>%
  filter(pollster != "ORB/Telegraph") %>%
  filter(pollster != "YouGov/The Times")

From lines 24-38 I created a density plot to illustrate what percent of each of these polls predicted the outcome of the 2016 brexit referendum (to leave). As we can see the ORB had the highest listings for leave. I also made an undecided and remain density plot for fun.

ggplot(data = BP, mapping = aes(x = leave, fill = pollster)) +
  geom_density(alpha = 0.5) + 
  labs(
    title = "4 Largest Brexit Polls",
    x = "Leave Percentage",
    y = "Density"
  ) +
  scale_color_manual(values = c("YouGov" = "purple", 
                                "ICM" = "navy",
                                "ORB" = "red",
                                "ComRes" = "orange"
                              )) +
  theme_minimal()
Warning: No shared levels found between `names(values)` of the manual scale and the
data's colour values.

UNDECIDED Graph

ggplot(BP, mapping = aes(x = undecided, fill = pollster)) +
  geom_density(alpha = 0.5) + 
  labs(
    title = "4 Largest Brexit Polls",
    x = "Undecided Percentage",
    y = "Density"
  ) +
  scale_color_manual(values = c("YouGov" = "purple", 
                                "ICM" = "navy",
                                "ORB" = "red",
                                "ComRes" = "orange"
                              )) +
  theme_minimal()
Warning: No shared levels found between `names(values)` of the manual scale and the
data's colour values.

REAMAIN Graph

ggplot(BP, mapping = aes(x = remain, fill = pollster)) +
  geom_density(alpha = 0.5) + 
  labs(
    title = "4 Largest Brexit Polls",
    x = "Remain Percentage",
    y = "Density"
  ) +
  scale_color_manual(values = c("YouGov" = "purple", 
                                "ICM" = "navy",
                                "ORB" = "red",
                                "ComRes" = "orange"
                              )) +
  theme_minimal()
Warning: No shared levels found between `names(values)` of the manual scale and the
data's colour values.

I used the brexit_polls dataset, whihc covers a variety of polling organizations as their recorded polling results. It also listed the polling sample size and how the poll was conducted (online or through telephone)