DATA 607 Week 1 Assignment - Assignment – Loading Data into a Data Frame

Deepa Sharma

February 9, 2022

Introduction:

The data for this task is taken from fivethirtyeight dataset located @ https://data.fivethirtyeight.com/

The goal of this project is to analyze how Biden’s approval rating to his Coronavirus response as President is changing across Republicans and Democrats over time.

To perform this analysis, only three columns of data is considered from the given dataset. These columns are end_date of the poll, party affiliation of the respondent, number of approvers.

The summary of the data is shown as well.

CovidPolls <- read.csv("https://raw.githubusercontent.com/fivethirtyeight/covid-19-polls/master/covid_approval_polls.csv",header = TRUE, sep=",")
summary(CovidPolls)
##   start_date          end_date           pollster           sponsor         
##  Length:3254        Length:3254        Length:3254        Length:3254       
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   sample_size        population           party             subject         
##  Min.   :    55.0   Length:3254        Length:3254        Length:3254       
##  1st Qu.:   386.0   Class :character   Class :character   Class :character  
##  Median :   612.4   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :  2160.7                                                           
##  3rd Qu.:  1163.8                                                           
##  Max.   :325970.0                                                           
##  NA's   :22                                                                 
##   tracking           text              approve        disapprove   
##  Mode :logical   Length:3254        Min.   : 1.00   Min.   : 1.00  
##  FALSE:2996      Class :character   1st Qu.:30.00   1st Qu.:29.00  
##  TRUE :242       Mode  :character   Median :42.00   Median :52.00  
##  NA's :16                           Mean   :46.53   Mean   :48.16  
##                                     3rd Qu.:65.00   3rd Qu.:62.00  
##                                     Max.   :98.00   Max.   :98.00  
##                                     NA's   :3       NA's   :15     
##      url           
##  Length:3254       
##  Class :character  
##  Mode  :character  
##                    
##                    
##                    
## 
head(CovidPolls)
##   start_date   end_date        pollster   sponsor sample_size population party
## 1 2020-02-02 2020-02-04          YouGov Economist        1500          a   all
## 2 2020-02-02 2020-02-04          YouGov Economist         376          a     R
## 3 2020-02-02 2020-02-04          YouGov Economist         523          a     D
## 4 2020-02-02 2020-02-04          YouGov Economist         599          a     I
## 5 2020-02-07 2020-02-09 Morning Consult                  2200          a   all
## 6 2020-02-07 2020-02-09 Morning Consult                   684          a     R
##   subject tracking
## 1   Trump    FALSE
## 2   Trump    FALSE
## 3   Trump    FALSE
## 4   Trump    FALSE
## 5   Trump    FALSE
## 6   Trump    FALSE
##                                                                                                                                                        text
## 1                                                                    Do you approve or disapprove of Donald Trumpâ\200\231s handling of the coronavirus outbreak?
## 2                                                                    Do you approve or disapprove of Donald Trumpâ\200\231s handling of the coronavirus outbreak?
## 3                                                                    Do you approve or disapprove of Donald Trumpâ\200\231s handling of the coronavirus outbreak?
## 4                                                                    Do you approve or disapprove of Donald Trumpâ\200\231s handling of the coronavirus outbreak?
## 5 Do you approve or disapprove of the job each of the following is doing in handling the spread of coronavirus in the United States? President Donald Trump
## 6 Do you approve or disapprove of the job each of the following is doing in handling the spread of coronavirus in the United States? President Donald Trump
##   approve disapprove
## 1      42         29
## 2      75          6
## 3      21         51
## 4      39         25
## 5      57         22
## 6      88          4
##                                                                                                   url
## 1         https://d25d2506sfb94s.cloudfront.net/cumulus_uploads/document/73jqd6u5mv/econTabReport.pdf
## 2         https://d25d2506sfb94s.cloudfront.net/cumulus_uploads/document/73jqd6u5mv/econTabReport.pdf
## 3         https://d25d2506sfb94s.cloudfront.net/cumulus_uploads/document/73jqd6u5mv/econTabReport.pdf
## 4         https://d25d2506sfb94s.cloudfront.net/cumulus_uploads/document/73jqd6u5mv/econTabReport.pdf
## 5 https://morningconsult.com/wp-content/uploads/2020/02/200214_crosstabs_CORONAVIRUS_Adults_v4_JB.pdf
## 6 https://morningconsult.com/wp-content/uploads/2020/02/200214_crosstabs_CORONAVIRUS_Adults_v4_JB.pdf

The following set of code is used to filter the data set to match the criteria for analysis. We are focusing on the response to the question “Do you approve or disapprove of the way President Biden is handling each of the following issues? Coronavirus”. The following code is used to create a dataframe with the subset of the data where text column is equal to the questions above.

mydata <- subset(CovidPolls, text == "Do you approve or disapprove of the way President Biden is handling each of the following issues? Coronavirus")
mydata <- data.frame(mydata)
head(mydata)
##      start_date   end_date pollster    sponsor sample_size population party
## 2227 2021-10-19 2021-01-21   YouGov Yahoo News        1704          a   all
## 2228 2021-10-19 2021-01-21   YouGov Yahoo News         458          a     R
## 2229 2021-10-19 2021-01-21   YouGov Yahoo News         550          a     D
## 2230 2021-10-19 2021-01-21   YouGov Yahoo News         503          a     I
## 2585 2021-05-24 2021-05-26   YouGov Yahoo News        1588          a   all
## 2586 2021-05-24 2021-05-26   YouGov Yahoo News         376          a     R
##      subject tracking
## 2227   Biden    FALSE
## 2228   Biden    FALSE
## 2229   Biden    FALSE
## 2230   Biden    FALSE
## 2585   Biden    FALSE
## 2586   Biden    FALSE
##                                                                                                               text
## 2227 Do you approve or disapprove of the way President Biden is handling each of the following issues? Coronavirus
## 2228 Do you approve or disapprove of the way President Biden is handling each of the following issues? Coronavirus
## 2229 Do you approve or disapprove of the way President Biden is handling each of the following issues? Coronavirus
## 2230 Do you approve or disapprove of the way President Biden is handling each of the following issues? Coronavirus
## 2585 Do you approve or disapprove of the way President Biden is handling each of the following issues? Coronavirus
## 2586 Do you approve or disapprove of the way President Biden is handling each of the following issues? Coronavirus
##      approve disapprove
## 2227      45         45
## 2228      12         84
## 2229      83         10
## 2230      39         52
## 2585      54         34
## 2586      26         67
##                                                                         url
## 2227 https://docs.cdn.yougov.com/4i6a6olite/20211021_yahoo_vaccine_tabs.pdf
## 2228 https://docs.cdn.yougov.com/4i6a6olite/20211021_yahoo_vaccine_tabs.pdf
## 2229 https://docs.cdn.yougov.com/4i6a6olite/20211021_yahoo_vaccine_tabs.pdf
## 2230 https://docs.cdn.yougov.com/4i6a6olite/20211021_yahoo_vaccine_tabs.pdf
## 2585 https://docs.cdn.yougov.com/zjdg6ujrzh/20210526_yahoo_vaccine_tabs.pdf
## 2586 https://docs.cdn.yougov.com/zjdg6ujrzh/20210526_yahoo_vaccine_tabs.pdf

The column names are renamed using the following code. This is not needed but done for aesthetics. The print of the new data frame is also generated at the end.

names(mydata)[names(mydata)=="end_date"] <- "End_Date"
names(mydata)[names(mydata)=="party"] <- "Party"
names(mydata)[names(mydata)=="approve"] <- "Approve"
names(mydata)[names(mydata)=="disapprove"] <- "Disapprove"
head(mydata)
##      start_date   End_Date pollster    sponsor sample_size population Party
## 2227 2021-10-19 2021-01-21   YouGov Yahoo News        1704          a   all
## 2228 2021-10-19 2021-01-21   YouGov Yahoo News         458          a     R
## 2229 2021-10-19 2021-01-21   YouGov Yahoo News         550          a     D
## 2230 2021-10-19 2021-01-21   YouGov Yahoo News         503          a     I
## 2585 2021-05-24 2021-05-26   YouGov Yahoo News        1588          a   all
## 2586 2021-05-24 2021-05-26   YouGov Yahoo News         376          a     R
##      subject tracking
## 2227   Biden    FALSE
## 2228   Biden    FALSE
## 2229   Biden    FALSE
## 2230   Biden    FALSE
## 2585   Biden    FALSE
## 2586   Biden    FALSE
##                                                                                                               text
## 2227 Do you approve or disapprove of the way President Biden is handling each of the following issues? Coronavirus
## 2228 Do you approve or disapprove of the way President Biden is handling each of the following issues? Coronavirus
## 2229 Do you approve or disapprove of the way President Biden is handling each of the following issues? Coronavirus
## 2230 Do you approve or disapprove of the way President Biden is handling each of the following issues? Coronavirus
## 2585 Do you approve or disapprove of the way President Biden is handling each of the following issues? Coronavirus
## 2586 Do you approve or disapprove of the way President Biden is handling each of the following issues? Coronavirus
##      Approve Disapprove
## 2227      45         45
## 2228      12         84
## 2229      83         10
## 2230      39         52
## 2585      54         34
## 2586      26         67
##                                                                         url
## 2227 https://docs.cdn.yougov.com/4i6a6olite/20211021_yahoo_vaccine_tabs.pdf
## 2228 https://docs.cdn.yougov.com/4i6a6olite/20211021_yahoo_vaccine_tabs.pdf
## 2229 https://docs.cdn.yougov.com/4i6a6olite/20211021_yahoo_vaccine_tabs.pdf
## 2230 https://docs.cdn.yougov.com/4i6a6olite/20211021_yahoo_vaccine_tabs.pdf
## 2585 https://docs.cdn.yougov.com/zjdg6ujrzh/20210526_yahoo_vaccine_tabs.pdf
## 2586 https://docs.cdn.yougov.com/zjdg6ujrzh/20210526_yahoo_vaccine_tabs.pdf

Two differnt set of plots are created - one each for Republican and Democratic respondents The package ggplot is used in this case.

library(magrittr)
library(ggplot2)
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v tibble  3.1.6     v dplyr   1.0.7
## v tidyr   1.2.0     v stringr 1.4.0
## v readr   2.1.2     v forcats 0.5.1
## v purrr   0.3.4
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x tidyr::extract()   masks magrittr::extract()
## x dplyr::filter()    masks stats::filter()
## x dplyr::lag()       masks stats::lag()
## x purrr::set_names() masks magrittr::set_names()
mydata %>% 
  filter(Party!= "all" & Party!= "I") %>%
  ggplot(aes(End_Date, Approve, colour = Party))+
  geom_point(size = 3, alpha = 1)+
  geom_line(size = 10)+
  theme_minimal()+
  labs(title = "Biden's Approval among Republicans and Democrats")
## geom_path: Each group consists of only one observation. Do you need to adjust
## the group aesthetic?

Conclusion:

In the above plot, we see that President Biden’s Coronavirus approval rating has been quite low among Republicans vs. Democrats. Also, it is evident that his approval rating is slowly falling accross both parties even though it is more so with the Republicans. Also, another intersting thing is that the pattern of ups and downs in approval rating of President Biden is very much similar across both parties. For example, his approval rating has been the highest among both parties around the end of June, 2021 and it has dipped to the lowest number in January 2022.