Introduction

An calculation of the president’s approval rating before convid19 and after convid19: The link of article is provide as below https://projects.fivethirtyeight.com/trump-approval-ratings/

Step 1:

pulling data from online excel sheet

polls <- read.csv('https://raw.githubusercontent.com/szx868/data607/master/approval_topline.csv')
head(polls)
##      president  subgroup modeldate approve_estimate approve_hi approve_lo
## 1 Donald Trump    Voters 8/29/2020         42.93170   47.37356   38.48984
## 2 Donald Trump    Adults 8/29/2020         40.54545   44.78658   36.30432
## 3 Donald Trump All polls 8/29/2020         42.03605   46.89826   37.17385
## 4 Donald Trump All polls 8/28/2020         42.06078   46.91824   37.20332
## 5 Donald Trump    Adults 8/28/2020         40.59133   44.82678   36.35588
## 6 Donald Trump    Voters 8/28/2020         42.93757   47.38771   38.48743
##   disapprove_estimate disapprove_hi disapprove_lo            timestamp
## 1            53.35110      58.36619      48.33603 13:42:24 29 Aug 2020
## 2            55.66737      60.75956      50.57518 13:40:06 29 Aug 2020
## 3            54.19849      59.58161      48.81537 13:38:37 29 Aug 2020
## 4            54.14054      59.51872      48.76236 17:57:36 28 Aug 2020
## 5            55.61034      60.70996      50.51071 17:59:04 28 Aug 2020
## 6            53.35496      58.36143      48.34849 18:01:25 28 Aug 2020

Step 2:

pick the columns you need, this case we only looking for date and approve estimate and disapprove estimate.

polls <- polls[c('modeldate','approve_estimate','disapprove_estimate')]
head(polls)
##   modeldate approve_estimate disapprove_estimate
## 1 8/29/2020         42.93170            53.35110
## 2 8/29/2020         40.54545            55.66737
## 3 8/29/2020         42.03605            54.19849
## 4 8/28/2020         42.06078            54.14054
## 5 8/28/2020         40.59133            55.61034
## 6 8/28/2020         42.93757            53.35496

Step 3:

Convert modeldate column from chr to date, so we filter out data

polls$modeldate <- as.Date(polls$modeldate,
                                     format = "%m/%d/%y")

sorted.polls <- order(polls['modeldate'])
head(polls[sorted.polls,])
##       modeldate approve_estimate disapprove_estimate
## 724  2020-01-01         43.43970            52.74527
## 725  2020-01-01         41.50187            53.28732
## 726  2020-01-01         42.61256            52.87366
## 1819 2020-01-01         41.14145            53.42892
## 1820 2020-01-01         42.50409            53.18977
## 1821 2020-01-01         41.44894            53.39114

Step 4:

Extract subset of data prior national emergency date(2020-03-13) Extract subset of data after national emergency date(2020-03-13)

polls.beforeConvid <- subset(polls,subset = modeldate<as.Date("2020-03-13"))
polls.afterConvid <- subset(polls,subset = modeldate>=as.Date("2020-03-13"))

Step 5:

plot graph to visualize result

library(ggplot2)

ggplot(data=polls.afterConvid, aes(x = modeldate, y = approve_estimate)) +
        geom_line() +
      labs(title = "Approval Rate Trend for After Convid")

ggplot(data=polls.beforeConvid, aes(x = modeldate, y = approve_estimate)) +
        geom_line() +
      labs(title = "Approval Rate Trend for Before Convid")

Step 6:

Renaming the columns to ‘approve’ and disapprove’ compare summary of two data frame

polls.afterConvid <- polls.afterConvid[c('approve_estimate','disapprove_estimate')]
polls.beforeConvid <- polls.beforeConvid[c('approve_estimate','disapprove_estimate')]
colnames(polls.beforeConvid) <- c('approve', 'disappove')
colnames(polls.afterConvid) <- c('approve', 'disappove')
summary(polls.beforeConvid)
##     approve        disappove    
##  Min.   :37.34   Min.   :37.00  
##  1st Qu.:40.47   1st Qu.:52.14  
##  Median :41.91   Median :53.39  
##  Mean   :41.88   Mean   :52.71  
##  3rd Qu.:43.30   3rd Qu.:54.39  
##  Max.   :48.62   Max.   :57.11
summary(polls.afterConvid)
##     approve        disappove    
##  Min.   :34.93   Min.   :48.65  
##  1st Qu.:40.15   1st Qu.:52.61  
##  Median :41.46   Median :53.47  
##  Mean   :41.19   Mean   :53.70  
##  3rd Qu.:42.55   3rd Qu.:54.70  
##  Max.   :46.33   Max.   :58.29

Conclusion

Based on those data it look like the popularity of Trump is not effected a lot by Convid-19 when you compare mean of Approve and mean of Disapprove. To have a more accurate result, I would recommend to take another static for different subgroup(Voters,All Polls, Adults)