An calculation of the president’s approval rating before convid19 and after convid19: The link of article is provide as below https://projects.fivethirtyeight.com/trump-approval-ratings/
pulling data from online excel sheet
polls <- read.csv('https://raw.githubusercontent.com/szx868/data607/master/approval_topline.csv')
head(polls)
## president subgroup modeldate approve_estimate approve_hi approve_lo
## 1 Donald Trump Voters 8/29/2020 42.93170 47.37356 38.48984
## 2 Donald Trump Adults 8/29/2020 40.54545 44.78658 36.30432
## 3 Donald Trump All polls 8/29/2020 42.03605 46.89826 37.17385
## 4 Donald Trump All polls 8/28/2020 42.06078 46.91824 37.20332
## 5 Donald Trump Adults 8/28/2020 40.59133 44.82678 36.35588
## 6 Donald Trump Voters 8/28/2020 42.93757 47.38771 38.48743
## disapprove_estimate disapprove_hi disapprove_lo timestamp
## 1 53.35110 58.36619 48.33603 13:42:24 29 Aug 2020
## 2 55.66737 60.75956 50.57518 13:40:06 29 Aug 2020
## 3 54.19849 59.58161 48.81537 13:38:37 29 Aug 2020
## 4 54.14054 59.51872 48.76236 17:57:36 28 Aug 2020
## 5 55.61034 60.70996 50.51071 17:59:04 28 Aug 2020
## 6 53.35496 58.36143 48.34849 18:01:25 28 Aug 2020
pick the columns you need, this case we only looking for date and approve estimate and disapprove estimate.
polls <- polls[c('modeldate','approve_estimate','disapprove_estimate')]
head(polls)
## modeldate approve_estimate disapprove_estimate
## 1 8/29/2020 42.93170 53.35110
## 2 8/29/2020 40.54545 55.66737
## 3 8/29/2020 42.03605 54.19849
## 4 8/28/2020 42.06078 54.14054
## 5 8/28/2020 40.59133 55.61034
## 6 8/28/2020 42.93757 53.35496
Convert modeldate column from chr to date, so we filter out data
polls$modeldate <- as.Date(polls$modeldate,
format = "%m/%d/%y")
sorted.polls <- order(polls['modeldate'])
head(polls[sorted.polls,])
## modeldate approve_estimate disapprove_estimate
## 724 2020-01-01 43.43970 52.74527
## 725 2020-01-01 41.50187 53.28732
## 726 2020-01-01 42.61256 52.87366
## 1819 2020-01-01 41.14145 53.42892
## 1820 2020-01-01 42.50409 53.18977
## 1821 2020-01-01 41.44894 53.39114
Extract subset of data prior national emergency date(2020-03-13) Extract subset of data after national emergency date(2020-03-13)
polls.beforeConvid <- subset(polls,subset = modeldate<as.Date("2020-03-13"))
polls.afterConvid <- subset(polls,subset = modeldate>=as.Date("2020-03-13"))
plot graph to visualize result
library(ggplot2)
ggplot(data=polls.afterConvid, aes(x = modeldate, y = approve_estimate)) +
geom_line() +
labs(title = "Approval Rate Trend for After Convid")
ggplot(data=polls.beforeConvid, aes(x = modeldate, y = approve_estimate)) +
geom_line() +
labs(title = "Approval Rate Trend for Before Convid")
Renaming the columns to ‘approve’ and disapprove’ compare summary of two data frame
polls.afterConvid <- polls.afterConvid[c('approve_estimate','disapprove_estimate')]
polls.beforeConvid <- polls.beforeConvid[c('approve_estimate','disapprove_estimate')]
colnames(polls.beforeConvid) <- c('approve', 'disappove')
colnames(polls.afterConvid) <- c('approve', 'disappove')
summary(polls.beforeConvid)
## approve disappove
## Min. :37.34 Min. :37.00
## 1st Qu.:40.47 1st Qu.:52.14
## Median :41.91 Median :53.39
## Mean :41.88 Mean :52.71
## 3rd Qu.:43.30 3rd Qu.:54.39
## Max. :48.62 Max. :57.11
summary(polls.afterConvid)
## approve disappove
## Min. :34.93 Min. :48.65
## 1st Qu.:40.15 1st Qu.:52.61
## Median :41.46 Median :53.47
## Mean :41.19 Mean :53.70
## 3rd Qu.:42.55 3rd Qu.:54.70
## Max. :46.33 Max. :58.29
Based on those data it look like the popularity of Trump is not effected a lot by Convid-19 when you compare mean of Approve and mean of Disapprove. To have a more accurate result, I would recommend to take another static for different subgroup(Voters,All Polls, Adults)