Noah Collin
This is Noah’s HW1 for 607. I’m using 538’s Covid Polls data set available here: https://github.com/fivethirtyeight/covid-19-polls
The specific CSV I’m using describes the approval percentage of a President’s performance in responding to Covid-19.
Below is a summary of the Polls CSV:
#setwd("")
polls <- read.csv("covid_approval_polls.csv")
summary(polls)
## start_date end_date pollster sponsor
## Length:2809 Length:2809 Length:2809 Length:2809
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## sample_size population party subject
## Min. : 55 Length:2809 Length:2809 Length:2809
## 1st Qu.: 389 Class :character Class :character Class :character
## Median : 640 Mode :character Mode :character Mode :character
## Mean : 2379
## 3rd Qu.: 1226
## Max. :325970
## NA's :22
## tracking text approve disapprove
## Mode :logical Length:2809 Min. : 1.00 Min. : 1.00
## FALSE:2559 Class :character 1st Qu.:30.00 1st Qu.:28.00
## TRUE :242 Mode :character Median :42.00 Median :53.00
## NA's :8 Mean :46.42 Mean :48.48
## 3rd Qu.:66.00 3rd Qu.:63.00
## Max. :98.00 Max. :98.00
## NA's :3 NA's :15
## url
## Length:2809
## Class :character
## Mode :character
##
##
##
##
Here are the top 5 rows of the uncleaned CSV:
head(polls)
##Subset of Data
library(dplyr)
Here’s just 4 columns of the data:
##Graph (I tried a few things that I couldn’t get to work. I’d hoped to turn in something better but ran out of time.)
ggplot(data = temp) +
geom_point(mapping = aes(x = end_date, y = approve))
## Warning: Removed 3 rows containing missing values (geom_point).
#+
# facet_grid(subject == "Trump" ~ subject == "Biden")