R Markdown

Noah Collin

This is Noah’s HW1 for 607. I’m using 538’s Covid Polls data set available here: https://github.com/fivethirtyeight/covid-19-polls

The specific CSV I’m using describes the approval percentage of a President’s performance in responding to Covid-19.

Below is a summary of the Polls CSV:

#setwd("")
polls <- read.csv("covid_approval_polls.csv")
summary(polls)
##   start_date          end_date           pollster           sponsor         
##  Length:2809        Length:2809        Length:2809        Length:2809       
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   sample_size      population           party             subject         
##  Min.   :    55   Length:2809        Length:2809        Length:2809       
##  1st Qu.:   389   Class :character   Class :character   Class :character  
##  Median :   640   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :  2379                                                           
##  3rd Qu.:  1226                                                           
##  Max.   :325970                                                           
##  NA's   :22                                                               
##   tracking           text              approve        disapprove   
##  Mode :logical   Length:2809        Min.   : 1.00   Min.   : 1.00  
##  FALSE:2559      Class :character   1st Qu.:30.00   1st Qu.:28.00  
##  TRUE :242       Mode  :character   Median :42.00   Median :53.00  
##  NA's :8                            Mean   :46.42   Mean   :48.48  
##                                     3rd Qu.:66.00   3rd Qu.:63.00  
##                                     Max.   :98.00   Max.   :98.00  
##                                     NA's   :3       NA's   :15     
##      url           
##  Length:2809       
##  Class :character  
##  Mode  :character  
##                    
##                    
##                    
## 

Here are the top 5 rows of the uncleaned CSV:

head(polls)

##Subset of Data

library(dplyr)

Here’s just 4 columns of the data:

##Graph (I tried a few things that I couldn’t get to work. I’d hoped to turn in something better but ran out of time.)

ggplot(data = temp) + 
  geom_point(mapping = aes(x = end_date, y = approve)) 
## Warning: Removed 3 rows containing missing values (geom_point).

#+ 
 # facet_grid(subject == "Trump" ~ subject == "Biden")