The main purpose of the assignment is to loading data into a data frame.The Presidential Primary Polls data from Polls data set is used here in this regard.The Presidential Primary Polls files contain data since the most recent election. Link to the data is https://projects.fivethirtyeight.com/polls-page/data/president_primary_polls.csv. After analyzing the data, the presidential candidates who achieved more than 55% vote as per population support in different states will be identified.
Required libraries
library(dplyr)
library(magrittr)
library(RCurl)
Data has header row so we must indicate “header=T”
x <- getURL("https://projects.fivethirtyeight.com/polls-page/data/president_primary_polls.csv")
df<-data.frame(read.csv(text=x, header=T))
Check column headers
df1=df %>% select(poll_id,pollster_id,pollster,pollster_rating_id,fte_grade,methodology,state,population,subpopulation,sample_size,question_id,cycle,office_type,stage,party,answer,candidate_id,candidate_name,pct)
head(df1)
## poll_id pollster_id pollster pollster_rating_id fte_grade methodology
## 1 79907 568 YouGov 391 B+ Online
## 2 79907 568 YouGov 391 B+ Online
## 3 79907 568 YouGov 391 B+ Online
## 4 79907 568 YouGov 391 B+ Online
## 5 79907 568 YouGov 391 B+ Online
## 6 79907 568 YouGov 391 B+ Online
## state population subpopulation sample_size question_id cycle
## 1 Massachusetts lv d 500 160203 2024
## 2 Massachusetts lv d 500 160203 2024
## 3 Massachusetts lv d 500 160203 2024
## 4 Massachusetts lv d 500 160203 2024
## 5 Massachusetts lv d 500 160203 2024
## 6 Massachusetts lv d 500 160203 2024
## office_type stage party answer candidate_id
## 1 U.S. President primary DEM Biden 19368
## 2 U.S. President primary DEM Harris 16661
## 3 U.S. President primary DEM Sanders 19238
## 4 U.S. President primary DEM Buttigieg 16662
## 5 U.S. President primary DEM Warren 19237
## 6 U.S. President primary DEM Ocasio-Cortez 16664
## candidate_name pct
## 1 Joe Biden 22
## 2 Kamala Harris 9
## 3 Bernard Sanders 12
## 4 Pete Buttigieg 17
## 5 Elizabeth Warren 15
## 6 Alexandria Ocasio-Cortez 6
Check renamed column headers
df1 %<>% rename(fivethirtyeight_grade=fte_grade,percentage=pct)
head(df1)
## poll_id pollster_id pollster pollster_rating_id fivethirtyeight_grade
## 1 79907 568 YouGov 391 B+
## 2 79907 568 YouGov 391 B+
## 3 79907 568 YouGov 391 B+
## 4 79907 568 YouGov 391 B+
## 5 79907 568 YouGov 391 B+
## 6 79907 568 YouGov 391 B+
## methodology state population subpopulation sample_size question_id
## 1 Online Massachusetts lv d 500 160203
## 2 Online Massachusetts lv d 500 160203
## 3 Online Massachusetts lv d 500 160203
## 4 Online Massachusetts lv d 500 160203
## 5 Online Massachusetts lv d 500 160203
## 6 Online Massachusetts lv d 500 160203
## cycle office_type stage party answer candidate_id
## 1 2024 U.S. President primary DEM Biden 19368
## 2 2024 U.S. President primary DEM Harris 16661
## 3 2024 U.S. President primary DEM Sanders 19238
## 4 2024 U.S. President primary DEM Buttigieg 16662
## 5 2024 U.S. President primary DEM Warren 19237
## 6 2024 U.S. President primary DEM Ocasio-Cortez 16664
## candidate_name percentage
## 1 Joe Biden 22
## 2 Kamala Harris 9
## 3 Bernard Sanders 12
## 4 Pete Buttigieg 17
## 5 Elizabeth Warren 15
## 6 Alexandria Ocasio-Cortez 6
Check transformed values in columns
df1 %<>% mutate(population=case_when(population=="lv"~"likely voters",
population=="rv"~"registered voters",
population=="a"~"all adults",
population=="v"~"voters"),
subpopulation=case_when(subpopulation=="r"~"republican",
subpopulation=="d"~"democratic" ))
head(df1)
## poll_id pollster_id pollster pollster_rating_id fivethirtyeight_grade
## 1 79907 568 YouGov 391 B+
## 2 79907 568 YouGov 391 B+
## 3 79907 568 YouGov 391 B+
## 4 79907 568 YouGov 391 B+
## 5 79907 568 YouGov 391 B+
## 6 79907 568 YouGov 391 B+
## methodology state population subpopulation sample_size question_id
## 1 Online Massachusetts likely voters democratic 500 160203
## 2 Online Massachusetts likely voters democratic 500 160203
## 3 Online Massachusetts likely voters democratic 500 160203
## 4 Online Massachusetts likely voters democratic 500 160203
## 5 Online Massachusetts likely voters democratic 500 160203
## 6 Online Massachusetts likely voters democratic 500 160203
## cycle office_type stage party answer candidate_id
## 1 2024 U.S. President primary DEM Biden 19368
## 2 2024 U.S. President primary DEM Harris 16661
## 3 2024 U.S. President primary DEM Sanders 19238
## 4 2024 U.S. President primary DEM Buttigieg 16662
## 5 2024 U.S. President primary DEM Warren 19237
## 6 2024 U.S. President primary DEM Ocasio-Cortez 16664
## candidate_name percentage
## 1 Joe Biden 22
## 2 Kamala Harris 9
## 3 Bernard Sanders 12
## 4 Pete Buttigieg 17
## 5 Elizabeth Warren 15
## 6 Alexandria Ocasio-Cortez 6
prepare a table showing population and party wise candidate name who got over 55% vote in presidential primary polls
table1<-subset(df1,percentage>55,select=c(state,population,party,candidate_name,percentage))
table1
## state population party candidate_name percentage
## 198 registered voters REP Donald Trump 57.0
## 214 Virginia all adults REP Donald Trump 62.0
## 298 Texas registered voters REP Ron DeSantis 58.0
## 312 registered voters REP Donald Trump 56.0
## 552 likely voters DEM Joe Biden 56.0
## 554 registered voters DEM Joe Biden 56.0
## 556 likely voters DEM Kamala Harris 56.0
## 560 likely voters REP Donald Trump 56.0
## 562 registered voters REP Donald Trump 59.0
## 580 all adults REP Donald Trump 64.0
## 582 all adults REP Donald Trump 79.0
## 595 Florida registered voters REP Ron DeSantis 61.0
## 664 registered voters REP Donald Trump 56.0
## 680 likely voters REP Donald Trump 55.4
## 688 all adults REP Donald Trump 61.0
## 690 registered voters REP Donald Trump 65.0
## 715 likely voters REP Donald Trump 59.0
## 931 Texas registered voters REP Ron DeSantis 57.0
## 1127 Texas likely voters REP Ron DeSantis 56.0
## 1157 likely voters REP Donald Trump 57.0
## 1228 registered voters REP Donald Trump 58.0
## 1306 registered voters REP Donald Trump 60.0
## 1308 likely voters REP Donald Trump 64.0
## 1310 registered voters REP Donald Trump 68.0
## 1437 registered voters REP Donald Trump 59.0
## 1657 registered voters REP Donald Trump 63.0
## 1897 registered voters REP Donald Trump 57.0
## 1916 registered voters DEM Kamala Harris 56.0
## 1922 registered voters REP Donald Trump 57.0
## 2149 registered voters REP Donald Trump 60.0
## 2165 registered voters REP Donald Trump 67.0
## 2413 registered voters REP Donald Trump 62.0
## 2503 likely voters REP Donald Trump 59.1
## 2547 Florida likely voters REP Donald Trump 58.0
## 2550 Florida likely voters DEM Joe Biden 60.0
## 2551 registered voters REP Donald Trump 58.0
## 2566 likely voters REP Donald Trump 59.0
## 2612 registered voters REP Donald Trump 66.8
## 2719 all adults REP Donald Trump 58.0
## 2799 registered voters REP Donald Trump 58.0
## 2867 likely voters REP Donald Trump 57.0
## 3075 New Hampshire likely voters REP Donald Trump 62.0
## 3104 Iowa registered voters REP Donald Trump 61.0
## 3231 Florida likely voters REP Ron DeSantis 64.0
## 3309 Georgia likely voters REP Donald Trump 73.0
## 3321 likely voters REP Donald Trump 56.0
## 3350 New Hampshire registered voters REP Donald Trump 56.5
## 3369 North Carolina likely voters REP Donald Trump 75.6
Presidential Primary Polls data is used here to identify population support and party wise candidate names who achieved more than 55% vote in different states in Presidential Primary Polls.