The article ‘Why Many Americans Don’t Vote’ describes a survey and subsequent analysis which sought to to provide insights on voter history. The survey captured the responses of 8,327 unique individuals on questions which range from demographic information to questions on elections and political perceptions. The survey results were then matched to a second dataset providing information on respondent voting history. The final dataset provided by fivethirtyeight.com contained survey and voting history information for 5,836 individuals. The article can be found here: https://projects.fivethirtyeight.com/non-voters-poll-2020-election/
For this assignment I created a subset of three variables in hopes of providing insight to the belief that systematic racism is a problem in the United States and history of voting.
Available responses:1=Strongly Agree,2=Somewhat Agree,3=Somewhat disagree,4=Strongly disagree
‘voter_category’ (A history of voting in recent elections)
Available responses:Always, Sporadic, Rarely/Never
RespId - A unique identification number which represents one unique survey respondent
‘Voter_category’ Response Definitions
Always = voted in all or all but one of the national elections presidential and midterm they were eligible to vote in since 2000
Sporadic = voted in at least two elections, but fewer than all the elections they were eligible to vote in or all but one
Rarely/Never = voted in no elections, or only one.
#Constructing new dataframe and recoding variables
#Make a new variable from the 'Q3_1' and recode the numerical responses via a case statement to make the data more readable
nonvoters_data <- nonvoters_data%>%
mutate(Systemic_racism_is_problem_in_the_United_States = case_when(
Q3_1 == 1 ~ '1.Strongly Agree',
Q3_1 == 2 ~ '2.Somewhat Agree',
Q3_1 == 3 ~ '3.Somewhat Disagree',
Q3_1 == 4 ~ '4.Strongly Disagree',
))
#renaming column name
nonvoters_data <- nonvoters_data%>%
rename(
Respondant_Identification_Number=RespId,
)
#create a new datafrom
d_frame <- nonvoters_data%>%
select(Respondant_Identification_Number, Systemic_racism_is_problem_in_the_United_States,voter_category)
#exclude rows with missing values from the dataframe
#this is the final dataframe for the assignment
d_frame <- d_frame%>%
filter(Systemic_racism_is_problem_in_the_United_States!="-1")
I placed the data into a pivot table to allow for high level overview
Observed_Values<-table(d_frame$Systemic_racism_is_problem_in_the_United_States, d_frame$voter_category)
Observed_Values
##
## always rarely/never sporadic
## 1.Strongly Agree 844 626 1212
## 2.Somewhat Agree 436 478 680
## 3.Somewhat Disagree 279 187 337
## 4.Strongly Disagree 243 141 334
Further analysis would be needed to determine if there is a relationship between the answer to question “Systemic_racism_is_problem_in_the_United_States” and voter history. It is interesting to note that among those respondents who strongly agree with the statement, these respondents are more likely to have a sporadic voting history. I would consider using a chi-squared test to determine a relationship between these two variables.