Overview

The article that I chose is "Why Many Americans Don't Vote", by Amelia Thomsn-DeVeux, Jasmine Mithani, and Laura Bronner. The article used a poll to collect data to study why many Americans do not vote. This is an important topic because anywhere between 35 to 60 percent of eligible voters do not cast a ballot in any given electin, as stated in the article. The link to the article is: https://projects.fivethirtyeight.com/non-voters-poll-2020-election/.

Code

Read .csv file

poll.data <- read.csv("https://raw.githubusercontent.com/SaneSky109/DATA607/main/poll_data.csv")

# Check the first few rows of the dataset using head() function
head(poll.data)
##   RespId weight Q1 Q2_1 Q2_2 Q2_3 Q2_4 Q2_5 Q2_6 Q2_7 Q2_8 Q2_9 Q2_10 Q3_1 Q3_2
## 1 470001 0.7516  1    1    1    2    4    1    4    2    2    4     2    1    1
## 2 470002 1.0267  1    1    2    2    3    1    1    2    1    1     3    3    3
## 3 470003 1.0844  1    1    1    2    2    1    1    2    1    4     3    2    2
## 4 470007 0.6817  1    1    1    1    3    1    1    1    1    1     2    1    1
## 5 480008 0.9910  1    1    1   -1    1    1    1    1    1    1     1    4   -1
## 6 480009 1.0591  1    3    2    3    4    1    3    3    1    1     4    1    2
##   Q3_3 Q3_4 Q3_5 Q3_6 Q4_1 Q4_2 Q4_3 Q4_4 Q4_5 Q4_6 Q5 Q6 Q7 Q8_1 Q8_2 Q8_3
## 1    4    4    3    2    2    1    2    2    2    2  1  2  1    3    4    2
## 2    4    3    3    2    2    2    2    3    3    1  1  2  2    2    3    2
## 3    3    3    2    2    2    2    3    3    2    3  1  1  1    3    2    1
## 4    4    4    2    1    1    2    2    2    2    2  1  3  1    3    2    2
## 5    1    1    2    4    1    1    1    1    1    1  1  2  2    1    3    2
## 6   -1    2    2    2    4    3    3    3    4    2  2  4  1    3    3    3
##   Q8_4 Q8_5 Q8_6 Q8_7 Q8_8 Q8_9 Q9_1 Q9_2 Q9_3 Q9_4 Q10_1 Q10_2 Q10_3 Q10_4
## 1    1    1    1    1    2    4    2    2    4    4     2     2     2     2
## 2    2    2    2    3    2    2    1    1    3    4     2     2     2     2
## 3    1    2    2    2    2    1    1    2    4    4     2     2     1     2
## 4    2    2    2    2    2    2    1    2    4    4     2     2     2     2
## 5    3    3    3    4    2    2    1    4    3    4     2     2     2     2
## 6    2    3    3    2    2    2   -1   -1   -1    4     2     2     2     2
##   Q11_1 Q11_2 Q11_3 Q11_4 Q11_5 Q11_6 Q14 Q15 Q16 Q17_1 Q17_2 Q17_3 Q17_4 Q18_1
## 1     2     2     2     2     2     2   5   1   1     1     1     1     3     2
## 2     2     2     1     2     2     2   1   1   2     2     2     2     3     2
## 3     2     2     1     2     1     2   5   2   1     1     3     1     1     2
## 4     1     2     2     2     1     2   5   1   4     1     1     1     1     2
## 5     2     2     1     2     2     2   1   5   1     2     2     4     4     2
## 6     2     2     2     1     2     2  -1  -1  -1    -1    -1    -1    -1     2
##   Q18_2 Q18_3 Q18_4 Q18_5 Q18_6 Q18_7 Q18_8 Q18_9 Q18_10 Q19_1 Q19_2 Q19_3
## 1     2     2     2     2     2     2     2     2      2    -1    -1     1
## 2     2     2     2     2     2     2     2     2      2    -1     1    -1
## 3     2     2     2     2     2     1     2     2      2    -1     1    -1
## 4     2     2     2     2     2     2     2     2      2    -1    -1     1
## 5     2     2     2     2     2     2     2     2      2    -1    -1    -1
## 6     2     2     2     2     2     2     2     2      2    -1    -1    -1
##   Q19_4 Q19_5 Q19_6 Q19_7 Q19_8 Q19_9 Q19_10 Q20 Q21 Q22 Q23 Q24 Q25 Q26 Q27_1
## 1     1     1     1     1    -1    -1     -1   1   1  NA   2   1   1   1     1
## 2    -1    -1    -1    -1    -1    -1     -1   1   1  NA   1   3   3   1     1
## 3     1    -1    -1    -1     1     1     -1   1   1  NA   2   1   2   1     1
## 4    -1    -1    -1    -1     1    -1      1   1   1  NA   2   1   2   1     1
## 5    -1    -1    -1    -1    -1    -1     -1   1   1  NA   1   3   1   1     1
## 6    -1    -1    -1    -1    -1    -1     -1   2   2   7  -1   4   3   4     2
##   Q27_2 Q27_3 Q27_4 Q27_5 Q27_6 Q28_1 Q28_2 Q28_3 Q28_4 Q28_5 Q28_6 Q28_7 Q28_8
## 1     1     1     1     1     1     1     1     1     1    -1    -1     1    -1
## 2     1     1     1     1     1     1    -1    -1    -1    -1     1    -1    -1
## 3     1     1     1     1     1     1    -1    -1    -1    -1    -1     1    -1
## 4     1     1     1     1     1     1     1    -1     1    -1    -1    -1    -1
## 5     1     1     1     1     1     1     1     1    -1     1    -1     1    -1
## 6     2     2     2     2     2    NA    NA    NA    NA    NA    NA    NA    NA
##   Q29_1 Q29_2 Q29_3 Q29_4 Q29_5 Q29_6 Q29_7 Q29_8 Q29_9 Q29_10 Q30 Q31 Q32 Q33
## 1    NA    NA    NA    NA    NA    NA    NA    NA    NA     NA   2  NA   1  NA
## 2    NA    NA    NA    NA    NA    NA    NA    NA    NA     NA   3  NA  NA   1
## 3    NA    NA    NA    NA    NA    NA    NA    NA    NA     NA   2  NA   2  NA
## 4    NA    NA    NA    NA    NA    NA    NA    NA    NA     NA   2  NA   1  NA
## 5    NA    NA    NA    NA    NA    NA    NA    NA    NA     NA   1  -1  NA  NA
## 6    -1    -1    -1    -1    -1    -1    -1    -1     1     -1   5  NA  NA  -1
##   ppage                educ  race gender    income_cat voter_category
## 1    73             College White Female      $75-125k         always
## 2    90             College White Female $125k or more         always
## 3    53             College White   Male $125k or more       sporadic
## 4    58        Some college Black Female       $40-75k       sporadic
## 5    81 High school or less White   Male       $40-75k         always
## 6    61 High school or less White Female       $40-75k   rarely/never

It is clear to see that: the column names need to be changed, entry names need to be more clear, and the number of columns need can be reduced.

Subset the data

poll.data.subset<-poll.data[,c(4:13,27:37,42:45,54,82,84,110,114:119)]

head(poll.data.subset)
##   Q2_1 Q2_2 Q2_3 Q2_4 Q2_5 Q2_6 Q2_7 Q2_8 Q2_9 Q2_10 Q6 Q7 Q8_1 Q8_2 Q8_3 Q8_4
## 1    1    1    2    4    1    4    2    2    4     2  2  1    3    4    2    1
## 2    1    2    2    3    1    1    2    1    1     3  2  2    2    3    2    2
## 3    1    1    2    2    1    1    2    1    4     3  1  1    3    2    1    1
## 4    1    1    1    3    1    1    1    1    1     2  3  1    3    2    2    2
## 5    1    1   -1    1    1    1    1    1    1     1  2  2    1    3    2    3
## 6    3    2    3    4    1    3    3    1    1     4  4  1    3    3    3    2
##   Q8_5 Q8_6 Q8_7 Q8_8 Q8_9 Q10_1 Q10_2 Q10_3 Q10_4 Q16 Q23 Q25 Q30 ppage
## 1    1    1    1    2    4     2     2     2     2   1   2   1   2    73
## 2    2    2    3    2    2     2     2     2     2   2   1   3   3    90
## 3    2    2    2    2    1     2     2     1     2   1   2   2   2    53
## 4    2    2    2    2    2     2     2     2     2   4   2   2   2    58
## 5    3    3    4    2    2     2     2     2     2   1   1   1   1    81
## 6    3    3    2    2    2     2     2     2     2  -1  -1   3   5    61
##                  educ  race gender    income_cat voter_category
## 1             College White Female      $75-125k         always
## 2             College White Female $125k or more         always
## 3             College White   Male $125k or more       sporadic
## 4        Some college Black Female       $40-75k       sporadic
## 5 High school or less White   Male       $40-75k         always
## 6 High school or less White Female       $40-75k   rarely/never

After some time looking at the dataset and article, I decided that these variables are most useful in determining the target variable: voter_category. Now let's adjust the column names adn entry names for all unclear variable names and entry names.

Renaming the Columns and Entries

# Renaming Variables

names(poll.data.subset)[names(poll.data.subset) == 'Q2_1'] <- 'Importance_of_Voting'
poll.data.subset$Importance_of_Voting[poll.data.subset$Importance_of_Voting == -1]<-1
poll.data.subset$Importance_of_Voting<-factor(poll.data.subset$Importance_of_Voting,labels = c("very_important", "somewhat_important", "not_so_important", "not_at_all_important"))
names(poll.data.subset)[names(poll.data.subset) == 'Q2_2'] <- 'Importance_of_Jury_Duty'
poll.data.subset$Importance_of_Jury_Duty[poll.data.subset$Importance_of_Jury_Duty == -1]<-1
poll.data.subset$Importance_of_Jury_Duty<-factor(poll.data.subset$Importance_of_Jury_Duty,labels = c("very_important", "somewhat_important", "not_so_important", "not_at_all_important"))
names(poll.data.subset)[names(poll.data.subset) == 'Q2_3'] <- 'Importance_of_Following_Politics'
poll.data.subset$Importance_of_Following_Politics[poll.data.subset$Importance_of_Following_Politics == -1]<-1
poll.data.subset$Importance_of_Following_Politics<-factor(poll.data.subset$Importance_of_Following_Politics,labels = c("very_important", "somewhat_important", "not_so_important", "not_at_all_important"))
names(poll.data.subset)[names(poll.data.subset) == 'Q2_4'] <- 'Importance_of_the_Flag'
poll.data.subset$Importance_of_the_Flag[poll.data.subset$Importance_of_the_Flag == -1]<-1
poll.data.subset$Importance_of_the_Flag<-factor(poll.data.subset$Importance_of_the_Flag,labels = c("very_important", "somewhat_important", "not_so_important", "not_at_all_important"))
names(poll.data.subset)[names(poll.data.subset) == 'Q2_5'] <- 'Importance_of_US_Census'
poll.data.subset$Importance_of_US_Census[poll.data.subset$Importance_of_US_Census == -1]<-1
poll.data.subset$Importance_of_US_Census<-factor(poll.data.subset$Importance_of_US_Census,labels = c("very_important", "somewhat_important", "not_so_important", "not_at_all_important"))
names(poll.data.subset)[names(poll.data.subset) == 'Q2_6'] <- 'Importance_of_Saying_the_Pledge'
poll.data.subset$Importance_of_Saying_the_Pledge[poll.data.subset$Importance_of_Saying_the_Pledge == -1]<-1
poll.data.subset$Importance_of_Saying_the_Pledge<-factor(poll.data.subset$Importance_of_Saying_the_Pledge,labels = c("very_important", "somewhat_important", "not_so_important", "not_at_all_important"))
names(poll.data.subset)[names(poll.data.subset) == 'Q2_7'] <- 'Importance_of_Military_Support'
poll.data.subset$Importance_of_Military_Support[poll.data.subset$Importance_of_Military_Support == -1]<-1
poll.data.subset$Importance_of_Military_Support<-factor(poll.data.subset$Importance_of_Military_Support,labels = c("very_important", "somewhat_important", "not_so_important", "not_at_all_important"))
names(poll.data.subset)[names(poll.data.subset) == 'Q2_8'] <- 'Importance_of_Respecting_Opinions'
poll.data.subset$Importance_of_Respecting_Opinions[poll.data.subset$Importance_of_Respecting_Opinions == -1]<-1
poll.data.subset$Importance_of_Respecting_Opinions<-factor(poll.data.subset$Importance_of_Respecting_Opinions,labels = c("very_important", "somewhat_important", "not_so_important", "not_at_all_important"))
names(poll.data.subset)[names(poll.data.subset) == 'Q2_9'] <- 'Importance_of_Religion'
poll.data.subset$Importance_of_Religion[poll.data.subset$Importance_of_Religion == -1]<-1
poll.data.subset$Importance_of_Religion<-factor(poll.data.subset$Importance_of_Religion,labels = c("very_important", "somewhat_important", "not_so_important", "not_at_all_important"))
names(poll.data.subset)[names(poll.data.subset) == 'Q2_10'] <- 'Importance_of_Right_to_Protest'
poll.data.subset$Importance_of_Right_to_Protest[poll.data.subset$Importance_of_Right_to_Protest == -1]<-1
poll.data.subset$Importance_of_Right_to_Protest<-factor(poll.data.subset$Importance_of_Right_to_Protest,labels = c("very_important", "somewhat_important", "not_so_important", "not_at_all_important"))

names(poll.data.subset)[names(poll.data.subset) == 'Q6'] <- 'How_many_people_in_office_are_like_you'
poll.data.subset$How_many_people_in_office_are_like_you[poll.data.subset$How_many_people_in_office_are_like_you == -1]<-1
poll.data.subset$How_many_people_in_office_are_like_you<-factor(poll.data.subset$How_many_people_in_office_are_like_you, labels = c("a lot", "some", "a few", "none"))


names(poll.data.subset)[names(poll.data.subset) == 'Q7'] <- 'Opinion_on_Structure_of_US_Government'
poll.data.subset$Opinion_on_Structure_of_US_Government[poll.data.subset$Opinion_on_Structure_of_US_Government == -1]<-1
poll.data.subset$Opinion_on_Structure_of_US_Government<-factor(poll.data.subset$Opinion_on_Structure_of_US_Government, labels = c("a lot needs to change", "change is not really needed"))


names(poll.data.subset)[names(poll.data.subset) == 'Q8_1'] <- 'Trust_President'
poll.data.subset$Trust_President[poll.data.subset$Trust_President == -1]<-1
poll.data.subset$Trust_President<-factor(poll.data.subset$Trust_President, labels = c("a lot", "some", "not much", "not at all"))
names(poll.data.subset)[names(poll.data.subset) == 'Q8_2'] <- 'Trust_Congress'
poll.data.subset$Trust_Congress[poll.data.subset$Trust_Congress == -1]<-1
poll.data.subset$Trust_Congress<-factor(poll.data.subset$Trust_Congress, labels = c("a lot", "some", "not much", "not at all"))
names(poll.data.subset)[names(poll.data.subset) == 'Q8_3'] <- 'Trust_Supreme_Court'
poll.data.subset$Trust_Supreme_Court[poll.data.subset$Trust_Supreme_Court == -1]<-1
poll.data.subset$Trust_Supreme_Court<-factor(poll.data.subset$Trust_Supreme_Court, labels = c("a lot", "some", "not much", "not at all"))
names(poll.data.subset)[names(poll.data.subset) == 'Q8_4'] <- 'Trust_CDC'
poll.data.subset$Trust_CDC[poll.data.subset$Trust_CDC == -1]<-1
poll.data.subset$Trust_CDC<-factor(poll.data.subset$Trust_CDC, labels = c("a lot", "some", "not much", "not at all"))
names(poll.data.subset)[names(poll.data.subset) == 'Q8_5'] <- 'Trust_Elected_Officials'
poll.data.subset$Trust_Elected_Officials[poll.data.subset$Trust_Elected_Officials == -1]<-1
poll.data.subset$Trust_Elected_Officials<-factor(poll.data.subset$Trust_Elected_Officials, labels = c("a lot", "some", "not much", "not at all"))
names(poll.data.subset)[names(poll.data.subset) == 'Q8_6'] <- 'Trust_CIA_or_FBI'
poll.data.subset$Trust_CIA_or_FBI[poll.data.subset$Trust_CIA_or_FBI == -1]<-1
poll.data.subset$Trust_CIA_or_FBI<-factor(poll.data.subset$Trust_CIA_or_FBI, labels = c("a lot", "some", "not much", "not at all"))
names(poll.data.subset)[names(poll.data.subset) == 'Q8_7'] <- 'Trust_News_Media_Outlets'
poll.data.subset$Trust_News_Media_Outlets[poll.data.subset$Trust_News_Media_Outlets == -1]<-1
poll.data.subset$Trust_News_Media_Outlets<-factor(poll.data.subset$Trust_News_Media_Outlets, labels = c("a lot", "some", "not much", "not at all"))
names(poll.data.subset)[names(poll.data.subset) == 'Q8_8'] <- 'Trust_Police'
poll.data.subset$Trust_Police[poll.data.subset$Trust_Police == -1]<-1
poll.data.subset$Trust_Police<-factor(poll.data.subset$Trust_Police, labels = c("a lot", "some", "not much", "not at all"))
names(poll.data.subset)[names(poll.data.subset) == 'Q8_9'] <- 'Trust_US_Postal_Service'
poll.data.subset$Trust_US_Postal_Service[poll.data.subset$Trust_US_Postal_Service == -1]<-1
poll.data.subset$Trust_US_Postal_Service<-factor(poll.data.subset$Trust_US_Postal_Service, labels = c("a lot", "some", "not much", "not at all"))

names(poll.data.subset)[names(poll.data.subset) == 'Q10_1'] <- 'Recieve_Longterm_Disability'
poll.data.subset$Recieve_Longterm_Disability[poll.data.subset$Recieve_Longterm_Disability == -1]<-1
poll.data.subset$Recieve_Longterm_Disability<-factor(poll.data.subset$Recieve_Longterm_Disability, labels = c("Yes", "No"))
names(poll.data.subset)[names(poll.data.subset) == 'Q10_2'] <- 'Have_Chronic_Illness'
poll.data.subset$Have_Chronic_Illness[poll.data.subset$Have_Chronic_Illness == -1]<-1
poll.data.subset$Have_Chronic_Illness<-factor(poll.data.subset$Have_Chronic_Illness, labels = c("Yes", "No"))
names(poll.data.subset)[names(poll.data.subset) == 'Q10_3'] <- 'Unemployed_Longer_than_1Year'
poll.data.subset$Unemployed_Longer_than_1Year[poll.data.subset$Unemployed_Longer_than_1Year == -1]<-1
poll.data.subset$Unemployed_Longer_than_1Year<-factor(poll.data.subset$Unemployed_Longer_than_1Year, labels = c("Yes", "No"))
names(poll.data.subset)[names(poll.data.subset) == 'Q10_4'] <- 'Evicted_within_past_Year'
poll.data.subset$Evicted_within_past_Year[poll.data.subset$Evicted_within_past_Year == -1]<-1
poll.data.subset$Evicted_within_past_Year<-factor(poll.data.subset$Evicted_within_past_Year, labels = c("Yes", "No"))

names(poll.data.subset)[names(poll.data.subset) == 'Q16'] <- 'How_Easy_is_it_to_Vote_in_National_Elections'
poll.data.subset$How_Easy_is_it_to_Vote_in_National_Elections[poll.data.subset$How_Easy_is_it_to_Vote_in_National_Elections == -1]<-1
poll.data.subset$How_Easy_is_it_to_Vote_in_National_Elections<-factor(poll.data.subset$How_Easy_is_it_to_Vote_in_National_Elections, labels = c("Very easy", "Somewhat easy", "Somewhat difficult","Very difficult"))


names(poll.data.subset)[names(poll.data.subset) == 'Q23'] <- 'Presidential_Candidate_Vote_for_2020'
poll.data.subset$Presidential_Candidate_Vote_for_2020[poll.data.subset$Presidential_Candidate_Vote_for_2020 == -1]<-1
poll.data.subset$Presidential_Candidate_Vote_for_2020<-factor(poll.data.subset$Presidential_Candidate_Vote_for_2020, labels = c("Donald Trump", "Joe Biden", "Unsure"))


names(poll.data.subset)[names(poll.data.subset) == 'Q25'] <- 'Following_Presidential_Race_2020'
poll.data.subset$Following_Presidential_Race_2020[poll.data.subset$Following_Presidential_Race_2020 == -1]<-1
poll.data.subset$Following_Presidential_Race_2020<-factor(poll.data.subset$Following_Presidential_Race_2020, labels = c("Very closely", "Somewhat closely", "Not very closely","Not closely at all"))

names(poll.data.subset)[names(poll.data.subset) == 'Q30'] <- 'Political_Affiliation'
poll.data.subset$Political_Affiliation[poll.data.subset$Political_Affiliation == -1]<-1
poll.data.subset$Political_Affiliation<-factor(poll.data.subset$Political_Affiliation, labels = c("Republican", "Democrat", "Independent","Other","No preference"))

names(poll.data.subset)[names(poll.data.subset) == 'ppage'] <- 'Age'
names(poll.data.subset)[names(poll.data.subset) == 'educ'] <- 'Education'
names(poll.data.subset)[names(poll.data.subset) == 'income_cat'] <- 'Income'

# Show the improved colnames and labels
head(poll.data.subset)
##   Importance_of_Voting Importance_of_Jury_Duty Importance_of_Following_Politics
## 1       very_important          very_important               somewhat_important
## 2       very_important      somewhat_important               somewhat_important
## 3       very_important          very_important               somewhat_important
## 4       very_important          very_important                   very_important
## 5       very_important          very_important                   very_important
## 6     not_so_important      somewhat_important                 not_so_important
##   Importance_of_the_Flag Importance_of_US_Census
## 1   not_at_all_important          very_important
## 2       not_so_important          very_important
## 3     somewhat_important          very_important
## 4       not_so_important          very_important
## 5         very_important          very_important
## 6   not_at_all_important          very_important
##   Importance_of_Saying_the_Pledge Importance_of_Military_Support
## 1            not_at_all_important             somewhat_important
## 2                  very_important             somewhat_important
## 3                  very_important             somewhat_important
## 4                  very_important                 very_important
## 5                  very_important                 very_important
## 6                not_so_important               not_so_important
##   Importance_of_Respecting_Opinions Importance_of_Religion
## 1                somewhat_important   not_at_all_important
## 2                    very_important         very_important
## 3                    very_important   not_at_all_important
## 4                    very_important         very_important
## 5                    very_important         very_important
## 6                    very_important         very_important
##   Importance_of_Right_to_Protest How_many_people_in_office_are_like_you
## 1             somewhat_important                                   some
## 2               not_so_important                                   some
## 3               not_so_important                                  a lot
## 4             somewhat_important                                  a few
## 5                 very_important                                   some
## 6           not_at_all_important                                   none
##   Opinion_on_Structure_of_US_Government Trust_President Trust_Congress
## 1                 a lot needs to change        not much     not at all
## 2           change is not really needed            some       not much
## 3                 a lot needs to change        not much           some
## 4                 a lot needs to change        not much           some
## 5           change is not really needed           a lot       not much
## 6                 a lot needs to change        not much       not much
##   Trust_Supreme_Court Trust_CDC Trust_Elected_Officials Trust_CIA_or_FBI
## 1                some     a lot                   a lot            a lot
## 2                some      some                    some             some
## 3               a lot     a lot                    some             some
## 4                some      some                    some             some
## 5                some  not much                not much         not much
## 6            not much      some                not much         not much
##   Trust_News_Media_Outlets Trust_Police Trust_US_Postal_Service
## 1                    a lot         some              not at all
## 2                 not much         some                    some
## 3                     some         some                   a lot
## 4                     some         some                    some
## 5               not at all         some                    some
## 6                     some         some                    some
##   Recieve_Longterm_Disability Have_Chronic_Illness Unemployed_Longer_than_1Year
## 1                          No                   No                           No
## 2                          No                   No                           No
## 3                          No                   No                          Yes
## 4                          No                   No                           No
## 5                          No                   No                           No
## 6                          No                   No                           No
##   Evicted_within_past_Year How_Easy_is_it_to_Vote_in_National_Elections
## 1                       No                                    Very easy
## 2                       No                                Somewhat easy
## 3                       No                                    Very easy
## 4                       No                               Very difficult
## 5                       No                                    Very easy
## 6                       No                                    Very easy
##   Presidential_Candidate_Vote_for_2020 Following_Presidential_Race_2020
## 1                            Joe Biden                     Very closely
## 2                         Donald Trump                 Not very closely
## 3                            Joe Biden                 Somewhat closely
## 4                            Joe Biden                 Somewhat closely
## 5                         Donald Trump                     Very closely
## 6                         Donald Trump                 Not very closely
##   Political_Affiliation Age           Education  race gender        Income
## 1              Democrat  73             College White Female      $75-125k
## 2           Independent  90             College White Female $125k or more
## 3              Democrat  53             College White   Male $125k or more
## 4              Democrat  58        Some college Black Female       $40-75k
## 5            Republican  81 High school or less White   Male       $40-75k
## 6         No preference  61 High school or less White Female       $40-75k
##   voter_category
## 1         always
## 2         always
## 3       sporadic
## 4       sporadic
## 5         always
## 6   rarely/never

The adjusted names and entries make the dataset much easier to read.

Conclusions

The original polling data needed to be adjusted to make the variables and results easier to understand. I acomplished this through variable selection, renaming columns, and factoring and renaming categorical entries. The target variable is voter_category in which analysis can be conducted to find meaningful insights using the enhanced dataset that I created from the polling data.