In this exercise, we will predict the presidential vote in the 2016 election using a liberal-conservative scale. We will also analyze some patterns in the 2016 American National Elections Survey (ANES). ANES is among the most comprehensive electoral surveys conducted in the US. It is conducted both offline and online, and pre and post-election.
The original dataset has 1842 variables. I selected a few here for us to study:
| Variable | Meaning |
|---|---|
int_vote_trump |
Intend to vote for Trump in the 2016 election |
voted_trump |
Voted for Trump in the 2016 election |
int_vote_clinton |
Intend to vote for Clinton in the 2016 election |
voted_clinton |
Voted for Clinton in the 2016 election |
lib_conserv_scale |
Liberal-Conservative scale |
white_voter |
Respondent declared herself as White |
latinx_voter |
Respondent declared herself as Latinx |
swing_voter |
Intended one vote but voted different |
swing_trump |
Did not intend to vote for Trump but did vote for him. |
swing_hillary |
Did not intend to vote for Clinton but did vote for her. |
region |
Country region |
age |
Age in years |
religion_important_life |
Binary indicator for the belief that religion is important for life. |
As always, we start by looking at the data:
head(anes)
## int_vote_trump voted_trump int_vote_clinton voted_clinton lib_conserv_scale
## 1 1 1 0 0 0.5215019
## 2 0 0 0 0 -0.1032504
## 3 0 1 0 0 1.1462541
## 4 1 1 0 0 0.5215019
## 5 0 0 0 0 -0.1032504
## 6 0 0 0 0 -0.7280026
## white_voter latinx_voter swing_voter swing_trump swing_hillary region age
## 1 1 0 0 0 0 South 26
## 2 1 0 0 0 0 Midwest 38
## 3 0 0 1 1 0 Northeast 60
## 4 1 0 0 0 0 Northeast 56
## 5 1 0 0 0 0 South 45
## 6 1 0 0 0 0 South 30
## religion_important_life
## 1 0
## 2 1
## 3 1
## 4 1
## 5 0
## 6 1
Answer: In this data set, what each observation represents a vote. The data set demonstrates a reflection of the voters’ demographics and which party they subscribed to (liberal/conservative). Apart from the social makeup, we see the presidential candidates and whether people voted yes or no for them. The yes being articulated by a 1 and a no being articulated by a 0.
Answer: The variable we will use as the predictor will be the liberal-conservative scale therefore liberal-conservative being our X variable.
Answer: Utilizing the liberal-conservative scale we determine that the Y-variable should be the vote for trump.
R code: ggplot(data=anes, aes(x=lib_conserv_scale, y=voted_trump)) + geom_jitter(alpha=0.5, height=0.4, width=0.2)
# ggplot(data=anes, aes(x=lib_conserv_scale, y=voted_trump)) +
geom_jitter(alpha=0.5, height=0.4, width=0.2)
## geom_point: na.rm = FALSE
## stat_identity: na.rm = FALSE
## position_jitter
lm() to fit a
linear model to the data. (1 point)R code: ggplot(data=anes, aes(x=lib_conserv_scale, y=voted_trump)) + geom_jitter(alpha=0.5, height=0.4, width=0.2)+ geom_point(fill = ‘lightblue’, alpha = 0.6) + labs(title = ’‘, y = ’predict the vote for Trump’, x = ‘liberal-conservative scale’) + geom_smooth(formula = ‘y ~ x’, method = ‘lm’, se = F, color = ‘blue’, lwd = 1) + theme_minimal()
# ggplot(data=anes, aes(x=lib_conserv_scale, y=voted_trump)) + geom_jitter(alpha=0.5, height=0.4, width=0.2)+ geom_point(fill = ‘lightblue’, alpha = 0.6) + labs(title = ’‘, y = ’predict the vote for Trump’, x = ‘liberal-conservative scale’) + geom_smooth(formula = ‘y ~ x’, method = ‘lm’, se = F, color = ‘blue’, lwd = 1) + theme_minimal()
lm(data=anes, voted_trump~lib_conserv_scale)
Call: lm(formula = voted_trump ~ lib_conserv_scale, data = anes)
Coefficients: (Intercept) lib_conserv_scale
0.3031 0.2297
—
Answer: Y= predicted votes in favor of Donald Trump. B^1= predicted votes in favor of Donald Trump when liberal conservative scale near 1. B^0= predicted votes in favor of Donald Trump when liberal conservative scale near 0.
Calculations: Calculations here.
Answer: Answers here.
Calculations: Calculations here.
Answer: Answers here.
R code: cor(anes\(lib_conserv_scale,anes\)voted_trump)^2
# cor(anes$lib_conserv_scale,anes$voted_trump)^2
Answer: [1] 0.249724
The R^2 is about 24% and this 24% is the ability in which the regression model is able to depict the variable
R code: lm(voted_trump~ lib_conserv_scale + age , data = anes)
# :lm(voted_trump~ lib_conserv_scale + age , data = anes)
Answer: Call: lm(formula = voted_trump ~ lib_conserv_scale + age, data = anes)
Coefficients: (Intercept) lib_conserv_scale age
0.198557 0.223668 0.002079
As shown in the prediction model 0.2% chance that age affected the vote for Trump. However, the liberal-conservative scale demonstrates that there is a 22% chance.
R code: lm(voted_trump~ lib_conserv_scale + age + latinx_voter, data=anes)
# lm(voted_trump~ lib_conserv_scale + age + latinx_voter, data=anes)
Call: lm(formula = voted_trump ~ lib_conserv_scale + age + latinx_voter, data = anes)
Coefficients: (Intercept) lib_conserv_scale age
0.222532 0.222963 0.001836
latinx_voter
-0.127574
Answer: The notion of liberal young Latinx Trump voters is false. As shown these Latinx voters displayed a -12% chance of favoring Trump in the election.