In this exercise, we will predict the presidential vote in the 2016 election using a liberal-conservative scale. We will also analyze some patterns in the 2016 American National Elections Survey (ANES). ANES is among the most comprehensive electoral surveys conducted in the US. It is conducted both offline and online, and pre and post-election.
The original dataset has 1842 variables. I selected a few here for us to study:
| Variable | Meaning |
|---|---|
int_vote_trump |
Intend to vote for Trump in the 2016 election |
voted_trump |
Voted for Trump in the 2016 election |
int_vote_clinton |
Intend to vote for Clinton in the 2016 election |
voted_clinton |
Voted for Clinton in the 2016 election |
lib_conserv_scale |
Liberal-Conservative scale |
white_voter |
Respondent declared herself as White |
latinx_voter |
Respondent declared herself as Latinx |
swing_voter |
Intended one vote but voted different |
swing_trump |
Did not intend to vote for Trump but did vote for him. |
swing_hillary |
Did not intend to vote for Clinton but did vote for her. |
region |
Country region |
age |
Age in years |
religion_important_life |
Binary indicator for the belief that religion is important for life. |
As always, we start by looking at the data:
head(anes)
## int_vote_trump voted_trump int_vote_clinton voted_clinton lib_conserv_scale
## 1 1 1 0 0 0.5215019
## 2 0 0 0 0 -0.1032504
## 3 0 1 0 0 1.1462541
## 4 1 1 0 0 0.5215019
## 5 0 0 0 0 -0.1032504
## 6 0 0 0 0 -0.7280026
## white_voter latinx_voter swing_voter swing_trump swing_hillary region age
## 1 1 0 0 0 0 South 26
## 2 1 0 0 0 0 Midwest 38
## 3 0 0 1 1 0 Northeast 60
## 4 1 0 0 0 0 Northeast 56
## 5 1 0 0 0 0 South 45
## 6 1 0 0 0 0 South 30
## religion_important_life
## 1 0
## 2 1
## 3 1
## 4 1
## 5 0
## 6 1
Answer: In the given data I am able to see if people voted (Yes/no) Which explains the 1 and 0. In this chart I am able to see the different presidential candidates and as well the reflection of the voters signifying how liberal/conservative the voters are.
I want to predict the vote for Trump using the liberal-conservative scale. The X-Variable should be liberal-conservative.
I want to predict the vote for Trump using the liberal-conservative scale. The Y-Variable should be the vote for Trump.
(Hint: The default scatter plot will not work for this data because
both variables have discrete variation: Vote for Trump is binary, and
the lib-con scale has seven categories. To plot this, you should
use some jitter. The parameters for jitter are
height for y and width for x.)
R code:ggplot(data=anes, aes(x=lib_conserv_scale, y=voted_trump)) + geom_jitter(alpha=0.5, height=0.4, width=0.2)
ggplot(data=anes, aes(x=lib_conserv_scale, y=voted_trump)) +
geom_jitter(alpha=0.5, height=0.4, width=0.2)
lm() to fit a
linear model to the data. (1 point)R code:ggplot(data=anes, aes(x=lib_conserv_scale, y=voted_trump)) + geom_jitter(alpha=0.5, height=0.4, width=0.2)+ geom_point(fill = ‘lightblue’, alpha = 0.6) + labs(title = ’‘, y = ’predict the vote for Trump’, x = ‘liberal-conservative scale’) + geom_smooth(formula = ‘y ~ x’, method = ‘lm’, se = F, color = ‘blue’, lwd = 1) + theme_minimal()
ggplot(data=anes, aes(x=lib_conserv_scale, y=voted_trump)) +
geom_jitter(alpha=0.5, height=0.4, width=0.2)+ geom_point(fill = 'lightblue', alpha = 0.6) +
labs(title = '', y = 'predict the vote for Trump', x = 'liberal-conservative scale') +
geom_smooth(formula = 'y ~ x', method = 'lm',
se = F, color = 'blue', lwd = 1) + theme_minimal()
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
lm(data=anes, voted_trump~lib_conserv_scale)
##
## Call:
## lm(formula = voted_trump ~ lib_conserv_scale, data = anes)
##
## Coefficients:
## (Intercept) lib_conserv_scale
## 0.3031 0.2297
(I.e., substitute \(Y\) for the name of the outcome variable, substitute \(\widehat{\beta}_0\) for the estimated value of the intercept coefficient, substitute \(\widehat{\beta}_1\) for the estimated value of the slope coefficient, and substitute \(X\) for the name of the predictor.)
ANSWER\(\widehat{\text{vote for trump}} = \hat{\beta}_0+\hat{\beta}_1\text{scale}\) \(Y\)= Predicted the vote for Donald Trump according to the input. \(\widehat{\beta}_0\) predicted vote for Donald Trump when the liberal-conservative scale is 0. \(\widehat{\beta}_0\) predicted vote for Donald Trump for every one unit in liberal conservative scale. \(Y\) is the value of the liberal-conservative scale.
Calculations: I put the following: 0.3031 + 0.2297(-1) = 0.0734 \(\widehat{\text{voted_trump}}=0.3031\text{(lib_conserv_scale)}\)
Answer: There is always a chance when it comes to statistics but there is never a 0% chance, but yes there should be a 7% chance. My equation looks like I put the following: 0.3031 + 0.2297(-1) = 0.0734. Which in total will be &% chance of predicted chance that this person will vote for Trump. 7.24 P.P
Calculations: \(\widehat{\text{voted_trump}}=0.3031\text{(lib_conserv_scale)}\)+\(\triangle \widehat{Y}\) \(-1\) 2*0.02297= 0.4594 0.2297 + 0.3031 = 0.5328 The predicted chance that this person will now vote for Trump is 53.28%.
Answer: 0.5328
(Hint: the function cor() might be helpful here.)
**R code*cor(anes\(lib_conserv_scale,anes\)voted_trump)^2
cor(anes$lib_conserv_scale,anes$voted_trump)^2
## [1] 0.249724
Answer:In the model fitted line we can see that there is a positive correlation.The \(R^2\) is 0.249724 approximately 24%. The most common interpretation of r-squared is how well the regression model explains observed data. For example, an r-squared of 60% reveals that 60% of the variability observed in the target variable is explained by the regression model.
R code:lm(voted_trump~ lib_conserv_scale + age , data = anes)
lm(voted_trump~ lib_conserv_scale + age , data = anes)
##
## Call:
## lm(formula = voted_trump ~ lib_conserv_scale + age, data = anes)
##
## Coefficients:
## (Intercept) lib_conserv_scale age
## 0.198557 0.223668 0.002079
Answer: In the prediction model the results show that age has a 0.2% affect in the voting of trump. My liberal conservative scale does better by 22% chance. The control for age of the variation of liberal concervative like if you were already to that political party then you have a 22% of voting for Trump. Then on the other hard age has a .2% correlation ofvoting for Trump. Call: lm(formula = voted_trump ~ lib_conserv_scale + age, data = anes)
Coefficients: (Intercept) lib_conserv_scale age
0.198557 0.223668 0.002079
R code:lm(voted_trump~ lib_conserv_scale + age + latinx_voter, data=anes)
lm(voted_trump~ lib_conserv_scale + age + latinx_voter, data=anes)
##
## Call:
## lm(formula = voted_trump ~ lib_conserv_scale + age + latinx_voter,
## data = anes)
##
## Coefficients:
## (Intercept) lib_conserv_scale age latinx_voter
## 0.222532 0.222963 0.001836 -0.127574
Answer: Latinx Voters have a -12% chance of voting for Trump.So a liberal young Latinx could vote for Trump is false.The control for age for as well has a 0.1% of affecting the chances of voting for Trump. Call: lm(formula = voted_trump ~ lib_conserv_scale + age + latinx_voter, data = anes)
Coefficients: (Intercept) lib_conserv_scale age latinx_voter
0.222532 0.222963 0.001836 -0.127574