The Relationship Between Vote Outcome and Education

Introduction

In this statistical analysis I will try to investigate the relationship between vote outcome and education using the 2000 Current Population Survey from the Census Bureau. The states that are represented are South Carolina and Arkansas. This is only a sample containing 7 variables (state, year, vote, income, education, age, and female-which takes on the values (1Female, 2 Male). It is interesting to look at what variables may effect vote turnout.

Looking at data like this can help increase awareness on the importance on voting. Whether it is voting for city council member or for the next President, many people have mixed thoughts on whether they should vote. Some people also do not vote because they are not aware of the issues / solutions that are occuring.

Analyzing this type of data can help us understand what variables to look at and perhaps consider incorporating other variables in future studies.

Data Used and Variable Explanation:

A data frame containing 7 variables (“state”, “year”, “vote”, “income”, “education”, “age”, “female”) and 1500 observations.

Analysis Used

The analysis I will be using will consist of the package Zelig to create logit based simulation models. THis will help us to visualize the predicted probabilities of education attainment and voter turnout. I am using a logit model since the dependent variable is nominal with only two levels.

Specifically looking at the Votes

## vote
##    0    1 
##  217 1283
## education
##   1   2   3   4 
## 216 486 403 395
## Model: 
## 
## Call:
## z5$zelig(formula = vote ~ education + income + age + female + 
##     state, data = voteincome)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.4569   0.3855   0.4806   0.5941   1.0363  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.009550   0.382580  -2.639 0.008320
## education    0.225355   0.090210   2.498 0.012485
## income       0.089629   0.021860   4.100 4.13e-05
## age          0.016321   0.004331   3.768 0.000164
## female       0.304063   0.151263   2.010 0.044415
## stateSC      0.311943   0.154000   2.026 0.042805
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1240.0  on 1499  degrees of freedom
## Residual deviance: 1181.6  on 1494  degrees of freedom
## AIC: 1193.6
## 
## Number of Fisher Scoring iterations: 5
## 
## Next step: Use 'setx' method
## 
##  sim x :
##  -----
## ev
##           mean         sd       50%      2.5%     97.5%
## [1,] 0.8337687 0.02288345 0.8351569 0.7831587 0.8717908
## pv
##          0     1
## [1,] 0.166 0.834
## 
##  sim x1 :
##  -----
## ev
##           mean         sd       50%      2.5%     97.5%
## [1,] 0.9080929 0.01371846 0.9091453 0.8799439 0.9323122
## pv
##          0     1
## [1,] 0.086 0.914
## fd
##            mean         sd        50%       2.5%     97.5%
## [1,] 0.07432418 0.03013401 0.07438663 0.01825795 0.1373818

First Differences for the Variable “Education (Where 1 = Less than Highschool Education and 4 = More than College Education”)

##        V1           
##  Min.   :-0.007198  
##  1st Qu.: 0.053523  
##  Median : 0.074387  
##  Mean   : 0.074324  
##  3rd Qu.: 0.092579  
##  Max.   : 0.172506
## 
##  sim x :
##  -----
## ev
##           mean        sd       50%      2.5%     97.5%
## [1,] 0.9077762 0.0133055 0.9091724 0.8780127 0.9316126
## pv
##          0     1
## [1,] 0.076 0.924
## 
##  sim x1 :
##  -----
## ev
##           mean         sd       50%      2.5%     97.5%
## [1,] 0.8330955 0.02360585 0.8342377 0.7842494 0.8759815
## pv
##         0    1
## [1,] 0.16 0.84
## fd
##             mean         sd         50%       2.5%      97.5%
## [1,] -0.07468075 0.03018175 -0.07397202 -0.1354132 -0.0175725

## 
##  sim x :
##  -----
## ev
##           mean         sd       50%      2.5%     97.5%
## [1,] 0.8332623 0.02310582 0.8352496 0.7846537 0.8735021
## pv
##          0     1
## [1,] 0.164 0.836
## 
##  sim x1 :
##  -----
## ev
##           mean         sd       50%      2.5%     97.5%
## [1,] 0.9073734 0.01353348 0.9085606 0.8778982 0.9323346
## pv
##          0     1
## [1,] 0.081 0.919
## fd
##            mean         sd        50%       2.5%     97.5%
## [1,] 0.07411117 0.03017012 0.07212039 0.01528098 0.1372173

Testing Variations in Education

Histogram for Predicted Probabilities of Vote Outcome with Education.

Conclusion

According to the data,there are more people who voted however the level of education seems to be between 1 and 4, meaning that the population selected has more than a high school education but less than a college education.. The results however do not seem to be as strong as I thought it would. It is important to note, that this dataset of a single year (2000) and of only 2 states and of 1500 observations, does not give detailed results. We would probably want to get at least a 5 year sample to gain more insights regarding vote outcome and education.