I am working with a shortened version of the 2012 General Social Survey Data. This dataset contains a smaller number of variables, some of which I look at here. I am interested in looking at which factors predict gun ownership.

library("Zelig")
## Loading required package: MASS
## Loading required package: boot
## ## 
## ##  Zelig (Version 3.5.3, built: 2011-11-29)
## ##  Please refer to http://gking.harvard.edu/zelig for full
## ##  documentation or help.zelig() for help with commands and
## ##  models supported by Zelig.
## ##
## 
## ##  Zelig project citations:
## ##    Kosuke Imai, Gary King, and Olivia Lau. (2009).
## ##    ``Zelig: Everyone's Statistical Software,''
## ##    http://gking.harvard.edu/zelig.
## ##  and
## ##    Kosuke Imai, Gary King, and Olivia Lau. (2008).
## ##    ``Toward A Common Framework for Statistical Analysis
## ##    and Development,'' Journal of Computational and
## ##    Graphical Statistics, Vol. 17, No. 4 (December)
## ##    pp. 892-913. 
## 
## ##  To cite individual Zelig models, please use the citation format printed with
## ##  each model run and in the documentation.
## ##
library("DescTools")
library("dplyr")
## Warning: package 'dplyr' was built under R version 3.1.3
## 
## Attaching package: 'dplyr'
## 
## The following object is masked from 'package:MASS':
## 
##     select
## 
## The following object is masked from 'package:stats':
## 
##     filter
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library("stargazer")
## 
## Please cite as: 
## 
##  Hlavac, Marek (2014). stargazer: LaTeX code and ASCII text for well-formatted regression and summary statistics tables.
##  R package version 5.1. http://CRAN.R-project.org/package=stargazer
library("readstata13")
## Warning: package 'readstata13' was built under R version 3.1.3
library("foreign")
library("car")
## 
## Attaching package: 'car'
## 
## The following object is masked from 'package:DescTools':
## 
##     Recode
## 
## The following object is masked from 'package:boot':
## 
##     logit
d <- read.dta("C:/Users/Abigail Walsh/Documents/Grad School/Queen's College/Basic Analytics/Basic Analytics/GSS.dta")
names(d)
##  [1] "CASEID"   "WORKBLKS" "RACDIF1"  "RACMAR"   "RACDIF2"  "RACDIF3" 
##  [7] "HELPBLK"  "HELPPOOR" "YEAR"     "SEX"      "AGE"      "RACE"    
## [13] "REALINC"  "REALRINC" "EDUC"     "DEGREE"   "PRESTG80" "PAPRES80"
## [19] "MARITAL"  "DIVORCE"  "CHILDS"   "RELIG"    "WRKSLF"   "UNEMP"   
## [25] "REGION"   "SIZE"     "RACLIVE"  "FEAR"     "GUN"      "POLVIEWS"
## [31] "FECHLD"   "FEFAM"
Final <- select(d, WORKBLKS, RACDIF1, RACDIF2, RACDIF3, RACMAR, HELPBLK, HELPPOOR, YEAR, SEX, AGE, RACE, REALINC, EDUC, DEGREE, RELIG, UNEMP, REGION, RACLIVE, FEAR, GUN, POLVIEWS)
names(Final)
##  [1] "WORKBLKS" "RACDIF1"  "RACDIF2"  "RACDIF3"  "RACMAR"   "HELPBLK" 
##  [7] "HELPPOOR" "YEAR"     "SEX"      "AGE"      "RACE"     "REALINC" 
## [13] "EDUC"     "DEGREE"   "RELIG"    "UNEMP"    "REGION"   "RACLIVE" 
## [19] "FEAR"     "GUN"      "POLVIEWS"
df <- data.frame(GUN = c("NO" , "YES", NA),stringsAsFactors=FALSE)
Final$REGION=as.numeric(Final$REGION)
Final$POLVIEWS=as.numeric(Final$POLVIEWS)
Final$RACE=as.numeric(Final$RACE)
Final$RACDIF2=as.numeric(Final$RACDIF2)

I created three models to help look at the data. The first model focused on education and age demographic information of participants. The second model considered additional information such as participant race and political leanings in this case how conservative they were in their political views. The third model adds in the measurement for racial attitudes, considering participants response to racial differences in education between Whites and Black being due to an inborn lack of ability for Blacks to learn.

model1 <- zelig(GUN ~ EDUC + AGE,
                data = Final, model="logit")
## How to cite this model in Zelig:
## Kosuke Imai, Gary King, and Oliva Lau. 2008. "logit: Logistic Regression for Dichotomous Dependent Variables" in Kosuke Imai, Gary King, and Olivia Lau, "Zelig: Everyone's Statistical Software," http://gking.harvard.edu/zelig
model2 <- zelig(GUN ~ EDUC + AGE + RACE + POLVIEWS, data = Final, model="logit")
## How to cite this model in Zelig:
## Kosuke Imai, Gary King, and Oliva Lau. 2008. "logit: Logistic Regression for Dichotomous Dependent Variables" in Kosuke Imai, Gary King, and Olivia Lau, "Zelig: Everyone's Statistical Software," http://gking.harvard.edu/zelig
model3 <- zelig(GUN ~ EDUC + AGE + RACE + POLVIEWS + RACDIF2, data=Final, model="logit")
## How to cite this model in Zelig:
## Kosuke Imai, Gary King, and Oliva Lau. 2008. "logit: Logistic Regression for Dichotomous Dependent Variables" in Kosuke Imai, Gary King, and Olivia Lau, "Zelig: Everyone's Statistical Software," http://gking.harvard.edu/zelig

To determine the best model, I looked at the Akaike information at the bottom of the table. Based on this information, model 3 was the best fit.

stargazer(model1, model2, model3, type = "html", style = "demography", title = "Table 1: Logit Models",          
          covariate.labels = c("Education", "Age", "Race",
                               "Conservativism", "Racial Attitudies on Inborn Ability"),
          dep.var.labels   = "Gun Ownership")
Table 1: Logit Models
Gun Ownership
Model 1 Model 2 Model 3
Education 0.026*** 0.027*** 0.043**
(0.006) (0.007) (0.014)
Age 0.014*** 0.013*** 0.016***
(0.001) (0.001) (0.002)
Race -0.175*** -0.113
(0.043) (0.079)
Conservativism 0.030* 0.015
(0.015) (0.029)
Racial Attitudies on Inborn Ability 0.001
(0.105)
Constant 0.481*** 0.710*** 0.321
(0.102) (0.175) (0.449)
N 19,214 16,097 4,313
Log Likelihood -9,414.761 -8,037.523 -2,127.900
AIC 18,835.520 16,085.050 4,267.800
p < .05; p < .01; p < .001

The models listed above represent a replication from previous homework. In order to address the current homework assignment I have created two new models examining the influence of race and conservativism on gun ownership. The first model considers both race and conservativism individually. The second model considers race and conservativism individually, as well as considering the influence of any potential interaction between race and conservativism on gun ownership.

m4 <-zelig(GUN ~ RACE + POLVIEWS, data = Final, model = "logit")
## How to cite this model in Zelig:
## Kosuke Imai, Gary King, and Oliva Lau. 2008. "logit: Logistic Regression for Dichotomous Dependent Variables" in Kosuke Imai, Gary King, and Olivia Lau, "Zelig: Everyone's Statistical Software," http://gking.harvard.edu/zelig
m5 <-zelig(GUN ~ RACE + POLVIEWS + RACE:POLVIEWS, data = Final, model = "logit")
## How to cite this model in Zelig:
## Kosuke Imai, Gary King, and Oliva Lau. 2008. "logit: Logistic Regression for Dichotomous Dependent Variables" in Kosuke Imai, Gary King, and Olivia Lau, "Zelig: Everyone's Statistical Software," http://gking.harvard.edu/zelig
stargazer(m4, m5, type = "html", style = "demography", title = "Table 1: Logit Models Interaction",          
          covariate.labels = c("Race",
                               "Conservativism", "Race:Conservativism"),
          dep.var.labels   = "Gun Ownership")
Table 1: Logit Models Interaction
Gun Ownership
Model 1 Model 2
Race -0.223*** 0.077
(0.042) (0.154)
Conservativism 0.047** 0.184**
(0.015) (0.069)
Race:Conservativism -0.061*
(0.030)
Constant 1.618*** 0.944**
(0.126) (0.356)
N 16,174 16,174
Log Likelihood -8,128.982 -8,126.918
AIC 16,263.970 16,261.840
p < .05; p < .01; p < .001

Model 1 shows us that race has a statistically significant (p=.001)negative relationship on gun ownership while conservativism has a statistically significant (p=.01) positive relationship with gun ownership, meaning the more conservative the participant the more likely he or she is to own a gun. Model 2 shows us that given the introduction of the interaction between race and conservativism, the significant relationship between race and gun ownership disappears. The significant (p=.01) positive relationship between conservativism and gun ownership remains. With the introduction of the relationship of the interaction between race and conservativism on gun ownership we can see a statistically significant (p=.05) negative relationsip.

Simulation

I ran three simulations based on race of participants to see how each variable (race, conservativism, education, and age) predict the likelihood of each race to own a gun. Here 2=White, 3=Black, 4=Other.

m1 <-zelig(GUN ~ RACE + POLVIEWS + EDUC + AGE, data = Final, model = "logit")
## How to cite this model in Zelig:
## Kosuke Imai, Gary King, and Oliva Lau. 2008. "logit: Logistic Regression for Dichotomous Dependent Variables" in Kosuke Imai, Gary King, and Olivia Lau, "Zelig: Everyone's Statistical Software," http://gking.harvard.edu/zelig
m2 <-setx(m1, RACE="2")
m3 <-sim(m1, x=m2)
summary(m3)
## 
##   Model: logit 
##   Number of simulations: 1000 
## 
## Values of X 
##      (Intercept) RACE POLVIEWS     EDUC      AGE
## 4602           1    2 5.100205 12.48643 44.63403
## 
## Expected Values: E(Y|X)
##        mean          sd      2.5%     97.5%
## 1 0.8048595 0.003426393 0.7984899 0.8117579
## 
## Predicted Values: Y|X
##       0     1
## 1 0.216 0.784
m1 <-zelig(GUN ~ RACE + POLVIEWS + EDUC + AGE, data = Final, model = "logit")
## How to cite this model in Zelig:
## Kosuke Imai, Gary King, and Oliva Lau. 2008. "logit: Logistic Regression for Dichotomous Dependent Variables" in Kosuke Imai, Gary King, and Olivia Lau, "Zelig: Everyone's Statistical Software," http://gking.harvard.edu/zelig
m2 <-setx(m1, RACE="3")
m3 <-sim(m1, x=m2)
summary(m3)
## 
##   Model: logit 
##   Number of simulations: 1000 
## 
## Values of X 
##      (Intercept) RACE POLVIEWS     EDUC      AGE
## 4602           1    3 5.100205 12.48643 44.63403
## 
## Expected Values: E(Y|X)
##        mean          sd      2.5%     97.5%
## 1 0.7756793 0.007279437 0.7612139 0.7893568
## 
## Predicted Values: Y|X
##       0     1
## 1 0.226 0.774
m1 <-zelig(GUN ~ RACE + POLVIEWS + EDUC + AGE, data = Final, model = "logit")
## How to cite this model in Zelig:
## Kosuke Imai, Gary King, and Oliva Lau. 2008. "logit: Logistic Regression for Dichotomous Dependent Variables" in Kosuke Imai, Gary King, and Olivia Lau, "Zelig: Everyone's Statistical Software," http://gking.harvard.edu/zelig
m2 <-setx(m1, RACE="4")
m3 <-sim(m1, x=m2)
summary(m3)
## 
##   Model: logit 
##   Number of simulations: 1000 
## 
## Values of X 
##      (Intercept) RACE POLVIEWS     EDUC      AGE
## 4602           1    4 5.100205 12.48643 44.63403
## 
## Expected Values: E(Y|X)
##        mean         sd      2.5%     97.5%
## 1 0.7443478 0.01492127 0.7140131 0.7742202
## 
## Predicted Values: Y|X
##      0    1
## 1 0.25 0.75

Based on these simulations whites are predicted to be 80.1% likely to own a gun, black are predicted to be 79.8% likely to own a gun, and those who are neither black nor white are predicted to be 74.6% likelyt to own a gun.

Unfortunately, everytime I run the simulation, in order to find the predicted percentages they change. I believe this to be a result of the simulations being slightly different to account for any changes. Regardless, I cannot seem to capture a reliable percentage, so I am going to commit to those listed above which were the result of the most recent simulation run before publishing this assignment.

Issues with the Difference I tried to complete the difference of difference calculations. I used the following code, but was given an error message. After much time on the Google, I still could not find a solution.

xh1 <- setx(m5, POLVIEWS = mean(Final\(POLVIEWS)+sd(Final\)POLVIEWS), RACE=2) xl1 <- setx(m5, POLVIEWS = mean(Final\(POLVIEWS), RACE=2) xh0 <- setx(m5, POLVIEWS = mean(Final\)POLVIEWS)+sd(Final\(POLVIEWS), RACE=3) xl0 <- setx(m5, POLVIEWS = mean(Final\)POLVIEWS), RACE=3)

zh1 <- sim(m5, x=xh1) zl1 <- sim(m5, x=xl1) zh0 <- sim(m5, x=xh0) zl0 <- sim(m5, x=xl0)

eff <- (zh1\(qi\)ev - zl1\(qi\)ev) -(zh0\(qi\)ev - zl0\(qi\)ev) summary(zh1\(qi\)ev)