I am working with a shortened version of the 2012 General Social Survey Data. This dataset contains a smaller number of variables, some of which I look at here. I am interested in looking at which factors predict gun ownership.
library("Zelig")
## Loading required package: boot
## Loading required package: MASS
## Loading required package: sandwich
## ZELIG (Versions 4.2-1, built: 2013-09-12)
##
## +----------------------------------------------------------------+
## | Please refer to http://gking.harvard.edu/zelig for full |
## | documentation or help.zelig() for help with commands and |
## | models support by Zelig. |
## | |
## | Zelig project citations: |
## | Kosuke Imai, Gary King, and Olivia Lau. (2009). |
## | ``Zelig: Everyone's Statistical Software,'' |
## | http://gking.harvard.edu/zelig |
## | and |
## | Kosuke Imai, Gary King, and Olivia Lau. (2008). |
## | ``Toward A Common Framework for Statistical Analysis |
## | and Development,'' Journal of Computational and |
## | Graphical Statistics, Vol. 17, No. 4 (December) |
## | pp. 892-913. |
## | |
## | To cite individual Zelig models, please use the citation |
## | format printed with each model run and in the documentation. |
## +----------------------------------------------------------------+
##
##
##
## Attaching package: 'Zelig'
##
## The following object is masked from 'package:utils':
##
## cite
library("DescTools")
##
## Attaching package: 'DescTools'
##
## The following object is masked from 'package:Zelig':
##
## Mode
library("dplyr")
## Warning: package 'dplyr' was built under R version 3.1.3
##
## Attaching package: 'dplyr'
##
## The following objects are masked from 'package:Zelig':
##
## combine, summarize
##
## The following object is masked from 'package:MASS':
##
## select
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library("stargazer")
##
## Please cite as:
##
## Hlavac, Marek (2014). stargazer: LaTeX code and ASCII text for well-formatted regression and summary statistics tables.
## R package version 5.1. http://CRAN.R-project.org/package=stargazer
library("readstata13")
## Warning: package 'readstata13' was built under R version 3.1.3
library("foreign")
d <- read.dta("C:/Users/Abigail Walsh/Documents/Grad School/Queen's College/Basic Analytics/Basic Analytics/GSS.dta")
names(d)
## [1] "CASEID" "WORKBLKS" "RACDIF1" "RACMAR" "RACDIF2" "RACDIF3"
## [7] "HELPBLK" "HELPPOOR" "YEAR" "SEX" "AGE" "RACE"
## [13] "REALINC" "REALRINC" "EDUC" "DEGREE" "PRESTG80" "PAPRES80"
## [19] "MARITAL" "DIVORCE" "CHILDS" "RELIG" "WRKSLF" "UNEMP"
## [25] "REGION" "SIZE" "RACLIVE" "FEAR" "GUN" "POLVIEWS"
## [31] "FECHLD" "FEFAM"
Final <- select(d, WORKBLKS, RACDIF1, RACDIF2, RACDIF3, RACMAR, HELPBLK, HELPPOOR, YEAR, SEX, AGE, RACE, REALINC, EDUC, DEGREE, RELIG, UNEMP, REGION, RACLIVE, FEAR, GUN, POLVIEWS)
names(Final)
## [1] "WORKBLKS" "RACDIF1" "RACDIF2" "RACDIF3" "RACMAR" "HELPBLK"
## [7] "HELPPOOR" "YEAR" "SEX" "AGE" "RACE" "REALINC"
## [13] "EDUC" "DEGREE" "RELIG" "UNEMP" "REGION" "RACLIVE"
## [19] "FEAR" "GUN" "POLVIEWS"
df <- data.frame(GUN = c("NO" , "YES", NA),stringsAsFactors=FALSE)
Final$GUN=as.numeric(Final$GUN)
Final$REGION=as.numeric(Final$REGION)
Final$POLVIEWS=as.numeric(Final$POLVIEWS)
Final$RACE=as.numeric(Final$RACE)
Final$RACDIF2=as.numeric(Final$RACDIF2)
I created three models to help look at the data. The first model focused on education and age demographic information of participants. The second model considered additional information such as participant race and political leanings in this case how conservative they were in their political views. The third model adds in the measurement for racial attitudes, considering participants response to racial differences in education between Whites and Black being due to an inborn lack of ability for Blacks to learn.
model1 <- glm(GUN ~ EDUC + AGE,
data = Final)
model2 <- glm(GUN ~ EDUC + AGE + RACE + POLVIEWS, data = Final)
model3 <- glm(GUN ~ EDUC + AGE + RACE + POLVIEWS + RACDIF2, data=Final)
To determine the best model, I looked at the Akaike information at the bottom of the table. Based on this information, model 3 was the best fit.
stargazer(model1, model2, model3, type="text", style = "aer",
title = "Table 1: Logit Models",
covariate.labels = c("Education", "Age", "Race",
"Conservativism", "Racial Attitudies on Inborn Ability"),
dep.var.labels = "Gun Ownership")
##
## Table 1: Logit Models
## ============================================================================
## Gun Ownership
## (1) (2) (3)
## ----------------------------------------------------------------------------
## Education 0.004*** 0.004*** 0.007***
## (0.001) (0.001) (0.002)
##
## Age 0.002*** 0.002*** 0.002***
## (0.000) (0.000) (0.000)
##
## Race -0.031*** -0.020
## (0.007) (0.013)
##
## Conservativism 0.005** 0.002
## (0.002) (0.005)
##
## Racial Attitudies on Inborn Ability 0.000
## (0.016)
##
## Constant 2.660*** 2.698*** 2.638***
## (0.016) (0.029) (0.071)
##
## Observations 19,214 16,097 4,313
## Log Likelihood -9,417.557 -8,095.599 -2,137.797
## Akaike Inf. Crit. 18,841.120 16,201.200 4,287.593
## ----------------------------------------------------------------------------
## Notes: ***Significant at the 1 percent level.
## **Significant at the 5 percent level.
## *Significant at the 10 percent level.
Model3OddsRatioTable <- round(exp(cbind(Estimate=coef(model3))), 2)
rownames(Model3OddsRatioTable) <- c("Intercept", "Education", "Age",
"Race", "Conservativism", "Racial Attitudes on Inborn Ability")
stargazer(Model3OddsRatioTable, type = "text", digits = 2, style = "aer",
title = "Table 3: Model Odds Ratios",
covariate.labels = c("Covariates", "Odds Ratio"))
##
## Table 3: Model Odds Ratios
## =============================================
## Covariates Odds Ratio
## ---------------------------------------------
## Intercept 13.99
## Education 1.01
## Age 1
## Race 0.98
## Conservativism 1
## Racial Attitudes on Inborn Ability 1
## ---------------------------------------------
While model 3 was the best fit, it may not give us the predictive information we would be interested in.