Introduction

For purposes of this assignment, I will be using data from the 2014 General Social Survey. To see if I can come up with some data that may actually be related this time, I will be using somewhat different data than the data I had used in previous assignments that dealth with Abortions. This time, I will be comparing the relationship gender plays with various work characteristics. Hopefully I will find some interesting relationships specific to the data from 2014! Read on…

library(Zelig)
library(foreign)
library(DescTools)
d <- read.dta("/Users/laurenberkowitz/Downloads/GSS2014.DTA", convert.factors = FALSE)
names(d)
library(dplyr)
library(tidyr)
library(pander)
library(car)
ExamWork <- select(d, age, sex, marital, educ, yearsjob, wrkhome, famwkoff, famvswk, hrsrelax, satjob)
names(ExamWork)

Variables include:

AGE Respondent’s Age

SEX Respondent’s Sex

RACE Respondent’s Race

MARITAL Marital Status

EDUC Highest year of school completed

YEARSJOB Years at present job

WRKHOME Frequency of working from home

FAMWKOFF Difficulty in taking off work for family

FAMVSWK Reverse Frequency of family interfering with work

HRSRELAX Hours per week to relax

SATJOB Reverse Job Satisfaction

Setup Logistic Regression

I want to answer the question of whether sex influences the relationship between working from home and the amount of hours per week to relax. Setting up the regression I want to see the relationship between sex and frequency of working from home. First we convert marital to a binomial variable.

ExamWork$BinaryMarital<- recode(ExamWork$marital, "c(1,2,3,4)='0'; 5='1'")

Then we run regressions

wkeffect1 <- glm(BinaryMarital ~ age + sex + yearsjob + wrkhome, family = binomial, data=ExamWork)
wkeffect2 <- glm(BinaryMarital ~ age + sex + yearsjob + wrkhome + hrsrelax, family = binomial, data=ExamWork)
wkeffect3 <- glm(BinaryMarital ~ age + sex + yearsjob + wrkhome + hrsrelax + famwkoff, family = binomial, data=ExamWork)
library(stargazer)
## 
## Please cite as: 
## 
##  Hlavac, Marek (2014). stargazer: LaTeX code and ASCII text for well-formatted regression and summary statistics tables.
##  R package version 5.1. http://CRAN.R-project.org/package=stargazer
stargazer(wkeffect1, wkeffect2, wkeffect3, type="html")
Dependent variable:
BinaryMarital
(1) (2) (3)
age -0.084*** -0.085*** -0.085***
(0.007) (0.007) (0.007)
sex 0.051 0.118 0.116
(0.143) (0.145) (0.145)
yearsjob -0.014 -0.013 -0.013
(0.012) (0.012) (0.012)
wrkhome -0.126*** -0.129*** -0.130***
(0.044) (0.045) (0.045)
hrsrelax 0.064** 0.067**
(0.028) (0.028)
famwkoff 0.061
(0.071)
Constant 2.806*** 2.524*** 2.379***
(0.348) (0.365) (0.412)
Observations 1,233 1,221 1,220
Log Likelihood -600.015 -591.397 -590.396
Akaike Inf. Crit. 1,210.031 1,194.793 1,194.793
Note: p<0.1; p<0.05; p<0.01

We see all of these relationships are statistically significant, some positive, some negative which means we can continue on to answer my question.

library(visreg)
library(erer)
## Loading required package: lmtest
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## 
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
library(tidyr)
library(memisc)
## Warning: package 'memisc' was built under R version 3.1.3
## Loading required package: lattice
## 
## Attaching package: 'lattice'
## 
## The following object is masked from 'package:boot':
## 
##     melanoma
## 
## 
## Attaching package: 'memisc'
## 
## The following object is masked from 'package:car':
## 
##     recode
## 
## The following objects are masked from 'package:dplyr':
## 
##     collect, query, rename
## 
## The following object is masked from 'package:DescTools':
## 
##     %nin%
## 
## The following objects are masked from 'package:stats':
## 
##     contr.sum, contr.treatment, contrasts
## 
## The following object is masked from 'package:base':
## 
##     as.array

Simulation

We are going to simulate each of the quantities, calculate the differences, and assess significance:

D1 <- zelig(BinaryMarital ~ educ + sex + educ:sex, data= ExamWork, model = "logit")
## How to cite this model in Zelig:
## Kosuke Imai, Gary King, and Oliva Lau. 2008. "logit: Logistic Regression for Dichotomous Dependent Variables" in Kosuke Imai, Gary King, and Olivia Lau, "Zelig: Everyone's Statistical Software," http://gking.harvard.edu/zelig

Below, I have taken the data out of chunks, because I cannot figure out how to solve the errors. I know that a few other classmates mentioned that they could not figure it out either. As far as I know, I copied from the slides in class what I should have done, but I may have missed something. I also wonder if my differentiations for “sex” were incorrect. I’d love some guidance!

See below for the remainder of my codes:

xh1 <- setx(D1, educ= mean(ExamWork$educ)+sd(ExamWork$educ), sex = "1")
xl1 <- setx(D1, educ = mean(ExamWork$educ), sex = "1")
xh0 <- setx(D1, educ = mean(ExamWork$educ)+sd(ExamWork$educ), sex = "2")
xl0 <- setx(D1, educ = mean(ExamWork$educ), sex = "2")

For some reason I cannot go further than here. I am taking the coding out so you can see what I tried to do.

zh1 <- sim(D1, x= xh1) zl1 <- sim(D1, x=xl1) zh0 <- sim(D1, x=xh0) zl0 <- sim(D1, x=xl0)

eff <- (zh1\(qi\)ev - zl1\(qi\)ev) - (zh0\(qi\)ev - zl0\(qi\)ev)

quantile(eff, c(.025,.975))

Count Variables

library(nnet)
D2 <- zelig(hrsrelax~ age + yearsjob, data = ExamWork, model="poisson")
stargazer(D2)
## 
## % Table created by stargazer v.5.1 by Marek Hlavac, Harvard University. E-mail: hlavac at fas.harvard.edu
## % Date and time: Sun, Apr 19, 2015 - 23:07:42
## \begin{table}[!htbp] \centering 
##   \caption{} 
##   \label{} 
## \begin{tabular}{@{\extracolsep{5pt}}lc} 
## \\[-1.8ex]\hline 
## \hline \\[-1.8ex] 
##  & \multicolumn{1}{c}{\textit{Dependent variable:}} \\ 
## \cline{2-2} 
## \\[-1.8ex] & hrsrelax \\ 
## \hline \\[-1.8ex] 
##  age & 0.004$^{***}$ \\ 
##   & (0.001) \\ 
##   & \\ 
##  yearsjob & 0.001 \\ 
##   & (0.002) \\ 
##   & \\ 
##  Constant & 1.082$^{***}$ \\ 
##   & (0.054) \\ 
##   & \\ 
## \hline \\[-1.8ex] 
## Observations & 1,223 \\ 
## Log Likelihood & $-$2,807.815 \\ 
## Akaike Inf. Crit. & 5,621.630 \\ 
## \hline 
## \hline \\[-1.8ex] 
## \textit{Note:}  & \multicolumn{1}{r}{$^{*}$p$<$0.1; $^{**}$p$<$0.05; $^{***}$p$<$0.01} \\ 
## \end{tabular} 
## \end{table}
D3<- zelig(famvswk ~ hrsrelax + age + yearsjob, data=ExamWork, model = "poisson")

This is as far as I got. I certainly need some more practice, but I’m happy that for the most part I was able to run formulas more successfully this week!