For purposes of this assignment, I will be using data from the 2014 General Social Survey. To see if I can come up with some data that may actually be related this time, I will be using somewhat different data than the data I had used in previous assignments that dealth with Abortions. This time, I will be comparing the relationship gender plays with various work characteristics. Hopefully I will find some interesting relationships specific to the data from 2014! Read on…
library(Zelig)
library(foreign)
library(DescTools)
d <- read.dta("/Users/laurenberkowitz/Downloads/GSS2014.DTA", convert.factors = FALSE)
names(d)
library(dplyr)
library(tidyr)
library(pander)
library(car)
ExamWork <- select(d, age, sex, marital, educ, race, yearsjob, wrkhome, famwkoff, famvswk, hrsrelax, satjob)
names(ExamWork)
Variables include:
AGE Respondent’s Age
SEX Respondent’s Sex
RACE Respondent’s Race
MARITAL Marital Status
EDUC Highest year of school completed
YEARSJOB Years at present job
WRKHOME Frequency of working from home
FAMWKOFF Difficulty in taking off work for family
FAMVSWK Reverse Frequency of family interfering with work
HRSRELAX Hours per week to relax
SATJOB Reverse Job Satisfaction
ExamWork <- na.omit(ExamWork)
ExamWork$educ <- as.numeric(ExamWork$educ)
I want to answer the question of whether sex influences the relationship between working from home and the amount of hours per week to relax. Setting up the regression I want to see the effect of different variables on marital status. I examine the effects of age, sex, years working, time working at home, time to relax, and family interference with work to see how they may influence the likelihood of being married. First we convert marital to a binomial variable. This will show someone as “married = 1” or “not married = 0”
ExamWork$BinaryMarital<- recode(ExamWork$marital, "c(1,2,3,4)='0'; 5='1'")
ExamWork$WhiteRace<-recode(ExamWork$race, "1='1' ; c(2,3)='0'")
Then we run regressions
wkeffect1 <- glm(BinaryMarital ~ age + sex + yearsjob + wrkhome, family = binomial, data=ExamWork)
wkeffect2 <- glm(BinaryMarital ~ age + sex + yearsjob + wrkhome + hrsrelax, family = binomial, data=ExamWork)
wkeffect3 <- glm(BinaryMarital ~ age + sex + yearsjob + wrkhome + hrsrelax + famwkoff, family = binomial, data=ExamWork)
library(stargazer)
##
## Please cite as:
##
## Hlavac, Marek (2014). stargazer: LaTeX code and ASCII text for well-formatted regression and summary statistics tables.
## R package version 5.1. http://CRAN.R-project.org/package=stargazer
stargazer(wkeffect1, wkeffect2, wkeffect3, type="html")
| Dependent variable: | |||
| BinaryMarital | |||
| (1) | (2) | (3) | |
| age | -0.084*** | -0.085*** | -0.085*** |
| (0.007) | (0.007) | (0.007) | |
| sex | 0.071 | 0.112 | 0.116 |
| (0.144) | (0.145) | (0.145) | |
| yearsjob | -0.013 | -0.013 | -0.013 |
| (0.012) | (0.012) | (0.012) | |
| wrkhome | -0.138*** | -0.130*** | -0.130*** |
| (0.044) | (0.045) | (0.045) | |
| hrsrelax | 0.063** | 0.067** | |
| (0.028) | (0.028) | ||
| famwkoff | 0.060 | ||
| (0.071) | |||
| Constant | 2.786*** | 2.536*** | 2.375*** |
| (0.351) | (0.366) | (0.413) | |
| Observations | 1,218 | 1,218 | 1,218 |
| Log Likelihood | -592.448 | -590.026 | -589.675 |
| Akaike Inf. Crit. | 1,194.895 | 1,192.053 | 1,193.351 |
| Note: | p<0.1; p<0.05; p<0.01 | ||
We see that some of these relationships have more significance, so with that knowledge and taking into account error (Akaike), I am going to rerun the regressions:
rewkeffect1 <- glm(BinaryMarital ~ age + wrkhome, family = binomial, data=ExamWork)
rewkeffect2 <- glm(BinaryMarital ~ age + wrkhome + hrsrelax, family = binomial, data=ExamWork)
rewkeffect3 <- glm(BinaryMarital ~ age + wrkhome + hrsrelax + WhiteRace, family = binomial, data=ExamWork)
stargazer(rewkeffect1, rewkeffect2, rewkeffect3, type="html")
| Dependent variable: | |||
| BinaryMarital | |||
| (1) | (2) | (3) | |
| age | -0.088*** | -0.089*** | -0.088*** |
| (0.006) | (0.007) | (0.007) | |
| wrkhome | -0.141*** | -0.135*** | -0.128*** |
| (0.044) | (0.045) | (0.045) | |
| hrsrelax | 0.059** | 0.057** | |
| (0.028) | (0.028) | ||
| WhiteRace | -0.615*** | ||
| (0.156) | |||
| Constant | 2.972*** | 2.796*** | 3.178*** |
| (0.268) | (0.279) | (0.301) | |
| Observations | 1,218 | 1,218 | 1,218 |
| Log Likelihood | -593.181 | -590.983 | -583.252 |
| Akaike Inf. Crit. | 1,192.361 | 1,189.966 | 1,176.505 |
| Note: | p<0.1; p<0.05; p<0.01 | ||
These results show all variables to have an effect on marital status with the third model (rewkeffect3) showing the least error with an Akaike of 199.396. We see when accounting for other variable effects, age has a negative effect on being married, so with every year older a person is, they are actually .087 less likely to be married, the more you work from home the less likely you are to be married (0.133), increased hours relaxing increases your likelihood of being married (.057), but being white does not. It’s interesting to note that these variables, other than relaxation time seem to negatively affect your chances of being married.
Continuing on using some of the knowledge discovered from the general linear regression model, I want to see if being married had an effect on the relationship of education and sex.
D1 <- zelig(BinaryMarital ~ educ + sex + educ:sex, data= ExamWork, model = "logit")
## How to cite this model in Zelig:
## Kosuke Imai, Gary King, and Oliva Lau. 2008. "logit: Logistic Regression for Dichotomous Dependent Variables" in Kosuke Imai, Gary King, and Olivia Lau, "Zelig: Everyone's Statistical Software," http://gking.harvard.edu/zelig
xh1 <- setx(D1, educ= mean(ExamWork$educ)+sd(ExamWork$educ), sex = 1)
xl1 <- setx(D1, educ = mean(ExamWork$educ), sex = 1)
xh0 <- setx(D1, educ = mean(ExamWork$educ)+sd(ExamWork$educ), sex = 2)
xl0 <- setx(D1, educ = mean(ExamWork$educ), sex = 2)
zh1 <- sim(D1, x= xh1)
zl1 <- sim(D1, x=xl1)
zh0 <- sim(D1, x=xh0)
zl0 <- sim(D1, x=xl0)
eff <- (zh1$qi$ev - zl1$qi$ev) - (zh0$qi$ev - zl0$qi$ev)
quantile(eff, c(.025,.975))
## 2.5% 97.5%
## -0.10051950 0.07096407
hist(eff)
Based on these results it appears there is really not an effect that marriage plays on the relationship between education and sex, as the histogram is a relatively normal distribution.
Using count variables, we want to estimate the likelihood of having more hours to relax based on race and marital status.
ExamWork$wrkhome <- as.numeric(ExamWork$wrkhome)
D2 <- zelig(hrsrelax~ WhiteRace + BinaryMarital + age + wrkhome, data = ExamWork, model="poisson")
stargazer(D2, type="html")
| Dependent variable: | |
| hrsrelax | |
| WhiteRace | -0.016 |
| (0.035) | |
| BinaryMarital | 0.110*** |
| (0.037) | |
| age | 0.007*** |
| (0.001) | |
| wrkhome | -0.035*** |
| (0.009) | |
| Constant | 1.033*** |
| (0.069) | |
| Observations | 1,218 |
| Log Likelihood | -2,782.452 |
| Akaike Inf. Crit. | 5,574.904 |
| Note: | p<0.1; p<0.05; p<0.01 |
Based on these results we see that there is not significant relationship with race and the amount of hours relaxing, but there are positive relationships based on being married and an increase in age and a negative relationship with working from home. We see the log odds of .110 and .007 for being married and having an increase in time to relax and age with having time to relax respectively, as well as a log odds of -.035 for working from home and having time to relax. This shows that your chances of having more relaxation time improve if you are married and the older you are, but the more you work from home, the less likely is your chance of having time to relax.
I’d like to examine the probility distribution of the relationship of age and marital status with the relationship to hours of relaxation to see if there is an interaction effect.
xh2 <- setx(D2, age= mean(ExamWork$age)+sd(ExamWork$age), BinaryMarital = 1)
xl2 <- setx(D2, educ = mean(ExamWork$age), BinaryMarital = 1)
xh3 <- setx(D2, educ = mean(ExamWork$age)+sd(ExamWork$age), BinaryMarital = 0)
xl3 <- setx(D2, educ = mean(ExamWork$age), BinaryMarital = 0)
zh2 <- sim(D2, x= xh2)
zl2 <- sim(D2, x=xl2)
zh3 <- sim(D2, x=xh3)
zl3 <- sim(D2, x=xl3)
gee <- (zh2$qi$ev - zl2$qi$ev) - (zh3$qi$ev - zl3$qi$ev)
quantile(gee, c(.025,.975))
## 2.5% 97.5%
## -0.0496718 0.8363300
hist(gee)
Based on these results it does seem to indicate that there is a significant relationship between marriage and age on the effect of hours to relax. #Interaction2 I’d also like to examine what kind of distribution the relationship of marriage and hours worked from home has on the amount of time to relax
xh4 <- setx(D2, age= mean(ExamWork$wrkhome)+sd(ExamWork$age), BinaryMarital = 1)
xl4 <- setx(D2, educ = mean(ExamWork$wrkhome), BinaryMarital = 1)
xh5 <- setx(D2, educ = mean(ExamWork$workhome)+sd(ExamWork$age), BinaryMarital = 0)
xl5 <- setx(D2, educ = mean(ExamWork$workhome), BinaryMarital = 0)
zh4 <- sim(D2, x= xh4)
zl4 <- sim(D2, x=xl4)
zh5 <- sim(D2, x=xh5)
zl5 <- sim(D2, x=xl5)
jay <- (zh4$qi$ev - zl4$qi$ev) - (zh5$qi$ev - zl5$qi$ev)
quantile(jay, c(.025,.975))
## 2.5% 97.5%
## -1.0599951 -0.3133549
hist(jay)
Based on these results, it appears that again marriage and hours worked from home have an effect on the amount of hours relaxed.