I would like to study if marriage and children make people more engaged in religious service. Alternatively speaking, I would like to know if people go to religious service more frequently after they get married and have children.
Marriage and having children make people spending more time on family life, such as staying together with family on weekends and holidays. A religious service is a good option for the family to spend time together and meet new people living nearby. Therefore, I believe that marriage and number of children are positively related to one’s attendance at religious service.
I used three variables from General Social Survey (GSS) in my model.
The dependent variable is attend, question “how often do you attend religious services”. Answers range from “never” to “more than once a week”.
I have two independent variables. The first one is marital status. I re-coded this variable to make people married as 1 and others as 0. About half of interviewees are married.
ds2$married <- ifelse(ds2$marital == 1,1,0)
table(ds2$married)
##
## 0 1
## 2455 2353
The second one is how many children do you have.
summaryBy(married~attend, data=ds2, FUN=c(mean, sd), na.rm=T)
## attend married.mean married.sd
## 1 0 0.3769231 0.4848485
## 2 1 0.4006623 0.4908460
## 3 2 0.4746835 0.4997542
## 4 3 0.4959016 0.5004963
## 5 4 0.5129870 0.5006447
## 6 5 0.4939467 0.5005697
## 7 6 0.5884615 0.4930614
## 8 7 0.5962539 0.4909032
## 9 8 0.5242967 0.5000492
## 10 NA 0.3846154 0.5063697
summaryBy(childs~attend, data=ds2, FUN=c(mean, sd), na.rm=T)
## attend childs.mean childs.sd
## 1 0 1.633462 1.643457
## 2 1 1.723333 1.492497
## 3 2 1.713608 1.612728
## 4 3 1.897541 1.683887
## 5 4 1.821429 1.557977
## 6 5 1.973366 1.616591
## 7 6 2.100000 1.661975
## 8 7 2.238293 1.742374
## 9 8 2.427110 1.818475
## 10 NA 2.153846 1.908147
From the table above, we can see the frequency of one attends religious service increases when marriage or number of children increases. Therefore, we can tell that marriage and number of children can be good predictors to one’s attendance at religious service, and expect positive relationship between both independent variables to dependent variable.
We start from a naïve multiple-regression model.
ols = plm(attend~ married + childs + as.factor(panelwave) , index = c("idnum", "panelwave"), ds2, model="pooling")
summary(ols)
## Oneway (individual) effect Pooling Model
##
## Call:
## plm(formula = attend ~ married + childs + as.factor(panelwave),
## data = ds2, model = "pooling", index = c("idnum", "panelwave"))
##
## Unbalanced Panel: n=1992, T=1-3, N=4785
##
## Residuals :
## Min. 1st Qu. Median 3rd Qu. Max.
## -5.220 -2.650 -0.206 2.670 5.130
##
## Coefficients :
## Estimate Std. Error t-value Pr(>|t|)
## (Intercept) 2.873756 0.082008 35.0426 <2e-16 ***
## married 0.680546 0.081645 8.3354 <2e-16 ***
## childs 0.217307 0.024262 8.9566 <2e-16 ***
## as.factor(panelwave)2 0.128317 0.094341 1.3601 0.1738
## as.factor(panelwave)3 0.143649 0.099745 1.4402 0.1499
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 38190
## Residual Sum of Squares: 36735
## R-Squared: 0.038089
## Adj. R-Squared: 0.038049
## F-statistic: 47.3186 on 4 and 4780 DF, p-value: < 2.22e-16
Married people attend religious service 0.68 units higher than others on average, net of the time trend. This coefficient is statistically significant.
One more child lead to 0.217 units more religious service on average, net of the time trend. This coefficient is statistically significant.
The outputs of the naïve multiple-regression model are in line with my expectations. Both marriage and number of children are positively related to attendance at religious services.
However, the naïve multiple-regression model is unable to deal with endogeneity issue, such as possible omitted variable bias. For example, maybe people frequently attending religious service are fundamentally different from people who do not attend religious service, such as race, socioeconomic status or education level. Perhaps people from high socioeconomic status family are more likely to attend religious service than people from low socioeconomic status family.
Thus, omitted variables are correlated to variables in this model and may cause biased estimation. Therefore, the relationship between the independent variables and dependent variable might be questionable in the naïve regression model because of omitted variable bias.
Fixed effects model can overcome this issue because it assumes that omitted variables are correlated with the observed variables in the model. Any changes in the dependent variable are only caused by variables other than time-invariant variables such as gender and race.
The followings are the outputs of fixed effects model.
fe = plm(attend~ married + childs + as.factor(panelwave), index = c("idnum", "panelwave"), ds2, model="within")
summary(fe)
## Oneway (individual) effect Within Model
##
## Call:
## plm(formula = attend ~ married + childs + as.factor(panelwave),
## data = ds2, model = "within", index = c("idnum", "panelwave"))
##
## Unbalanced Panel: n=1992, T=1-3, N=4785
##
## Residuals :
## Min. 1st Qu. Median 3rd Qu. Max.
## -5.320 -0.363 0.000 0.343 4.950
##
## Coefficients :
## Estimate Std. Error t-value Pr(>|t|)
## married 6.9457e-02 1.1865e-01 0.5854 0.55835
## childs -3.5282e-05 5.5212e-02 -0.0006 0.99949
## as.factor(panelwave)2 9.7184e-02 5.0573e-02 1.9217 0.05475 .
## as.factor(panelwave)3 3.4378e-02 5.4651e-02 0.6290 0.52937
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 5382.3
## Residual Sum of Squares: 5374.2
## R-Squared: 0.0015068
## Adj. R-Squared: 0.00087823
## F-statistic: 1.05217 on 4 and 2789 DF, p-value: 0.37871
Married people attend religious service 0.069457 units higher than others on average, net of the time trend. This coefficient is not statistically significant.
One more child lead to 0.000035282 units fewer religious service on average, net of the time trend. This coefficient is not statistically significant.
The outputs of fixed effects model are obviously different from the outputs of the OLS model I have run at the beginning. The impact of number of children on religious service becomes negative, even though the coefficient is very small. The coefficient of marriage decreases largely from 0.68 to 0.069. Moreover, both coefficients become statistically insignificant. The impacts of marriage and number of children on religious service are both diminished after assuming omitted variables are correlated to observed variables.
Also, I run a random effects model.
re = plm(attend~ married + childs + as.factor(panelwave), index = c("idnum", "panelwave"), ds2, model="random")
summary(re)
## Oneway (individual) effect Random Effect Model
## (Swamy-Arora's transformation)
##
## Call:
## plm(formula = attend ~ married + childs + as.factor(panelwave),
## data = ds2, model = "random", index = c("idnum", "panelwave"))
##
## Unbalanced Panel: n=1992, T=1-3, N=4785
##
## Effects:
## var std.dev share
## idiosyncratic 1.927 1.388 0.252
## individual 5.723 2.392 0.748
## theta :
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.4981 0.6823 0.6823 0.6576 0.6823 0.6823
##
## Residuals :
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -4.7900 -1.0800 -0.0072 0.0092 1.0700 4.8000
##
## Coefficients :
## Estimate Std. Error t-value Pr(>|t|)
## (Intercept) 3.102698 0.089468 34.6794 < 2.2e-16 ***
## married 0.405339 0.086086 4.7085 2.566e-06 ***
## childs 0.163130 0.029887 5.4582 5.052e-08 ***
## as.factor(panelwave)2 0.094747 0.049745 1.9047 0.05688 .
## as.factor(panelwave)3 0.038963 0.053478 0.7286 0.46630
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 9448.9
## Residual Sum of Squares: 9263.5
## R-Squared: 0.020743
## Adj. R-Squared: 0.020721
## F-statistic: 23.9183 on 4 and 4780 DF, p-value: < 2.22e-16
Married people attend religious service 0.40 units higher than others on average, net of the time trend. This coefficient is statistically significant.
One more child lead to 0.163 units more religious service on average, net of the time trend. This coefficient is statistically significant.
The outputs of the random effects model are more consistent with the outputs of the OLS model. This is because the assumption of random effects model, omitted variables are uncorrelated with observed variables, is very similar to the assumption of OLS model.
Here is the Hausman test to compare the fixed effects model and the random effects model above.
phtest(fe, re)
##
## Hausman Test
##
## data: attend ~ married + childs + as.factor(panelwave)
## chisq = 89.011, df = 4, p-value < 2.2e-16
## alternative hypothesis: one model is inconsistent
By the result of Hausman test that P-value is small enough, we are confident to reject the null hypothesis that coefficients between these two models are same. Therefore, we should use fixed effects model in this case since the coefficients between two models are different and fixed effects model is more consistent. This result indicates that it is likely omitted variables, such as race, gender, SES, are correlated to attendance at religious service and we should consider the impacts of these omitted variables as well.