library(Zelig)
library(foreign)
library(knitr)
library(stargazer)
library(dplyr)
library(car)
library(DescTools)
library(lme4)
library(texreg)
The dataset I am working with is of a 1988 Bangladesh Fertility study. The dataset has 1934 women in 60 districts. Here I will be using the data to show how contraceptive use, location, and age may affect the amount of children a woman has.
N <- read.table("/Users/laurenberkowitz/Downloads/datasets/bang.txt")
fertility <- select(N, "womanid"=V1, "district"=V2, "contrause"=V3, "livchild"=V4, "age"=V5, "urbanres"=V6, "constant"=V7)
names(fertility)
## [1] "womanid" "district" "contrause" "livchild" "age" "urbanres"
## [7] "constant"
fertility <- na.omit(fertility)
Here we look at the amount of children based on contraceptive use by location.
m2_lme <- lmer(livchild ~ urbanres + (urbanres|contrause), data = fertility)
summary(m2_lme)
## Linear mixed model fit by REML ['lmerMod']
## Formula: livchild ~ urbanres + (urbanres | contrause)
## Data: fertility
##
## REML criterion at convergence: 6297.1
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.58190 -1.13387 0.04529 1.07471 1.30691
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## contrause (Intercept) 0.072201 0.26870
## urbanres 0.004123 0.06421 -1.00
## Residual 1.510723 1.22911
## Number of obs: 1934, groups: contrause, 2
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2.7568 0.1931 14.275
## urbanres -0.2205 0.0772 -2.856
##
## Correlation of Fixed Effects:
## (Intr)
## urbanres -0.659
We see that less children are born in urban areas and more are born in rural areas. The contraceptive use is .07, so not very high in areas of urban residence. There is a very large residual with 1.51.
To see about additional factors, we also consider age on contraceptive use
m3_lme <- lmer(livchild ~ urbanres + (urbanres|contrause) + age, data = fertility)
m4_lme <- lmer(livchild ~ urbanres*age + (urbanres|contrause), data = fertility)
screenreg(list(m3_lme, m4_lme))
##
## ===========================================================
## Model 1 Model 2
## -----------------------------------------------------------
## (Intercept) 2.74 *** 2.74 ***
## (0.15) (0.15)
## urbanres -0.18 *** -0.18 ***
## (0.05) (0.05)
## age 0.09 *** 0.09 ***
## (0.00) (0.00)
## urbanres:age 0.00
## (0.01)
## -----------------------------------------------------------
## AIC 5083.14 5093.18
## BIC 5122.11 5137.72
## Log Likelihood -2534.57 -2538.59
## Num. obs. 1934 1934
## Num. groups: contrause 2 2
## Variance: contrause.(Intercept) 0.05 0.05
## Variance: contrause.urbanres 0.00 0.00
## Variance: Residual 0.80 0.80
## ===========================================================
## *** p < 0.001, ** p < 0.01, * p < 0.05
We see that there does not appear to be a significant relationship between area of residence and age. We see according to Model 1 that there is a negative affect on the number of kids and living in an urban setting. We see that age doesn’t have any different affect either under Model 2.
The next step is to center the data:
library(magrittr)
fertility %<>% mutate(meanage = age - mean(age))
m5_lme <- lmer(livchild ~ urbanres*meanage + (urbanres|contrause), data = fertility)
screenreg(list(m4_lme, m5_lme))
##
## ===========================================================
## Model 1 Model 2
## -----------------------------------------------------------
## (Intercept) 2.74 *** 2.74 ***
## (0.15) (0.15)
## urbanres -0.18 *** -0.18 ***
## (0.05) (0.05)
## age 0.09 ***
## (0.00)
## urbanres:age 0.00
## (0.01)
## meanage 0.09 ***
## (0.00)
## urbanres:meanage 0.00
## (0.01)
## -----------------------------------------------------------
## AIC 5093.18 5093.18
## BIC 5137.72 5137.72
## Log Likelihood -2538.59 -2538.59
## Num. obs. 1934 1934
## Num. groups: contrause 2 2
## Variance: contrause.(Intercept) 0.05 0.05
## Variance: contrause.urbanres 0.00 0.00
## Variance: Residual 0.80 0.80
## ===========================================================
## *** p < 0.001, ** p < 0.01, * p < 0.05
We can see there is still a negative effect on the number of children in an urban setting when the age is averaged. I think it could also be helpful to see if the number of children is actually related to contraceptive use.
m6_lme <- lmer(livchild ~ 1 + (1|contrause), data=fertility)
summary(m6_lme)
## Linear mixed model fit by REML ['lmerMod']
## Formula: livchild ~ 1 + (1 | contrause)
## Data: fertility
##
## REML criterion at convergence: 6306
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.4943 -1.2403 0.1279 0.9390 1.1931
##
## Random effects:
## Groups Name Variance Std.Dev.
## contrause (Intercept) 0.05064 0.225
## Residual 1.51997 1.233
## Number of obs: 1934, groups: contrause, 2
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2.6857 0.1617 16.61
We calculate the Interclass Correlation:
ICC <- ((0.05064)/(0.05064+1.51997))
ICC
## [1] 0.03224225
The small number shows that only 3% of the determining for number of children has to do with contraceptive use. On the other hand, we find over half of the factors determining the amount of children a woman has has to do with her age:
m7_lme <- lmer(livchild ~ 1 + (1|age), data=fertility)
summary(m7_lme)
## Linear mixed model fit by REML ['lmerMod']
## Formula: livchild ~ 1 + (1 | age)
## Data: fertility
##
## REML criterion at convergence: 4990
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.3516 -0.4782 0.1798 0.6657 3.0969
##
## Random effects:
## Groups Name Variance Std.Dev.
## age (Intercept) 0.9816 0.9907
## Residual 0.7041 0.8391
## Number of obs: 1934, groups: age, 50
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2.6748 0.1447 18.49
ICC2 <- ((0.9816)/(0.9816+0.7041))
ICC2
## [1] 0.58231