library(Zelig)
library(foreign)
library(knitr)
library(stargazer)
library(dplyr)
library(car)
library(DescTools)
library(lme4)
library(texreg)

The dataset I am working with is of a 1988 Bangladesh Fertility study. The dataset has 1934 women in 60 districts. Here I will be using the data to show how contraceptive use, location, and age may affect the amount of children a woman has.

N <- read.table("/Users/laurenberkowitz/Downloads/datasets/bang.txt")
fertility <- select(N, "womanid"=V1, "district"=V2, "contrause"=V3, "livchild"=V4, "age"=V5, "urbanres"=V6, "constant"=V7)
names(fertility)
## [1] "womanid"   "district"  "contrause" "livchild"  "age"       "urbanres" 
## [7] "constant"

womanid=the id number of each woman studied

district = the district grouping of each woman studied

contrause = if the woman used contraceptives or not

livchild = the number of living children each woman has

age = the age of the woman

urbanres = if the urban lived in an urban location (versus a rural location)

contant = a constant of 1

fertility <- na.omit(fertility)

Here we look at the amount of children based on contraceptive use by location.

m2_lme <- lmer(livchild ~ urbanres + (urbanres|contrause), data = fertility)
summary(m2_lme)
## Linear mixed model fit by REML ['lmerMod']
## Formula: livchild ~ urbanres + (urbanres | contrause)
##    Data: fertility
## 
## REML criterion at convergence: 6297.1
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.58190 -1.13387  0.04529  1.07471  1.30691 
## 
## Random effects:
##  Groups    Name        Variance Std.Dev. Corr 
##  contrause (Intercept) 0.072201 0.26870       
##            urbanres    0.004123 0.06421  -1.00
##  Residual              1.510723 1.22911       
## Number of obs: 1934, groups:  contrause, 2
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   2.7568     0.1931  14.275
## urbanres     -0.2205     0.0772  -2.856
## 
## Correlation of Fixed Effects:
##          (Intr)
## urbanres -0.659

We see that less children are born in urban areas and more are born in rural areas. The contraceptive use is .07, so not very high in areas of urban residence. There is a very large residual with 1.51.

To see about additional factors, we also consider age on contraceptive use

m3_lme <- lmer(livchild ~ urbanres + (urbanres|contrause) + age, data = fertility)
m4_lme <- lmer(livchild ~ urbanres*age + (urbanres|contrause), data = fertility)
screenreg(list(m3_lme, m4_lme))
## 
## ===========================================================
##                                  Model 1       Model 2     
## -----------------------------------------------------------
## (Intercept)                          2.74 ***      2.74 ***
##                                     (0.15)        (0.15)   
## urbanres                            -0.18 ***     -0.18 ***
##                                     (0.05)        (0.05)   
## age                                  0.09 ***      0.09 ***
##                                     (0.00)        (0.00)   
## urbanres:age                                       0.00    
##                                                   (0.01)   
## -----------------------------------------------------------
## AIC                               5083.14       5093.18    
## BIC                               5122.11       5137.72    
## Log Likelihood                   -2534.57      -2538.59    
## Num. obs.                         1934          1934       
## Num. groups: contrause               2             2       
## Variance: contrause.(Intercept)      0.05          0.05    
## Variance: contrause.urbanres         0.00          0.00    
## Variance: Residual                   0.80          0.80    
## ===========================================================
## *** p < 0.001, ** p < 0.01, * p < 0.05

We see that there does not appear to be a significant relationship between area of residence and age. We see according to Model 1 that there is a negative affect on the number of kids and living in an urban setting. We see that age doesn’t have any different affect either under Model 2.
The next step is to center the data:

library(magrittr)
fertility %<>% mutate(meanage = age - mean(age))
m5_lme <- lmer(livchild ~ urbanres*meanage + (urbanres|contrause), data = fertility)
screenreg(list(m4_lme, m5_lme))
## 
## ===========================================================
##                                  Model 1       Model 2     
## -----------------------------------------------------------
## (Intercept)                          2.74 ***      2.74 ***
##                                     (0.15)        (0.15)   
## urbanres                            -0.18 ***     -0.18 ***
##                                     (0.05)        (0.05)   
## age                                  0.09 ***              
##                                     (0.00)                 
## urbanres:age                         0.00                  
##                                     (0.01)                 
## meanage                                            0.09 ***
##                                                   (0.00)   
## urbanres:meanage                                   0.00    
##                                                   (0.01)   
## -----------------------------------------------------------
## AIC                               5093.18       5093.18    
## BIC                               5137.72       5137.72    
## Log Likelihood                   -2538.59      -2538.59    
## Num. obs.                         1934          1934       
## Num. groups: contrause               2             2       
## Variance: contrause.(Intercept)      0.05          0.05    
## Variance: contrause.urbanres         0.00          0.00    
## Variance: Residual                   0.80          0.80    
## ===========================================================
## *** p < 0.001, ** p < 0.01, * p < 0.05

We can see there is still a negative effect on the number of children in an urban setting when the age is averaged. I think it could also be helpful to see if the number of children is actually related to contraceptive use.

m6_lme <- lmer(livchild ~ 1 + (1|contrause), data=fertility)
summary(m6_lme)
## Linear mixed model fit by REML ['lmerMod']
## Formula: livchild ~ 1 + (1 | contrause)
##    Data: fertility
## 
## REML criterion at convergence: 6306
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.4943 -1.2403  0.1279  0.9390  1.1931 
## 
## Random effects:
##  Groups    Name        Variance Std.Dev.
##  contrause (Intercept) 0.05064  0.225   
##  Residual              1.51997  1.233   
## Number of obs: 1934, groups:  contrause, 2
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   2.6857     0.1617   16.61

We calculate the Interclass Correlation:

ICC <- ((0.05064)/(0.05064+1.51997))
ICC
## [1] 0.03224225

The small number shows that only 3% of the determining for number of children has to do with contraceptive use. On the other hand, we find over half of the factors determining the amount of children a woman has has to do with her age:

m7_lme <- lmer(livchild ~ 1 + (1|age), data=fertility)
summary(m7_lme)
## Linear mixed model fit by REML ['lmerMod']
## Formula: livchild ~ 1 + (1 | age)
##    Data: fertility
## 
## REML criterion at convergence: 4990
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.3516 -0.4782  0.1798  0.6657  3.0969 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  age      (Intercept) 0.9816   0.9907  
##  Residual             0.7041   0.8391  
## Number of obs: 1934, groups:  age, 50
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   2.6748     0.1447   18.49
ICC2 <- ((0.9816)/(0.9816+0.7041))
ICC2
## [1] 0.58231