Introduction

The data I will be using for this analysis is from the 1988 Bangladesh Fertility Survey. I am exclusively using a sub-sample of 1,934 women grouped in 60 various districts. I am interested to see if contraceptive use of women varies by districts, age, and number of children. Unique features of this data are the district and the individual woman. Women are nested within districts. Districts provide a context for women.

Complete-pooling Model

How to cite this model in Zelig: Kosuke Imai, Gary King, and Oliva Lau. 2007. “normal: Normal Regression for Continuous Dependent Variables” in Kosuke Imai, Gary King, and Olivia Lau, “Zelig: Everyone’s Statistical Software,” http://gking.harvard.edu/zelig

Dependent variable:
use
age 0.002 (0.001)
Constant 0.392*** (0.011)
Observations 1,934
Log Likelihood -1,358.061
Akaike Inf. Crit. 2,720.122
Note: p<0.1; p<0.05; p<0.01


The table above tells us that contraceptive use increases with age and this is significant. However, this model does not take into consideration the nesting structure of this data-set such as district and number of living children. Hence, further analysis is necessary.

No-pooling Model

As seen below the histogram of contraceptive use in relation to living children produces normally distributed results:

dcoef <- bang %>% 
  group_by(district) %>% 
  do(mod = lm(use ~ lc, data = .))
coef <- dcoef %>% do(data.frame(lc = coef(.$mod)[2]))
hist(coef$lc)

coef <- data.frame(coef)
Desc(coef)
## 
## -------------------------------------------------------------------------
## 'data.frame':    60 obs. of  1 variable:
##  1 $ lc: num  1.99e-02 1.03e-01 -1.05e-16 6.76e-02 2.23e-02 ...
## 
## ------------------------------------------------------------------------- 
## 1 - lc (numeric)
## 
##   length      n    NAs unique     0s   mean meanSE
##       60     60      0     59      2  0.053  0.013
## 
##      .05    .10    .25 median    .75    .90    .95
##   -0.079 -0.063  0.001  0.049  0.106  0.165  0.190
## 
##      rng     sd  vcoef    mad    IQR   skew   kurt
##    0.558  0.098  1.839  0.074  0.105  0.001  1.660
##  
## lowest : -0.2272727, -0.217759, -0.07988529, -0.07918149, -0.07239459
## highest: 0.1710526, 0.1894957, 0.1935841, 0.3235294, 0.3303571
## 
## Shapiro-Wilks normality test  p.value : 0.022506

Partial-Pooling:

Multilevel model #1

m1_lme <- lmer(use ~ age + (1|district), data = bang)
summary(m1_lme)
## Linear mixed model fit by REML ['lmerMod']
## Formula: use ~ age + (1 | district)
##    Data: bang
## 
## REML criterion at convergence: 2673.8
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.3330 -0.8297 -0.5748  1.1374  1.6543 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  district (Intercept) 0.01276  0.113   
##  Residual             0.22466  0.474   
## Number of obs: 1934, groups:  district, 60
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 0.375453   0.019041  19.719
## age         0.001918   0.001208   1.588
## 
## Correlation of Fixed Effects:
##     (Intr)
## age -0.008

Multilevel Model #2

m2_lme <- lmer(use ~ age + (age|district), data = bang)
summary(m2_lme)
## Linear mixed model fit by REML ['lmerMod']
## Formula: use ~ age + (age | district)
##    Data: bang
## 
## REML criterion at convergence: 2671.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.4695 -0.8214 -0.5763  1.1285  1.6280 
## 
## Random effects:
##  Groups   Name        Variance  Std.Dev. Corr
##  district (Intercept) 1.268e-02 0.112610     
##           age         8.134e-06 0.002852 0.68
##  Residual             2.240e-01 0.473264     
## Number of obs: 1934, groups:  district, 60
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 0.376035   0.018979  19.813
## age         0.001741   0.001276   1.364
## 
## Correlation of Fixed Effects:
##     (Intr)
## age 0.165


library(lme4)
stargazer(cpooling, m1_lme, m2_lme, type = "html")
Dependent variable:
use
normal linear
mixed-effects
(1) (2) (3)
age 0.002 0.002 0.002
(0.001) (0.001) (0.001)
Constant 0.392*** 0.375*** 0.376***
(0.011) (0.019) (0.019)
Observations 1,934 1,934 1,934
Log Likelihood -1,358.061 -1,336.895 -1,335.765
Akaike Inf. Crit. 2,720.122 2,681.790 2,683.530
Bayesian Inf. Crit. 2,704.060 2,716.934
Note: p<0.1; p<0.05; p<0.01


library(texreg)
screenreg(list(m1_lme, m2_lme))
## 
## ==========================================================
##                                 Model 1       Model 2     
## ----------------------------------------------------------
## (Intercept)                         0.38 ***      0.38 ***
##                                    (0.02)        (0.02)   
## age                                 0.00          0.00    
##                                    (0.00)        (0.00)   
## ----------------------------------------------------------
## AIC                              2681.79       2683.53    
## BIC                              2704.06       2716.93    
## Log Likelihood                  -1336.90      -1335.76    
## Num. obs.                        1934          1934       
## Num. groups: district              60            60       
## Variance: district.(Intercept)      0.01          0.01    
## Variance: Residual                  0.22          0.22    
## Variance: district.age                            0.00    
## ==========================================================
## *** p < 0.001, ** p < 0.01, * p < 0.05

Multilevel Models 3&4

m3_lme <- lmer(use ~ age + (age|district) + lc, data = bang)
m4_lme <- lmer(use ~ age*lc + (age|district), data = bang)
screenreg(list(m3_lme, m4_lme))
## 
## ==========================================================
##                                 Model 1       Model 2     
## ----------------------------------------------------------
## (Intercept)                         0.18 ***      0.25 ***
##                                    (0.03)        (0.04)   
## age                                -0.01 ***      0.01 ** 
##                                    (0.00)        (0.00)   
## lc                                  0.08 ***      0.07 ***
##                                    (0.01)        (0.01)   
## age:lc                                           -0.01 ***
##                                                  (0.00)   
## ----------------------------------------------------------
## AIC                              2711.23       2700.56    
## BIC                              2750.20       2745.10    
## Log Likelihood                  -1348.62      -1342.28    
## Num. obs.                        1934          1934       
## Num. groups: district              60            60       
## Variance: district.(Intercept)      0.00          0.00    
## Variance: district.age              0.00          0.00    
## Variance: Residual                  0.23          0.23    
## ==========================================================
## *** p < 0.001, ** p < 0.01, * p < 0.05

Centering the Independent Variables

m3_lme <- lmer(use ~ age + (age|district) + lc, data = bang)
m4a_lme <- lmer(use ~ age*clc + (age|district), data = bang)
screenreg(list(m4_lme, m4a_lme))
## 
## ==========================================================
##                                 Model 1       Model 2     
## ----------------------------------------------------------
## (Intercept)                         0.25 ***      0.44 ***
##                                    (0.04)        (0.01)   
## age                                 0.01 **      -0.00 ** 
##                                    (0.00)        (0.00)   
## lc                                  0.07 ***              
##                                    (0.01)                 
## age:lc                             -0.01 ***              
##                                    (0.00)                 
## clc                                               0.07 ***
##                                                  (0.01)   
## age:clc                                          -0.01 ***
##                                                  (0.00)   
## ----------------------------------------------------------
## AIC                              2700.56       2700.56    
## BIC                              2745.10       2745.10    
## Log Likelihood                  -1342.28      -1342.28    
## Num. obs.                        1934          1934       
## Num. groups: district              60            60       
## Variance: district.(Intercept)      0.00          0.00    
## Variance: district.age              0.00          0.00    
## Variance: Residual                  0.23          0.23    
## ==========================================================
## *** p < 0.001, ** p < 0.01, * p < 0.05

Intra-class Correlation:

Is contraception use mainly an indivdual-level or district-level phenonmenon?

m0_lme <- lmer(use ~ 1 + (1|district), data = bang)
summary(m0_lme)
## Linear mixed model fit by REML ['lmerMod']
## Formula: use ~ 1 + (1 | district)
##    Data: bang
## 
## REML criterion at convergence: 2664.7
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.2531 -0.8063 -0.5738  1.1268  1.6078 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  district (Intercept) 0.01267  0.1126  
##  Residual             0.22487  0.4742  
## Number of obs: 1934, groups:  district, 60
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   0.3757     0.0190   19.77

Intra-class correlation: 0.01267/0.01267 + 0.22487 = 0.05334

Conclusion

In conclusion, the district variable has some impact on weather a woman uses contraceptives, with an ICC of .053. This makes sense as certain areas have better access and availability for women to obtain contraceptives. Additionally, as the number of children a woman had went up, so did the likelihood of contraceptive usage significantly. Age was less significant than number of children in general. However, further analysis proved that age was significantly negatively correlated to number of living children in relation to contraceptive usage. Meaning younger women, with less children are the least likely to use contraceptives. Further analysis should look at socio-economic status and education in relation to contraceptive usage as many woman around the world still aren’t provided with the basic knowledge to make such decisions.