This is a second attempt, as I’m not sure that the data I selected the first time was relevant to the calculations made. Hopefully this attempt was better.
library(Zelig)
library(foreign)
library(knitr)
library(stargazer)
library(dplyr)
library(car)
library(DescTools)
library(lme4)
library(texreg)
The dataset I am working with is of a 1988 Bangladesh Fertility study. The dataset has 1934 women in 60 districts. Here I will be using the data to show how contraceptive use, location, and age may affect the amount of children a woman has.
N <- read.table("/Users/laurenberkowitz/Downloads/datasets/bang.txt")
fertility <- select(N, "womanid"=V1, "district"=V2, "contrause"=V3, "livchild"=V4, "age"=V5, "urbanres"=V6, "constant"=V7)
names(fertility)
## [1] "womanid" "district" "contrause" "livchild" "age" "urbanres"
## [7] "constant"
fertility <- na.omit(fertility)
We decided that our dependent variable will be contraceptive use, athe amount of children based on contraceptive use by location. Locations are grouped by district, and identified as urban or rural.
m2_lme <- lmer(contrause ~ livchild + (livchild|urbanres), data = fertility)
summary(m2_lme)
## Linear mixed model fit by REML ['lmerMod']
## Formula: contrause ~ livchild + (livchild | urbanres)
## Data: fertility
##
## REML criterion at convergence: 2646.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.2202 -0.8715 -0.5172 1.1705 1.5768
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## urbanres (Intercept) 2.100e-02 0.14490
## livchild 4.264e-05 0.00653 -1.00
## Residual 2.281e-01 0.47755
## Number of obs: 1934, groups: urbanres, 2
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 0.292191 0.105693 2.765
## livchild 0.051818 0.009889 5.240
##
## Correlation of Fixed Effects:
## (Intr)
## livchild -0.646
We see that with every child there is 2.100e-02 to describe variance related to contraceptive use, with 2.100e-02 designated to the intercept of urban residence affecting contraceptive use, with a residual of 2.281e-01. To see about additional factors, we also consider age on contraceptive use
m3_lme <- lmer(contrause ~ livchild + (livchild|urbanres) + age, data = fertility)
m4_lme <- lmer(contrause ~ livchild*age + (livchild|urbanres), data = fertility)
screenreg(list(m3_lme, m4_lme))
##
## ==========================================================
## Model 1 Model 2
## ----------------------------------------------------------
## (Intercept) 0.21 0.27 **
## (0.11) (0.11)
## livchild 0.08 *** 0.07 ***
## (0.01) (0.01)
## age -0.01 *** 0.01 *
## (0.00) (0.00)
## livchild:age -0.01 ***
## (0.00)
## ----------------------------------------------------------
## AIC 2656.76 2648.26
## BIC 2695.73 2692.80
## Log Likelihood -1321.38 -1316.13
## Num. obs. 1934 1934
## Num. groups: urbanres 2 2
## Variance: urbanres.(Intercept) 0.02 0.02
## Variance: urbanres.livchild 0.00 0.00
## Variance: Residual 0.23 0.22
## ==========================================================
## *** p < 0.001, ** p < 0.01, * p < 0.05
We see that age does appear to have a significant effect on the use of contraception for this model. The next step is to center the data:
library(magrittr)
fertility %<>% mutate(meanage = age - mean(age))
m5_lme <- lmer(contrause ~ livchild*meanage + (livchild|urbanres), data = fertility)
screenreg(list(m4_lme, m5_lme))
##
## ==========================================================
## Model 1 Model 2
## ----------------------------------------------------------
## (Intercept) 0.27 ** 0.27 **
## (0.11) (0.11)
## livchild 0.07 *** 0.07 ***
## (0.01) (0.01)
## age 0.01 *
## (0.00)
## livchild:age -0.01 ***
## (0.00)
## meanage 0.01 *
## (0.00)
## livchild:meanage -0.01 ***
## (0.00)
## ----------------------------------------------------------
## AIC 2648.26 2648.26
## BIC 2692.80 2692.80
## Log Likelihood -1316.13 -1316.13
## Num. obs. 1934 1934
## Num. groups: urbanres 2 2
## Variance: urbanres.(Intercept) 0.02 0.02
## Variance: urbanres.livchild 0.00 0.00
## Variance: Residual 0.22 0.22
## ==========================================================
## *** p < 0.001, ** p < 0.01, * p < 0.05
We see that .07 means that when an average age of a mother is calculated, she will use conraception by .07. When not accounting for the age of the mother, the likelihood of using contraception is also .07 and these are statistically signficant calculations. I think it could also be helpful to see if the number of children is actually related to contraceptive use.
m6_lme <- lmer(contrause ~ 1 + (1|urbanres), data=fertility)
summary(m6_lme)
## Linear mixed model fit by REML ['lmerMod']
## Formula: contrause ~ 1 + (1 | urbanres)
## Data: fertility
##
## REML criterion at convergence: 2676.5
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.0654 -0.7111 -0.7111 1.0088 1.3632
##
## Random effects:
## Groups Name Variance Std.Dev.
## urbanres (Intercept) 0.01488 0.1220
## Residual 0.23242 0.4821
## Number of obs: 1934, groups: urbanres, 2
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 0.42822 0.08709 4.917
We calculate the Interclass Correlation:
ICC <- ((0.01488)/(0.01488+0.23242))
ICC
## [1] 0.06016983
The small number shows that only 6% of the determining for contraceptive use has to do with whether or not women live in an urban location. This also explains the very low numbers we had for all of the related variables.