This week, we will continue to work with the Scotland dataset (scotland.dta) from Leckie (2013). This is a sample of more than 2,000 Scottish primary school students nested within 17 schools and also nested within 500+ neighborhoods.

load in packages

library(tidyverse)

## ── Attaching packages ────────────────────────────────────────────────────── tidyverse 1.3.0 ──

## ✓ ggplot2 3.3.2     ✓ purrr   0.3.4
## ✓ tibble  3.0.3     ✓ dplyr   1.0.2
## ✓ tidyr   1.1.1     ✓ stringr 1.4.0
## ✓ readr   1.3.1     ✓ forcats 0.5.0

## ── Conflicts ───────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

library(Hmisc)

## Loading required package: lattice

## Loading required package: survival

## Loading required package: Formula

## 
## Attaching package: 'Hmisc'

## The following objects are masked from 'package:dplyr':
## 
##     src, summarize

## The following objects are masked from 'package:base':
## 
##     format.pval, units

library(lme4)

## Loading required package: Matrix

## 
## Attaching package: 'Matrix'

## The following objects are masked from 'package:tidyr':
## 
##     expand, pack, unpack

Part 1: Loading in the Data, and then Recoding Variables

Load in the Data

scotland <- haven::read_dta("scotland.dta")

glimpse(scotland)

## Rows: 2,310
## Columns: 11
## $ neighid  <dbl> 26, 26, 26, 26, 26, 27, 29, 29, 29, 29, 29, 29, 29, 29, 29, …
## $ schid    <dbl> 20, 20, 20, 20, 20, 20, 18, 20, 20, 20, 20, 20, 20, 20, 20, …
## $ attain   <dbl> 1.5177, -1.3276, 0.5610, 1.5177, -1.3276, -0.1325, 0.0293, 0…
## $ p7vrq    <dbl> 17.972, -10.028, 2.972, 1.972, -1.028, 3.972, 8.972, -0.028,…
## $ p7read   <dbl> 17.134, -27.866, 6.134, 11.134, -0.866, -0.866, 6.134, -5.86…
## $ dadocc   <dbl> 16.196, -3.454, 2.316, -9.094, -3.454, -3.454, 16.196, -3.45…
## $ dadunemp <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ daded    <dbl> 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, …
## $ momed    <dbl> 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, …
## $ male     <dbl> 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, …
## $ deprive  <dbl> -0.551, -0.551, -0.551, -0.551, -0.551, 0.147, -0.083, -0.08…

The `describe` function from `Hmisc` is a great codebook…

describe(scotland)

## scotland 
## 
##  11  Variables      2310  Observations
## --------------------------------------------------------------------------------
## neighid  Format:%12.0g 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     2310        0      524        1    495.3    305.5     61.0    143.9 
##      .25      .50      .75      .90      .95 
##    240.2    530.0    707.0    808.0    861.0 
## 
## lowest :   26   27   29   30   31, highest: 1092 1095 1096 1097 1098
## --------------------------------------------------------------------------------
## schid  Format:%12.0g 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     2310        0       17    0.995    10.01    7.174        0        2 
##      .25      .50      .75      .90      .95 
##        5        9       16       19       20 
## 
## lowest :  0  1  2  3  5, highest: 16 17 18 19 20
##                                                                             
## Value          0     1     2     3     5     6     7     8     9    10    13
## Frequency    146    22   146   159   155   101   286   112   136   133    92
## Proportion 0.063 0.010 0.063 0.069 0.067 0.044 0.124 0.048 0.059 0.058 0.040
##                                               
## Value         15    16    17    18    19    20
## Frequency    190   111   154    91   102   174
## Proportion 0.082 0.048 0.067 0.039 0.044 0.075
## --------------------------------------------------------------------------------
## attain  Format:%12.0g 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     2310        0       14    0.987   0.0934    1.124  -1.3276  -1.3276 
##      .25      .50      .75      .90      .95 
##  -0.5812   0.1582   0.7350   1.5177   1.5177 
## 
## lowest : -1.3276 -0.5812 -0.3600 -0.1325  0.0293
## highest:  0.7350  0.9127  1.1405  1.5177  2.4151
## --------------------------------------------------------------------------------
## p7vrq  Format:%12.0g 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     2310        0       68    0.999   0.5058    11.95  -17.028  -13.028 
##      .25      .50      .75      .90      .95 
##   -7.028   -0.028    7.972   13.972   16.972 
## 
## lowest : -27.028 -26.028 -25.028 -24.028 -23.028
## highest:  35.972  36.972  40.972  41.972  42.972
## --------------------------------------------------------------------------------
## p7read  Format:%12.0g 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     2310        0       61        1 -0.04435     15.8  -23.866  -17.866 
##      .25      .50      .75      .90      .95 
##   -9.866   -0.866    9.134   19.134   24.134 
## 
## lowest : -31.866 -30.866 -29.866 -28.866 -27.866
## highest:  24.134  25.134  26.134  27.134  28.134
## --------------------------------------------------------------------------------
## dadocc  Format:%12.0g 
##        n  missing distinct     Info     Mean      Gmd 
##     2310        0        7    0.933  -0.4642    12.33 
## 
## lowest : -23.454 -11.494  -9.094  -3.454   2.316
## highest:  -9.094  -3.454   2.316  16.196  29.226
##                                                                   
## Value      -23.454 -11.494  -9.094  -3.454   2.316  16.196  29.226
## Frequency       91     285     303     884     242     397     108
## Proportion   0.039   0.123   0.131   0.383   0.105   0.172   0.047
## --------------------------------------------------------------------------------
## dadunemp  Format:%12.0g 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##     2310        0        2    0.292      252   0.1091   0.1945 
## 
## --------------------------------------------------------------------------------
## daded  Format:%12.0g 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##     2310        0        2    0.507      497   0.2152   0.3379 
## 
## --------------------------------------------------------------------------------
## momed  Format:%12.0g 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##     2310        0        2     0.56      574   0.2485   0.3736 
## 
## --------------------------------------------------------------------------------
## male  Format:%12.0g 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##     2310        0        2    0.749     1109   0.4801   0.4994 
## 
## --------------------------------------------------------------------------------
## deprive  Format:%12.0g 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     2310        0      458        1  0.02167   0.6664  -0.8250  -0.6930 
##      .25      .50      .75      .90      .95 
##  -0.3970  -0.0620   0.2957   0.8410   1.1400 
## 
## lowest : -1.082 -1.048 -1.030 -0.983 -0.975, highest:  2.330  2.419  2.438  2.498  2.959
## --------------------------------------------------------------------------------

Part One: Data Exploration and Coding/ Labelling

Start with variable labelling using the `label` function

label(scotland$neighid)="Neighborhood ID"
label(scotland$schid)="School ID"
label(scotland$attain)="Student Measure of Educational Attainment"
label(scotland$p7vrq)="Primary 7 Verbal Reasoning Quotient"
label(scotland$p7read)="Primary 7 Reading Test Scores"
label(scotland$dadocc)="School Mean of Dad's Occupation Score on Hope-Goldthorpe Scale"
label(scotland$dadunemp)="Dad Currently Unemployed"
label(scotland$daded)="Dad School After Age 15"
label(scotland$momed)="Mom School After Age 15"
label(scotland$male)="Male"
label(scotland$deprive)="Neighborhood Deprivation Score (Poverty, Health, and Housing)"

#Clean up data, make new data set and turn category variables into factor w/the following: ##At the student level: momed and daded; At the school level: dadocc (continuous); At neighborhood level: deprive (continuous).

scot.clean <- scotland %>%
  mutate(., 
         daded.fac = as_factor(daded),
         momed.fac = as_factor(momed))

levels(scot.clean$daded.fac) = c("No","Yes")
levels(scot.clean$momed.fac) = c("No","Yes")

glimpse(scot.clean)

## Rows: 2,310
## Columns: 13
## $ neighid   <labelled> 26, 26, 26, 26, 26, 27, 29, 29, 29, 29, 29, 29, 29, 29…
## $ schid     <labelled> 20, 20, 20, 20, 20, 20, 18, 20, 20, 20, 20, 20, 20, 20…
## $ attain    <labelled> 1.5177, -1.3276, 0.5610, 1.5177, -1.3276, -0.1325, 0.0…
## $ p7vrq     <labelled> 17.972, -10.028, 2.972, 1.972, -1.028, 3.972, 8.972, -…
## $ p7read    <labelled> 17.134, -27.866, 6.134, 11.134, -0.866, -0.866, 6.134,…
## $ dadocc    <labelled> 16.196, -3.454, 2.316, -9.094, -3.454, -3.454, 16.196,…
## $ dadunemp  <labelled> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ daded     <labelled> 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, …
## $ momed     <labelled> 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, …
## $ male      <labelled> 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, …
## $ deprive   <labelled> -0.551, -0.551, -0.551, -0.551, -0.551, 0.147, -0.083,…
## $ daded.fac <fct> No, No, No, No, No, No, Yes, No, No, No, No, No, No, No, No…
## $ momed.fac <fct> No, No, No, No, No, Yes, Yes, No, No, No, No, No, No, No, N…

describe(scot.clean)

## scot.clean 
## 
##  13  Variables      2310  Observations
## --------------------------------------------------------------------------------
## neighid : Neighborhood ID  Format:%12.0g 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     2310        0      524        1    495.3    305.5     61.0    143.9 
##      .25      .50      .75      .90      .95 
##    240.2    530.0    707.0    808.0    861.0 
## 
## lowest :   26   27   29   30   31, highest: 1092 1095 1096 1097 1098
## --------------------------------------------------------------------------------
## schid : School ID  Format:%12.0g 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     2310        0       17    0.995    10.01    7.174        0        2 
##      .25      .50      .75      .90      .95 
##        5        9       16       19       20 
## 
## lowest :  0  1  2  3  5, highest: 16 17 18 19 20
##                                                                             
## Value          0     1     2     3     5     6     7     8     9    10    13
## Frequency    146    22   146   159   155   101   286   112   136   133    92
## Proportion 0.063 0.010 0.063 0.069 0.067 0.044 0.124 0.048 0.059 0.058 0.040
##                                               
## Value         15    16    17    18    19    20
## Frequency    190   111   154    91   102   174
## Proportion 0.082 0.048 0.067 0.039 0.044 0.075
## --------------------------------------------------------------------------------
## attain : Student Measure of Educational Attainment  Format:%12.0g 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     2310        0       14    0.987   0.0934    1.124  -1.3276  -1.3276 
##      .25      .50      .75      .90      .95 
##  -0.5812   0.1582   0.7350   1.5177   1.5177 
## 
## lowest : -1.3276 -0.5812 -0.3600 -0.1325  0.0293
## highest:  0.7350  0.9127  1.1405  1.5177  2.4151
## --------------------------------------------------------------------------------
## p7vrq : Primary 7 Verbal Reasoning Quotient  Format:%12.0g 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     2310        0       68    0.999   0.5058    11.95  -17.028  -13.028 
##      .25      .50      .75      .90      .95 
##   -7.028   -0.028    7.972   13.972   16.972 
## 
## lowest : -27.028 -26.028 -25.028 -24.028 -23.028
## highest:  35.972  36.972  40.972  41.972  42.972
## --------------------------------------------------------------------------------
## p7read : Primary 7 Reading Test Scores  Format:%12.0g 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     2310        0       61        1 -0.04435     15.8  -23.866  -17.866 
##      .25      .50      .75      .90      .95 
##   -9.866   -0.866    9.134   19.134   24.134 
## 
## lowest : -31.866 -30.866 -29.866 -28.866 -27.866
## highest:  24.134  25.134  26.134  27.134  28.134
## --------------------------------------------------------------------------------
## dadocc : School Mean of Dad's Occupation Score on Hope-Goldthorpe Scale  Format:%12.0g 
##        n  missing distinct     Info     Mean      Gmd 
##     2310        0        7    0.933  -0.4642    12.33 
## 
## lowest : -23.454 -11.494  -9.094  -3.454   2.316
## highest:  -9.094  -3.454   2.316  16.196  29.226
##                                                                   
## Value      -23.454 -11.494  -9.094  -3.454   2.316  16.196  29.226
## Frequency       91     285     303     884     242     397     108
## Proportion   0.039   0.123   0.131   0.383   0.105   0.172   0.047
## --------------------------------------------------------------------------------
## dadunemp : Dad Currently Unemployed  Format:%12.0g 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##     2310        0        2    0.292      252   0.1091   0.1945 
## 
## --------------------------------------------------------------------------------
## daded : Dad School After Age 15  Format:%12.0g 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##     2310        0        2    0.507      497   0.2152   0.3379 
## 
## --------------------------------------------------------------------------------
## momed : Mom School After Age 15  Format:%12.0g 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##     2310        0        2     0.56      574   0.2485   0.3736 
## 
## --------------------------------------------------------------------------------
## male : Male  Format:%12.0g 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##     2310        0        2    0.749     1109   0.4801   0.4994 
## 
## --------------------------------------------------------------------------------
## deprive : Neighborhood Deprivation Score (Poverty, Health, and Housing)  Format:%12.0g 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     2310        0      458        1  0.02167   0.6664  -0.8250  -0.6930 
##      .25      .50      .75      .90      .95 
##  -0.3970  -0.0620   0.2957   0.8410   1.1400 
## 
## lowest : -1.082 -1.048 -1.030 -0.983 -0.975, highest:  2.330  2.419  2.438  2.498  2.959
## --------------------------------------------------------------------------------
## daded.fac : Dad School After Age 15 
##        n  missing distinct 
##     2310        0        2 
##                       
## Value         No   Yes
## Frequency   1813   497
## Proportion 0.785 0.215
## --------------------------------------------------------------------------------
## momed.fac : Mom School After Age 15 
##        n  missing distinct 
##     2310        0        2 
##                       
## Value         No   Yes
## Frequency   1736   574
## Proportion 0.752 0.248
## --------------------------------------------------------------------------------

Explore our Outcome, `p7read`

ggplot(data = scot.clean, mapping = aes(x = p7read)) +
  geom_histogram(bins = 20) + 
  labs(title = "Distribution of Primary 7 Reading Test Scores",
       x = "Standardized Primary 7 Reading Test Scores") +
  theme_minimal()

###Using reading score (p7read) as the outcome, run null models for: school (schid), neighborhood (neighid), and additive null model combining school and neighborhood.

First, null school model

model.null.school <- lmer(p7read ~ (1|schid), REML = FALSE, data = scot.clean)
summary(model.null.school)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: p7read ~ (1 | schid)
##    Data: scot.clean
## 
##      AIC      BIC   logLik deviance df.resid 
##  18558.8  18576.0  -9276.4  18552.8     2307 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.7611 -0.6895 -0.0027  0.6854  2.5664 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  schid    (Intercept)  18.75    4.33   
##  Residual             176.64   13.29   
## Number of obs: 2310, groups:  schid, 17
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  -0.2912     1.0951  -0.266

Calculate School ICC

icc.school <- 18.75/(18.75 + 176.64)
icc.school

## [1] 0.09596192

#Second, null neighborhood (neighid) model

model.null.neigh <- lmer(p7read ~ (1|neighid), REML = FALSE, data = scot.clean)
summary(model.null.neigh)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: p7read ~ (1 | neighid)
##    Data: scot.clean
## 
##      AIC      BIC   logLik deviance df.resid 
##  18606.3  18623.5  -9300.1  18600.3     2307 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.65885 -0.67581 -0.01035  0.67042  2.51256 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  neighid  (Intercept)  32.89    5.735  
##  Residual             160.37   12.664  
## Number of obs: 2310, groups:  neighid, 524
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  -0.2407     0.3816  -0.631

Calculate Neighborhood ICC

icc.neigh <- 32.89/(32.89 + 160.37)
icc.neigh

## [1] 0.1701852

Null additive cross-classified model

model.null.crossed <- lmer(p7read ~ (1|schid) + (1|neighid), REML = FALSE, data = scot.clean)
summary(model.null.crossed)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: p7read ~ (1 | schid) + (1 | neighid)
##    Data: scot.clean
## 
##      AIC      BIC   logLik deviance df.resid 
##  18524.9  18547.8  -9258.4  18516.9     2306 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.61239 -0.66099  0.01158  0.66532  2.55698 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  neighid  (Intercept)  17.21    4.148  
##  schid    (Intercept)  16.59    4.074  
##  Residual             160.36   12.663  
## Number of obs: 2310, groups:  neighid, 524; schid, 17
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  -0.3701     1.0522  -0.352

Calculate Neighborhood ICC in Crossed Model

icc.neigh.crossed <- 17.21/(17.21 + 16.59 + 160.36)
icc.neigh.crossed

## [1] 0.08863824

Calculate School ICC in Crossed Model

icc.school.crossed <- 16.59/(16.59 + 17.21 + 160.36)
icc.school.crossed

## [1] 0.08544499

##Add predictors to the model: at the student level, add momed and daded; at the school level, add dadocc; and at the neighborhood level, add deprive.

Add an L1 Predictors, mom’s education after 15 yo & dad’s educ. after 15 yo

model.1 <- lmer(p7read ~ momed.fac + daded.fac + (1|schid) + (1|neighid), REML = FALSE, data = scot.clean)
summary(model.1)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: p7read ~ momed.fac + daded.fac + (1 | schid) + (1 | neighid)
##    Data: scot.clean
## 
##      AIC      BIC   logLik deviance df.resid 
##  18424.4  18458.9  -9206.2  18412.4     2304 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.99961 -0.66409  0.01792  0.66336  2.63998 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  neighid  (Intercept)  13.24    3.639  
##  schid    (Intercept)  13.06    3.614  
##  Residual             155.66   12.476  
## Number of obs: 2310, groups:  neighid, 524; schid, 17
## 
## Fixed effects:
##              Estimate Std. Error t value
## (Intercept)   -2.1132     0.9580  -2.206
## momed.facYes   2.4043     0.7069   3.401
## daded.facYes   5.3704     0.7504   7.157
## 
## Correlation of Fixed Effects:
##             (Intr) mmd.fY
## momed.facYs -0.110       
## daded.facYs -0.083 -0.460

Add school-level predictor: dadocc

model.2 <- lmer(p7read ~ momed.fac + daded.fac + dadocc + (1|schid) + (1|neighid), REML = FALSE, data = scot.clean)
summary(model.2)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: p7read ~ momed.fac + daded.fac + dadocc + (1 | schid) + (1 |  
##     neighid)
##    Data: scot.clean
## 
##      AIC      BIC   logLik deviance df.resid 
##  18310.2  18350.5  -9148.1  18296.2     2303 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.2651 -0.6765  0.0008  0.6506  2.6851 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  neighid  (Intercept)   7.921   2.814  
##  schid    (Intercept)   9.900   3.146  
##  Residual             151.718  12.317  
## Number of obs: 2310, groups:  neighid, 524; schid, 17
## 
## Fixed effects:
##              Estimate Std. Error t value
## (Intercept)  -1.38112    0.84752  -1.630
## momed.facYes  1.83651    0.69334   2.649
## daded.facYes  3.47317    0.75559   4.597
## dadocc        0.26808    0.02425  11.057
## 
## Correlation of Fixed Effects:
##             (Intr) mmd.fY ddd.fY
## momed.facYs -0.127              
## daded.facYs -0.107 -0.428       
## dadocc       0.078 -0.079 -0.242

Add neighborhood-level predictor: deprive

model.3 <- lmer(p7read ~ momed.fac + daded.fac + dadocc + deprive + (1|schid) + (1|neighid), REML = FALSE, data = scot.clean)
summary(model.3)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: p7read ~ momed.fac + daded.fac + dadocc + deprive + (1 | schid) +  
##     (1 | neighid)
##    Data: scot.clean
## 
##      AIC      BIC   logLik deviance df.resid 
##  18253.9  18299.9  -9119.0  18237.9     2302 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.3799 -0.6834  0.0091  0.6557  3.1614 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  neighid  (Intercept)   3.716   1.928  
##  schid    (Intercept)   6.381   2.526  
##  Residual             151.646  12.314  
## Number of obs: 2310, groups:  neighid, 524; schid, 17
## 
## Fixed effects:
##              Estimate Std. Error t value
## (Intercept)  -1.15892    0.70581  -1.642
## momed.facYes  1.66801    0.68663   2.429
## daded.facYes  3.31275    0.74800   4.429
## dadocc        0.23752    0.02435   9.752
## deprive      -3.84492    0.48529  -7.923
## 
## Correlation of Fixed Effects:
##             (Intr) mmd.fY ddd.fY dadocc
## momed.facYs -0.151                     
## daded.facYs -0.128 -0.427              
## dadocc       0.085 -0.075 -0.232       
## deprive     -0.033  0.032  0.039  0.197

#Test an interaction between momed and daded.

model.4 <- lmer(p7read ~ momed.fac + daded.fac + dadocc + deprive + momed.fac:daded.fac + (1|schid) + (1|neighid), REML = FALSE, data = scot.clean)
summary(model.4)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: 
## p7read ~ momed.fac + daded.fac + dadocc + deprive + momed.fac:daded.fac +  
##     (1 | schid) + (1 | neighid)
##    Data: scot.clean
## 
##      AIC      BIC   logLik deviance df.resid 
##  18247.3  18299.0  -9114.6  18229.3     2301 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.3143 -0.6715  0.0026  0.6539  3.1707 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  neighid  (Intercept)   3.513   1.874  
##  schid    (Intercept)   6.506   2.551  
##  Residual             151.222  12.297  
## Number of obs: 2310, groups:  neighid, 524; schid, 17
## 
## Fixed effects:
##                           Estimate Std. Error t value
## (Intercept)               -1.37684    0.71429  -1.928
## momed.facYes               3.12965    0.84626   3.698
## daded.facYes               5.34180    1.01514   5.262
## dadocc                     0.23530    0.02432   9.676
## deprive                   -3.75830    0.48407  -7.764
## momed.facYes:daded.facYes -4.24203    1.43861  -2.949
## 
## Correlation of Fixed Effects:
##             (Intr) mmd.fY ddd.fY dadocc depriv
## momed.facYs -0.181                            
## daded.facYs -0.163  0.143                     
## dadocc       0.087 -0.079 -0.192              
## deprive     -0.039  0.059  0.067  0.195       
## mmd.fcYs:.Y  0.103 -0.587 -0.678  0.032 -0.057

lmerTest::rand(model.4)

## ANOVA-like table for random-effects: Single term deletions
## 
## Model:
## p7read ~ momed.fac + daded.fac + dadocc + deprive + (1 | schid) + 
##     (1 | neighid) + momed.fac:daded.fac
##               npar  logLik   AIC    LRT Df Pr(>Chisq)    
## <none>           9 -9114.6 18247                         
## (1 | schid)      8 -9135.3 18287 41.426  1  1.224e-10 ***
## (1 | neighid)    8 -9115.8 18248  2.362  1     0.1243    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Use `interplot` to visualize the interaction between `deprive` and `dadocc`…did not work

Step 6: Compare Original “Final” Model, Robust Model, and Trimmed Model for Differences (Sensitivity Analysis)

library(modelsummary)

## 
## Attaching package: 'modelsummary'

## The following object is masked from 'package:Hmisc':
## 
##     Mean

library(broom.mixed)

## Registered S3 method overwritten by 'broom.mixed':
##   method      from 
##   tidy.gamlss broom

models <- list(model.null.neigh, model.null.school, model.null.crossed, model.1, model.2, model.3, model.4)
modelsummary(models, output = "markdown")

	Model 1	Model 2	Model 3	Model 4	Model 5	Model 6	Model 7
(Intercept)	-0.241	-0.291	-0.370	-2.113	-1.381	-1.159	-1.377
	(0.382)	(1.095)	(1.052)	(0.958)	(0.848)	(0.706)	(0.714)
sd__(Intercept)	5.735	4.330
sd__Observation	12.664	13.291
neighid sd__(Intercept)			4.148	3.639	2.814	1.928	1.874
schid sd__(Intercept)			4.074	3.614	3.146	2.526	2.551
Residual sd__Observation			12.663	12.476	12.317	12.314	12.297
momed.facYes				2.404	1.837	1.668	3.130
				(0.707)	(0.693)	(0.687)	(0.846)
daded.facYes				5.370	3.473	3.313	5.342
				(0.750)	(0.756)	(0.748)	(1.015)
dadocc					0.268	0.238	0.235
					(0.024)	(0.024)	(0.024)
deprive						-3.845	-3.758
						(0.485)	(0.484)
momed.facYes × daded.facYes							-4.242
							(1.439)
AIC	18606.3	18558.8	18524.9	18424.4	18310.2	18253.9	18247.3
BIC	18623.5	18576.0	18547.8	18458.9	18350.5	18299.9	18299.0
Log.Lik.	-9300.147	-9276.398	-9258.426	-9206.212	-9148.124	-9118.970	-9114.635

Mayton.Module 11 Content Review

Caitlin Mayton

2/6/2021

load in packages

Part 1: Loading in the Data, and then Recoding Variables

Load in the Data

The `describe` function from `Hmisc` is a great codebook…

Part One: Data Exploration and Coding/ Labelling

Start with variable labelling using the `label` function

Explore our Outcome, `p7read`

First, null school model

Calculate School ICC

Calculate Neighborhood ICC

Null additive cross-classified model

Calculate Neighborhood ICC in Crossed Model

Calculate School ICC in Crossed Model

Add an L1 Predictors, mom’s education after 15 yo & dad’s educ. after 15 yo

Add school-level predictor: dadocc

Add neighborhood-level predictor: deprive

Use `interplot` to visualize the interaction between `deprive` and `dadocc`…did not work

Step 6: Compare Original “Final” Model, Robust Model, and Trimmed Model for Differences (Sensitivity Analysis)

Mayton.Module 11 Content Review

Caitlin Mayton

2/6/2021

load in packages

Part 1: Loading in the Data, and then Recoding Variables

Load in the Data

The describe function from Hmisc is a great codebook…

Part One: Data Exploration and Coding/ Labelling

Start with variable labelling using the label function

Explore our Outcome, p7read

First, null school model

Calculate School ICC

Calculate Neighborhood ICC

Null additive cross-classified model

Calculate Neighborhood ICC in Crossed Model

Calculate School ICC in Crossed Model

Add an L1 Predictors, mom’s education after 15 yo & dad’s educ. after 15 yo

Add school-level predictor: dadocc

Add neighborhood-level predictor: deprive

Use interplot to visualize the interaction between deprive and dadocc…did not work

Step 6: Compare Original “Final” Model, Robust Model, and Trimmed Model for Differences (Sensitivity Analysis)

The `describe` function from `Hmisc` is a great codebook…

Start with variable labelling using the `label` function

Explore our Outcome, `p7read`

Use `interplot` to visualize the interaction between `deprive` and `dadocc`…did not work