Kori Nicolai

library (tidyverse)
## -- Attaching packages ------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.2     v purrr   0.3.4
## v tibble  3.0.3     v dplyr   1.0.2
## v tidyr   1.1.1     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.5.0
## -- Conflicts ---------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(psych)
## 
## Attaching package: 'psych'
## The following objects are masked from 'package:ggplot2':
## 
##     %+%, alpha
library(lme4)
## Loading required package: Matrix
## 
## Attaching package: 'Matrix'
## The following objects are masked from 'package:tidyr':
## 
##     expand, pack, unpack

```r
schoolsmoke <- read_csv("EDUS 651/Module 4/schoolsmoke.csv")
## Parsed with column specification:
## cols(
##   school = col_double(),
##   thk = col_double(),
##   prethk = col_double(),
##   cc = col_double(),
##   gprethk = col_double(),
##   d = col_double(),
##   d2 = col_double(),
##   d3 = col_double(),
##   d4 = col_double(),
##   d5 = col_double()
## )

1.Calculate a composite scale score using the 5 peer pressure items (peer1-peer5). Estimate Cronbach’s alpha for this scale and report and interpret the results. Should we incorporate this measure into our analysis going forward?

my.keys.peerpressure <- list(peerpressure = c("d", "d2", "d3", "d4", "d5"))
my.scale.peerpressure <- scoreItems(my.keys.peerpressure, schoolsmoke, impute = "none")
## Warning in sqrt(corrected.var * item.var): NaNs produced
print(my.scale.peerpressure, short = FALSE)
## Call: scoreItems(keys = my.keys.peerpressure, items = schoolsmoke, 
##     impute = "none")
## 
## (Standardized) Alpha:
##       peerpressure
## alpha        -0.07
## 
## Standard errors of unstandardized Alpha:
##       peerpressure
## ASE          0.041
## 
## Standardized Alpha of observed scales:
##      peerpressure
## [1,]        -0.07
## 
## Average item correlation:
##           peerpressure
## average.r       -0.013
## 
## Median item correlation:
## peerpressure 
##       -0.011 
## 
##  Guttman 6* reliability: 
##          peerpressure
## Lambda.6       -0.052
## 
## Signal/Noise based upon av.r : 
##              peerpressure
## Signal/Noise       -0.065
## 
## Scale intercorrelations corrected for attenuation 
##  raw correlations below the diagonal, alpha on the diagonal 
##  corrected correlations above the diagonal:
## 
## Note that these are the correlations of the complete scales based on the correlation matrix,
##  not the observed scales based on the raw items.
##              peerpressure
## peerpressure        -0.07
## 
## Item by scale correlations:
##  corrected for item overlap and scale reliability
##    peerpressure
## d           NaN
## d2          NaN
## d3          NaN
## d4          NaN
## d5          NaN
## 
## Non missing response frequency for each item
##       0    1    2    3    4 miss
## d  0.10 0.22 0.38 0.20 0.10    0
## d2 0.11 0.18 0.43 0.18 0.09    0
## d3 0.10 0.23 0.38 0.18 0.10    0
## d4 0.10 0.20 0.40 0.20 0.10    0
## d5 0.10 0.20 0.40 0.20 0.10    0

The scale has a Cronbach alpha of -0.07. I am slightly confused by this because I have never seen a negative Cronbach alph. The reliability of this scale is very low and I would not use it going forward.

  1. Treating “thk” as the DV, and “school” as the level-2 clustering variable (we will ignore classroom clustering for now- stay tuned!), estimate a null model. Compute, report, and interpret the ICC, as well as the likelihood ratio test at the bottom of the output. Is it worth conducting MLM on these data?
model.0 <- lmer(thk ~ (1|school), REML = FALSE, data = schoolsmoke)
summary(model.0)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: thk ~ (1 | school)
##    Data: schoolsmoke
## 
##      AIC      BIC   logLik deviance df.resid 
##   4833.0   4849.1  -2413.5   4827.0     1597 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.9039 -0.7734  0.0693  0.9240  1.7430 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  school   (Intercept) 0.09438  0.3072  
##  Residual             1.16219  1.0780  
## Number of obs: 1600, groups:  school, 28
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  2.60750    0.06538   39.88
ICC <- 0.09438/(0.09438 + 1.16219)

The ICC is 0.075, we should run a multilevel model because 7% of the variance in what students know about tobacco and health is at the school level.

  1. Interpret the estimated fixed effect of the intercept in the null model. What is this number measuring- what does it mean?

The fixed effect of the intercept is 2.61. This means that when we consider the clustering at the school level the mean of what students know about tabacco and health is 2.61 on the measure used, which is a little higher than the halfway point on the measure where a higher number equals more knowledge.

4.Now, add “prethk” as a level one covariate. Interpret the coefficient estimate. Report what happens to the level-1 and level-2 variances of our random terms. Do they go up or down, and by how much? What does this mean?

model.1 <- lmer(thk ~ prethk + (1|school), REML = FALSE, data = schoolsmoke)
summary(model.1)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: thk ~ prethk + (1 | school)
##    Data: schoolsmoke
## 
##      AIC      BIC   logLik deviance df.resid 
##   4733.5   4755.0  -2362.7   4725.5     1596 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.12273 -0.82727  0.03082  0.83756  2.15165 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  school   (Intercept) 0.07519  0.2742  
##  Residual             1.09321  1.0456  
## Number of obs: 1600, groups:  school, 28
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  2.15164    0.07414   29.02
## prethk       0.21747    0.02122   10.25
## 
## Correlation of Fixed Effects:
##        (Intr)
## prethk -0.598

The coefficient estimate for prethk is 0.22 and is significant. This means that for each additional 1 unit of prethk, thk is 0.22 higher. Both variances at the school and individual level go down, the school level variance goes down from 0.09 to 0.07 and at the individual level the variance goes down form 1.16 to 1.09. This means with the addition of the prethk variable to the model less variance is unexplained at the school and individual level.

  1. Finally, add “cc” and “gprethk” as level-2 predictors to your model. Interpret the results. Explain how the estimates for prethk and gprethk are related to one another, and also explain what this suggests about the impact of school context on an individual student’s knowledge. Report what happens to the level-1 and level-2 variances of our random terms. Do they go up or down, and by how much?
model.2 <- lmer(thk ~ prethk + cc + gprethk + (1|school), REML = FALSE, data = schoolsmoke)
summary(model.2)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: thk ~ prethk + cc + gprethk + (1 | school)
##    Data: schoolsmoke
## 
##      AIC      BIC   logLik deviance df.resid 
##   4712.6   4744.9  -2350.3   4700.6     1594 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.23332 -0.79567  0.02374  0.79254  2.16704 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  school   (Intercept) 0.01734  0.1317  
##  Residual             1.09334  1.0456  
## Number of obs: 1600, groups:  school, 28
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  1.07609    0.25478   4.224
## prethk       0.21260    0.02138   9.944
## cc           0.43075    0.07554   5.702
## gprethk      0.41835    0.11930   3.507
## 
## Correlation of Fixed Effects:
##         (Intr) prethk cc    
## prethk   0.000              
## cc      -0.288  0.000       
## gprethk -0.963 -0.179  0.148

Gprethk is the group average of prethk. Additionally, the gprethk has a greater impact than prethk. While adjusting for cc and prethk, for each additional 1 unit of gprethk, thk is 0.42 higher, compared to the 0.21 increase in thk resulting from an additional unit of prethk.This suggest the school context of grethk has a higher impact on individual student knowledge. The variance at the school level goes down by about .058 and about 0.017 of the variance unexplained by the model. The variance at the individual level actually goes up, although only slightly, by 0.00013