Use the models specified in the IQ and language example to replicate the results of analysis.
pacman::p_load(mlmRev, tidyverse, lme4)
dta <- read.table('data/Homework.txt', header = T)
head(dta)
School Pupil IQ Language Group_size IQ_c School_mean Group_mean SES_c
1 1 17001 15.0 46 29 3.16594 -1.5141 5.9 -4.812
2 1 17002 14.5 45 29 2.66594 -1.5141 5.9 -17.812
3 1 17003 9.5 33 29 -2.33406 -1.5141 5.9 -12.812
4 1 17004 11.0 46 29 -0.83406 -1.5141 5.9 -4.812
5 1 17005 8.0 20 29 -3.83406 -1.5141 5.9 -17.812
6 1 17006 9.5 30 29 -2.33406 -1.5141 5.9 -17.812
summary(m1 <- lmer(Language ~ IQ_c + (1 | School), data = dta))
Linear mixed model fit by REML ['lmerMod']
Formula: Language ~ IQ_c + (1 | School)
Data: dta
REML criterion at convergence: 15256
Scaled residuals:
Min 1Q Median 3Q Max
-4.094 -0.637 0.058 0.706 3.145
Random effects:
Groups Name Variance Std.Dev.
School (Intercept) 9.6 3.1
Residual 42.2 6.5
Number of obs: 2287, groups: School, 131
Fixed effects:
Estimate Std. Error t value
(Intercept) 40.6082 0.3082 131.8
IQ_c 2.4876 0.0701 35.5
Correlation of Fixed Effects:
(Intr)
IQ_c 0.018
The variance of school is 9.6 with 3.1 SD and the estimate of centered IQ is 2.49 and SE is 0.07.
Replicate the results of analysis reported the math scores over years data example.
# load them
pacman::p_load(foreign, tidyverse, lme4, merTools)
# file location
fLoc <- "data/eg_hlm.sav"
# read spss format and covert it to data frame
dta <- read.spss(fLoc) %>% as.data.frame
# check data structure
str(dta)
'data.frame': 7230 obs. of 12 variables:
$ size : num 380 380 380 380 380 380 380 380 380 380 ...
$ lowinc : num 40.3 40.3 40.3 40.3 40.3 40.3 40.3 40.3 40.3 40.3 ...
$ mobility: num 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 ...
$ female : num 0 0 0 0 0 0 0 0 0 0 ...
$ black : num 0 0 0 0 0 0 0 0 0 0 ...
$ hispanic: num 1 1 1 0 0 0 0 0 1 1 ...
$ year : num 0.5 1.5 2.5 -1.5 -0.5 0.5 1.5 2.5 -1.5 -0.5 ...
$ grade : num 2 3 4 0 1 2 3 4 0 1 ...
$ math : num 1.146 1.134 2.3 -1.303 0.439 ...
$ retained: num 0 0 0 0 0 0 0 0 0 0 ...
$ school : Factor w/ 60 levels " 2020",..: 1 1 1 1 1 1 1 1 1 1 ...
$ cid : Factor w/ 1721 levels " 101480302",..: 244 244 244 248 248 248 248 248 253 253 ...
# set black-and-white theme
theme_set(theme_bw())
#
ggplot(dta, aes(year, math, group = cid)) +
geom_point(alpha = .1) +
geom_line(alpha = .1) +
stat_smooth(aes(group = school), method = "lm", se = F, lwd = .6) +
stat_smooth(aes(group = 1), method = "lm", se = F, col = "magenta") +
labs(x = "Year of age (centered)", y = "Math score")
# nested specification with random intercepts only
# fixed-effect for slope
summary(m1 <- lmer(math ~ year + (1 | school/cid), data = dta))
Linear mixed model fit by REML ['lmerMod']
Formula: math ~ year + (1 | school/cid)
Data: dta
REML criterion at convergence: 16759
Scaled residuals:
Min 1Q Median 3Q Max
-3.809 -0.583 -0.027 0.557 6.087
Random effects:
Groups Name Variance Std.Dev.
cid:school (Intercept) 0.670 0.818
school (Intercept) 0.187 0.432
Residual 0.347 0.589
Number of obs: 7230, groups: cid:school, 1721; school, 60
Fixed effects:
Estimate Std. Error t value
(Intercept) -0.7805 0.0611 -12.8
year 0.7461 0.0054 138.3
Correlation of Fixed Effects:
(Intr)
year -0.031
# same as
#lmer(math ~ year + (1 | school) +( 1 | school:cid) , data = dta))
#
The variance of school is 0.19 with 0.43 SD, the variance of child nested by school is 0,67 and SD is 0.82, and the effect of year is significant on math with 0.75 beta and 0.005 SE.
# random slopes
summary(m2 <- lmer(math ~ ( year | school/cid), data = dta))
Linear mixed model fit by REML ['lmerMod']
Formula: math ~ (year | school/cid)
Data: dta
REML criterion at convergence: 16554
Scaled residuals:
Min 1Q Median 3Q Max
-3.223 -0.561 -0.031 0.531 5.158
Random effects:
Groups Name Variance Std.Dev. Corr
cid:school (Intercept) 0.6413 0.801
year 0.0113 0.106 0.55
school (Intercept) 0.9244 0.961
year 0.6079 0.780 0.92
Residual 0.3011 0.549
Number of obs: 7230, groups: cid:school, 1721; school, 60
Fixed effects:
Estimate Std. Error t value
(Intercept) -1.6442 0.0546 -30.1
convergence code: 0
Model failed to converge with max|grad| = 0.00321233 (tol = 0.002, component 1)
When we considered the effect of year, the variance of school is 0.92 with 0.96 SD, the variance of child nested by school is 0.64 and SD is 0.80.
anova(m1, m2)
Data: dta
Models:
m1: math ~ year + (1 | school/cid)
m2: math ~ (year | school/cid)
npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
m1 5 16757 16791 -8374 16747
m2 8 16566 16621 -8275 16550 197 3 <2e-16
These two models are significant different
# merTools
# estimate random-effects parameters by simulation
m2_re <- REsim(m2)
# plot for random effects
plotREsim(m2_re)
# residual plots
plot(m1, ylim = c(-4, 4),
xlab = "Fitted values", ylab = "Pearson residuals", cex =.5)
#
plot(m2, ylim = c(-4, 4),
xlab = "Fitted values", ylab = "Pearson residuals", cex =.5)
How does the estimate of grand mean depend on the intra-class correlation in a simple random-effects model?
dta1 <- read.table('data/demo1.txt', header = T)
head(dta1)
Score School Pupil
1 11 S1 P1
2 13 S1 P2
3 51 S2 P3
4 53 S2 P4
5 55 S2 P5
6 91 S3 P6
dta2 <- read.table('data/demo2.txt', header = T)
head(dta2)
Score School Pupil
1 11 S1 P1
2 51 S1 P2
3 13 S2 P3
4 55 S2 P4
5 91 S2 P5
6 51 S3 P6
summary(m1 <- lmer(Score ~ 1 + (1 | School), data=dta1))
Linear mixed model fit by REML ['lmerMod']
Formula: Score ~ 1 + (1 | School)
Data: dta1
REML criterion at convergence: 51.5
Scaled residuals:
Min 1Q Median 3Q Max
-1.328 -0.474 0.000 0.461 1.355
Random effects:
Groups Name Variance Std.Dev.
School (Intercept) 1679 40.98
Residual 5 2.24
Number of obs: 9, groups: School, 3
Fixed effects:
Estimate Std. Error t value
(Intercept) 53.0 23.7 2.24
For the first data set, the results showed the variance of school is 1679 with 40.98 SD.
summary(m2 <- lmer(Score ~ 1 + (1 | School), data=dta2))
Linear mixed model fit by REML ['lmerMod']
Formula: Score ~ 1 + (1 | School)
Data: dta2
REML criterion at convergence: 80.4
Scaled residuals:
Min 1Q Median 3Q Max
-1.3673 -0.3429 -0.0278 1.0715 1.1400
Random effects:
Groups Name Variance Std.Dev.
School (Intercept) 104 10.2
Residual 968 31.1
Number of obs: 9, groups: School, 3
Fixed effects:
Estimate Std. Error t value
(Intercept) 56.4 12.0 4.69
For the second data set, the results showed the variance of school is 104 with 10.2 SD.