Exercise Linear Mixed Model: Within-Subject Design

Save the file lmm1.csv on your computer and
read it in R as data frame. Name the object d.lmm.
If necessary (after rio::import()), convert subject and time to factors: d.lmm$...<-as.factor(d.lmm$...)
These are data from a within-subject design with one 4-level categorical within-subject variable time. $n=20$ subjects were measured 4 times leading to 80 observations organized in long format structure. The dependent variable is response. Look at the data with

str(d.lmm)

## 'data.frame':    80 obs. of  4 variables:
##  $ subject : Factor w/ 20 levels "s1","s2","s3",..: 1 1 1 1 2 2 2 2 3 3 ...
##  $ time    : Factor w/ 4 levels "t1","t2","t3",..: 1 2 3 4 1 2 3 4 1 2 ...
##  $ response: num  32.1 38.9 35.2 39.4 22.9 ...
##  $ timeNum : num  1 2 3 4 1 2 3 4 1 2 ...

psych::headTail(d.lmm)

##     subject time response timeNum
## 1        s1   t1    32.05       1
## 2        s1   t2     38.9       2
## 3        s1   t3    35.17       3
## 4        s1   t4    39.41       4
## ...    <NA> <NA>      ...     ...
## 77      s20   t1    20.56       1
## 78      s20   t2    27.47       2
## 79      s20   t3     27.9       3
## 80      s20   t4    35.87       4

## spaghetti plot
interaction.plot(d.lmm$time,d.lmm$subject,d.lmm$response,xlab="time",ylab="response",trace.label = "subject")

We assume equal distances between the 4 time-points, that is, we use timeNum as a continuous predictor.

Load packages lme4 and lmerTest. Then, fit three models to the response with continuous time as predictor.
- A linear model with lm(), ignoring the within-subject correlation. Name the model m.lm
- A random intercept model with lmer() (assuming uniform within-subject correlation): m.lmerI, by adding (1|subject) to the right-hand side of the model formula
- A random slope model with lmer(): m.lmerS, by replacing (1|subject) by (timeNum|subject)

In the following, round all numerical quantities to 3 decimals!

Are the point estimates of the fixed effects for m.lm, m.lmerS and m.lmerI different? TRUE or FALSE?

What is the $p$-value of the test of no effect of timeNum in the linear model? Use round(summary(m.lm)$coef,5)

What is the corresponding $p$-value in the random intercept model?

Test the random intercept model against the random slope model. Random effects can be tested with lmerTest::ranova, that is lmerTest::ranova(m.lmerS). Can you reject the random intercept model? (TRUE or FALSE). (You can visualize the model with lattice::xyplot(fitted(m.lmerI)~timeNum,groups=subject,data=d.lmm,type="b"))

Report the estimated between-subject variance parameter of the random intercept model. Use print(VarCorr(m.lmerI),comp="Variance",digits=6).

Report the estimated within-subject variance parameter of the random intercept model

For lmer-fits, confint() constructs likelihood confidence intervals. What is the lower limit of a 99% CI for the expected effect of timeNum for the random intercept model? (see ?confint.merMod for setting the confidence level)

Random intercept model: What is the point estimate of the expected change in response for a change of four units on the predictor timeNum?

Random intercept model: What is the point prediction for a mean observation at timeNum=4? Use predict(m.lmerI,re.form=~0,newdata=data.frame(timeNum=4))

Random intercept model: What is the point prediction for subject s1 at timeNum=4? Use predict(m.lmerI,newdata=data.frame(timeNum=4,subject="s1"))

Fits for continuous time effect on response:

m.lm<-lm(response~timeNum,data=d.lmm)
library(lme4)
library(lmerTest)
m.lmerI<-lmer(response~timeNum+(1|subject),data=d.lmm)
m.lmerS<-lmer(response~timeNum+(timeNum|subject),data=d.lmm)

Linear model fit, fixed effects

round(summary(m.lm)$coef,5)

##             Estimate Std. Error t value Pr(>|t|)
## (Intercept) 22.97959    2.80582 8.18997  0.00000
## timeNum      1.93121    1.02454 1.88495  0.06316

Random intercept model fit, fixed effects

round(summary(m.lmerI,cor=FALSE)$coef,5)

##             Estimate Std. Error       df t value Pr(>|t|)
## (Intercept) 22.97959    2.42644 30.76841 9.47050    0e+00
## timeNum      1.93121    0.45996 59.00000 4.19869    9e-05

Random slope model fit, fixed effects

round(summary(m.lmerS,cor=FALSE)$coef,5)

##             Estimate Std. Error       df t value Pr(>|t|)
## (Intercept) 22.97959    2.39972 18.99979 9.57595  0.00000
## timeNum      1.93121    0.52625 18.99973 3.66973  0.00163

We see that the point estimates for the fixed effects are the same for the three models. The uncertainties are different, that is, the $SE$ and therefore also the $p$-values. For within-subject effects, not-accounting for the repeated measures design decreases precision, the $SE$ and therefore the $p$-values are larger. (For between-subject effects, the contrary would be true! If clustered data are analysed with lm(), between-subject effects would have smaller $SE$ and $p$-values).

With the following function, we can contrast the performance of the models. Without going into details: The standard error is smallest for m.lmerI. m.lmerI is also the optimal model with respect to the AIC criterion. This is a criterion measuring a compromise between a good fit and parsimony (less parameters). Low AIC are better:

modelsummary::modelsummary(models=list(linear=m.lm,randomIntercept=m.lmerI,randomSlope=m.lmerS),estimator="ML",output="kableExtra")

	linear	randomIntercept	randomSlope
(Intercept)	22.980	22.980	22.980
	(2.806)	(2.426)	(2.400)
timeNum	1.931	1.931	1.931
	(1.025)	(0.460)	(0.526)
SD (Intercept subject)		9.275	9.386
SD (timeNum subject)			1.389
Cor (Intercept~timeNum subject)			−0.205
SD (Observations)		4.600	4.249
Num.Obs.	80	80	80
R2	0.044
R2 Adj.	0.031
R2 Marg.		0.042	0.042
R2 Cond.		0.811	0.839
AIC	603.3	534.1	537.0
BIC	610.4	543.7	551.3
ICC		0.8	0.8
Log.Lik.	−298.649
RMSE	10.12	3.99	3.47

Test of model with `(1|subject)` versus `(timeNum|subject)`

The random intercept model has 2 parameters less than the random slope model (slope variance and correlation between intercept and slope). We cannot reject the random intercept model in favor of the more complex random slope model. So we can stay with the simpler random intercept model.

ranova(m.lmerS)

## ANOVA-like table for random-effects: Single term deletions
## 
## Model:
## response ~ timeNum + (timeNum | subject)
##                                npar  logLik    AIC    LRT Df Pr(>Chisq)
## <none>                            6 -260.55 533.11                     
## timeNum in (timeNum | subject)    4 -261.26 530.51 1.4082  2     0.4946

Intermezzo. Visualize random intercept or random slope fit

lattice::xyplot(fitted(m.lmerI)~timeNum,groups=subject,data=d.lmm,type="b")

lattice::xyplot(fitted(m.lmerS)~timeNum,groups=subject,data=d.lmm,type="b")

Extract the variance components

print(VarCorr(m.lmerI),comp="Variance",digits=6)

##  Groups   Name        Variance
##  subject  (Intercept) 86.0182 
##  Residual             21.1559

Confidence interval with different confidence level

confint(m.lmerI,parm="timeNum",level=0.99)

## Computing profile confidence intervals ...

##             0.5 %   99.5 %
## timeNum 0.7231249 3.139297

Change in response

Since the model is linear, the expected change in response for a change of four units on timeNum is 4 times the corresponding slope parameter

4*summary(m.lmerI)$coef[2,1]

## [1] 7.724845

Prediction of mean observation at `timeNum=4`:

predict(m.lmerI,re.form=~0,newdata=data.frame(timeNum=4))

##        1 
## 30.70444

##re.form~0 makes marginal prediction, not conditional

Prediction at `timeNum=4` for subject `s1`:

predict(m.lmerI,newdata=data.frame(timeNum=4,subject="s1"))

##        1 
## 38.78483

##conditional (on subject) prediction. The random effect for s1 is added:
ranef(m.lmerI)$subject$"(Intercept)"[1]

## [1] 8.080389