Feel free to leave a comment in case of mistakes…
The data come from the SII (Study of Instructional Improvement), which was carried out to assess the math achievement scores of first- and third-grade pupils in randomly selected classrooms from a national US sample of elementary schools. The dataset includes results for 1190 first-grade pupils sampled from 312 classrooms in 107 schools. From the sample of elementary schools, all first-grade classrooms were considered (1 to 9). Within a classroom, 1 to 10 children were sampled to take a test on math. The following variables were measured:
Hip fractures are a very common and important health problem among elderly people. They are often the result of an unfortunate, seemingly innocent fall in the patient’s home environment. Less fluent movements combined with bad vision and impaired reaction skills makes older people more at risk for such falls. The weaker bones and especially the presence of osteoporosis also causes the hip to break more easily. The hip surgery needed is a serious operation and can involve a lot of blood loss. A relatively long recovery period (8 to 12 days) in the hospital is therefore a real necessity. The combination of these overwhelming events can also cause the mental status of the patients to worsen. When this deterioration is very severe, typically causing a loss of orientation in time and space, this phenomenon is called a delirium. In daily practice, the ‘Mini-Mental State Examination score’ (MMSE) is sometimes being used to measure the cognitive status of patients. It is the sum of 30 binary outcomes which result from correct (1) or incorrect (0) answer to specific questions (e.g. ‘Where are you now?’). Obviously, higher MMSE values indicate better cognitive functioning.
It is of interest to study how this MMSE evolution depends on certain characteristics of the patients. The data set used to this purpose contains the MMSE score for 60 patients, at 1, 3, 5, 8 and 12 days after surgery. For each patient the following variables are measured:
library(rio)
library(nlme)
library(ggplot2)
library(lmtest)
library(kableExtra)
library(dplyr)
Vidoes on the theory:
Describe in detail why in both datasets, one cannot make use of basic statistical models, like linear regression or ANOVA. What assumptions are violated? What special kind of data is the hip fracture dataset?
Classic ANOVA and regression assume all observations are independent. Here, we have related data observations that are not independent:
Hip fracture: multiple measurements (of MMSE) over time in the same individual = longitudinal study.
SII: here there is non-independence at two levels:
The purpose of multilevel (or mixed) models is to account for the relatedness between these observations.
Read in the data in R, check the variable types via str() or the “environment” tab in RStudio, and coerce all variables into the correct type. Assign the appropriate factor levels.
Hip data
<- import("hipfracture2.txt")
hip $NEURO <- factor(hip$NEURO, levels = c(0, 1), labels = c("not neuro-psychiatric", "neuro-psychiatric"))
hip$IDNR <- factor(hip$IDNR)
hip
%>%
hip slice(1:6) %>%
kbl() %>%
kable_paper()
IDNR | NEURO | AGE | MMSE | TIME |
---|---|---|---|---|
1 | not neuro-psychiatric | 74 | 28 | 1 |
1 | not neuro-psychiatric | 74 | 28 | 3 |
1 | not neuro-psychiatric | 74 | 28 | 5 |
1 | not neuro-psychiatric | 74 | 26 | 8 |
1 | not neuro-psychiatric | 74 | 25 | 12 |
2 | not neuro-psychiatric | 67 | 25 | 1 |
SII data
<- import("SII.txt")
sii $sex <- factor(sii$sex, levels = c(0,1), labels = c("males", "females"))
sii$schoolid <- factor(sii$schoolid)
sii$classid <- factor(sii$classid)
sii$childid <- factor(sii$childid)
sii%>%
sii slice(1:6) %>%
kbl() %>%
kable_paper()
schoolid | classid | childid | sex | mathkind | mathgain | ses | housepov | yearstea |
---|---|---|---|---|---|---|---|---|
1 | 160 | 1 | females | 448 | 32 | 0.46 | 0.082 | 1 |
1 | 160 | 2 | males | 460 | 109 | -0.27 | 0.082 | 1 |
1 | 160 | 3 | females | 511 | 56 | -0.03 | 0.082 | 1 |
1 | 217 | 4 | males | 449 | 83 | -0.38 | 0.082 | 2 |
1 | 217 | 5 | males | 425 | 53 | -0.03 | 0.082 | 2 |
1 | 217 | 6 | females | 450 | 65 | 0.76 | 0.082 | 2 |
SII data: Explore the mathgain values in boys and girls in the different schools using an index plot.
Mean in both sexes for each school:
%>%
sii group_by(schoolid, sex) %>%
summarise(mean = mean(mathgain)) %>%
kbl() %>%
kable_paper() %>%
scroll_box(width = "300px", height = "200px")
schoolid | sex | mean |
---|---|---|
1 | males | 59.285714 |
1 | females | 60.250000 |
2 | males | 68.500000 |
2 | females | 59.750000 |
3 | males | 94.636364 |
3 | females | 67.666667 |
4 | males | 6.333333 |
4 | females | 64.000000 |
5 | males | 79.000000 |
5 | females | 50.750000 |
6 | males | 83.142857 |
6 | females | 72.400000 |
7 | males | 56.250000 |
7 | females | 57.800000 |
8 | males | 41.750000 |
8 | females | 52.750000 |
9 | males | 37.666667 |
9 | females | 32.000000 |
10 | males | 82.300000 |
10 | females | 65.750000 |
11 | males | 54.181818 |
11 | females | 48.400000 |
12 | males | 56.785714 |
12 | females | 47.692308 |
13 | males | 54.500000 |
13 | females | 42.200000 |
14 | males | 44.400000 |
14 | females | 44.200000 |
15 | males | 82.571429 |
15 | females | 66.166667 |
16 | males | 96.000000 |
16 | females | 89.750000 |
17 | males | 67.285714 |
17 | females | 104.000000 |
18 | males | 117.500000 |
18 | females | 62.000000 |
19 | males | 63.800000 |
19 | females | 48.666667 |
20 | males | 57.200000 |
20 | females | 65.000000 |
21 | males | 55.307692 |
21 | females | 61.250000 |
22 | males | 46.000000 |
22 | females | 37.333333 |
23 | males | 66.333333 |
23 | females | 75.500000 |
24 | males | -16.000000 |
24 | females | 45.285714 |
25 | males | 23.000000 |
25 | females | 30.400000 |
26 | males | 72.000000 |
26 | females | 36.833333 |
27 | males | 46.500000 |
27 | females | 74.888889 |
28 | males | 25.800000 |
28 | females | 67.200000 |
29 | males | 71.500000 |
29 | females | 35.666667 |
30 | males | 27.000000 |
30 | females | 29.000000 |
31 | males | 57.375000 |
31 | females | 46.785714 |
32 | males | 68.666667 |
32 | females | 44.666667 |
33 | males | 64.000000 |
33 | females | 61.416667 |
34 | males | 35.250000 |
34 | females | 71.333333 |
35 | males | 20.166667 |
35 | females | 42.200000 |
36 | males | 74.000000 |
36 | females | 71.333333 |
37 | males | 69.000000 |
37 | females | 40.714286 |
38 | males | 51.857143 |
38 | females | 33.500000 |
39 | males | 27.857143 |
39 | females | 26.400000 |
40 | males | 28.000000 |
40 | females | 46.500000 |
41 | males | 67.857143 |
41 | females | 72.000000 |
42 | males | 61.000000 |
42 | females | 58.083333 |
43 | males | 70.000000 |
43 | females | 50.666667 |
44 | males | 74.875000 |
44 | females | 63.777778 |
45 | males | 82.666667 |
45 | females | 42.500000 |
46 | males | 58.285714 |
46 | females | 49.000000 |
47 | males | 88.571429 |
47 | females | 39.333333 |
48 | males | 52.000000 |
48 | females | 47.500000 |
49 | males | 81.714286 |
49 | females | 80.000000 |
50 | males | 41.714286 |
50 | females | 47.000000 |
51 | males | 40.000000 |
51 | females | 73.000000 |
52 | males | 48.000000 |
52 | females | 39.000000 |
53 | males | 83.000000 |
53 | females | 65.800000 |
54 | males | 40.625000 |
54 | females | 18.500000 |
55 | males | 51.714286 |
55 | females | 55.666667 |
56 | males | 55.333333 |
56 | females | 47.250000 |
57 | males | 72.666667 |
57 | females | 69.200000 |
58 | males | 43.500000 |
58 | females | 54.333333 |
59 | males | 46.333333 |
59 | females | 41.000000 |
60 | males | 73.875000 |
60 | females | 73.800000 |
61 | males | 53.600000 |
61 | females | 34.833333 |
62 | males | 62.166667 |
62 | females | 47.000000 |
63 | males | 74.000000 |
63 | females | 17.000000 |
64 | males | 56.333333 |
64 | females | 84.000000 |
65 | males | 60.750000 |
65 | females | 52.800000 |
66 | males | 61.666667 |
66 | females | 54.500000 |
67 | males | 87.833333 |
67 | females | 72.333333 |
68 | males | 60.000000 |
68 | females | 84.000000 |
69 | males | 78.333333 |
69 | females | 44.000000 |
70 | males | 96.750000 |
70 | females | 69.000000 |
71 | males | 51.923077 |
71 | females | 53.000000 |
72 | males | 54.400000 |
72 | females | 47.666667 |
73 | males | 59.500000 |
73 | females | 39.000000 |
74 | males | 58.333333 |
75 | males | 99.000000 |
75 | females | 70.333333 |
76 | males | 51.266667 |
76 | females | 34.916667 |
77 | males | 58.666667 |
77 | females | 52.250000 |
78 | males | 39.250000 |
78 | females | 31.000000 |
79 | males | 85.000000 |
79 | females | 56.555556 |
80 | males | 66.333333 |
80 | females | 80.750000 |
81 | males | 70.000000 |
81 | females | 64.428571 |
82 | males | 57.384615 |
82 | females | 55.000000 |
83 | males | 46.250000 |
83 | females | 61.000000 |
84 | males | 72.400000 |
84 | females | 86.666667 |
85 | males | 39.600000 |
85 | females | 66.000000 |
86 | males | 31.600000 |
86 | females | 41.500000 |
87 | males | 43.500000 |
87 | females | 49.250000 |
88 | males | 52.000000 |
88 | females | 36.000000 |
89 | males | 40.000000 |
89 | females | 59.500000 |
90 | males | 58.500000 |
90 | females | 34.000000 |
91 | males | 36.625000 |
91 | females | 60.250000 |
92 | males | 72.666667 |
92 | females | 59.833333 |
93 | males | 31.900000 |
93 | females | 50.909091 |
94 | males | 80.833333 |
94 | females | 83.666667 |
95 | males | 35.000000 |
95 | females | 75.500000 |
96 | males | 61.500000 |
96 | females | 41.000000 |
97 | males | 53.000000 |
97 | females | 91.500000 |
98 | males | 61.000000 |
98 | females | 37.000000 |
99 | males | 54.000000 |
99 | females | 49.666667 |
100 | males | 44.000000 |
100 | females | 69.555556 |
101 | males | 61.909091 |
101 | females | 68.800000 |
102 | males | 47.200000 |
102 | females | 58.666667 |
103 | males | 104.500000 |
103 | females | 86.000000 |
104 | males | 91.333333 |
104 | females | 84.000000 |
105 | males | 59.000000 |
105 | females | 63.800000 |
106 | males | 9.500000 |
107 | males | 40.500000 |
107 | females | 50.125000 |
::ggplot(sii, aes(x = schoolid, y = mathgain, color = sex)) +
ggplot2stat_summary(fun = mean, geom = "point") +
theme(axis.text.x = element_text(size = 3))
Hip data: Plot the MMSE evolutions over time. You can make one plot with all the patients, using a different color for neuro and non-Neuro patients, or generate two separate plots for the neuro and non-neuro persons, respectively. Discuss what you see here.
ggplot(hip, aes(x = TIME, y = MMSE, color = NEURO)) +
stat_summary(fun = mean, geom = "point", size = 3) +
geom_jitter(size = 0.3, width = 0.2) +
geom_line(aes(group = IDNR), size = 0.2)
Vidoes on the theory:
SII data Is there a difference of math achievement score between males and females? A descriptive plot was already made:
<- import("SII.txt")
sii $sex <- factor(sii$sex, levels = c(0,1), labels = c("males", "females"))
sii$schoolid <- factor(sii$schoolid)
sii
::ggplot(sii, aes(x = schoolid, y = mathgain, color = sex)) +
ggplot2stat_summary(fun = mean, geom = "point") +
theme(axis.text.x = element_text(size = 3))
We will now look at a mixed model. We still only consider the clustering on school-level and ignore the clustering on the class-level (this will be included in a later stage). Write down the full model with assumptions.
\(Y_{ij} = (beta_{0} + beta_{i}) + beta_{2}*sex + \epsilon_{ij}\)
with \(b_{i}\) = random intercept
SII data Fit now the model in R, discuss the within-cluster variability and between-cluster variability.
Basic model output
<-lme(mathgain~sex,random=~1|schoolid,data=sii)
modelsummary(model)
## Linear mixed-effects model fit by REML
## Data: sii
## AIC BIC logLik
## 11780.3 11800.62 -5886.151
##
## Random effects:
## Formula: ~1 | schoolid
## (Intercept) Residual
## StdDev: 10.45341 33.07291
##
## Fixed effects: mathgain ~ sex
## Value Std.Error DF t-value p-value
## (Intercept) 58.65798 1.748933 1082 33.53931 0.0000
## sexfemales -2.13315 1.965660 1082 -1.08521 0.2781
## Correlation:
## (Intr)
## sexfemales -0.566
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -4.8454731 -0.6000218 -0.0198554 0.5417158 5.6376937
##
## Number of Observations: 1190
## Number of Groups: 107
anova.lme(model,type="marginal")
## numDF denDF F-value p-value
## (Intercept) 1 1082 1124.8850 <.0001
## sex 1 1082 1.1777 0.2781
Test of fixed effects (Type 3): The Score in Girls is on average about 2 point lower, however the \(se\) and thus, the p-Value is large (28%).
<- anova.lme(model, type = "marginal")
fixed_effects fixed_effects
## numDF denDF F-value p-value
## (Intercept) 1 1082 1124.8850 <.0001
## sex 1 1082 1.1777 0.2781
Estimating the covariance of parameters: The first row (intercepts) represents the between school variability, the second row (residual) the within school variability.
<- VarCorr(model)
covPar covPar
## schoolid = pdLogChol(1)
## Variance StdDev
## (Intercept) 109.2739 10.45341
## Residual 1093.8173 33.07291
Total variability is decomposed as within-cluster variability (residuals) and between-cluster variability (schools).
<- as.numeric(covPar[1,1])
betwVar <- as.numeric(covPar[2,1])
withVar <- betwVar+withVar
totVar <- betwVar/totVar
Expl Expl
## [1] 0.09082761
Thus, 9% of thr variance due to variance between schools.
Hip data
Fit a mixed model, taking into account both neuro-groups (exclude observations with a missing value). Discuss the variability between and within patients. Write down the model with assumptions.
Mixed model for MMSE over time in both neuro groups: MMSE of \(measurement_{j}\) in \(ID_{i}\)
\(Y_{ij} = (beta_{0}+b_{i}) + beta_{1}*NEURO + beta_{2}*TIME + beta_{3}*NEURO*TIME + \epsilon_{ij}\)
Fit the model
<- lme(MMSE~NEURO+TIME+NEURO:TIME,random=~1|IDNR,data=hip,na.action=na.exclude)
modelsummary(model)
## Linear mixed-effects model fit by REML
## Data: hip
## AIC BIC logLik
## 1464.789 1486.222 -726.3943
##
## Random effects:
## Formula: ~1 | IDNR
## (Intercept) Residual
## StdDev: 6.478461 2.513081
##
## Fixed effects: MMSE ~ NEURO + TIME + NEURO:TIME
## Value Std.Error DF t-value p-value
## (Intercept) 22.011719 1.0770182 205 20.437648 0.0000
## NEUROneuro-psychiatric -8.038534 1.8690377 58 -4.300894 0.0001
## TIME 0.197531 0.0523993 205 3.769725 0.0002
## NEUROneuro-psychiatric:TIME -0.059808 0.0964234 205 -0.620264 0.5358
## Correlation:
## (Intr) NEUROn- TIME
## NEUROneuro-psychiatric -0.576
## TIME -0.254 0.146
## NEUROneuro-psychiatric:TIME 0.138 -0.258 -0.543
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -4.14473201 -0.44157131 0.01532865 0.48824740 2.81016393
##
## Number of Observations: 267
## Number of Groups: 60
Get the variances
<- VarCorr(model)
covPar covPar
## IDNR = pdLogChol(1)
## Variance StdDev
## (Intercept) 41.970459 6.478461
## Residual 6.315578 2.513081
<- as.numeric(covPar[1,1])
betwVar <- as.numeric(covPar[2,1])
withVar <- betwVar+withVar
totVar <- betwVar/totVar
Expl Expl
## [1] 0.8692049
There is no different evolution over time as the p-value for the interaction term is 53%
summary(model)
## Linear mixed-effects model fit by REML
## Data: hip
## AIC BIC logLik
## 1464.789 1486.222 -726.3943
##
## Random effects:
## Formula: ~1 | IDNR
## (Intercept) Residual
## StdDev: 6.478461 2.513081
##
## Fixed effects: MMSE ~ NEURO + TIME + NEURO:TIME
## Value Std.Error DF t-value p-value
## (Intercept) 22.011719 1.0770182 205 20.437648 0.0000
## NEUROneuro-psychiatric -8.038534 1.8690377 58 -4.300894 0.0001
## TIME 0.197531 0.0523993 205 3.769725 0.0002
## NEUROneuro-psychiatric:TIME -0.059808 0.0964234 205 -0.620264 0.5358
## Correlation:
## (Intr) NEUROn- TIME
## NEUROneuro-psychiatric -0.576
## TIME -0.254 0.146
## NEUROneuro-psychiatric:TIME 0.138 -0.258 -0.543
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -4.14473201 -0.44157131 0.01532865 0.48824740 2.81016393
##
## Number of Observations: 267
## Number of Groups: 60
When omitting the interaction term, TIME is a significant fixed effect.
<- lme(MMSE~NEURO+TIME,random=~1|IDNR,data=hip, na.action= na.exclude)
modelanova.lme(model,type="marginal")
## numDF denDF F-value p-value
## (Intercept) 1 206 428.4846 <.0001
## NEURO 1 58 21.2744 <.0001
## TIME 1 206 16.7848 1e-04
Vidoes on the theory
SII data
Give and fit the model, testing if there is a difference in mathgain between boys and girls, taking into account clustering on both school and class levels. Write down this model (no other covariates have to be considered yet).
\(Y_{ijk} = beta_{0} + a_{i} + b_{i(j)} + beta_{1}*sex + \epsilon_{ijk}\)
with \(a_{i}\) = school effect
and \(b_{i(j)}\) = class effect (within school)
SII data
Is there a difference in mathgain between boys and girls?
The main effect of sex is not significant:
<-lme(mathgain~sex,random=~1|schoolid/classid,data=sii, na.action=na.exclude)
modelanova.lme(model)
## numDF denDF F-value p-value
## (Intercept) 1 877 1583.0446 <.0001
## sex 1 877 1.4586 0.2275
summary(model)
## Linear mixed-effects model fit by REML
## Data: sii
## AIC BIC logLik
## 11774.14 11799.54 -5882.068
##
## Random effects:
## Formula: ~1 | schoolid
## (Intercept)
## StdDev: 8.772372
##
## Formula: ~1 | classid %in% schoolid
## (Intercept) Residual
## StdDev: 10.05193 32.04467
##
## Fixed effects: mathgain ~ sex
## Value Std.Error DF t-value p-value
## (Intercept) 58.61460 1.747250 877 33.54678 0.0000
## sexfemales -2.35259 1.947955 877 -1.20772 0.2275
## Correlation:
## (Intr)
## sexfemales -0.564
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -4.67888536 -0.58891986 -0.03238045 0.53965216 5.61255392
##
## Number of Observations: 1190
## Number of Groups:
## schoolid classid %in% schoolid
## 107 312
SII data
Discuss the different variance components from the model above. What fraction of the total variance is due to the variability between classes and between schools?
VarCorr(model)
## Variance StdDev
## schoolid = pdLogChol(1)
## (Intercept) 76.9545 8.772372
## classid = pdLogChol(1)
## (Intercept) 101.0413 10.051932
## Residual 1026.8606 32.044666
<- as.numeric(VarCorr(model)[2,1])
varSchool <- as.numeric(VarCorr(model)[4,1])
varClass <- as.numeric(VarCorr(model)[5,1])
varRes <- varSchool + varClass + varRes
varTot
<- varSchool/varTot
fracSchool <- varClass/varTot fracClass
The fraction of the total variation explained by schools is 0.0638703 and 0.0838617 by classes.
SII data
/varTot varSchool
## [1] 0.06387027
+varClass)/varTot (varSchool
## [1] 0.147732
One must add the variance due to school in numerator as students from same class are also within the same school!
SII data
Fit 1 model including all covariates as main effects and as an interaction with sex, and taking into account all levels of clustering. Discuss significance of the covariates. Is the effect of (some of) the covariates sex-dependent? (Note: stepwise backward model building is the subject of exercise Day 4)
<-lme(mathgain~sex*(ses+housepov+yearstea+mathkind),random=~1|schoolid/classid,data=sii, na.action=na.exclude) #short
model<- lme(mathgain~sex+ses+housepov+yearstea+mathkind+sex:ses+sex:housepov+sex:yearstea+sex:mathkind,random=~1|schoolid/classid,data=sii, na.action=na.exclude) #long
model1
summary(model)
## Linear mixed-effects model fit by REML
## Data: sii
## AIC BIC logLik
## 11408.22 11474.17 -5691.11
##
## Random effects:
## Formula: ~1 | schoolid
## (Intercept)
## StdDev: 9.195376
##
## Formula: ~1 | classid %in% schoolid
## (Intercept) Residual
## StdDev: 8.51822 27.23456
##
## Fixed effects: mathgain ~ sex * (ses + housepov + yearstea + mathkind)
## Value Std.Error DF t-value p-value
## (Intercept) 278.63790 14.572595 871 19.120679 0.0000
## sexfemales -11.20361 20.032320 871 -0.559277 0.5761
## ses 5.70738 1.703249 871 3.350879 0.0008
## housepov -5.32490 11.796167 105 -0.451410 0.6526
## yearstea 0.22939 0.141292 204 1.623505 0.1060
## mathkind -0.47700 0.029986 871 -15.907333 0.0000
## sexfemales:ses -0.05452 2.351317 871 -0.023187 0.9815
## sexfemales:housepov -24.27888 12.855978 871 -1.888529 0.0593
## sexfemales:yearstea -0.34430 0.175657 871 -1.960051 0.0503
## sexfemales:mathkind 0.03950 0.041685 871 0.947695 0.3435
## Correlation:
## (Intr) sxfmls ses houspv yearst mthknd sxfmls:s sxfmls:h
## sexfemales -0.672
## ses 0.214 -0.170
## housepov -0.241 0.118 0.096
## yearstea -0.165 0.093 -0.053 0.097
## mathkind -0.974 0.666 -0.231 0.085 0.032
## sexfemales:ses -0.155 0.216 -0.687 -0.073 0.041 0.169
## sexfemales:housepov 0.163 -0.221 -0.084 -0.539 -0.075 -0.080 0.171
## sexfemales:yearstea 0.106 -0.123 0.039 -0.061 -0.615 -0.024 -0.020 0.151
## sexfemales:mathkind 0.657 -0.982 0.181 -0.052 -0.018 -0.675 -0.239 0.093
## sxfmls:y
## sexfemales
## ses
## housepov
## yearstea
## mathkind
## sexfemales:ses
## sexfemales:housepov
## sexfemales:yearstea
## sexfemales:mathkind -0.002
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -5.89285360 -0.62785115 -0.02861974 0.55658062 4.29948404
##
## Number of Observations: 1190
## Number of Groups:
## schoolid classid %in% schoolid
## 107 312
Hip Data:
Looking at the profiles from the hipfracture dataset, it might be of interest to include a quadratic time effect. Define the quadratic time and add it to the model.
Write down and fit the model. Is the quadratic time effect needed here?
$TIME2<-hip$TIME^2
hip%>%
hip slice(1:6) %>%
kbl() %>%
kable_paper()
IDNR | NEURO | AGE | MMSE | TIME | TIME2 |
---|---|---|---|---|---|
1 | not neuro-psychiatric | 74 | 28 | 1 | 1 |
1 | not neuro-psychiatric | 74 | 28 | 3 | 9 |
1 | not neuro-psychiatric | 74 | 28 | 5 | 25 |
1 | not neuro-psychiatric | 74 | 26 | 8 | 64 |
1 | not neuro-psychiatric | 74 | 25 | 12 | 144 |
2 | not neuro-psychiatric | 67 | 25 | 1 | 1 |
\(_{Yij} = (beta_{0}+b_{i}) + beta_{1}*TIME + beta_{2}*TIME2 + \epsilon_{eij}\)
Fitting the model:
<- lme(MMSE~TIME+TIME2,random=~1|IDNR,data=hip, na.action=na.exclude)
modelsummary(model)
## Linear mixed-effects model fit by REML
## Data: hip
## AIC BIC logLik
## 1486.63 1504.51 -738.3152
##
## Random effects:
## Formula: ~1 | IDNR
## (Intercept) Residual
## StdDev: 7.558976 2.501436
##
## Fixed effects: MMSE ~ TIME + TIME2
## Value Std.Error DF t-value p-value
## (Intercept) 18.861017 1.0627175 205 17.747912 0.0000
## TIME 0.406287 0.1635060 205 2.484842 0.0138
## TIME2 -0.017868 0.0125315 205 -1.425834 0.1554
## Correlation:
## (Intr) TIME
## TIME -0.347
## TIME2 0.301 -0.963
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -4.06437286 -0.43322995 0.04405184 0.47938518 2.82910055
##
## Number of Observations: 267
## Number of Groups: 60
Hip data
Include the age of the patient as an extra covariate in the model. Is there a different effect of age in both neuro groups?
No, because the interaction term is not significant:
<- lme(MMSE~TIME+NEURO+AGE+AGE:NEURO,random=~1|IDNR,data=hip, na.action=na.exclude)
modelsummary(model)
## Linear mixed-effects model fit by REML
## Data: hip
## AIC BIC logLik
## 1447.366 1472.345 -716.6832
##
## Random effects:
## Formula: ~1 | IDNR
## (Intercept) Residual
## StdDev: 5.460354 2.508963
##
## Fixed effects: MMSE ~ TIME + NEURO + AGE + AGE:NEURO
## Value Std.Error DF t-value p-value
## (Intercept) 56.50126 7.962052 206 7.096320 0.0000
## TIME 0.17633 0.043895 206 4.017058 0.0001
## NEUROneuro-psychiatric -4.88675 17.718753 56 -0.275796 0.7837
## AGE -0.43988 0.101196 56 -4.346782 0.0001
## NEUROneuro-psychiatric:AGE -0.03616 0.222840 56 -0.162276 0.8717
## Correlation:
## (Intr) TIME NEUROn- AGE
## TIME -0.035
## NEUROneuro-psychiatric -0.449 -0.006
## AGE -0.993 0.007 0.446
## NEUROneuro-psychiatric:AGE 0.451 0.007 -0.996 -0.454
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -4.21901376 -0.47124584 0.01957349 0.46873450 2.80219031
##
## Number of Observations: 267
## Number of Groups: 60
However, age is a significant fixed effect:
<- lme(MMSE~TIME+NEURO+AGE,random=~1|IDNR,data=hip, na.action=na.exclude)
modelsummary(model)
## Linear mixed-effects model fit by REML
## Data: hip
## AIC BIC logLik
## 1444.22 1465.652 -716.1098
##
## Random effects:
## Formula: ~1 | IDNR
## (Intercept) Residual
## StdDev: 5.411095 2.508976
##
## Fixed effects: MMSE ~ TIME + NEURO + AGE
## Value Std.Error DF t-value p-value
## (Intercept) 57.08459 7.046333 206 8.101319 0e+00
## TIME 0.17627 0.043893 206 4.015965 1e-04
## NEUROneuro-psychiatric -7.75201 1.523968 57 -5.086727 0e+00
## AGE -0.44734 0.089389 57 -5.004366 0e+00
## Correlation:
## (Intr) TIME NEURO-
## TIME -0.044
## NEUROneuro-psychiatric 0.006 0.008
## AGE -0.992 0.011 -0.079
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -4.22127751 -0.47530674 0.01848392 0.47026467 2.80123379
##
## Number of Observations: 267
## Number of Groups: 60
anova.lme(model)
## numDF denDF F-value p-value
## (Intercept) 1 206 800.2194 <.0001
## TIME 1 206 16.9625 1e-04
## NEURO 1 57 30.2343 <.0001
## AGE 1 57 25.0437 <.0001
Videos on the theory:
SII data
Start from the last model from day 3. Perform a backward model selection. What covariates are important in explaining the mathgain? Discuss the effects in the final model.
Stepwise backward elimination in the model with interactions between sex and other covaraites. The fullmodel:
<-lme(mathgain~sex*(ses+housepov+yearstea+mathkind),random=~1|schoolid/classid,data=sii, na.action = na.omit)
modelsummary(model)
## Linear mixed-effects model fit by REML
## Data: sii
## AIC BIC logLik
## 11408.22 11474.17 -5691.11
##
## Random effects:
## Formula: ~1 | schoolid
## (Intercept)
## StdDev: 9.195376
##
## Formula: ~1 | classid %in% schoolid
## (Intercept) Residual
## StdDev: 8.51822 27.23456
##
## Fixed effects: mathgain ~ sex * (ses + housepov + yearstea + mathkind)
## Value Std.Error DF t-value p-value
## (Intercept) 278.63790 14.572595 871 19.120679 0.0000
## sexfemales -11.20361 20.032320 871 -0.559277 0.5761
## ses 5.70738 1.703249 871 3.350879 0.0008
## housepov -5.32490 11.796167 105 -0.451410 0.6526
## yearstea 0.22939 0.141292 204 1.623505 0.1060
## mathkind -0.47700 0.029986 871 -15.907333 0.0000
## sexfemales:ses -0.05452 2.351317 871 -0.023187 0.9815
## sexfemales:housepov -24.27888 12.855978 871 -1.888529 0.0593
## sexfemales:yearstea -0.34430 0.175657 871 -1.960051 0.0503
## sexfemales:mathkind 0.03950 0.041685 871 0.947695 0.3435
## Correlation:
## (Intr) sxfmls ses houspv yearst mthknd sxfmls:s sxfmls:h
## sexfemales -0.672
## ses 0.214 -0.170
## housepov -0.241 0.118 0.096
## yearstea -0.165 0.093 -0.053 0.097
## mathkind -0.974 0.666 -0.231 0.085 0.032
## sexfemales:ses -0.155 0.216 -0.687 -0.073 0.041 0.169
## sexfemales:housepov 0.163 -0.221 -0.084 -0.539 -0.075 -0.080 0.171
## sexfemales:yearstea 0.106 -0.123 0.039 -0.061 -0.615 -0.024 -0.020 0.151
## sexfemales:mathkind 0.657 -0.982 0.181 -0.052 -0.018 -0.675 -0.239 0.093
## sxfmls:y
## sexfemales
## ses
## housepov
## yearstea
## mathkind
## sexfemales:ses
## sexfemales:housepov
## sexfemales:yearstea
## sexfemales:mathkind -0.002
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -5.89285360 -0.62785115 -0.02861974 0.55658062 4.29948404
##
## Number of Observations: 1190
## Number of Groups:
## schoolid classid %in% schoolid
## 107 312
<-update(model,.~.-sex:ses )
model2summary(model2)
## Linear mixed-effects model fit by REML
## Data: sii
## AIC BIC logLik
## 11409.77 11470.66 -5692.884
##
## Random effects:
## Formula: ~1 | schoolid
## (Intercept)
## StdDev: 9.196347
##
## Formula: ~1 | classid %in% schoolid
## (Intercept) Residual
## StdDev: 8.517937 27.22176
##
## Fixed effects: mathgain ~ sex + ses + housepov + yearstea + mathkind + sex:housepov + sex:yearstea + sex:mathkind
## Value Std.Error DF t-value p-value
## (Intercept) 278.59085 14.390994 872 19.358694 0.0000
## sexfemales -11.10523 19.550559 872 -0.568026 0.5702
## ses 5.68033 1.237910 872 4.588646 0.0000
## housepov -5.34830 11.761602 105 -0.454726 0.6502
## yearstea 0.22949 0.141126 204 1.626147 0.1055
## mathkind -0.47689 0.029541 872 -16.143229 0.0000
## sexfemales:housepov -24.22817 12.661608 872 -1.913514 0.0560
## sexfemales:yearstea -0.34437 0.175544 872 -1.961714 0.0501
## sexfemales:mathkind 0.03928 0.040454 872 0.970900 0.3319
## Correlation:
## (Intr) sxfmls ses houspv yearst mthknd sxfmls:h sxfmls:y
## sexfemales -0.662
## ses 0.150 -0.031
## housepov -0.256 0.137 0.063
## yearstea -0.161 0.086 -0.035 0.100
## mathkind -0.973 0.655 -0.160 0.099 0.025
## sexfemales:housepov 0.195 -0.268 0.047 -0.536 -0.083 -0.113
## sexfemales:yearstea 0.104 -0.122 0.034 -0.063 -0.614 -0.021 0.157
## sexfemales:mathkind 0.647 -0.982 0.024 -0.072 -0.008 -0.664 0.140 -0.007
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -5.89519431 -0.62812346 -0.02846769 0.55761536 4.29985590
##
## Number of Observations: 1190
## Number of Groups:
## schoolid classid %in% schoolid
## 107 312
<-update(model2,.~.-sex:mathkind)
model3summary(model3)
## Linear mixed-effects model fit by REML
## Data: sii
## AIC BIC logLik
## 11404.13 11459.96 -5691.066
##
## Random effects:
## Formula: ~1 | schoolid
## (Intercept)
## StdDev: 9.205218
##
## Formula: ~1 | classid %in% schoolid
## (Intercept) Residual
## StdDev: 8.461796 27.23086
##
## Fixed effects: mathgain ~ sex + ses + housepov + yearstea + mathkind + sex:housepov + sex:yearstea
## Value Std.Error DF t-value p-value
## (Intercept) 269.55968 10.978084 873 24.554346 0.0000
## sexfemales 7.53737 3.724963 873 2.023474 0.0433
## ses 5.65186 1.237601 873 4.566791 0.0000
## housepov -4.52070 11.727911 105 -0.385465 0.7007
## yearstea 0.23077 0.141006 204 1.636594 0.1033
## mathkind -0.45787 0.022102 873 -20.716396 0.0000
## sexfemales:housepov -25.97646 12.537727 873 -2.071864 0.0386
## sexfemales:yearstea -0.34347 0.175560 873 -1.956448 0.0507
## Correlation:
## (Intr) sxfmls ses houspv yearst mthknd sxfmls:h
## sexfemales -0.187
## ses 0.176 -0.037
## housepov -0.276 0.351 0.065
## yearstea -0.204 0.412 -0.035 0.100
## mathkind -0.954 0.022 -0.193 0.068 0.026
## sexfemales:housepov 0.137 -0.688 0.044 -0.533 -0.083 -0.026
## sexfemales:yearstea 0.142 -0.674 0.035 -0.063 -0.615 -0.034 0.159
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -5.88869525 -0.63454292 -0.02443226 0.56174303 4.31673108
##
## Number of Observations: 1190
## Number of Groups:
## schoolid classid %in% schoolid
## 107 312
<-update(model3,.~.-sex:yearstea)
model4summary(model4)
## Linear mixed-effects model fit by REML
## Data: sii
## AIC BIC logLik
## 11404.31 11455.07 -5692.154
##
## Random effects:
## Formula: ~1 | schoolid
## (Intercept)
## StdDev: 9.152796
##
## Formula: ~1 | classid %in% schoolid
## (Intercept) Residual
## StdDev: 8.587203 27.24878
##
## Fixed effects: mathgain ~ sex + ses + housepov + yearstea + mathkind + sex:housepov
## Value Std.Error DF t-value p-value
## (Intercept) 272.57162 10.877696 874 25.057844 0.0000
## sexfemales 2.61081 2.754110 874 0.947969 0.3434
## ses 5.73408 1.238130 874 4.631239 0.0000
## housepov -5.96054 11.707259 105 -0.509132 0.6117
## yearstea 0.06115 0.111576 204 0.548012 0.5843
## mathkind -0.45924 0.022113 874 -20.768311 0.0000
## sexfemales:housepov -22.01667 12.390680 874 -1.776874 0.0759
## Correlation:
## (Intr) sxfmls ses houspv yearst mthknd
## sexfemales -0.124
## ses 0.173 -0.018
## housepov -0.270 0.418 0.068
## yearstea -0.150 -0.004 -0.017 0.077
## mathkind -0.959 -0.001 -0.192 0.066 0.007
## sexfemales:housepov 0.118 -0.797 0.039 -0.531 0.019 -0.021
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -5.86847848 -0.62029105 -0.03509731 0.56863270 4.30389815
##
## Number of Observations: 1190
## Number of Groups:
## schoolid classid %in% schoolid
## 107 312
Now, main effect of yearstea can be removed, as yearstea is no longer part of an interaction term:
<-update(model4,.~.-yearstea)
model5summary(model5)
## Linear mixed-effects model fit by REML
## Data: sii
## AIC BIC logLik
## 11400.06 11445.75 -5691.03
##
## Random effects:
## Formula: ~1 | schoolid
## (Intercept)
## StdDev: 9.184765
##
## Formula: ~1 | classid %in% schoolid
## (Intercept) Residual
## StdDev: 8.552127 27.24273
##
## Fixed effects: mathgain ~ sex + ses + housepov + mathkind + sex:housepov
## Value Std.Error DF t-value p-value
## (Intercept) 273.49208 10.752582 874 25.435015 0.0000
## sexfemales 2.62108 2.753277 874 0.951986 0.3414
## ses 5.74628 1.237654 874 4.642884 0.0000
## housepov -6.46999 11.678182 105 -0.554024 0.5807
## mathkind -0.45938 0.022107 874 -20.779831 0.0000
## sexfemales:housepov -22.16862 12.384934 874 -1.789966 0.0738
## Correlation:
## (Intr) sxfmls ses houspv mthknd
## sexfemales -0.126
## ses 0.172 -0.018
## housepov -0.262 0.419 0.069
## mathkind -0.969 -0.001 -0.192 0.066
## sexfemales:housepov 0.122 -0.797 0.039 -0.534 -0.021
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -5.8763441 -0.6230294 -0.0302178 0.5707610 4.3001268
##
## Number of Observations: 1190
## Number of Groups:
## schoolid classid %in% schoolid
## 107 312
….
<-update(model5,.~.-sex:housepov)
model6summary(model6)
## Linear mixed-effects model fit by REML
## Data: sii
## AIC BIC logLik
## 11408.12 11448.74 -5696.06
##
## Random effects:
## Formula: ~1 | schoolid
## (Intercept)
## StdDev: 9.061953
##
## Formula: ~1 | classid %in% schoolid
## (Intercept) Residual
## StdDev: 8.719333 27.25296
##
## Fixed effects: mathgain ~ sex + ses + housepov + mathkind
## Value Std.Error DF t-value p-value
## (Intercept) 275.73178 10.677641 875 25.823287 0.0000
## sexfemales -1.30972 1.664641 875 -0.786789 0.4316
## ses 5.83023 1.237521 875 4.711221 0.0000
## housepov -17.53600 9.848187 105 -1.780632 0.0779
## mathkind -0.46003 0.022116 875 -20.800621 0.0000
## Correlation:
## (Intr) sxfmls ses houspv
## sexfemales -0.048
## ses 0.169 0.021
## housepov -0.234 -0.012 0.107
## mathkind -0.974 -0.029 -0.191 0.065
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -5.78800233 -0.62708538 -0.01617522 0.54849256 4.29842019
##
## Number of Observations: 1190
## Number of Groups:
## schoolid classid %in% schoolid
## 107 312
….
<-update(model6,.~.-sex)
model7summary(model7)
## Linear mixed-effects model fit by REML
## Data: sii
## AIC BIC logLik
## 11409.6 11445.14 -5697.798
##
## Random effects:
## Formula: ~1 | schoolid
## (Intercept)
## StdDev: 9.061356
##
## Formula: ~1 | classid %in% schoolid
## (Intercept) Residual
## StdDev: 8.685155 27.25499
##
## Fixed effects: mathgain ~ ses + housepov + mathkind
## Value Std.Error DF t-value p-value
## (Intercept) 275.32608 10.663464 876 25.819572 0.0000
## ses 5.85139 1.237085 876 4.729980 0.0000
## housepov -17.62905 9.841465 105 -1.791303 0.0761
## mathkind -0.46054 0.022104 876 -20.835602 0.0000
## Correlation:
## (Intr) ses houspv
## ses 0.171
## housepov -0.235 0.107
## mathkind -0.977 -0.191 0.065
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -5.76624565 -0.63606866 -0.02284166 0.55176293 4.27668545
##
## Number of Observations: 1190
## Number of Groups:
## schoolid classid %in% schoolid
## 107 312
…. final model
<-update(model7,.~.-housepov)
model8summary(model8)
## Linear mixed-effects model fit by REML
## Data: sii
## AIC BIC logLik
## 11417.21 11447.68 -5702.604
##
## Random effects:
## Formula: ~1 | schoolid
## (Intercept)
## StdDev: 8.994781
##
## Formula: ~1 | classid %in% schoolid
## (Intercept) Residual
## StdDev: 8.805618 27.26732
##
## Fixed effects: mathgain ~ ses + mathkind
## Value Std.Error DF t-value p-value
## (Intercept) 270.78470 10.371906 876 26.107515 0
## ses 6.08792 1.230806 876 4.946288 0
## mathkind -0.45788 0.022074 876 -20.742919 0
## Correlation:
## (Intr) ses
## ses 0.203
## mathkind -0.992 -0.199
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -5.81406721 -0.63030057 -0.02714651 0.56205803 4.27064533
##
## Number of Observations: 1190
## Number of Groups:
## schoolid classid %in% schoolid
## 107 312
Hip data
Fit the model with only main effects, no interaction effects. Obtain the Empirical-Bayes estimates for the patients. Present them in a histogram.
Fit the model
<-lme(MMSE ~ NEURO + TIME +AGE , random=~1|IDNR,data=hip, na.action = na.omit) model
Get the Bayes estimates and plot histogram
<- data.frame(ranef(model)[1])
BE names(BE) <- "Intercept"
$y <- dnorm(BE$Intercept)
BE
ggplot(BE, aes(x = Intercept)) + geom_histogram()
Hip data
Make a dataset with all predictions for the 60 patients.
Get predicted mean (given covariates NEURO, AGE and TIME):
<-predict(model,level=0) predMean
Get prediction for the individual (given covariates NEURO, AGE and TIME, plus individual random intercept):
<-predict(model,level=1)
predIndiv
<- data.frame(cbind(cbind(predMean,predIndiv)))
df %>%
df slice(1:10) %>%
kbl() %>%
kable_paper()
predMean | predIndiv | |
---|---|---|
X1 | 24.15806 | 26.07161 |
X1.1 | 24.51061 | 26.42416 |
X1.2 | 24.86315 | 26.77670 |
X1.3 | 25.39197 | 27.30552 |
X1.4 | 26.09706 | 28.01061 |
X2 | 27.28941 | 24.57311 |
X2.1 | 27.64196 | 24.92566 |
X2.2 | 27.99450 | 25.27820 |
X2.3 | 28.52332 | 25.80702 |
X3 | 27.28941 | 27.18284 |
The toenail dataset is the result of a randomized, double-blind, parallel group, multicenter study for the comparison of two oral treatments (0 and 1) for toenail dermatophyte onychomycosis (TDO). TDO is a common toenail infection, difficult to treat, and affecting more than 2 out of 100 persons. In total, 36 hospitals were randomized. All patients with an infection on the big toenail were randomized in 2 treatment groups and were followed during 12 weeks (3 months) of treatment and followed further up to a total of 12 months. Measurements were taken at baseline, every month during treatment, and every 3 months afterwards, resulting in a maximum of 7 measurements per subject. The measure of interest is the unaffected nail length in mm (‘nailbig’). Also, the age of the patient at the beginning of the study is denoted.
First, make a descriptive plot to see how the mean unaffected nail length evolves on average over time between the two treatment groups. Comment on the graph.
Import and clean data:
<- import("toenail.txt")
nail $treat <- factor(nail$treat, levels = c(0,1), labels = c("placebo", "active treatment"))
nail$id <- factor(nail$id)
nail$center <- factor(nail$center)
nail$timeNum <- nail$time
nail$time <- factor(nail$time, levels = c(0,1,2,3,6,9,12), labels = c("0","1","2","3","6","9","12")) # because not linearly measured nail
Generate the graph:
ggplot(nail, aes(x = time, y = nailbig, color = treat)) +
stat_summary(fun = mean, geom = "point", size = 3) +
geom_jitter(width = 0.3, size = 0.1) + labs(x = "time in months")
In general, the unaffeted nail length is increasing over time in both groups. The increase seems not to be linear with time. The increase seems to be higher in the active group. The variation between subjects is quite large.
For the model we will first choose the random structure, based on a fixed structure with time and the interaction with time and treat, ignoring the covariate age (time as continuous variable).
<- lme(nailbig ~ time + time:treat, random = ~1 | center/id, data = nail, na.action = na.omit)
model summary(model)
## Linear mixed-effects model fit by REML
## Data: nail
## AIC BIC logLik
## 9274.068 9367.866 -4620.034
##
## Random effects:
## Formula: ~1 | center
## (Intercept)
## StdDev: 0.0008601381
##
## Formula: ~1 | id %in% center
## (Intercept) Residual
## StdDev: 2.560856 2.492617
##
## Fixed effects: nailbig ~ time + time:treat
## Value Std.Error DF t-value p-value
## (Intercept) 1.693921 0.2934113 1542 5.773197 0.0000
## time1 0.896659 0.2943568 1542 3.046163 0.0024
## time2 1.987248 0.2972278 1542 6.685941 0.0000
## time3 3.504213 0.3022572 1542 11.593481 0.0000
## time6 5.621216 0.3075350 1542 18.278298 0.0000
## time9 6.224012 0.3207449 1542 19.404868 0.0000
## time12 6.324421 0.3216902 1542 19.659971 0.0000
## time0:treatactive treatment 0.393414 0.4138003 1542 0.950733 0.3419
## time1:treatactive treatment 0.289620 0.4175257 1542 0.693659 0.4880
## time2:treatactive treatment 0.309150 0.4209154 1542 0.734470 0.4628
## time3:treatactive treatment 0.202418 0.4271594 1542 0.473870 0.6357
## time6:treatactive treatment 0.230427 0.4351366 1542 0.529551 0.5965
## time9:treatactive treatment 0.836164 0.4499432 1542 1.858376 0.0633
## time12:treatactive treatment 0.891608 0.4517271 1542 1.973777 0.0486
## Correlation:
## (Intr) time1 time2 time3 time6 time9 time12
## time1 -0.486
## time2 -0.481 0.490
## time3 -0.473 0.482 0.482
## time6 -0.465 0.474 0.474 0.473
## time9 -0.448 0.454 0.454 0.453 0.451
## time12 -0.446 0.453 0.453 0.452 0.449 0.445
## time0:treatactive treatment -0.709 0.345 0.341 0.336 0.330 0.317 0.316
## time1:treatactive treatment -0.360 -0.363 -0.007 -0.007 -0.007 -0.006 -0.006
## time2:treatactive treatment -0.357 -0.007 -0.371 -0.010 -0.010 -0.009 -0.009
## time3:treatactive treatment -0.352 -0.007 -0.011 -0.382 -0.015 -0.013 -0.013
## time6:treatactive treatment -0.345 -0.007 -0.010 -0.015 -0.393 -0.017 -0.017
## time9:treatactive treatment -0.333 -0.007 -0.010 -0.014 -0.018 -0.421 -0.026
## time12:treatactive treatment -0.332 -0.007 -0.010 -0.014 -0.018 -0.026 -0.422
## tm0:tt tm1:tt tm2:tt tm3:tt tm6:tt tm9:tt
## time1
## time2
## time3
## time6
## time9
## time12
## time0:treatactive treatment
## time1:treatactive treatment 0.508
## time2:treatactive treatment 0.504 0.506
## time3:treatactive treatment 0.497 0.498 0.498
## time6:treatactive treatment 0.488 0.489 0.489 0.488
## time9:treatactive treatment 0.471 0.472 0.472 0.471 0.468
## time12:treatactive treatment 0.469 0.470 0.470 0.469 0.466 0.461
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.7888417338 -0.5300417022 0.0001921644 0.5666066228 5.1732155056
##
## Number of Observations: 1854
## Number of Groups:
## center id %in% center
## 26 299
anova.lme(model)
## numDF denDF F-value p-value
## (Intercept) 1 1542 980.5896 <.0001
## time 6 1542 291.9657 <.0001
## time:treat 7 1542 0.9105 0.4972
VarCorr(model)
## Variance StdDev
## center = pdLogChol(1)
## (Intercept) 7.398375e-07 0.0008601381
## id = pdLogChol(1)
## (Intercept) 6.557983e+00 2.5608559349
## Residual 6.213138e+00 2.4926167123
Yes, the variance between subjects is much larger than the variance between hospitals. This makes sense.
<- lme(nailbig ~ time + time:treat, random = ~1 | id, data = nail, na.action = na.omit)
model1 anova(model, model1)
## Model df AIC BIC logLik Test L.Ratio p-value
## model 1 17 9274.068 9367.866 -4620.034
## model1 2 16 9271.112 9359.393 -4619.556 1 vs 2 0.9555801 0.3283
<- nail %>%
nail_var group_by(timeNum, treat) %>%
summarise(Variance = var(nailbig))
%>%
nail_var kbl() %>%
kable_paper()
timeNum | treat | Variance |
---|---|---|
0 | placebo | 5.640835 |
0 | active treatment | 8.572255 |
1 | placebo | 6.440780 |
1 | active treatment | 8.449233 |
2 | placebo | 7.922346 |
2 | active treatment | 9.314727 |
3 | placebo | 10.898467 |
3 | active treatment | 10.144576 |
6 | placebo | 15.517773 |
6 | active treatment | 14.188550 |
9 | placebo | 20.460068 |
9 | active treatment | 20.571358 |
12 | placebo | 27.233707 |
12 | active treatment | 25.052586 |
ggplot(nail_var, aes(x = timeNum, y = Variance, fill = treat)) + geom_bar(stat = "identity") + labs(y = "total variance", x = "time")
The variance increases with time, but about at the same level in both groups.
<- lme(nailbig ~ timeNum + timeNum:treat, random = ~1 + timeNum | id, data = nail, na.action = na.exclude)
model2 summary(model2)
## Linear mixed-effects model fit by REML
## Data: nail
## AIC BIC logLik
## 8723.239 8761.904 -4354.62
##
## Random effects:
## Formula: ~1 + timeNum | id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 2.7133258 (Intr)
## timeNum 0.4752377 -0.389
## Residual 1.7750636
##
## Fixed effects: nailbig ~ timeNum + timeNum:treat
## Value Std.Error DF t-value p-value
## (Intercept) 2.5970699 0.16917771 1554 15.351135 0.0000
## timeNum 0.5820213 0.04303633 1554 13.523953 0.0000
## timeNum:treatactive treatment 0.0571236 0.05687196 1554 1.004425 0.3153
## Correlation:
## (Intr) timeNm
## timeNum -0.312
## timeNum:treatactive treatment 0.002 -0.684
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.46797352 -0.51623252 -0.06386359 0.45542574 5.65730820
##
## Number of Observations: 1854
## Number of Groups: 298
anova.lme(model2)
## numDF denDF F-value p-value
## (Intercept) 1 1554 683.0674 <.0001
## timeNum 1 1554 378.9960 <.0001
## timeNum:treat 1 1554 1.0089 0.3153
Remark: R crashes (in my case) when not using time as numeric varible…
anova(model1, model2)
## Model df AIC BIC logLik Test L.Ratio p-value
## model1 1 16 9271.112 9359.393 -4619.556
## model2 2 7 8723.239 8761.904 -4354.620 1 vs 2 529.8732 <.0001
The random slope improves the model…
We will now look at the fixed structure, using the random structure that was chosen.
anova.lme(model2)
## numDF denDF F-value p-value
## (Intercept) 1 1554 683.0674 <.0001
## timeNum 1 1554 378.9960 <.0001
## timeNum:treat 1 1554 1.0089 0.3153
There is a time effect, but the interaction is not significant, thus, no different evolution between groups.
<- lme(nailbig ~ timeNum + age, ~1 + timeNum | id, data = nail, na.action = na.exclude)
model3 summary(model3)
## Linear mixed-effects model fit by REML
## Data: nail
## AIC BIC logLik
## 8725.119 8763.784 -4355.56
##
## Random effects:
## Formula: ~1 + timeNum | id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 2.7016795 (Intr)
## timeNum 0.4750058 -0.378
## Residual 1.7748809
##
## Fixed effects: nailbig ~ timeNum + age
## Value Std.Error DF t-value p-value
## (Intercept) 1.1529033 1.3497224 1554 0.854178 0.3931
## timeNum 0.6109973 0.0314125 1554 19.450765 0.0000
## age 0.0225356 0.0208990 1554 1.078312 0.2811
## Correlation:
## (Intr) timeNm
## timeNum -0.041
## age -0.992 -0.011
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.4914541 -0.5200767 -0.0587634 0.4530281 5.6613576
##
## Number of Observations: 1854
## Number of Groups: 298
anova.lme(model3)
## numDF denDF F-value p-value
## (Intercept) 1 1554 670.7432 <.0001
## timeNum 1 1554 378.8407 <.0001
## age 1 1554 1.1628 0.2811
No main effect of age.