Linear mixed models (LMMs) must negotiate the complexity of the model with amount of support by the available data. Often this requires that we determine a model of intermediate complexity, that is a parsimonious mixed model (Bates et al., 2015a, Matuschek et al., 2017). The default strategy for the determination of such a model, is to compare models with likelihood ratio tests (LRTs). A valid LRT requires that the LMMs entering the comparison are identified (not degenerate or overparameterized) and can be ordered in a hierarchically nested sequence.
Here the focus is on the mathematical foundation of an LMM comprising a fixed factor f
with levels A
, B
, and C
and a random factor group g
with levels drawn from a normal distribution. I use the default treatment contrast to estimate the mean of the base (A
) and the two contrasts c1 = B - A
and c2 = C - A
as fixed effects and the corresponding variance parameters (VPs) and correlation parameters (CPs) in the random-effects structure. The demonstration reveals how the lmer syntax translates into the variance-covariance matrix of the random-effects structure. By forcing subsets of variance and correlation parameters to zero, I arrive at a sequence of nested LMMs.
There are two sections. I compare LMMs with the (1 ... g)
specification of the random-effects structure (RES) in the first and LMMs with the (0 ... g)
RES specification in the second section.
(1 + c1 + c2 | g)
specificationLet us consider the following five LMMs, ordered in decreasing complexity, written in the lmer
formula (Bates et al., 2015b).
m1: y ~ 1 + f + (1 + c1 + c2 | g)
m2: y ~ 1 + f + (1 | subj) + (0 + c1 + c2 | g)
m3: y ~ 1 + f + (1 + c1 + c2 || g) # or equivalently
y ~ 1 + f + (1 | g) + (0 + c1 | g) + (0 + c2 | g)
m4: y ~ 1 + f + (1 + c2 || g) # or equivalently
y ~ 1 + f + (1 | g) + (0 + c2 | g)
m5: y ~ 1 + f + (1 | g)
LMM m1
represents the maximal LMM. The LMM estimates 3 VPs corresponding to the fixed effects and 3 CPs.
LMM m3
represents a model of intermediated complexity. It also estimates 3 VPs, but forces the 3 CPs to zero, that is it is a zero-correlation parameter LMM.
LMM m5
is a minimal model comprising only 1 VP for the intercept. Given that we are using a treatment contrast, this VP estimates the variance among elements of the random factor g
for the first level of fixed factor f
.
In terms of model complexity LMM m2
and LMM m4
sort themselves between LMM m1
and LMM m3
and between LMM m3
and LMM m5
, respectively. In LMM m2
, I force one of the CPs to zero and LMM m4
(which like LMM m3
forces all CPs to zero). In addition, I assume that one of the VPs is not supported by the data and can be eliminated from the model.
The script demonstrates that the elements of \(\mathbf\Sigma\) correspond in a transparent way with the model specification in the lmer syntax. Morevoer, LMM m1
to LMM m5
represent a nested hierarchical sequence from maximal to minimal model complexity. Thus, LRTs can be used for comparing models in this sequence.
(0 + c1 + c2 | g)
specificationThe LMMs in this section represent two clusters. The first cluster consists of LMMs m6
, m7
, and m8
. The LMMs m1a
to m4a
in the second cluster corrspond to LMMs m1
to m5
in their random-effects structure (m1a
is a max_LMM
and m3a
is a zcp_LMM
), but they estimate a very different set of VPs and CPs. Each of the two clusters represents a nested model sequence of decreasing complexity.
Sequence 1
m6: y ~ 1 + f + (1 | subj) + (0 + A + B + C || subj)
m7: y ~ 1 + f + (1 | subj) + (1 | factor:subj)
m8: y ~ 1 + f + (1 | factor:subj)
Sequence 2
m1a: y ~ 1 + f + (0 + A + B + C | subj)
m2a: y ~ 1 + f + (0 + A | subj) + (0 + B + C | subj)
m3a: y ~ 1 + f + (0 + A + B + C || subj)
m4a: y ~ 1 + f + (0 + B + C || subj)
Note that LMMs m7
and m8
(as m5
in the first section) do not depend on the specification of the random-effect structure. They are included here because they can be compared with LMM m6
in the first sequence.
m1
Let us begin with the maximal LMM (max_LMM
) with default treatment contrasts both in the fixed-effect part and in the random-effect term. The model estimates 3 VPs and 3 CPs, aside from the residual variance and the fixed effects. The following models are nested under this LMM.
For this model, \(\Sigma\) is of the following form:
\[\mathbf\Sigma = \begin{bmatrix} \sigma_1^2 & \sigma_1^2 + \sigma_{12} & \sigma_1^2 + \sigma_{13}\\ \sigma_1^2 + \sigma_{12} & \sigma_1^2 + \sigma_2^2 + 2 \sigma_{12} & \sigma_1^2 + \sigma_{12} + \sigma_{13} + \sigma_{23}\\ \sigma_1^2 + \sigma_{13} & \sigma_1^2 + \sigma_{12} + \sigma_{13} + \sigma_{23} & \sigma_1^2 + \sigma_3^2 + 2\sigma_{13}\\ \end{bmatrix}.\]
m2
Let us onsider two versions of LMM m2
. The first version estimates only the correlation between the two contrasts. The second version estimates only the correlation between factor level A
(i.e., intercept) and first contrast.
c1
and contrast c2
Removing correlation parameters involving the intercept (i.e., \(\sigma_{12}\) and \(\sigma_{13}\) ) from \(\Sigma\), we obtain the following matrix:
\[\mathbf\Sigma = \begin{bmatrix} \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 + \sigma_2^2 & \sigma_1^2 + \sigma_{23} \\ \sigma_1^2 & \sigma_1^2 + \sigma_{23} & \sigma_1^2 + \sigma_3^2 \\ \end{bmatrix}.\]
A
and contrast c1
In this version of LMM m2
, we estimate the CP for level A
and contrast c1
. After removing \(\sigma_{13}\) and \(\sigma_{23}\) from the \(\Sigma\), we obtain the following specification:
\[\mathbf\Sigma = \begin{bmatrix} \sigma_1^2 & \sigma_1^2 + \sigma_{12} & \sigma_1^2 \\ \sigma_1^2 + \sigma_{12} & \sigma_1^2 + \sigma_2^2 + 2\sigma_{12} & \sigma_1^2 + \sigma_{12} \\ \sigma_1^2 & \sigma_1^2 + \sigma_{12} & \sigma_1^2 + \sigma_3^2 \\ \end{bmatrix}.\]
m3
LMM m3
(zcp_LMM) estimates only VPs, correlation parameters are forced to zero.
The \(\Sigma\) on the following form:
\[\mathbf\Sigma = \begin{bmatrix} \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 + \sigma_2^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 + \sigma_3^2 \\ \end{bmatrix}.\]
The alternative specification for this LMM m3
corresponds to the following decomposition of the random-effect variance-covariance matrix:
\[\mathbf \Sigma = \begin{bmatrix} \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \end{bmatrix} + \begin{bmatrix} 0 & 0 & 0\\ 0 & \sigma_2^2 & 0\\ 0& 0& 0\\ \end{bmatrix} + \begin{bmatrix} 0 & 0 & 0\\ 0 & 0 & 0\\ 0& 0& \sigma_3^2\\ \end{bmatrix}\]
m4
Without CPs and with the additional removal of the VC for the first contrast c1
\(\Sigma\) has the following form:
\[\mathbf\Sigma = \begin{bmatrix} \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 + \sigma_3^2 \\ \end{bmatrix}.\]
The alternative specification for this LMM m4
corresponds to the following decomposition of the random-effect variance-covariance matrix:
\[\mathbf \Sigma = \begin{bmatrix} \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \end{bmatrix} + \begin{bmatrix} 0 & 0 & 0\\ 0 & 0 & 0\\ 0& 0& \sigma_3^2\\ \end{bmatrix}\]
m5
Finally, and somewhat trivially by now, for the minimal LMM m5
, \(\Sigma\) looks like this.
\[\mathbf\Sigma = \begin{bmatrix} \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \end{bmatrix}.\]
m6
\[\mathbf\Sigma = \begin{bmatrix} \sigma_A^2 & \sigma_{AB} & \sigma_{AC} \\ \sigma_{AB} & \sigma_B^2 & \sigma_{BC} \\ \sigma_{AC} & \sigma_{BC} & \sigma_C^2 \\ \end{bmatrix}.\]
m7
This is the interaction LMM (int_LMM
) that does not fit in the hierarchical sequence. Here is the reason why. Its \(\Sigma\) looks like this.
\[\mathbf\Sigma = \begin{bmatrix} \sigma_1^2 + \sigma_2^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 + \sigma_2^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 + \sigma_2^2 \\ \end{bmatrix}.\]
Interestingly, the model implements the assumption of equal covariances, but would require also some special assumptions about the diagonal elements to fit in the hierarchy. These assumptions would depend on the contrast and, at first glance, don’t offer much of an option, if a treatment contrast is chosen.
However, it fits very well into the hieararchy with LMM m1a
, as shown in the next section.
m8
LMM m8
is also a minimal LMM – like LMM m5
. They differ, however, in that here the levels of the random factor represent unique combinations of Machine:Worker
. This model is nested under the interaction LMM m7
.
\[\mathbf\Sigma = \begin{bmatrix} \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \end{bmatrix}.\]
m1a
Here is the general from for LMM m1a
, that is the maximal LMM for the factor levels.
In the standard notation the model can be expressed as: \[y_{ijk} = \mu + \alpha_i + d_{ij} + e_{ijk}, \quad \mathbf d_j \sim N(\mathbf 0, \mathbf\Sigma), \quad e_{ijk} \sim N(0, \sigma^2)\]
where \(i=1, 2, 3\), \(j=1, \ldots, 6\), \(k=1, 2, 3\), \(\alpha_i\) is the effect of the \(i\)th machine and \(d_{ij}\) is the random effect of the \(j\)th worker on the \(i\)th machine, \(\mathbf d_j = (d_{1j}, d_{2j}, d_{3_j})^\top\) and \[\mathbf\Sigma = \begin{bmatrix} \sigma_A^2 & \sigma_{AB} & \sigma_{AC} \\ \sigma_{AB} & \sigma_B^2 & \sigma_{BC} \\ \sigma_{AC} & \sigma_{BC} & \sigma_C^2 \\ \end{bmatrix}.\]
where \(\sigma_A^2 \geq 0\), \(\sigma_C^2 \geq 0\), \(\sigma_C^2 \geq 0\).
There is a one-to-one corresponds between VPs and CPs in the lmer
output and the elements of \(\mathbf\Sigma\). This is true for all the following models.
I derive variance and covariance elements of \(\Sigma\) for treatment contrasts from this model. This should give us the entries for the general form of LMM m1
.
Variance
The variance for the first contrast B - A
is:
\[\begin{align} \mathsf{Var}(Y_{2jk} - Y_{1jk}) =~& \mathsf{Var}(Y_{2jk}) + \mathsf{Var}(Y_{1jk}) - 2 \mathsf{Cov}(Y_{2jk}, Y_{1jk}) \\ =~& (\sigma_B^2 + \sigma^2) + (\sigma_A^2 + \sigma^2) - 2\sigma_{AB} \\ =~& \sigma_A^2 + \sigma_B^2 - 2\sigma_{AB} + 2\sigma^2 \end{align}\]
Covariance
The covariance between the two contrasts B - A
and C - A
is \[\begin{align} \mathsf{Cov}(Y_{2jk} - Y_{1jk}, Y_{3jk} - Y_{1jk}) =&~ \mathsf{Cov}(Y_{2jk}, Y_{3jk}) - \mathsf{Cov}(Y_{2jk}, Y_{1jk}) - \mathsf{Cov}(Y_{1jk}, Y_{3jk}) + \mathsf{Cov}(Y_{1jk}, Y_{1jk}) \\
=~& \sigma_{BC} - \sigma_{AB} - \sigma_{AC} + (\sigma_A^2 + \sigma^2) \\
=~& \sigma_A^2 - \sigma_{AB} - \sigma_{AC} + \sigma_{BC} + \sigma^2
\end{align}\]
In principle, the variance and covariance for the contrasts based on the factor levels should correspond to the elements of LMM m1
, but note that \(\sigma_{AB}\) and \(\sigma_{AC}\) enter with a negativ and \(\sigma_{12}\) and \(\sigma_{13}\) with a positive sign. Fortunately, this can be reconciled if we specify c1 = A - B
and c2 = A - C
(see Illustration in 3.1).
m2a
LMM m2a
forces a zero CPs between level A
and the other two levels. Thus, only the CP between B
and C
is estimated.
\[\mathbf\Sigma = \begin{bmatrix} \sigma_A^2 & 0 & 0 \\ 0 & \sigma_B^2 & \sigma_{BC} \\ 0 & \sigma_{BC} & \sigma_C^2 \\ \end{bmatrix}.\]
Model parameters in \(\mathbf\Sigma\) correspond directly to those reported in the lmer
output.
m3a
LMM m3a
is the zcp_LMM
for this specification; VPs estimate the variance of g
performance associated with the three levels of the factor.
\[\mathbf\Sigma = \begin{bmatrix} \sigma_A^2 & 0 & 0 \\ 0 & \sigma_B^2 & 0 \\ 0 & 0 & \sigma_C^2 \\ \end{bmatrix}.\]
Model parameters in \(\mathbf\Sigma\) correspond directly to those reported in the lmer
output.
m4a
LMM m4
represent the assumption that there the data do not support a reliable VC for level A
.
\[\mathbf\Sigma = \begin{bmatrix} 0 & 0 & 0 \\ 0 & \sigma_B^2 & 0 \\ 0 & 0 & \sigma_C^2 \\ \end{bmatrix}.\]
Model parameters in \(\mathbf\Sigma\) correspond directly to those reported in the lmer
output.
Here I illustrate that the entries of \(\mathbf\Sigma\) for the various LMMs can be computed from the estimates of VP and CP. I use the Machines
data which are part of the MEMSS
package [efficiency scores of 6 workers (random factor) tested on each of 3 machines (fixed factor) for 3 times ].
For LMM m1
to LMM m4
I extract the intercept and two indicator variables c1
and c2
coding the treatment-contrast specification from the model matrix.
For LMM m6
and LMMs m8
to m12
I extract the three levelsA
, B
, and C
of the factor from the mode matrix.
Results for LMMs m5
, m7
, and m8
do not depend on the contrast-based or level-based specification. These models estimate only one or (in the case of m7
two) variance parameters.
m1
contrasts(Machines$Machine) <- contr.treatment(3, base=1)
mm <- model.matrix(~ 1 + Machine, Machines)
c1 <- mm[, 2]
c2 <- mm[, 3]
m1 <- lmer(score ~ 1 + Machine + (1 + c1 + c2 | Worker), data=Machines, REML=FALSE)
(vc <- as.data.frame(VarCorr(m1)))
## grp var1 var2 vcov sdcor
## 1 Worker (Intercept) <NA> 13.8157529 3.7169548
## 2 Worker c1 <NA> 28.6861701 5.3559472
## 3 Worker c2 <NA> 11.2431151 3.3530755
## 4 Worker (Intercept) c1 9.7215585 0.4883288
## 5 Worker (Intercept) c2 -4.5269756 -0.3632265
## 6 Worker c1 c2 5.3097436 0.2956609
## 7 Residual <NA> <NA> 0.9246296 0.9615766
# Compute Sigma values according to general form for m1
(sigmas <- setNames(c(vc[1, "vcov"],
sum(vc[c(1:2), "vcov"]) + 2*vc[4, "vcov"],
sum(vc[c(1,3), "vcov"]) + 2*vc[5, "vcov"],
vc[1, "vcov"] + vc[4, "vcov"],
vc[1, "vcov"] + vc[5, "vcov"],
vc[1, "vcov"] + sum(vc[4:6, "vcov"])),
c("sigma_1^2", "sigma_2^2", "sigma_3^2",
"sigma_{12}", "sigma_{13}", "sigma_{23}")))
## sigma_1^2 sigma_2^2 sigma_3^2 sigma_{12} sigma_{13} sigma_{23}
## 13.815753 61.945040 16.004917 23.537311 9.288777 24.320079
V_m1 <- getME_V(m1)
round(V_m1[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 13.82 23.54 9.29
## 19 23.54 61.95 24.32
## 39 9.29 24.32 16.00
The diagonal and off-diagonal values in \(\Sigma\) computed from the lmer
model parameter estimates according to the general form for LMM m1
match the values of the 3 x 3 block pattern in V_m1
.
The \(\Sigma\) of LMM m1
differs from the \(\Sigma\) of LMM m1a
in that \(\sigma_{12}\) and \(\sigma_{13}\) enter the computation with a negative sign. The is simple due to reversing the direction of subtraction for contrasts c1
and c2
. With this specification, I also need to subtract \(\sigma_{12}\) and \(\sigma_{13}\) to achieve the match.
c1_ <- -c1
c2_ <- -c2
m1_ <- lmer(score ~ 1 + Machine + (1 + c1_ + c2_ | Worker), data=Machines, REML=FALSE)
(vc <- as.data.frame(VarCorr(m1_)))
## grp var1 var2 vcov sdcor
## 1 Worker (Intercept) <NA> 13.8157433 3.7169535
## 2 Worker c1_ <NA> 28.6861865 5.3559487
## 3 Worker c2_ <NA> 11.2431178 3.3530759
## 4 Worker (Intercept) c1_ -9.7215504 -0.4883284
## 5 Worker (Intercept) c2_ 4.5269789 0.3632269
## 6 Worker c1_ c2_ 5.3097474 0.2956610
## 7 Residual <NA> <NA> 0.9246296 0.9615766
# For inverted contrasts need to subtract sigma_12 and sigma_13 (see general form for m1a)
(sigmas <- setNames(c(vc[1, "vcov"],
sum(vc[c(1:2), "vcov"]) - 2*vc[4, "vcov"],
sum(vc[c(1,3), "vcov"]) - 2*vc[5, "vcov"],
vc[1, "vcov"] - vc[4, "vcov"],
vc[1, "vcov"] - vc[5, "vcov"],
vc[1, "vcov"] - sum(vc[4:5, "vcov"]) + vc[6, "vcov"]),
c("sigma_1^2", "sigma_2^2", "sigma_3^2",
"sigma_{12}", "sigma_{13}", "sigma_{23}")))
## sigma_1^2 sigma_2^2 sigma_3^2 sigma_{12} sigma_{13} sigma_{23}
## 13.815743 61.945031 16.004903 23.537294 9.288764 24.320062
V_m1_ <- getME_V(m1_)
round(V_m1_[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 13.82 23.54 9.29
## 19 23.54 61.95 24.32
## 39 9.29 24.32 16.00
m2
c1
and contrast c2
m2_c1c2 <- lmer(score ~ 1 + Machine + (1 | Worker) + (0 + c1 + c2 | Worker), data=Machines, REML=FALSE)
(vc <- as.data.frame(VarCorr(m2_c1c2)))
## grp var1 var2 vcov sdcor
## 1 Worker (Intercept) <NA> 13.7151414 3.7033959
## 2 Worker.1 c1 <NA> 29.1133286 5.3956768
## 3 Worker.1 c2 <NA> 11.0432239 3.3231347
## 4 Worker.1 c1 c2 5.4237504 0.3024864
## 5 Residual <NA> <NA> 0.9257324 0.9621499
# Compute Sigma values according to general form for m2_c1c2
(sigmas <- setNames(c(vc[1, "vcov"],
sum(vc[c(1:2), "vcov"]),
sum(vc[c(1,3), "vcov"]),
vc[1, "vcov"],
vc[1, "vcov"],
vc[1, "vcov"] + vc[4, "vcov"]),
c("sigma_1^2", "sigma_2^2", "sigma_3^2",
"sigma_{12}", "sigma_{13}", "sigma_{23}")))
## sigma_1^2 sigma_2^2 sigma_3^2 sigma_{12} sigma_{13} sigma_{23}
## 13.71514 42.82847 24.75837 13.71514 13.71514 19.13889
V_c1c2 <- getME_V(m2_c1c2)
round(V_c1c2[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 13.72 13.72 13.72
## 19 13.72 42.83 19.14
## 39 13.72 19.14 24.76
All computed values match with the contents of V_c1_c2
.
A
and contrast c1
m2_Ac1 <- lmer(score ~ 1 + Machine + (1 + c1 | Worker) + (0 + c2 | Worker), data=Machines, REML=FALSE)
(vc <- as.data.frame(VarCorr(m2_Ac1)))
## grp var1 var2 vcov sdcor
## 1 Worker (Intercept) <NA> 13.5634264 3.6828557
## 2 Worker c1 <NA> 28.3878238 5.3280225
## 3 Worker (Intercept) c1 9.9954824 0.5093930
## 4 Worker.1 c2 <NA> 10.6339867 3.2609794
## 5 Residual <NA> <NA> 0.9329676 0.9659025
# Compute Sigma values according to general form for m2_Ac1
(sigmas <- setNames(c(vc[1, "vcov"],
sum(vc[c(1:2), "vcov"]) + 2*vc[3, "vcov"],
sum(vc[c(1,4), "vcov"]),
vc[1, "vcov"] + vc[3, "vcov"],
vc[1, "vcov"],
vc[1, "vcov"] + vc[3, "vcov"]),
c("sigma_1^2", "sigma_2^2", "sigma_3^2",
"sigma_{12}", "sigma_{13}", "sigma_{23}")))
## sigma_1^2 sigma_2^2 sigma_3^2 sigma_{12} sigma_{13} sigma_{23}
## 13.56343 61.94221 24.19741 23.55891 13.56343 23.55891
V_m2_Ac1 <- getME_V(m2_Ac1)
round(V_m2_Ac1[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 13.56 23.56 13.56
## 19 23.56 61.94 23.56
## 39 13.56 23.56 24.20
Other off-diagonal values agree with the VP estimate for (\(\sigma_1^2\)). Thus, all computed values match with the contents of V_m2_Ac1
.
m3
The worker-level covariance matrix (ignoring the residual variance) has the following structure:
m3 <- lmer(score ~ 1 + Machine + (1 + c1 + c2 || Worker), data=Machines, REML=FALSE)
(vc <- as.data.frame(VarCorr(m3)))
## grp var1 var2 vcov sdcor
## 1 Worker (Intercept) <NA> 13.7771273 3.7117553
## 2 Worker.1 c1 <NA> 28.8274937 5.3691241
## 3 Worker.2 c2 <NA> 10.9321153 3.3063749
## 4 Residual <NA> <NA> 0.9265345 0.9625666
# Compute Sigma values according to general form for m3
(sigmas <- setNames(c(vc[1, "vcov"],
sum(vc[1:2, "vcov"]),
sum(vc[c(1,3), "vcov"]),
vc[1, "vcov"], vc[1, "vcov"], vc[1, "vcov"]),
c("sigma_1^2", "sigma_2^2", "sigma_3^2",
"sigma_{12}", "sigma_{13}", "sigma_{23}")))
## sigma_1^2 sigma_2^2 sigma_3^2 sigma_{12} sigma_{13} sigma_{23}
## 13.77713 42.60462 24.70924 13.77713 13.77713 13.77713
V_m3 <- getME_V(m3)
round(V_m3[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 13.78 13.78 13.78
## 19 13.78 42.60 13.78
## 39 13.78 13.78 24.71
I compute the entries for V1_m3
from the estimates of model parameters as shown in the general form of \(\Sigma\) for this model. The values are in agreement.
m4
Here I drop the VP for c1
, that is I assume there on reliable worker-related variance for the difference between Machines A
and B
. I also assume that none of the CPs can be estimated reliably. So there are only two VP parameters, one for level A
and the other for the second contrast c2
.
m4 <- lmer(score ~ 1 + Machine + (1 + c2 || Worker), data=Machines, REML=FALSE)
(vc <- as.data.frame(VarCorr(m4)))
## grp var1 var2 vcov sdcor
## 1 Worker (Intercept) <NA> 27.221350 5.217408
## 2 Worker.1 c2 <NA> 8.911068 2.985141
## 3 Residual <NA> <NA> 7.159796 2.675780
# Compute Sigma values according to general form for m3
(sigmas <- setNames(c(vc[1, "vcov"], vc[1, "vcov"],
sum(vc[c(1,2), "vcov"]),
vc[1, "vcov"], vc[1, "vcov"], vc[1, "vcov"]),
c("sigma_1^2", "sigma_2^2", "sigma_3^2",
"sigma_{12}", "sigma_{13}", "sigma_{23}")))
## sigma_1^2 sigma_2^2 sigma_3^2 sigma_{12} sigma_{13} sigma_{23}
## 27.22135 27.22135 36.13242 27.22135 27.22135 27.22135
V_m4 <- getME_V(m4)
round(V_m4[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 27.22 27.22 27.22
## 19 27.22 27.22 27.22
## 39 27.22 27.22 36.13
m5
The worker-level covariance matrix (ignoring the residual variance) has the following structure:
m5 <- lmer(score ~ 1 + Machine + (1 | Worker), data=Machines, REML=FALSE)
(vc <- data.frame(VarCorr(m5)))
## grp var1 var2 vcov sdcor
## 1 Worker (Intercept) <NA> 21.933665 4.683339
## 2 Residual <NA> <NA> 9.579514 3.095079
(sigmas <- setNames(vc[1, "vcov"], c("sigma_1^2")))
## sigma_1^2
## 21.93367
V_m5 <- getME_V(m5)
round(V_m5[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 21.93 21.93 21.93
## 19 21.93 21.93 21.93
## 39 21.93 21.93 21.93
The relation between the parameters in \(\mathbf\Sigma\) and what lmer
reports is transparent.
m6
mm <- model.matrix(~ 0 + Machine, Machines)
A <- mm[, 1]
B <- mm[, 2]
C <- mm[, 3]
m6 <- lmer(score ~ 1 + Machine + (1 | Worker) + (0 + A + B + C || Worker), data=Machines, REML=FALSE)
(vc <- as.data.frame(VarCorr(m6)))
## grp var1 var2 vcov sdcor
## 1 Worker (Intercept) <NA> 11.9444977 3.4560813
## 2 Worker.1 A <NA> 3.0860039 1.7567026
## 3 Worker.2 B <NA> 28.7020858 5.3574328
## 4 Worker.3 C <NA> 6.6958839 2.5876406
## 5 Residual <NA> <NA> 0.9246296 0.9615766
(sigmas <- setNames(c(sum(vc[1:2, "vcov"]) , sum(vc[c(1,3), "vcov"]), sum(vc[c(1,4), "vcov"]),
vc[1, "vcov"], vc[1, "vcov"], vc[1, "vcov"]),
c("sigma_1^2", "sigma_2^2", "sigma_3^2",
"sigma_{12}", "sigma_{13}", "sigma_{23}")))
## sigma_1^2 sigma_2^2 sigma_3^2 sigma_{12} sigma_{13} sigma_{23}
## 15.03050 40.64658 18.64038 11.94450 11.94450 11.94450
V_m6 <- getME_V(m6)
round(V_m6[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 15.03 11.94 11.94
## 19 11.94 40.65 11.94
## 39 11.94 11.94 18.64
m7
Here I analyze the interaction LMM (int_LMM
).
m7 <- lmer(score ~ 1 + Machine + (1| Worker) + (1 | Machine:Worker), data=Machines, REML=FALSE)
(vc <- as.data.frame(VarCorr(m7)))
## grp var1 var2 vcov sdcor
## 1 Machine:Worker (Intercept) <NA> 11.5398455 3.3970348
## 2 Worker (Intercept) <NA> 19.0487024 4.3644819
## 3 Residual <NA> <NA> 0.9246296 0.9615766
(sigmas <- setNames(c(sum(vc[1:2, "vcov"]) , sum(vc[1:2, "vcov"]), sum(vc[1:2, "vcov"]),
vc[2, "vcov"], vc[2, "vcov"], vc[2, "vcov"]),
c("sigma_1^2", "sigma_2^2", "sigma_3^2",
"sigma_{12}", "sigma_{13}", "sigma_{23}")))
## sigma_1^2 sigma_2^2 sigma_3^2 sigma_{12} sigma_{13} sigma_{23}
## 30.58855 30.58855 30.58855 19.04870 19.04870 19.04870
V_m7 <- getME_V(m7)
round(V_m7[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 30.59 19.05 19.05
## 19 19.05 30.59 19.05
## 39 19.05 19.05 30.59
The int_LMM
is simpler than the LMM m6
in that in addition to its correlation parameters (CPs), the variance parameters (VP) are constrained to be equal. Therefore, this LMM is nested under m6
.
Note (again?) that irrespective of number of levels, the model always estimates only two parameters. From the decomposition it is transparent that the VP associated with the interaction term is added to the \(\Sigma\) diagonal.
m8
m8 <- lmer(score ~ 1 + Machine + (1 | Machine:Worker), data=Machines, REML=FALSE)
(vc <- as.data.frame(VarCorr(m8)))
## grp var1 var2 vcov sdcor
## 1 Machine:Worker (Intercept) <NA> 30.5885492 5.5306916
## 2 Residual <NA> <NA> 0.9246296 0.9615766
V_m8 <- getME_V(m8)
round(V_m8[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 30.59 . .
## 19 . 30.59 .
## 39 . . 30.59
The relation between the parameters in \(\mathbf\Sigma\) and what lmer
reports is transparent.
m1a
m1a <- lmer(score ~ 1 + Machine + (0 + A + B + C | Worker), data=Machines, REML=FALSE)
(vc1a <- as.data.frame(VarCorr(m1a)))
## grp var1 var2 vcov sdcor
## 1 Worker A <NA> 13.8157362 3.7169525
## 2 Worker B <NA> 61.9449805 7.8705134
## 3 Worker C <NA> 16.0049005 4.0006125
## 4 Worker A B 23.5372734 0.8045742
## 5 Worker A C 9.2887580 0.6246606
## 6 Worker B C 24.3200489 0.7723869
## 7 Residual <NA> <NA> 0.9246297 0.9615767
V_m1a <- getME_V(m1a)
round(V_m1a[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 13.82 23.54 9.29
## 19 23.54 61.94 24.32
## 39 9.29 24.32 16.00
m2a
m2a <- lmer(score ~ 1 + Machine + (0 + A | Worker) + (0 + B + C | Worker), data=Machines, REML=FALSE)
(vc2a <- as.data.frame(VarCorr(m2a)))
## grp var1 var2 vcov sdcor
## 1 Worker A <NA> 13.8157401 3.7169531
## 2 Worker.1 B <NA> 61.9449925 7.8705141
## 3 Worker.1 C <NA> 16.0049076 4.0006134
## 4 Worker.1 B C 24.3200601 0.7723870
## 5 Residual <NA> <NA> 0.9246296 0.9615766
V_m2a <- getME_V(m2a)
round(V_m2a[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 13.82 . .
## 19 . 61.94 24.32
## 39 . 24.32 16.00
m3a
m3a <- lmer(score ~ 1 + Machine + (0 + A + B + C || Worker), data=Machines, REML=FALSE)
(vc3a <- as.data.frame(VarCorr(m3a)))
## grp var1 var2 vcov sdcor
## 1 Worker A <NA> 13.8157407 3.7169532
## 2 Worker.1 B <NA> 61.9450011 7.8705147
## 3 Worker.2 C <NA> 16.0049075 4.0006134
## 4 Residual <NA> <NA> 0.9246296 0.9615766
V_m3a <- getME_V(m3a)
round(V_m3a[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 13.82 . .
## 19 . 61.95 .
## 39 . . 16
m4a
m4a <- lmer(score ~ 1 + Machine + (0 + B + C || Worker), data=Machines, REML=FALSE)
(vc4a <- as.data.frame(VarCorr(m4a)))
## grp var1 var2 vcov sdcor
## 1 Worker B <NA> 59.971321 7.744115
## 2 Worker.1 C <NA> 14.031230 3.745828
## 3 Residual <NA> <NA> 6.845661 2.616421
V_m4a <- getME_V(m4a)
round(V_m4a[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 . . .
## 19 . 59.97 .
## 39 . . 14.03
The hiearchical structure of LMMs m1
to m4
can also be seen in comparison of visualization based on estimates for the variance-covariance matrix V
based on 3 machines x 6 workers x 3 measures (i.e., 54 x 54).
Vs <- list(V_m1, V_m2_Ac1, V_m3, V_m4)
for(i in 1:4) print(image(Vs[[i]], asp=1))
This exercise demonstrates that LMM m1
to LMM m5
represent a nested hierarchical sequence from maximal to minimal model complexity. Thus, LRTs can be used for model comparison. The elements of \(\mathbf\Sigma\) correspond in a transparent way with the model specification in the lmer syntax. The decomposition of the three-level-factor LMMs described is quite transparent and suggestive of the analagous decomposition of factors with two or four or more levels.
Related RPubs:
This script was motivated and inspired by exchanges with Rune H. B. Christensen. I gratefully acknowledge his insights and comments.
sessionInfo()
## R version 3.5.1 (2018-07-02)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS 10.14
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] lme4_1.1-18-1 Matrix_1.2-14
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.18 lattice_0.20-35 digest_0.6.17
## [4] MASS_7.3-50 grid_3.5.1 nlme_3.1-137
## [7] magrittr_1.5 evaluate_0.11 stringi_1.2.4
## [10] minqa_1.2.4 nloptr_1.0.4 rmarkdown_1.10.11
## [13] splines_3.5.1 htmldeps_0.1.1 tools_3.5.1
## [16] stringr_1.3.1 yaml_2.2.0 compiler_3.5.1
## [19] htmltools_0.3.6 knitr_1.20