Linear mixed models (LMMs) must negotiate the complexity of the model with amount of support by the available data. Often this requires that we determine a model of intermediate complexity, that is a parsimonious mixed model (Bates et al., 2015a, Matuschek et al., 2017). The default strategy for the determination of such a model, is to compare models with likelihood ratio tests (LRTs). A valid LRT requires that the LMMs entering the comparison are identified (not degenerate or overparameterized) and can be ordered in a hierarchically nested sequence.
Here the focus is on the mathematical foundation of an LMM comprising a fixed factor f with levels A, B, and C and a random factor group g with levels drawn from a normal distribution. I use the default treatment contrast to estimate the mean of the base (A) and the two contrasts c1 = B - A and c2 = C - A as fixed effects and the corresponding variance parameters (VPs) and correlation parameters (CPs) in the random-effects structure. The demonstration reveals how the lmer syntax translates into the variance-covariance matrix of the random-effects structure. By forcing subsets of variance and correlation parameters to zero, I arrive at a sequence of nested LMMs.
There are two sections. I compare LMMs with the (1 ... g) specification of the random-effects structure (RES) in the first and LMMs with the (0 ... g) RES specification in the second section.
(1 + c1 + c2 | g) specificationLet us consider the following five LMMs, ordered in decreasing complexity, written in the lmerformula (Bates et al., 2015b).
m1: y ~ 1 + f + (1 + c1 + c2 | g)
m2: y ~ 1 + f + (1 | subj) + (0 + c1 + c2 | g)
m3: y ~ 1 + f + (1 + c1 + c2 || g) # or equivalently
y ~ 1 + f + (1 | g) + (0 + c1 | g) + (0 + c2 | g)
m4: y ~ 1 + f + (1 + c2 || g) # or equivalently
y ~ 1 + f + (1 | g) + (0 + c2 | g)
m5: y ~ 1 + f + (1 | g)
LMM m1 represents the maximal LMM. The LMM estimates 3 VPs corresponding to the fixed effects and 3 CPs.
LMM m3 represents a model of intermediated complexity. It also estimates 3 VPs, but forces the 3 CPs to zero, that is it is a zero-correlation parameter LMM.
LMM m5 is a minimal model comprising only 1 VP for the intercept. Given that we are using a treatment contrast, this VP estimates the variance among elements of the random factor g for the first level of fixed factor f.
In terms of model complexity LMM m2 and LMM m4 sort themselves between LMM m1 and LMM m3 and between LMM m3 and LMM m5, respectively. In LMM m2, I force one of the CPs to zero and LMM m4 (which like LMM m3 forces all CPs to zero). In addition, I assume that one of the VPs is not supported by the data and can be eliminated from the model.
The script demonstrates that the elements of \(\mathbf\Sigma\) correspond in a transparent way with the model specification in the lmer syntax. Morevoer, LMM m1 to LMM m5 represent a nested hierarchical sequence from maximal to minimal model complexity. Thus, LRTs can be used for comparing models in this sequence.
(0 + c1 + c2 | g) specificationThe LMMs in this section represent two clusters. The first cluster consists of LMMs m6, m7, and m8. The LMMs m1a to m4a in the second cluster corrspond to LMMs m1to m5 in their random-effects structure (m1ais a max_LMM and m3a is a zcp_LMM), but they estimate a very different set of VPs and CPs. Each of the two clusters represents a nested model sequence of decreasing complexity.
Sequence 1
m6: y ~ 1 + f + (1 | subj) + (0 + A + B + C || subj)
m7: y ~ 1 + f + (1 | subj) + (1 | factor:subj)
m8: y ~ 1 + f + (1 | factor:subj)
Sequence 2
m1a: y ~ 1 + f + (0 + A + B + C | subj)
m2a: y ~ 1 + f + (0 + A | subj) + (0 + B + C | subj)
m3a: y ~ 1 + f + (0 + A + B + C || subj)
m4a: y ~ 1 + f + (0 + B + C || subj)
Note that LMMs m7 and m8 (as m5 in the first section) do not depend on the specification of the random-effect structure. They are included here because they can be compared with LMM m6 in the first sequence.
m1Let us begin with the maximal LMM (max_LMM) with default treatment contrasts both in the fixed-effect part and in the random-effect term. The model estimates 3 VPs and 3 CPs, aside from the residual variance and the fixed effects. The following models are nested under this LMM.
For this model, \(\Sigma\) is of the following form:
\[\mathbf\Sigma = \begin{bmatrix} \sigma_1^2 & \sigma_1^2 + \sigma_{12} & \sigma_1^2 + \sigma_{13}\\ \sigma_1^2 + \sigma_{12} & \sigma_1^2 + \sigma_2^2 + 2 \sigma_{12} & \sigma_1^2 + \sigma_{12} + \sigma_{13} + \sigma_{23}\\ \sigma_1^2 + \sigma_{13} & \sigma_1^2 + \sigma_{12} + \sigma_{13} + \sigma_{23} & \sigma_1^2 + \sigma_3^2 + 2\sigma_{13}\\ \end{bmatrix}.\]
m2Let us onsider two versions of LMM m2. The first version estimates only the correlation between the two contrasts. The second version estimates only the correlation between factor level A (i.e., intercept) and first contrast.
c1 and contrast c2Removing correlation parameters involving the intercept (i.e., \(\sigma_{12}\) and \(\sigma_{13}\) ) from \(\Sigma\), we obtain the following matrix:
\[\mathbf\Sigma = \begin{bmatrix} \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 + \sigma_2^2 & \sigma_1^2 + \sigma_{23} \\ \sigma_1^2 & \sigma_1^2 + \sigma_{23} & \sigma_1^2 + \sigma_3^2 \\ \end{bmatrix}.\]
A and contrast c1In this version of LMM m2, we estimate the CP for level Aand contrast c1. After removing \(\sigma_{13}\) and \(\sigma_{23}\) from the \(\Sigma\), we obtain the following specification:
\[\mathbf\Sigma = \begin{bmatrix} \sigma_1^2 & \sigma_1^2 + \sigma_{12} & \sigma_1^2 \\ \sigma_1^2 + \sigma_{12} & \sigma_1^2 + \sigma_2^2 + 2\sigma_{12} & \sigma_1^2 + \sigma_{12} \\ \sigma_1^2 & \sigma_1^2 + \sigma_{12} & \sigma_1^2 + \sigma_3^2 \\ \end{bmatrix}.\]
m3LMM m3 (zcp_LMM) estimates only VPs, correlation parameters are forced to zero.
The \(\Sigma\) on the following form:
\[\mathbf\Sigma = \begin{bmatrix} \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 + \sigma_2^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 + \sigma_3^2 \\ \end{bmatrix}.\]
The alternative specification for this LMM m3 corresponds to the following decomposition of the random-effect variance-covariance matrix:
\[\mathbf \Sigma = \begin{bmatrix} \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \end{bmatrix} + \begin{bmatrix} 0 & 0 & 0\\ 0 & \sigma_2^2 & 0\\ 0& 0& 0\\ \end{bmatrix} + \begin{bmatrix} 0 & 0 & 0\\ 0 & 0 & 0\\ 0& 0& \sigma_3^2\\ \end{bmatrix}\]
m4Without CPs and with the additional removal of the VC for the first contrast c1 \(\Sigma\) has the following form:
\[\mathbf\Sigma = \begin{bmatrix} \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 + \sigma_3^2 \\ \end{bmatrix}.\]
The alternative specification for this LMM m4 corresponds to the following decomposition of the random-effect variance-covariance matrix:
\[\mathbf \Sigma = \begin{bmatrix} \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \end{bmatrix} + \begin{bmatrix} 0 & 0 & 0\\ 0 & 0 & 0\\ 0& 0& \sigma_3^2\\ \end{bmatrix}\]
m5Finally, and somewhat trivially by now, for the minimal LMM m5, \(\Sigma\) looks like this.
\[\mathbf\Sigma = \begin{bmatrix} \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \end{bmatrix}.\]
m6\[\mathbf\Sigma = \begin{bmatrix} \sigma_A^2 & \sigma_{AB} & \sigma_{AC} \\ \sigma_{AB} & \sigma_B^2 & \sigma_{BC} \\ \sigma_{AC} & \sigma_{BC} & \sigma_C^2 \\ \end{bmatrix}.\]
m7This is the interaction LMM (int_LMM) that does not fit in the hierarchical sequence. Here is the reason why. Its \(\Sigma\) looks like this.
\[\mathbf\Sigma = \begin{bmatrix} \sigma_1^2 + \sigma_2^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 + \sigma_2^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 + \sigma_2^2 \\ \end{bmatrix}.\]
Interestingly, the model implements the assumption of equal covariances, but would require also some special assumptions about the diagonal elements to fit in the hierarchy. These assumptions would depend on the contrast and, at first glance, don’t offer much of an option, if a treatment contrast is chosen.
However, it fits very well into the hieararchy with LMM m1a, as shown in the next section.
m8LMM m8 is also a minimal LMM – like LMM m5. They differ, however, in that here the levels of the random factor represent unique combinations of Machine:Worker. This model is nested under the interaction LMM m7.
\[\mathbf\Sigma = \begin{bmatrix} \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \end{bmatrix}.\]
m1aHere is the general from for LMM m1a, that is the maximal LMM for the factor levels.
In the standard notation the model can be expressed as: \[y_{ijk} = \mu + \alpha_i + d_{ij} + e_{ijk}, \quad \mathbf d_j \sim N(\mathbf 0, \mathbf\Sigma), \quad e_{ijk} \sim N(0, \sigma^2)\]
where \(i=1, 2, 3\), \(j=1, \ldots, 6\), \(k=1, 2, 3\), \(\alpha_i\) is the effect of the \(i\)th machine and \(d_{ij}\) is the random effect of the \(j\)th worker on the \(i\)th machine, \(\mathbf d_j = (d_{1j}, d_{2j}, d_{3_j})^\top\) and \[\mathbf\Sigma = \begin{bmatrix} \sigma_A^2 & \sigma_{AB} & \sigma_{AC} \\ \sigma_{AB} & \sigma_B^2 & \sigma_{BC} \\ \sigma_{AC} & \sigma_{BC} & \sigma_C^2 \\ \end{bmatrix}.\]
where \(\sigma_A^2 \geq 0\), \(\sigma_C^2 \geq 0\), \(\sigma_C^2 \geq 0\).
There is a one-to-one corresponds between VPs and CPs in the lmer output and the elements of \(\mathbf\Sigma\). This is true for all the following models.
I derive variance and covariance elements of \(\Sigma\) for treatment contrasts from this model. This should give us the entries for the general form of LMM m1.
Variance
The variance for the first contrast B - A is:
\[\begin{align} \mathsf{Var}(Y_{2jk} - Y_{1jk}) =~& \mathsf{Var}(Y_{2jk}) + \mathsf{Var}(Y_{1jk}) - 2 \mathsf{Cov}(Y_{2jk}, Y_{1jk}) \\ =~& (\sigma_B^2 + \sigma^2) + (\sigma_A^2 + \sigma^2) - 2\sigma_{AB} \\ =~& \sigma_A^2 + \sigma_B^2 - 2\sigma_{AB} + 2\sigma^2 \end{align}\]
Covariance
The covariance between the two contrasts B - A and C - A is \[\begin{align} \mathsf{Cov}(Y_{2jk} - Y_{1jk}, Y_{3jk} - Y_{1jk}) =&~ \mathsf{Cov}(Y_{2jk}, Y_{3jk}) - \mathsf{Cov}(Y_{2jk}, Y_{1jk}) - \mathsf{Cov}(Y_{1jk}, Y_{3jk}) + \mathsf{Cov}(Y_{1jk}, Y_{1jk}) \\
=~& \sigma_{BC} - \sigma_{AB} - \sigma_{AC} + (\sigma_A^2 + \sigma^2) \\
=~& \sigma_A^2 - \sigma_{AB} - \sigma_{AC} + \sigma_{BC} + \sigma^2
\end{align}\]
In principle, the variance and covariance for the contrasts based on the factor levels should correspond to the elements of LMM m1, but note that \(\sigma_{AB}\) and \(\sigma_{AC}\) enter with a negativ and \(\sigma_{12}\) and \(\sigma_{13}\) with a positive sign. Fortunately, this can be reconciled if we specify c1 = A - B and c2 = A - C (see Illustration in 3.1).
m2aLMM m2a forces a zero CPs between level A and the other two levels. Thus, only the CP between B and C is estimated.
\[\mathbf\Sigma = \begin{bmatrix} \sigma_A^2 & 0 & 0 \\ 0 & \sigma_B^2 & \sigma_{BC} \\ 0 & \sigma_{BC} & \sigma_C^2 \\ \end{bmatrix}.\]
Model parameters in \(\mathbf\Sigma\) correspond directly to those reported in the lmer output.
m3aLMM m3a is the zcp_LMM for this specification; VPs estimate the variance of g performance associated with the three levels of the factor.
\[\mathbf\Sigma = \begin{bmatrix} \sigma_A^2 & 0 & 0 \\ 0 & \sigma_B^2 & 0 \\ 0 & 0 & \sigma_C^2 \\ \end{bmatrix}.\]
Model parameters in \(\mathbf\Sigma\) correspond directly to those reported in the lmer output.
m4aLMM m4 represent the assumption that there the data do not support a reliable VC for level A.
\[\mathbf\Sigma = \begin{bmatrix} 0 & 0 & 0 \\ 0 & \sigma_B^2 & 0 \\ 0 & 0 & \sigma_C^2 \\ \end{bmatrix}.\]
Model parameters in \(\mathbf\Sigma\) correspond directly to those reported in the lmer output.
Here I illustrate that the entries of \(\mathbf\Sigma\) for the various LMMs can be computed from the estimates of VP and CP. I use the Machines data which are part of the MEMSS package [efficiency scores of 6 workers (random factor) tested on each of 3 machines (fixed factor) for 3 times ].
For LMM m1 to LMM m4 I extract the intercept and two indicator variables c1 and c2 coding the treatment-contrast specification from the model matrix.
For LMM m6 and LMMs m8 to m12 I extract the three levelsA, B, and C of the factor from the mode matrix.
Results for LMMs m5, m7, and m8 do not depend on the contrast-based or level-based specification. These models estimate only one or (in the case of m7 two) variance parameters.
m1contrasts(Machines$Machine) <- contr.treatment(3, base=1)
mm <- model.matrix(~ 1 + Machine, Machines)
c1 <- mm[, 2]
c2 <- mm[, 3]
m1 <- lmer(score ~ 1 + Machine + (1 + c1 + c2 | Worker), data=Machines, REML=FALSE)
(vc <- as.data.frame(VarCorr(m1)))
## grp var1 var2 vcov sdcor
## 1 Worker (Intercept) <NA> 13.8157529 3.7169548
## 2 Worker c1 <NA> 28.6861701 5.3559472
## 3 Worker c2 <NA> 11.2431151 3.3530755
## 4 Worker (Intercept) c1 9.7215585 0.4883288
## 5 Worker (Intercept) c2 -4.5269756 -0.3632265
## 6 Worker c1 c2 5.3097436 0.2956609
## 7 Residual <NA> <NA> 0.9246296 0.9615766
# Compute Sigma values according to general form for m1
(sigmas <- setNames(c(vc[1, "vcov"],
sum(vc[c(1:2), "vcov"]) + 2*vc[4, "vcov"],
sum(vc[c(1,3), "vcov"]) + 2*vc[5, "vcov"],
vc[1, "vcov"] + vc[4, "vcov"],
vc[1, "vcov"] + vc[5, "vcov"],
vc[1, "vcov"] + sum(vc[4:6, "vcov"])),
c("sigma_1^2", "sigma_2^2", "sigma_3^2",
"sigma_{12}", "sigma_{13}", "sigma_{23}")))
## sigma_1^2 sigma_2^2 sigma_3^2 sigma_{12} sigma_{13} sigma_{23}
## 13.815753 61.945040 16.004917 23.537311 9.288777 24.320079
V_m1 <- getME_V(m1)
round(V_m1[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 13.82 23.54 9.29
## 19 23.54 61.95 24.32
## 39 9.29 24.32 16.00
The diagonal and off-diagonal values in \(\Sigma\) computed from the lmer model parameter estimates according to the general form for LMM m1 match the values of the 3 x 3 block pattern in V_m1.
The \(\Sigma\) of LMM m1 differs from the \(\Sigma\) of LMM m1a in that \(\sigma_{12}\) and \(\sigma_{13}\) enter the computation with a negative sign. The is simple due to reversing the direction of subtraction for contrasts c1 and c2. With this specification, I also need to subtract \(\sigma_{12}\) and \(\sigma_{13}\) to achieve the match.
c1_ <- -c1
c2_ <- -c2
m1_ <- lmer(score ~ 1 + Machine + (1 + c1_ + c2_ | Worker), data=Machines, REML=FALSE)
(vc <- as.data.frame(VarCorr(m1_)))
## grp var1 var2 vcov sdcor
## 1 Worker (Intercept) <NA> 13.8157433 3.7169535
## 2 Worker c1_ <NA> 28.6861865 5.3559487
## 3 Worker c2_ <NA> 11.2431178 3.3530759
## 4 Worker (Intercept) c1_ -9.7215504 -0.4883284
## 5 Worker (Intercept) c2_ 4.5269789 0.3632269
## 6 Worker c1_ c2_ 5.3097474 0.2956610
## 7 Residual <NA> <NA> 0.9246296 0.9615766
# For inverted contrasts need to subtract sigma_12 and sigma_13 (see general form for m1a)
(sigmas <- setNames(c(vc[1, "vcov"],
sum(vc[c(1:2), "vcov"]) - 2*vc[4, "vcov"],
sum(vc[c(1,3), "vcov"]) - 2*vc[5, "vcov"],
vc[1, "vcov"] - vc[4, "vcov"],
vc[1, "vcov"] - vc[5, "vcov"],
vc[1, "vcov"] - sum(vc[4:5, "vcov"]) + vc[6, "vcov"]),
c("sigma_1^2", "sigma_2^2", "sigma_3^2",
"sigma_{12}", "sigma_{13}", "sigma_{23}")))
## sigma_1^2 sigma_2^2 sigma_3^2 sigma_{12} sigma_{13} sigma_{23}
## 13.815743 61.945031 16.004903 23.537294 9.288764 24.320062
V_m1_ <- getME_V(m1_)
round(V_m1_[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 13.82 23.54 9.29
## 19 23.54 61.95 24.32
## 39 9.29 24.32 16.00
m2c1 and contrast c2m2_c1c2 <- lmer(score ~ 1 + Machine + (1 | Worker) + (0 + c1 + c2 | Worker), data=Machines, REML=FALSE)
(vc <- as.data.frame(VarCorr(m2_c1c2)))
## grp var1 var2 vcov sdcor
## 1 Worker (Intercept) <NA> 13.7151414 3.7033959
## 2 Worker.1 c1 <NA> 29.1133286 5.3956768
## 3 Worker.1 c2 <NA> 11.0432239 3.3231347
## 4 Worker.1 c1 c2 5.4237504 0.3024864
## 5 Residual <NA> <NA> 0.9257324 0.9621499
# Compute Sigma values according to general form for m2_c1c2
(sigmas <- setNames(c(vc[1, "vcov"],
sum(vc[c(1:2), "vcov"]),
sum(vc[c(1,3), "vcov"]),
vc[1, "vcov"],
vc[1, "vcov"],
vc[1, "vcov"] + vc[4, "vcov"]),
c("sigma_1^2", "sigma_2^2", "sigma_3^2",
"sigma_{12}", "sigma_{13}", "sigma_{23}")))
## sigma_1^2 sigma_2^2 sigma_3^2 sigma_{12} sigma_{13} sigma_{23}
## 13.71514 42.82847 24.75837 13.71514 13.71514 19.13889
V_c1c2 <- getME_V(m2_c1c2)
round(V_c1c2[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 13.72 13.72 13.72
## 19 13.72 42.83 19.14
## 39 13.72 19.14 24.76
All computed values match with the contents of V_c1_c2.
A and contrast c1m2_Ac1 <- lmer(score ~ 1 + Machine + (1 + c1 | Worker) + (0 + c2 | Worker), data=Machines, REML=FALSE)
(vc <- as.data.frame(VarCorr(m2_Ac1)))
## grp var1 var2 vcov sdcor
## 1 Worker (Intercept) <NA> 13.5634264 3.6828557
## 2 Worker c1 <NA> 28.3878238 5.3280225
## 3 Worker (Intercept) c1 9.9954824 0.5093930
## 4 Worker.1 c2 <NA> 10.6339867 3.2609794
## 5 Residual <NA> <NA> 0.9329676 0.9659025
# Compute Sigma values according to general form for m2_Ac1
(sigmas <- setNames(c(vc[1, "vcov"],
sum(vc[c(1:2), "vcov"]) + 2*vc[3, "vcov"],
sum(vc[c(1,4), "vcov"]),
vc[1, "vcov"] + vc[3, "vcov"],
vc[1, "vcov"],
vc[1, "vcov"] + vc[3, "vcov"]),
c("sigma_1^2", "sigma_2^2", "sigma_3^2",
"sigma_{12}", "sigma_{13}", "sigma_{23}")))
## sigma_1^2 sigma_2^2 sigma_3^2 sigma_{12} sigma_{13} sigma_{23}
## 13.56343 61.94221 24.19741 23.55891 13.56343 23.55891
V_m2_Ac1 <- getME_V(m2_Ac1)
round(V_m2_Ac1[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 13.56 23.56 13.56
## 19 23.56 61.94 23.56
## 39 13.56 23.56 24.20
Other off-diagonal values agree with the VP estimate for (\(\sigma_1^2\)). Thus, all computed values match with the contents of V_m2_Ac1.
m3The worker-level covariance matrix (ignoring the residual variance) has the following structure:
m3 <- lmer(score ~ 1 + Machine + (1 + c1 + c2 || Worker), data=Machines, REML=FALSE)
(vc <- as.data.frame(VarCorr(m3)))
## grp var1 var2 vcov sdcor
## 1 Worker (Intercept) <NA> 13.7771273 3.7117553
## 2 Worker.1 c1 <NA> 28.8274937 5.3691241
## 3 Worker.2 c2 <NA> 10.9321153 3.3063749
## 4 Residual <NA> <NA> 0.9265345 0.9625666
# Compute Sigma values according to general form for m3
(sigmas <- setNames(c(vc[1, "vcov"],
sum(vc[1:2, "vcov"]),
sum(vc[c(1,3), "vcov"]),
vc[1, "vcov"], vc[1, "vcov"], vc[1, "vcov"]),
c("sigma_1^2", "sigma_2^2", "sigma_3^2",
"sigma_{12}", "sigma_{13}", "sigma_{23}")))
## sigma_1^2 sigma_2^2 sigma_3^2 sigma_{12} sigma_{13} sigma_{23}
## 13.77713 42.60462 24.70924 13.77713 13.77713 13.77713
V_m3 <- getME_V(m3)
round(V_m3[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 13.78 13.78 13.78
## 19 13.78 42.60 13.78
## 39 13.78 13.78 24.71
I compute the entries for V1_m3 from the estimates of model parameters as shown in the general form of \(\Sigma\) for this model. The values are in agreement.
m4Here I drop the VP for c1, that is I assume there on reliable worker-related variance for the difference between Machines A and B. I also assume that none of the CPs can be estimated reliably. So there are only two VP parameters, one for level A and the other for the second contrast c2.
m4 <- lmer(score ~ 1 + Machine + (1 + c2 || Worker), data=Machines, REML=FALSE)
(vc <- as.data.frame(VarCorr(m4)))
## grp var1 var2 vcov sdcor
## 1 Worker (Intercept) <NA> 27.221350 5.217408
## 2 Worker.1 c2 <NA> 8.911068 2.985141
## 3 Residual <NA> <NA> 7.159796 2.675780
# Compute Sigma values according to general form for m3
(sigmas <- setNames(c(vc[1, "vcov"], vc[1, "vcov"],
sum(vc[c(1,2), "vcov"]),
vc[1, "vcov"], vc[1, "vcov"], vc[1, "vcov"]),
c("sigma_1^2", "sigma_2^2", "sigma_3^2",
"sigma_{12}", "sigma_{13}", "sigma_{23}")))
## sigma_1^2 sigma_2^2 sigma_3^2 sigma_{12} sigma_{13} sigma_{23}
## 27.22135 27.22135 36.13242 27.22135 27.22135 27.22135
V_m4 <- getME_V(m4)
round(V_m4[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 27.22 27.22 27.22
## 19 27.22 27.22 27.22
## 39 27.22 27.22 36.13
m5The worker-level covariance matrix (ignoring the residual variance) has the following structure:
m5 <- lmer(score ~ 1 + Machine + (1 | Worker), data=Machines, REML=FALSE)
(vc <- data.frame(VarCorr(m5)))
## grp var1 var2 vcov sdcor
## 1 Worker (Intercept) <NA> 21.933665 4.683339
## 2 Residual <NA> <NA> 9.579514 3.095079
(sigmas <- setNames(vc[1, "vcov"], c("sigma_1^2")))
## sigma_1^2
## 21.93367
V_m5 <- getME_V(m5)
round(V_m5[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 21.93 21.93 21.93
## 19 21.93 21.93 21.93
## 39 21.93 21.93 21.93
The relation between the parameters in \(\mathbf\Sigma\) and what lmer reports is transparent.
m6mm <- model.matrix(~ 0 + Machine, Machines)
A <- mm[, 1]
B <- mm[, 2]
C <- mm[, 3]
m6 <- lmer(score ~ 1 + Machine + (1 | Worker) + (0 + A + B + C || Worker), data=Machines, REML=FALSE)
(vc <- as.data.frame(VarCorr(m6)))
## grp var1 var2 vcov sdcor
## 1 Worker (Intercept) <NA> 11.9444977 3.4560813
## 2 Worker.1 A <NA> 3.0860039 1.7567026
## 3 Worker.2 B <NA> 28.7020858 5.3574328
## 4 Worker.3 C <NA> 6.6958839 2.5876406
## 5 Residual <NA> <NA> 0.9246296 0.9615766
(sigmas <- setNames(c(sum(vc[1:2, "vcov"]) , sum(vc[c(1,3), "vcov"]), sum(vc[c(1,4), "vcov"]),
vc[1, "vcov"], vc[1, "vcov"], vc[1, "vcov"]),
c("sigma_1^2", "sigma_2^2", "sigma_3^2",
"sigma_{12}", "sigma_{13}", "sigma_{23}")))
## sigma_1^2 sigma_2^2 sigma_3^2 sigma_{12} sigma_{13} sigma_{23}
## 15.03050 40.64658 18.64038 11.94450 11.94450 11.94450
V_m6 <- getME_V(m6)
round(V_m6[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 15.03 11.94 11.94
## 19 11.94 40.65 11.94
## 39 11.94 11.94 18.64
m7Here I analyze the interaction LMM (int_LMM).
m7 <- lmer(score ~ 1 + Machine + (1| Worker) + (1 | Machine:Worker), data=Machines, REML=FALSE)
(vc <- as.data.frame(VarCorr(m7)))
## grp var1 var2 vcov sdcor
## 1 Machine:Worker (Intercept) <NA> 11.5398455 3.3970348
## 2 Worker (Intercept) <NA> 19.0487024 4.3644819
## 3 Residual <NA> <NA> 0.9246296 0.9615766
(sigmas <- setNames(c(sum(vc[1:2, "vcov"]) , sum(vc[1:2, "vcov"]), sum(vc[1:2, "vcov"]),
vc[2, "vcov"], vc[2, "vcov"], vc[2, "vcov"]),
c("sigma_1^2", "sigma_2^2", "sigma_3^2",
"sigma_{12}", "sigma_{13}", "sigma_{23}")))
## sigma_1^2 sigma_2^2 sigma_3^2 sigma_{12} sigma_{13} sigma_{23}
## 30.58855 30.58855 30.58855 19.04870 19.04870 19.04870
V_m7 <- getME_V(m7)
round(V_m7[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 30.59 19.05 19.05
## 19 19.05 30.59 19.05
## 39 19.05 19.05 30.59
The int_LMM is simpler than the LMM m6 in that in addition to its correlation parameters (CPs), the variance parameters (VP) are constrained to be equal. Therefore, this LMM is nested under m6.
Note (again?) that irrespective of number of levels, the model always estimates only two parameters. From the decomposition it is transparent that the VP associated with the interaction term is added to the \(\Sigma\) diagonal.
m8m8 <- lmer(score ~ 1 + Machine + (1 | Machine:Worker), data=Machines, REML=FALSE)
(vc <- as.data.frame(VarCorr(m8)))
## grp var1 var2 vcov sdcor
## 1 Machine:Worker (Intercept) <NA> 30.5885492 5.5306916
## 2 Residual <NA> <NA> 0.9246296 0.9615766
V_m8 <- getME_V(m8)
round(V_m8[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 30.59 . .
## 19 . 30.59 .
## 39 . . 30.59
The relation between the parameters in \(\mathbf\Sigma\) and what lmer reports is transparent.
m1am1a <- lmer(score ~ 1 + Machine + (0 + A + B + C | Worker), data=Machines, REML=FALSE)
(vc1a <- as.data.frame(VarCorr(m1a)))
## grp var1 var2 vcov sdcor
## 1 Worker A <NA> 13.8157362 3.7169525
## 2 Worker B <NA> 61.9449805 7.8705134
## 3 Worker C <NA> 16.0049005 4.0006125
## 4 Worker A B 23.5372734 0.8045742
## 5 Worker A C 9.2887580 0.6246606
## 6 Worker B C 24.3200489 0.7723869
## 7 Residual <NA> <NA> 0.9246297 0.9615767
V_m1a <- getME_V(m1a)
round(V_m1a[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 13.82 23.54 9.29
## 19 23.54 61.94 24.32
## 39 9.29 24.32 16.00
m2am2a <- lmer(score ~ 1 + Machine + (0 + A | Worker) + (0 + B + C | Worker), data=Machines, REML=FALSE)
(vc2a <- as.data.frame(VarCorr(m2a)))
## grp var1 var2 vcov sdcor
## 1 Worker A <NA> 13.8157401 3.7169531
## 2 Worker.1 B <NA> 61.9449925 7.8705141
## 3 Worker.1 C <NA> 16.0049076 4.0006134
## 4 Worker.1 B C 24.3200601 0.7723870
## 5 Residual <NA> <NA> 0.9246296 0.9615766
V_m2a <- getME_V(m2a)
round(V_m2a[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 13.82 . .
## 19 . 61.94 24.32
## 39 . 24.32 16.00
m3am3a <- lmer(score ~ 1 + Machine + (0 + A + B + C || Worker), data=Machines, REML=FALSE)
(vc3a <- as.data.frame(VarCorr(m3a)))
## grp var1 var2 vcov sdcor
## 1 Worker A <NA> 13.8157407 3.7169532
## 2 Worker.1 B <NA> 61.9450011 7.8705147
## 3 Worker.2 C <NA> 16.0049075 4.0006134
## 4 Residual <NA> <NA> 0.9246296 0.9615766
V_m3a <- getME_V(m3a)
round(V_m3a[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 13.82 . .
## 19 . 61.95 .
## 39 . . 16
m4am4a <- lmer(score ~ 1 + Machine + (0 + B + C || Worker), data=Machines, REML=FALSE)
(vc4a <- as.data.frame(VarCorr(m4a)))
## grp var1 var2 vcov sdcor
## 1 Worker B <NA> 59.971321 7.744115
## 2 Worker.1 C <NA> 14.031230 3.745828
## 3 Residual <NA> <NA> 6.845661 2.616421
V_m4a <- getME_V(m4a)
round(V_m4a[seq(1,9,3), seq(1,9,3)], 2)
## 3 x 3 sparse Matrix of class "dgCMatrix"
## 1 19 39
## 1 . . .
## 19 . 59.97 .
## 39 . . 14.03
The hiearchical structure of LMMs m1 to m4 can also be seen in comparison of visualization based on estimates for the variance-covariance matrix V based on 3 machines x 6 workers x 3 measures (i.e., 54 x 54).
Vs <- list(V_m1, V_m2_Ac1, V_m3, V_m4)
for(i in 1:4) print(image(Vs[[i]], asp=1))
This exercise demonstrates that LMM m1 to LMM m5 represent a nested hierarchical sequence from maximal to minimal model complexity. Thus, LRTs can be used for model comparison. The elements of \(\mathbf\Sigma\) correspond in a transparent way with the model specification in the lmer syntax. The decomposition of the three-level-factor LMMs described is quite transparent and suggestive of the analagous decomposition of factors with two or four or more levels.
Related RPubs:
This script was motivated and inspired by exchanges with Rune H. B. Christensen. I gratefully acknowledge his insights and comments.
sessionInfo()
## R version 3.5.1 (2018-07-02)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS 10.14
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] lme4_1.1-18-1 Matrix_1.2-14
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.18 lattice_0.20-35 digest_0.6.17
## [4] MASS_7.3-50 grid_3.5.1 nlme_3.1-137
## [7] magrittr_1.5 evaluate_0.11 stringi_1.2.4
## [10] minqa_1.2.4 nloptr_1.0.4 rmarkdown_1.10.11
## [13] splines_3.5.1 htmldeps_0.1.1 tools_3.5.1
## [16] stringr_1.3.1 yaml_2.2.0 compiler_3.5.1
## [19] htmltools_0.3.6 knitr_1.20