This is a follow-up to an RPub on factor contrasts and factor levels in linear mixed models. It eliminates all LMMs with 0 + factor
in the fixed-effect part (FE0
), because all these LMMs are mathematically equivalent to (a reparameterization of) the corresponding 1 + factor
version (FE1
). I will use the FE0
version whenever this specification is used in the random-effect structure to maintain consistency of estimated model parameters across fixed and random part. The older script also describes various alternative specifications. I highly recommend Bates’s (2009) explanation of how factors (i.e., categorical covariates) are incorporated into model matrices.
I only use LMMs with numeric covariates extracted from the model matrix. factor
is replaced with the corresponding columns of the model matrix, that is with numeric covariates. Specifically, I use a factor with three levels A
, B
, and C
, with c1
and c2
representing the two contrasts defined for the three levels. Since I will use only numeric covariates, I can also use the double-bar syntax without risk of confusion.
Three of the five LMMs introduced here serve as anchors of a spectrum of model complexity ranging from a maximal LMM (max_LMM) estimating all variance parameters (VPs) and covariance/correlation parameters (CPs) to an LMM of intermediate complexity estimating all VPs, but no CPs [zero_correlation parameter LMM (zcp_LMM)] to a minimal LMM with only one VP for the intercept (min_LMM), ignoring the residual. Between max_LMM and zcp_LMM, there is room for parsimonious LMMs of the first kind (prsm1_LMM), that is models pruning some of the CPs. Between zcp_LMM and min_LMM, there is room for parsimonious LMMs of the second kind (prsm2_LMM), that is models pruning some of the VPs in the absence of CPs. Thus, the order reflects a strictly decreasing model complexity. All of these LMMs use the numeric-covariate equivalent of the 1 + factor
RE-structure (RE1
). Therefore, this notation is silently implied in the specification of model acronyms.
1. max_LMM: y ~ 1 + c1 + c2 + (1 + c1 + c2 | subj)
This model is the maximal LMM for the a fixed factor with 3 levels and a single random factor subj. In general, if k is the number of factor levels, we estimate k VPs and k(k-1)/2 CPs and up to k random effects for each level of subj.
2. prsm1_LMM: y ~ 1 + c1 + c2 + (1 | subj) + (0 + c1 + c2 | subj)
This is but only one version of a prsm1_LMM. The specification suppresses the CP between the intercept and the two contrasts. Alternative versions suppress the CP between the two contrasts or between the intercept and one of the contrasts. These distinctions do not matter here, but, of course, they are relevant for the analysis of a specific research question with a specific set of data. prsm1_LMM is nested under maxLMM because the models differ only in whether a CP is forced to zero or not.
3. zcp_LMM: y ~ 1 + c1 + c2 + (1 + c1 + c2 || subj)
This model estimates VPs for the intercept and contrasts, but it forces all CPs to zero. Therefore, the zcp_LMM is nested under prsm1_LMM because the latter will have at least one CP.
4. prsm2_LMM: y ~ 1 + c1 + c2 + (1 + c2 || subj)
Whereas in prsm1_LMM one or several CPs are forced to zero, in the prsm2_LMM this applies to one or several VPs. In the example specification in (4), the prsm2_LMM is appropriate if the data do not support an LMM with reliable between-subject variance in c1
. Prsm2_LMM has at least one VP less than zcp_LMM. Therefore, it is nested under zcp_LMM.
The minimal LMM is an LMM varying only in the intercept. If prsm2_LMM contains at least one VP beyond the VP for the intercept, then min1_LMM is nested under prsm2_LMM.
5. min1_LMM: y ~ 1 + c1 + c2 + (1 | subj)
Model complexity increases strongly in LMMs with the number of factor levels as defined in (1) to (4) above. Even for a moderate number of levels k, the sum of estimated VP and CP parameters can be large, that is k(k+1)/2 for max_LMM. The interaction LMM (int_LMM) described here tests the interaction between the fixed and the random factor. It also has the advantage that model complexity does not increase when the number of factor levels increases. It estimates only two VPs and no CPs.
6. int_LMM: y ~ 1 + c1 + c2 + (1 | subj) + (1 | factor:subj)
The int_LMM allows for an additive shift for each level of of subj
with (1 | subj)
and an additive shift for each (observed) combination of levels of factor
and subj
(Bates, 2009). In other words, in addition, to the VP for the intercept (i.e., the differences between levels of the random factor subj
, the model introduces a second VP that captures the variance associated with deviation from parallelism. This second VP will be zero for a three-level factor in the unlikely case that the three subject profiles associated with the fixed-factor levels are parallel or, in other words, that there are no reliably interindividual differences for the contrasts defined for the factor.
Int_LMM differs from min_LMM only in the orthogonal interaction term. Thus, min_LMM is also nested under int_LMM and can be used to test the significanc of the interaction term in int_LMM in a LRT.
Int_LMM has fewer model parameters than zcp_LMM, but it has a different mathematical structure and is not nested und zcp_LMM. The int_LMM, however, is nested under max_LMM; specifically, it corresponds to max_LMM_RE0 (see below; which is a reparameterization of max_LMM) subject to the constraint that the variance-covariance matrix for the random effects have a compound symmetry pattern (Bates, 2009).
The random-factor term (1 | factor:subj)
can also be viewed as re-definition of the levels of the random factor. In other words, it might be assumed that it is the combination of subjects and factor levels represents a population from which a random sample was drawn. If this is a valid assumption, then the following model is an alternative minimal LMM for the data set under analysis.
7. min2_LMM: y ~ 1 + c1 + c2 + (1 | factor:subj)
It is obvious from a comparison of int_LMM and min2_LMM that the latter can serve as a baseline for assessing the significance of the subject-related random-factor term (1 | subj)
, that is whether there are reliable individual differences between subjects beyond the variance that is captured by the interaction of the fixed and random factor. In other words, min2_LMM is nested under int_LMM.
Finally, within the taxonomy introduced here, I mention three models that correspond to three models mentioned above, but use the RE0
structure.
8. max_LMM_RE0: y ~ 1 + c1 + c2 + (0 + A + B + C | subj)
9. prsm1_LMM_RE0: y ~ 1 + c1 + c2 + (0 + A | subj) + (0 + B + C | subj)
10. zcp_LMM_RE0: y ~ 1 + c1 + c2 + (0 + A + B + C || subj)
Note: maxLMM_RE0 is a reparameterization of max_LMM, but prsm1LMM_RE0 and zcp_LMM_RE0 are not necessarily reparameterizations of prsm1_LMM and zcp_LMM; whether or not mathematical equivalence holds depends on the type of contrast specified for the factor (see the previous RPub for some examples).
In general, LMMs with RE0 structure are not used much. One advantage is that their CPs are more easily interpreted because they correspond to the familiar table of correlations between measures (rather than to a table of correlations between effects).
The LMMs 1 to 10 defined above are hierarchically ordered according to following sequences from the complex to the simple model. Neighboring models within each sequence can be compared with a LRT statistic.
(1) max_LMM -> prsm1_LMM -> zcp_LMM -> prsm2_LMM -> min1_LMM
(2) max_LMM -> int_LMM -> min1_LMM
(3) max_LMM -> int_LMM -> min2_LMM
(4) max_LMM_RE0 -> prsm1_LMM_RE0 -> zcp_LMM_RE0
So far, in my research, I have worked almost exclusively with Sequence (1) and do not recall ever experiencing technical problems or inconsistencies for non-degenerate LMMs. Thus, this sequence will certainly allow one to determine a parsimonious LMM (Bates et al., 2015, Matuschek et al. 2017).
For complex fixed-effect structures (i.e., model with many factors or with factors with many levels), the number of CPs grows very rapidly. If this goes together with a modest number of observations or levels of the random factor, Sequences (2) and (3) might be a good place to start to avoid convergence problems. Of course, if your hypotheses are about CPs, these LMMs will not get you very far.
Finally, Sequence (4) could be a default strategy if one is interested in the fixed effects and if one does not want to spend much time in wondering about the meaning of CPs. However, as correlations between levels of fixed factors tend be large (at least larger than the correlations between effects), LMMs in this sequence may be prone to convergence problems.
Fits of the 10 LMMs defined here have been reported in the previous RPub. Here we illustrate the LRTs for each of the four sequences.
The illustrations use the Machines
data which are part of the MEMSS
package [efficiency scores of 6 workers (random factor) tested 3 times on each of 3 machines (fixed factor) ]. For later reference, I compute means and correlations between the 3 machine scores by averaging the 3 measures available for the 6 workers (fixed-effect correlations).
knitr::opts_chunk$set(echo = TRUE)
suppressMessages(library("lme4"))
library("RePsychLing")
suppressMessages(library("tidyverse"))
data("Machines", package = "MEMSS")
Machines$Measure <- factor(rep(1:3))
d <- as.tibble(Machines)
contrasts(d$Machine) <- contr.treatment(3)
# Extract numeric covariates
## ... for FE1 and RE1
mm1 <- model.matrix(~ 1 + Machine, d)
dBA <- mm1[, 2]
dCA <- mm1[, 3]
# ... for RE0
mm0 <- model.matrix(~ 0 + Machine, d)
A <- mm0[, 1]
B <- mm0[, 2]
C <- mm0[, 3]
# max_LMM
m1 <- lmer(score ~ 1 + dBA + dCA + (1 + dBA + dCA | Worker), d, REML=FALSE)
# prsm1_LMM
m2 <- lmer(score ~ 1 + dBA + dCA + (1 | Worker) + (0 + dBA + dCA | Worker), d, REML=FALSE)
# zcp_LMM
m3 <- lmer(score ~ 1 + dBA + dCA + (1 + dBA + dCA || Worker), d, REML=FALSE)
# prsm2_LMM
m4 <- lmer(score ~ 1 + dBA + dCA + (1 | Worker) + (0 + dCA || Worker), d, REML=FALSE)
# min1_LMM
m5 <- lmer(score ~ 1 + dBA + dCA + (1 | Worker), d, REML=FALSE)
anova(m5, m4, m3, m2, m1)
## Data: d
## Models:
## m5: score ~ 1 + dBA + dCA + (1 | Worker)
## m4: score ~ 1 + dBA + dCA + (1 | Worker) + ((0 + dCA | Worker))
## m3: score ~ 1 + dBA + dCA + ((1 | Worker) + (0 + dBA | Worker) +
## m3: (0 + dCA | Worker))
## m2: score ~ 1 + dBA + dCA + (1 | Worker) + (0 + dBA + dCA | Worker)
## m1: score ~ 1 + dBA + dCA + (1 + dBA + dCA | Worker)
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## m5 5 303.70 313.65 -146.85 293.70
## m4 6 300.47 312.40 -144.24 288.47 5.2321 1 0.02217 *
## m3 7 235.11 249.04 -110.56 221.11 67.3580 1 2.264e-16 ***
## m2 8 236.58 252.49 -110.29 220.58 0.5376 1 0.46342
## m1 10 236.42 256.31 -108.21 216.42 4.1578 2 0.12507
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# max_LMM
m1 <- lmer(score ~ 1 + dBA + dCA + (1 + dBA + dCA | Worker), d, REML=FALSE)
# int_LMM
m6 <- lmer(score ~ 1 + Machine + (1 | Worker) + (1 | Machine:Worker), d, REML=FALSE)
# min1_LMM
m5 <- lmer(score ~ 1 + dBA + dCA + (1 | Worker), d, REML=FALSE)
anova(m5, m6, m1)
## Data: d
## Models:
## m5: score ~ 1 + dBA + dCA + (1 | Worker)
## m6: score ~ 1 + Machine + (1 | Worker) + (1 | Machine:Worker)
## m1: score ~ 1 + dBA + dCA + (1 + dBA + dCA | Worker)
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## m5 5 303.70 313.65 -146.85 293.70
## m6 6 237.27 249.20 -112.64 225.27 68.4338 1 < 2e-16 ***
## m1 10 236.42 256.31 -108.21 216.42 8.8516 4 0.06492 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# max_LMM
m1 <- lmer(score ~ 1 + dBA + dCA + (1 + dBA + dCA | Worker), d, REML=FALSE)
# int_LMM
m6 <- lmer(score ~ 1 + Machine + (1 | Worker) + (1 | Machine:Worker), d, REML=FALSE)
# min2_LMM
m7 <- lmer(score ~ 1 + dBA + dCA + (1 | Machine:Worker), d, REML=FALSE)
anova(m7, m6, m1)
## Data: d
## Models:
## m7: score ~ 1 + dBA + dCA + (1 | Machine:Worker)
## m6: score ~ 1 + Machine + (1 | Worker) + (1 | Machine:Worker)
## m1: score ~ 1 + dBA + dCA + (1 + dBA + dCA | Worker)
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## m7 5 241.95 251.90 -115.98 231.95
## m6 6 237.27 249.20 -112.64 225.27 6.6816 1 0.009741 **
## m1 10 236.42 256.31 -108.21 216.42 8.8516 4 0.064917 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# maxL_MM_RE0
m8 <- lmer(score ~ 1 + dBA + dCA + (0 + A + B + C | Worker), d, REML=FALSE)
# prsm1_LMM_RE0
m9 <- lmer(score ~ 1 + dBA + dCA + (0 + A | Worker) + (0 + B + C | Worker), d, REML=FALSE)
# zcp_LMM_RE0
m10 <- lmer(score ~ 1 + dBA + dCA + (0 + A + B + C || Worker), d, REML=FALSE)
anova(m10, m9, m8)
## Data: d
## Models:
## m10: score ~ 1 + dBA + dCA + ((0 + A | Worker) + (0 + B | Worker) +
## m10: (0 + C | Worker))
## m9: score ~ 1 + dBA + dCA + (0 + A | Worker) + (0 + B + C | Worker)
## m8: score ~ 1 + dBA + dCA + (0 + A + B + C | Worker)
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## m10 7 241.63 255.55 -113.81 227.63
## m9 8 238.39 254.30 -111.19 222.39 5.2396 1 0.02208 *
## m8 10 236.42 256.31 -108.21 216.42 5.9683 2 0.05058 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
sessionInfo()
## R version 3.5.0 (2018-04-23)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS High Sierra 10.13.4
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] forcats_0.3.0 stringr_1.3.1 dplyr_0.7.4
## [4] purrr_0.2.4 readr_1.1.1 tidyr_0.8.0
## [7] tibble_1.4.2 ggplot2_2.2.1 tidyverse_1.2.1
## [10] RePsychLing_0.0.4 lme4_1.1-18 Matrix_1.2-14
##
## loaded via a namespace (and not attached):
## [1] reshape2_1.4.3 splines_3.5.0 haven_1.1.1 lattice_0.20-35
## [5] colorspace_1.3-2 htmltools_0.3.6 yaml_2.1.19 rlang_0.2.0
## [9] nloptr_1.0.4 pillar_1.2.2 foreign_0.8-70 glue_1.2.0
## [13] modelr_0.1.1 readxl_1.1.0 bindrcpp_0.2.2 bindr_0.1.1
## [17] plyr_1.8.4 munsell_0.4.3 gtable_0.2.0 cellranger_1.1.0
## [21] rvest_0.3.2 psych_1.8.4 evaluate_0.10.1 knitr_1.20
## [25] parallel_3.5.0 broom_0.4.4 Rcpp_0.12.17 backports_1.1.2
## [29] scales_0.5.0 jsonlite_1.5 mnormt_1.5-5 hms_0.4.2
## [33] digest_0.6.15 stringi_1.2.2 grid_3.5.0 rprojroot_1.3-2
## [37] cli_1.0.0 tools_3.5.0 magrittr_1.5 lazyeval_0.2.1
## [41] crayon_1.3.4 pkgconfig_2.0.1 MASS_7.3-50 xml2_1.2.0
## [45] lubridate_1.7.4 rstudioapi_0.7 assertthat_0.2.0 minqa_1.2.4
## [49] rmarkdown_1.9 httr_1.3.1 R6_2.2.2 nlme_3.1-137
## [53] compiler_3.5.0