1 Overview / Summary

This is a follow-up to an RPub on factor contrasts and factor levels in linear mixed models. It eliminates all LMMs with 0 + factor in the fixed-effect part (FE0), because all these LMMs are mathematically equivalent to (a reparameterization of) the corresponding 1 + factor version (FE1). I will use the FE0 version whenever this specification is used in the random-effect structure to maintain consistency of estimated model parameters across fixed and random part. The older script also describes various alternative specifications. I highly recommend Bates’s (2009) explanation of how factors (i.e., categorical covariates) are incorporated into model matrices.

I only use LMMs with numeric covariates extracted from the model matrix. factor is replaced with the corresponding columns of the model matrix, that is with numeric covariates. Specifically, I use a factor with three levels A, B, and C, with c1 and c2 representing the two contrasts defined for the three levels. Since I will use only numeric covariates, I can also use the double-bar syntax without risk of confusion.

2 Five prototype LMMs

Three of the five LMMs introduced here serve as anchors of a spectrum of model complexity ranging from a maximal LMM (max_LMM) estimating all variance parameters (VPs) and covariance/correlation parameters (CPs) to an LMM of intermediate complexity estimating all VPs, but no CPs [zero_correlation parameter LMM (zcp_LMM)] to a minimal LMM with only one VP for the intercept (min_LMM), ignoring the residual. Between max_LMM and zcp_LMM, there is room for parsimonious LMMs of the first kind (prsm1_LMM), that is models pruning some of the CPs. Between zcp_LMM and min_LMM, there is room for parsimonious LMMs of the second kind (prsm2_LMM), that is models pruning some of the VPs in the absence of CPs. Thus, the order reflects a strictly decreasing model complexity. All of these LMMs use the numeric-covariate equivalent of the 1 + factor RE-structure (RE1). Therefore, this notation is silently implied in the specification of model acronyms.

2.1 Maximal LMMs (max_LMMs)

1. max_LMM: y ~ 1 + c1 + c2 + (1 + c1 + c2 | subj)

This model is the maximal LMM for the a fixed factor with 3 levels and a single random factor subj. In general, if k is the number of factor levels, we estimate k VPs and k(k-1)/2 CPs and up to k random effects for each level of subj.

2.2 Parsimonious LMMs of the first type (prsm1_LMMs)

2. prsm1_LMM: y ~ 1 + c1 + c2 + (1 | subj) + (0 + c1 + c2 | subj)

This is but only one version of a prsm1_LMM. The specification suppresses the CP between the intercept and the two contrasts. Alternative versions suppress the CP between the two contrasts or between the intercept and one of the contrasts. These distinctions do not matter here, but, of course, they are relevant for the analysis of a specific research question with a specific set of data. prsm1_LMM is nested under maxLMM because the models differ only in whether a CP is forced to zero or not.

2.3 Zero-correlation parameters LMMs (zcp_LMMs)

3. zcp_LMM: y ~ 1 + c1 + c2 + (1 + c1 + c2 || subj) 

This model estimates VPs for the intercept and contrasts, but it forces all CPs to zero. Therefore, the zcp_LMM is nested under prsm1_LMM because the latter will have at least one CP.

2.4 Parsimonious LMMs of the second type (prsm2_LMMs)

4. prsm2_LMM: y ~ 1 + c1 + c2 + (1 + c2 || subj)

Whereas in prsm1_LMM one or several CPs are forced to zero, in the prsm2_LMM this applies to one or several VPs. In the example specification in (4), the prsm2_LMM is appropriate if the data do not support an LMM with reliable between-subject variance in c1. Prsm2_LMM has at least one VP less than zcp_LMM. Therefore, it is nested under zcp_LMM.

2.5 Minimal LMM (min1_LMM)

The minimal LMM is an LMM varying only in the intercept. If prsm2_LMM contains at least one VP beyond the VP for the intercept, then min1_LMM is nested under prsm2_LMM.

5. min1_LMM: y ~ 1 + c1 + c2 + (1 | subj)

3 LMMs with interaction of fixed and random factor (int_LMMs)

3.1 LMM with interaction term (int_LMM)

Model complexity increases strongly in LMMs with the number of factor levels as defined in (1) to (4) above. Even for a moderate number of levels k, the sum of estimated VP and CP parameters can be large, that is k(k+1)/2 for max_LMM. The interaction LMM (int_LMM) described here tests the interaction between the fixed and the random factor. It also has the advantage that model complexity does not increase when the number of factor levels increases. It estimates only two VPs and no CPs.

6. int_LMM: y ~ 1 + c1 + c2 + (1 | subj) + (1 | factor:subj) 

The int_LMM allows for an additive shift for each level of of subj with (1 | subj) and an additive shift for each (observed) combination of levels of factor and subj (Bates, 2009). In other words, in addition, to the VP for the intercept (i.e., the differences between levels of the random factor subj, the model introduces a second VP that captures the variance associated with deviation from parallelism. This second VP will be zero for a three-level factor in the unlikely case that the three subject profiles associated with the fixed-factor levels are parallel or, in other words, that there are no reliably interindividual differences for the contrasts defined for the factor.

Int_LMM differs from min_LMM only in the orthogonal interaction term. Thus, min_LMM is also nested under int_LMM and can be used to test the significanc of the interaction term in int_LMM in a LRT.

Int_LMM has fewer model parameters than zcp_LMM, but it has a different mathematical structure and is not nested und zcp_LMM. The int_LMM, however, is nested under max_LMM; specifically, it corresponds to max_LMM_RE0 (see below; which is a reparameterization of max_LMM) subject to the constraint that the variance-covariance matrix for the random effects have a compound symmetry pattern (Bates, 2009).

3.2 Minimal LMM based on interaction term (min2_LMM)

The random-factor term (1 | factor:subj) can also be viewed as re-definition of the levels of the random factor. In other words, it might be assumed that it is the combination of subjects and factor levels represents a population from which a random sample was drawn. If this is a valid assumption, then the following model is an alternative minimal LMM for the data set under analysis.

7. min2_LMM: y ~ 1 + c1 + c2 + (1 | factor:subj)

It is obvious from a comparison of int_LMM and min2_LMM that the latter can serve as a baseline for assessing the significance of the subject-related random-factor term (1 | subj), that is whether there are reliable individual differences between subjects beyond the variance that is captured by the interaction of the fixed and random factor. In other words, min2_LMM is nested under int_LMM.

4 LMMs with RE0 structure

Finally, within the taxonomy introduced here, I mention three models that correspond to three models mentioned above, but use the RE0 structure.

 8. max_LMM_RE0:   y ~ 1 + c1 + c2 + (0 + A + B + C | subj)
 
 9. prsm1_LMM_RE0: y ~ 1 + c1 + c2 + (0 + A | subj) + (0 + B + C | subj)
 
10. zcp_LMM_RE0:   y ~ 1 + c1 + c2 + (0 + A + B + C || subj)

Note: maxLMM_RE0 is a reparameterization of max_LMM, but prsm1LMM_RE0 and zcp_LMM_RE0 are not necessarily reparameterizations of prsm1_LMM and zcp_LMM; whether or not mathematical equivalence holds depends on the type of contrast specified for the factor (see the previous RPub for some examples).

In general, LMMs with RE0 structure are not used much. One advantage is that their CPs are more easily interpreted because they correspond to the familiar table of correlations between measures (rather than to a table of correlations between effects).

5 Definition of hierarchical model sequences

The LMMs 1 to 10 defined above are hierarchically ordered according to following sequences from the complex to the simple model. Neighboring models within each sequence can be compared with a LRT statistic.

(1) max_LMM -> prsm1_LMM -> zcp_LMM -> prsm2_LMM -> min1_LMM  

(2) max_LMM -> int_LMM -> min1_LMM 

(3) max_LMM -> int_LMM -> min2_LMM 

(4) max_LMM_RE0 -> prsm1_LMM_RE0 -> zcp_LMM_RE0 

So far, in my research, I have worked almost exclusively with Sequence (1) and do not recall ever experiencing technical problems or inconsistencies for non-degenerate LMMs. Thus, this sequence will certainly allow one to determine a parsimonious LMM (Bates et al., 2015, Matuschek et al. 2017).

For complex fixed-effect structures (i.e., model with many factors or with factors with many levels), the number of CPs grows very rapidly. If this goes together with a modest number of observations or levels of the random factor, Sequences (2) and (3) might be a good place to start to avoid convergence problems. Of course, if your hypotheses are about CPs, these LMMs will not get you very far.

Finally, Sequence (4) could be a default strategy if one is interested in the fixed effects and if one does not want to spend much time in wondering about the meaning of CPs. However, as correlations between levels of fixed factors tend be large (at least larger than the correlations between effects), LMMs in this sequence may be prone to convergence problems.

6 Demonstration of hierarchical model sequences

Fits of the 10 LMMs defined here have been reported in the previous RPub. Here we illustrate the LRTs for each of the four sequences.

The illustrations use the Machines data which are part of the MEMSS package [efficiency scores of 6 workers (random factor) tested 3 times on each of 3 machines (fixed factor) ]. For later reference, I compute means and correlations between the 3 machine scores by averaging the 3 measures available for the 6 workers (fixed-effect correlations).

knitr::opts_chunk$set(echo = TRUE)
suppressMessages(library("lme4"))
library("RePsychLing")
suppressMessages(library("tidyverse"))

data("Machines", package = "MEMSS")
Machines$Measure <- factor(rep(1:3))
d <- as.tibble(Machines) 

contrasts(d$Machine) <- contr.treatment(3)

# Extract numeric covariates
## ... for FE1 and RE1
mm1 <- model.matrix(~ 1 + Machine, d)
dBA <- mm1[, 2]
dCA <- mm1[, 3]

# ... for RE0
mm0 <- model.matrix(~ 0 + Machine, d)
A <- mm0[, 1]
B <- mm0[, 2]
C <- mm0[, 3]

6.1 Sequence 1

# max_LMM   
m1 <- lmer(score ~ 1 + dBA + dCA + (1 + dBA + dCA | Worker), d, REML=FALSE)
# prsm1_LMM
m2 <- lmer(score ~ 1 + dBA + dCA + (1 | Worker) + (0 + dBA + dCA | Worker), d, REML=FALSE)
# zcp_LMM
m3 <- lmer(score ~ 1 + dBA + dCA + (1 + dBA + dCA || Worker), d, REML=FALSE)
# prsm2_LMM
m4 <- lmer(score ~ 1 + dBA + dCA + (1 | Worker) + (0 + dCA || Worker), d, REML=FALSE)
# min1_LMM
m5 <- lmer(score ~ 1 + dBA + dCA + (1 | Worker), d, REML=FALSE)

anova(m5, m4, m3, m2, m1)
## Data: d
## Models:
## m5: score ~ 1 + dBA + dCA + (1 | Worker)
## m4: score ~ 1 + dBA + dCA + (1 | Worker) + ((0 + dCA | Worker))
## m3: score ~ 1 + dBA + dCA + ((1 | Worker) + (0 + dBA | Worker) + 
## m3:     (0 + dCA | Worker))
## m2: score ~ 1 + dBA + dCA + (1 | Worker) + (0 + dBA + dCA | Worker)
## m1: score ~ 1 + dBA + dCA + (1 + dBA + dCA | Worker)
##    Df    AIC    BIC  logLik deviance   Chisq Chi Df Pr(>Chisq)    
## m5  5 303.70 313.65 -146.85   293.70                              
## m4  6 300.47 312.40 -144.24   288.47  5.2321      1    0.02217 *  
## m3  7 235.11 249.04 -110.56   221.11 67.3580      1  2.264e-16 ***
## m2  8 236.58 252.49 -110.29   220.58  0.5376      1    0.46342    
## m1 10 236.42 256.31 -108.21   216.42  4.1578      2    0.12507    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

6.2 Sequence 2

# max_LMM   
m1 <- lmer(score ~ 1 + dBA + dCA + (1 + dBA + dCA | Worker), d, REML=FALSE)
# int_LMM
m6 <- lmer(score ~ 1 + Machine + (1 | Worker) + (1 | Machine:Worker), d, REML=FALSE)
# min1_LMM
m5 <- lmer(score ~ 1 + dBA + dCA + (1 | Worker), d, REML=FALSE)

anova(m5, m6, m1)
## Data: d
## Models:
## m5: score ~ 1 + dBA + dCA + (1 | Worker)
## m6: score ~ 1 + Machine + (1 | Worker) + (1 | Machine:Worker)
## m1: score ~ 1 + dBA + dCA + (1 + dBA + dCA | Worker)
##    Df    AIC    BIC  logLik deviance   Chisq Chi Df Pr(>Chisq)    
## m5  5 303.70 313.65 -146.85   293.70                              
## m6  6 237.27 249.20 -112.64   225.27 68.4338      1    < 2e-16 ***
## m1 10 236.42 256.31 -108.21   216.42  8.8516      4    0.06492 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

6.3 Sequence 3

# max_LMM   
m1 <- lmer(score ~ 1 + dBA + dCA + (1 + dBA + dCA | Worker), d, REML=FALSE)
# int_LMM
m6 <- lmer(score ~ 1 + Machine + (1 | Worker) + (1 | Machine:Worker), d, REML=FALSE)
# min2_LMM
m7 <- lmer(score ~ 1 + dBA + dCA +  (1 | Machine:Worker), d, REML=FALSE)

anova(m7, m6, m1)
## Data: d
## Models:
## m7: score ~ 1 + dBA + dCA + (1 | Machine:Worker)
## m6: score ~ 1 + Machine + (1 | Worker) + (1 | Machine:Worker)
## m1: score ~ 1 + dBA + dCA + (1 + dBA + dCA | Worker)
##    Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)   
## m7  5 241.95 251.90 -115.98   231.95                            
## m6  6 237.27 249.20 -112.64   225.27 6.6816      1   0.009741 **
## m1 10 236.42 256.31 -108.21   216.42 8.8516      4   0.064917 . 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

6.4 Sequence 4

# maxL_MM_RE0
m8  <- lmer(score ~  1 + dBA + dCA + (0 + A + B + C | Worker), d, REML=FALSE)       
# prsm1_LMM_RE0
m9  <- lmer(score ~  1 + dBA + dCA + (0 + A | Worker) + (0 + B + C | Worker), d, REML=FALSE)
# zcp_LMM_RE0
m10 <- lmer(score ~  1 + dBA + dCA + (0 + A + B + C || Worker), d, REML=FALSE)                

anova(m10, m9, m8)
## Data: d
## Models:
## m10: score ~ 1 + dBA + dCA + ((0 + A | Worker) + (0 + B | Worker) + 
## m10:     (0 + C | Worker))
## m9: score ~ 1 + dBA + dCA + (0 + A | Worker) + (0 + B + C | Worker)
## m8: score ~ 1 + dBA + dCA + (0 + A + B + C | Worker)
##     Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)  
## m10  7 241.63 255.55 -113.81   227.63                           
## m9   8 238.39 254.30 -111.19   222.39 5.2396      1    0.02208 *
## m8  10 236.42 256.31 -108.21   216.42 5.9683      2    0.05058 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

7 Appendix

sessionInfo()
## R version 3.5.0 (2018-04-23)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS High Sierra 10.13.4
## 
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] forcats_0.3.0     stringr_1.3.1     dplyr_0.7.4      
##  [4] purrr_0.2.4       readr_1.1.1       tidyr_0.8.0      
##  [7] tibble_1.4.2      ggplot2_2.2.1     tidyverse_1.2.1  
## [10] RePsychLing_0.0.4 lme4_1.1-18       Matrix_1.2-14    
## 
## loaded via a namespace (and not attached):
##  [1] reshape2_1.4.3   splines_3.5.0    haven_1.1.1      lattice_0.20-35 
##  [5] colorspace_1.3-2 htmltools_0.3.6  yaml_2.1.19      rlang_0.2.0     
##  [9] nloptr_1.0.4     pillar_1.2.2     foreign_0.8-70   glue_1.2.0      
## [13] modelr_0.1.1     readxl_1.1.0     bindrcpp_0.2.2   bindr_0.1.1     
## [17] plyr_1.8.4       munsell_0.4.3    gtable_0.2.0     cellranger_1.1.0
## [21] rvest_0.3.2      psych_1.8.4      evaluate_0.10.1  knitr_1.20      
## [25] parallel_3.5.0   broom_0.4.4      Rcpp_0.12.17     backports_1.1.2 
## [29] scales_0.5.0     jsonlite_1.5     mnormt_1.5-5     hms_0.4.2       
## [33] digest_0.6.15    stringi_1.2.2    grid_3.5.0       rprojroot_1.3-2 
## [37] cli_1.0.0        tools_3.5.0      magrittr_1.5     lazyeval_0.2.1  
## [41] crayon_1.3.4     pkgconfig_2.0.1  MASS_7.3-50      xml2_1.2.0      
## [45] lubridate_1.7.4  rstudioapi_0.7   assertthat_0.2.0 minqa_1.2.4     
## [49] rmarkdown_1.9    httr_1.3.1       R6_2.2.2         nlme_3.1-137    
## [53] compiler_3.5.0