LMMs afford the simultaneous estimation of fixed effects and (co-)variance parameters associated with them relative to one or several random factors (e.g., subjects, items). The (co-variance) parameters can be used to generate “predictions” for the levels of the random factors (i.e., the random effects). Unfortunately, often the amount of information in the data (e.g., number of subjects or items, number of observations per subject or items, standard deviation within subjects or within items) is not adequate for the reliable estimation of all (co-)variance parameters available in principle. One reason is that the number of model parameters grows quadratically with the number of variance parameters [i.e., for n variance components we obtain maximally n(n+1)/2 model parameters, ignoring fixed effects]. So the question is: How can we arrive at a “defensible” model given the available amount of information in the data?

One simple way to reduce model complexity is to force correlation parameters to zero. The new || syntax of lmer() is very convenient for specifying a zero-correlation parameter LMM (zcpLMM). However, there are a few constraints one needs to know. I illustrate them with the Machines data [i.e., the efficiency scores of 6 workers (random factor) tested 3 times on each of 3 machines (fixed factor) ]. Here are some basic statistics.

library(lme4)
## Loading required package: Matrix
## Loading required package: Rcpp
library(plyr)
library(ggplot2)

data("Machines", package = "MEMSS")
str(Machines)
## 'data.frame':    54 obs. of  3 variables:
##  $ Worker : Factor w/ 6 levels "1","2","3","4",..: 1 1 1 2 2 2 3 3 3 4 ...
##  $ Machine: Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 1 1 ...
##  $ score  : num  52 52.8 53.1 51.8 52.8 53.1 60 60.2 58.4 51.1 ...
summary(Machines)
##  Worker Machine     score     
##  1:9    A:18    Min.   :43.0  
##  2:9    B:18    1st Qu.:52.4  
##  3:9    C:18    Median :61.6  
##  4:9            Mean   :59.6  
##  5:9            3rd Qu.:65.3  
##  6:9            Max.   :72.1
ddply(Machines, .(Machine), summarise, N=length(score), M=round(mean(score), 2), SD=round(sd(score), 2))
##   Machine  N     M   SD
## 1       A 18 52.36 3.99
## 2       B 18 60.32 8.16
## 3       C 18 66.27 4.19

Let’s start with default treatment contrasts for the Machine factor. The first model specifies the maximal LMM; the second model uses the ||-syntax to force (some) correlations to zero.

contrasts(Machines$Machine) <- contr.treatment(3)
print(summary(m1a <- lmer(score ~ Machine + (Machine  | Worker), REML=FALSE, data=Machines)), corr=FALSE)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: score ~ Machine + (Machine | Worker)
##    Data: Machines
## 
##      AIC      BIC   logLik deviance df.resid 
##    236.4    256.3   -108.2    216.4       44 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.4077 -0.5189  0.0323  0.4560  2.5409 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr       
##  Worker   (Intercept) 13.816   3.717               
##           Machine2    28.686   5.356     0.49      
##           Machine3    11.243   3.353    -0.36  0.30
##  Residual              0.925   0.962               
## Number of obs: 54, groups:  Worker, 6
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)    52.36       1.53    34.1
## Machine2        7.97       2.21     3.6
## Machine3       13.92       1.41     9.9
print(summary(m1b <- lmer(score ~ Machine + (Machine || Worker), REML=FALSE, data=Machines)), corr=FALSE)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: score ~ Machine + ((1 | Worker) + (0 + Machine | Worker))
##    Data: Machines
## 
##      AIC      BIC   logLik deviance df.resid 
##    238.4    260.3   -108.2    216.4       43 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.4077 -0.5189  0.0323  0.4560  2.5409 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr     
##  Worker   (Intercept)  0.247   0.497             
##  Worker.1 MachineA    13.569   3.684             
##           MachineB    61.698   7.855    0.80     
##           MachineC    15.758   3.970    0.62 0.77
##  Residual              0.925   0.962             
## Number of obs: 54, groups:  Worker, 6
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)    52.36       1.53    34.1
## Machine2        7.97       2.21     3.6
## Machine3       13.92       1.41     9.9

In both models the fixed effects are as expected, that is the mean for machine A (intercept) and the differences of machines B and C from machine A. Also, the random effects for model m1a are as expected; they are the worker-related variance components for machine A and the associated difference effects of machine B and C as well as the associated correlation parameters. However, the results reported under “Random effects” for model m1b may not be as expected–at least if one expected that the double-bar notation would force all correlation parameters to zero. The specification forces a zero-correlation between intercept and effects, but still reports three correlation parameters for the three machines. The reason is that the double-bar notation evaluates to two random-effects terms:

((1 | Worker) + (0 + Machine | Worker))

The first term gives us the variance component for the overall differences between workers. The second term, following general R-formula syntax, instructs a suppression of the intercept, that is we get estimates of worker-related variance components for each of the three machines and the three associated correlation parameters. Thus, there is a “mismatch” between fixed-effect and random-effect information; there is no information about worker-related variance components associated with the fixed effects in the random-effects part of this model. By implication, any contrast specified for the factor will be ignored. For example, if we use a sliding-difference sdif() contrast for the Machine factor.

contrasts(Machines$Machine) <- MASS:::contr.sdif(3)
print(summary(m2a <- lmer(score ~ Machine + (Machine  | Worker), REML=FALSE, data=Machines)), corr=FALSE)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: score ~ Machine + (Machine | Worker)
##    Data: Machines
## 
##      AIC      BIC   logLik deviance df.resid 
##    236.4    256.3   -108.2    216.4       44 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.4077 -0.5189  0.0323  0.4560  2.5409 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr       
##  Worker   (Intercept) 22.895   4.785               
##           Machine2-1  28.686   5.356     0.82      
##           Machine3-2  29.310   5.414    -0.77 -0.81
##  Residual              0.925   0.962               
## Number of obs: 54, groups:  Worker, 6
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)    59.65       1.96   30.47
## Machine2-1      7.97       2.21    3.60
## Machine3-2      5.95       2.23    2.66
print(summary(m2b <- lmer(score ~ Machine + (Machine || Worker), REML=FALSE, data=Machines)), corr=FALSE)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: score ~ Machine + ((1 | Worker) + (0 + Machine | Worker))
##    Data: Machines
## 
##      AIC      BIC   logLik deviance df.resid 
##    238.4    260.3   -108.2    216.4       43 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.4077 -0.5189  0.0323  0.4560  2.5409 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr     
##  Worker   (Intercept)  0.691   0.831             
##  Worker.1 MachineA    13.125   3.623             
##           MachineB    61.254   7.826    0.81     
##           MachineC    15.314   3.913    0.61 0.77
##  Residual              0.925   0.962             
## Number of obs: 54, groups:  Worker, 6
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)    59.65       1.96   30.47
## Machine2-1      7.97       2.21    3.60
## Machine3-2      5.95       2.23    2.66

As expected, the fixed effects estimate the Grand Mean of machine means (intercept) and the differences between machines B and A and machines C and B. The variance components and correlation parameters of model m2a correspond to the new set of fixed effects; those of the second random-effects term of model m2b are not affected by the switch to the different contrast for the machine factor, they are numerically very close to those reported for model m1b.

So how do we get variance components for the fixed effects while at the same time enforcing zero correlation parameters for all of them (not only between intercept and effects)? One option is to work with numeric covariates. The simplest way to achieve this is to extract them from the model matrix. Let’s go back to model m1 with treatment contrasts.

cB_A <- model.matrix(m1a)[,2]
cC_A <- model.matrix(m1a)[,3]  
print(summary(m1c <- lmer(score ~ 1 + Machine + (cB_A + cC_A  | Worker), REML=FALSE, data=Machines)), corr=FALSE)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: score ~ 1 + Machine + (cB_A + cC_A | Worker)
##    Data: Machines
## 
##      AIC      BIC   logLik deviance df.resid 
##    236.4    256.3   -108.2    216.4       44 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.4077 -0.5189  0.0323  0.4560  2.5409 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr       
##  Worker   (Intercept) 13.816   3.717               
##           cB_A        28.686   5.356     0.49      
##           cC_A        11.243   3.353    -0.36  0.30
##  Residual              0.925   0.962               
## Number of obs: 54, groups:  Worker, 6
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)    59.65       1.96   30.47
## Machine2-1      7.97       2.21    3.60
## Machine3-2      5.95       2.23    2.66
print(summary(m1d <- lmer(score ~ 1 + Machine + (cB_A + cC_A || Worker), REML=FALSE, data=Machines)), corr=FALSE)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: score ~ 1 + Machine + ((1 | Worker) + (0 + cB_A | Worker) + (0 +  
##     cC_A | Worker))
##    Data: Machines
## 
##      AIC      BIC   logLik deviance df.resid 
##    235.1    249.0   -110.6    221.1       47 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.3119 -0.5070  0.0044  0.4530  2.4712 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  Worker   (Intercept) 13.777   3.712   
##  Worker.1 cB_A        28.827   5.369   
##  Worker.2 cC_A        10.932   3.306   
##  Residual              0.927   0.963   
## Number of obs: 54, groups:  Worker, 6
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)    59.65       1.75    34.2
## Machine2-1      7.97       2.22     3.6
## Machine3-2      5.95       2.59     2.3
anova(m1d, m1c)
## Data: Machines
## Models:
## m1d: score ~ 1 + Machine + ((1 | Worker) + (0 + cB_A | Worker) + (0 + 
## m1d:     cC_A | Worker))
## m1c: score ~ 1 + Machine + (cB_A + cC_A | Worker)
##     Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## m1d  7 235 249   -111      221                        
## m1c 10 236 256   -108      216   4.7      3        0.2

Model m1c is identical to model 1a, but model m1d is different from model m1b. Model m1d estimates 3 worker-related variance components (i.e., for machineA and the differences between machines B and C from machine A) and 0 correlation parameters. The likelihood ratio test (LRT; tells us that there is no evidence for significant correlations between the effects. And there is an option for a further simplification: Perhaps the variances of the worker-related effects can be constrained to equality without a loss of goodness of fit. For this test we need to switch back to the Machine factor.

print(summary(m1e <- lmer(score ~ Machine + (1|Worker) + (1|Machine:Worker), REML=FALSE, data=Machines)), corr=FALSE)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: score ~ Machine + (1 | Worker) + (1 | Machine:Worker)
##    Data: Machines
## 
##      AIC      BIC   logLik deviance df.resid 
##    237.3    249.2   -112.6    225.3       48 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.2590 -0.5513 -0.0065  0.4467  2.5489 
## 
## Random effects:
##  Groups         Name        Variance Std.Dev.
##  Machine:Worker (Intercept) 11.540   3.397   
##  Worker         (Intercept) 19.049   4.364   
##  Residual                    0.925   0.962   
## Number of obs: 54, groups:  Machine:Worker, 18 Worker, 6
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)    59.65       1.96   30.47
## Machine2-1      7.97       1.99    4.01
## Machine3-2      5.95       1.99    2.99
anova(m1e, m1d)
## Data: Machines
## Models:
## m1e: score ~ Machine + (1 | Worker) + (1 | Machine:Worker)
## m1d: score ~ 1 + Machine + ((1 | Worker) + (0 + cB_A | Worker) + (0 + 
## m1d:     cC_A | Worker))
##     Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)  
## m1e  6 237 249   -113      225                          
## m1d  7 235 249   -111      221  4.16      1      0.041 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The LRT suggests that the assumption that worker-related variances for the two contrasts comes at a significant cost in goodness of fit. In other words, allowing for differences in the variance of worker-related effects increases the goodness of fit.

Similarly, we can estimate worker-related variance components for the second contrast specification, that is the differences between the machines.

cB_A <- model.matrix(m2a)[,2]
cC_B <- model.matrix(m2a)[,3]
summary(m2c <- lmer(score ~ 1 + Machine + (cB_A + cC_B  | Worker), REML=FALSE, data=Machines))
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: score ~ 1 + Machine + (cB_A + cC_B | Worker)
##    Data: Machines
## 
##      AIC      BIC   logLik deviance df.resid 
##    236.4    256.3   -108.2    216.4       44 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.4077 -0.5189  0.0323  0.4560  2.5409 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr       
##  Worker   (Intercept) 22.895   4.785               
##           cB_A        28.686   5.356     0.82      
##           cC_B        29.310   5.414    -0.77 -0.81
##  Residual              0.925   0.962               
## Number of obs: 54, groups:  Worker, 6
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)    59.65       1.96   30.47
## Machine2-1      7.97       2.21    3.60
## Machine3-2      5.95       2.23    2.66
## 
## Correlation of Fixed Effects:
##            (Intr) Mch2-1
## Machine2-1  0.811       
## Machine3-2 -0.765 -0.800
summary(m2d <- lmer(score ~ 1 + Machine + (cB_A + cC_B || Worker), REML=FALSE, data=Machines))
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: score ~ 1 + Machine + ((1 | Worker) + (0 + cB_A | Worker) + (0 +  
##     cC_B | Worker))
##    Data: Machines
## 
##      AIC      BIC   logLik deviance df.resid 
##    243.6    257.5   -114.8    229.6       47 
## 
## Scaled residuals: 
##    Min     1Q Median     3Q    Max 
## -2.249 -0.519 -0.002  0.431  2.573 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  Worker   (Intercept) 22.895   4.785   
##  Worker.1 cB_A        28.193   5.310   
##  Worker.2 cC_B        28.807   5.367   
##  Residual              0.927   0.963   
## Number of obs: 54, groups:  Worker, 6
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)    59.65       1.96   30.47
## Machine2-1      7.97       2.19    3.64
## Machine3-2      5.95       2.21    2.69
## 
## Correlation of Fixed Effects:
##            (Intr) Mch2-1
## Machine2-1  0.000       
## Machine3-2  0.000 -0.011
anova(m2d, m2c)
## Data: Machines
## Models:
## m2d: score ~ 1 + Machine + ((1 | Worker) + (0 + cB_A | Worker) + (0 + 
## m2d:     cC_B | Worker))
## m2c: score ~ 1 + Machine + (cB_A + cC_B | Worker)
##     Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)   
## m2d  7 244 258   -115      230                           
## m2c 10 236 256   -108      216  13.2      3     0.0043 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

In this model, fixing correlation parameters to zero already reduces the goodness of fit significantly. Thus, this example also illustrates that model complexity depends on the planned comparisons implemented via contrasts. Model m1d would be chosen for the treatment contrast (using machine A as reference) and the more complex model m2c is justified if the design calls for planned comparisons between machines A and B and machines B and C.

Finally, what is the advantage of the ||-notation? For the Machines data the advantage is not large because the fixed factor Machine has only three levels. The traditional way of specifying the zcpLMM with numeric covariates requires a separate random-effect terms for every variance component, that is for the present example:

summary(m2e <- lmer(score ~ 1 + Machine + (1 | Worker) + (0 + cB_A | Worker) + (0 + cC_B | Worker),
                    REML=FALSE, data=Machines))

Model m2e (not evaluated here) yields results identical to those of model m2d. With a large number of variance components, arising from factors with a larger number of levels or complex experimental designs with several factors and their interactions, the new ||-syntax may save quite a bit of typing. The specification of the comparison with the maximal model is minimal (e.g., the specification of models m2c and m2d differs only in one extra ’|’for m2d) and provides a very convenient test of the significance of the ensemble of correlation parameters or subsets of them.

Package Versions

sessionInfo()
## R version 3.1.0 (2014-04-10)
## Platform: x86_64-apple-darwin13.1.0 (64-bit)
## 
## locale:
## [1] de_DE.UTF-8/de_DE.UTF-8/de_DE.UTF-8/C/de_DE.UTF-8/de_DE.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] plyr_1.8.1    lme4_1.1-7    Rcpp_0.11.2   Matrix_1.1-4  ggplot2_1.0.0
## [6] knitr_1.6    
## 
## loaded via a namespace (and not attached):
##  [1] colorspace_1.2-4    digest_0.6.4        evaluate_0.5.5     
##  [4] formatR_0.10        grid_3.1.0          gtable_0.1.2       
##  [7] htmltools_0.2.4     lattice_0.20-29     MASS_7.3-33        
## [10] minqa_1.2.3         munsell_0.4.2       nlme_3.1-117       
## [13] nloptr_1.0.0        proto_0.3-10        RcppEigen_0.3.2.1.2
## [16] reshape2_1.4        rmarkdown_0.2.46    scales_0.2.4       
## [19] splines_3.1.0       stringr_0.6.2       tools_3.1.0        
## [22] yaml_2.1.13