Mixed models and non binary random design matrix

The few functions described below allow to consider non binary random design matrix using the machinery of the ‘lme4’ R package.

Writing some classical mixed models for assessing the mixing abilitiy for cultivar mixture, we realized that no free R package was available (2017) for dealing with non binary random design matrix. Although ’lme4’is a widespread and effective package, it does only consider random design issued from group structures.

In order to overcome this limitation we wrote a replacement function for dealing the input of the model and we relied on the optimization machinery of the package for estimating the parameters.

Principle of the code

library(lme4)

## Loading required package: Matrix

The core function is a replacement for the function ‘mkReTrms’ of the ‘lme4’ package, which makes the random effect terms (by mainly creating the design matrix \(Z\) and inpution the random covariance matrix of each factor).

myReTrms<-function(ListZ,ListVar=NULL){
  reTrms<-list()
  reTrms$Zt    <- Matrix(t(Reduce('cbind',ListZ)),sparse=TRUE)
  reTrms$theta <- rep(1,length(ListZ))  # Initial Value of the covariance parameters
  reTrms$Lind  <- rep(1:length(ListZ),unlist(lapply(ListZ,ncol))) #an integer vector of indices determining the mapping of the elements of the theta vector to the "x" slot of Lambdat
  reTrms$Gp    <- as.integer(unname(cumsum(c(0,unlist(lapply(ListZ,ncol))))))
  reTrms$lower   <- rep(0,length(ListZ)) # lower bounds on the covariance parameters
  if (is.null(ListVar)) {reTrms$Lambdat <- Matrix(diag(rep(1,sum(unlist(lapply(ListZ,ncol))))),sparse=TRUE)} # transpose of the sparse relative covariance factor
  else {reTrms$Lambdat <- bdiag(lapply(ListVar,chol))}
   reTrms$Ztlist <- lapply(ListZ,function(Z) Matrix(t(Z),sparse=T))
  reTrms$cnms   <-  as.list(names(ListZ)) ; names(reTrms$cnms)<- names(ListZ)
  # Flist is Not very clean (to say the least... )
  reTrms$flist <- lapply(ListZ,function(Z) {flist.factor<- as.factor(colnames(Z)[apply(Z,1,function(x) which(rmultinom(n=1,size=1,prob =x)==1) )]);
                                            levels(flist.factor)<-colnames(Z); return(flist.factor)}) #NULL # list of grouping factors used in the random-effects terms (used for computing initial variance ??)
  return(reTrms)
}

The main function is a replacement of the lmer function. Instead of specifying an R formula for describing the model, the replacement function takes as input - the response vector - the fixed effect design matrix - the list of random effect design matrices - the corresponding list of covariance matrices

mylmer <- function(Response, X, ListZ,ListVar=NULL, REML = TRUE){
  notNA<-!is.na(Response)       # Get rid of the NA
  Response<- Response[notNA]   # Find another solution about the NA ???
  X <- X[notNA,]
  ListZ <- lapply(ListZ,function(Z) Z<-Z[notNA,]) 
  fr<-model.frame(Response~.,data.frame(Response=Response,X) )               
  reTrms <- myReTrms(ListZ,ListVar)
  devfun <- mkLmerDevfun(fr, X, reTrms)
  opt <- optimizeLmer(devfun)
  return(mkMerMod(environment(devfun), opt, reTrms, fr = fr))
}

The extraction of the random effect needs also a modified function.

# My ranef replacement
myranef<-function(model,condVar=TRUE){
  re.cond.mode<-tapply(model@u,mymod@pp$Lind,function(x) x)
  names(re.cond.mode)<- names(model@cnms)

  if (condVar) {
    Zt<-model@pp$Zt
    D  <- sigma(model)* t(model@pp$Lambdat) %*% model@pp$Lambdat
    Sigma<- t(Zt)%*%  D %*%Zt + sigma(model)*diag(rep(1,ncol(Zt)))
    var.cond <- D - Zt %*%solve(Sigma) %*% t(Zt)
    var.cond <- diag(var.cond) 
    var.cond.mode <- tapply(var.cond,mymod@pp$Lind,function(x) x)
    }
  for (i in 1:length(re.cond.mode)) {
    re.cond.mode[[i]]<-data.frame(re.cond.mode[i])
    names(re.cond.mode[[i]])<-names(re.cond.mode[i])
    row.names(re.cond.mode[[i]]) <- levels(model@flist[[i]])
    
    if (condVar) attr(re.cond.mode[[i]],"postVar")=array(var.cond.mode[[i]],c(1,1,nrow(re.cond.mode[[i]])))
  }
  attr(re.cond.mode,"class") <- "ranef.mer" 
  re.cond.mode
}

Example code

As specified above the model specification needs the following objects: - the response vector - the fixed effect design matrix - the list of random effect design matrices - the corresponding list of covariance matrices

If the last argument is empty each covariance matrix associated to the random effect will be spherical.

Estimation of two different model (with and without GMA)

load("cultivar-mix.Rdata")
ListZ <- list(GMA=Zg,SMA=Zc)
mymod<-mylmer(Response =  Phenotype[[1]],X, ListZ,REML=FALSE) 
mymod.gma<-mylmer(Response =  Phenotype[[1]],X, ListZ[1],REML=FALSE)

Once the model is estimated classical outputs are available

summary(mymod)

## Linear mixed model fit by REML ['lmerMod']
## 
## REML criterion at convergence: 1415.3
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.98362 -0.58842 -0.06312  0.57326  3.11492 
## 
## Random effects:
##  Groups   Name Variance Std.Dev.
##  <NA>     GMA  42.97    6.555   
##  <NA>     SMA  41.42    6.436   
##  Residual      44.72    6.687   
## Number of obs: 202, groups:  GMA, 25; SMA, 100
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  72.9659     1.6720   43.64
## Rep2         -7.2429     0.9422   -7.69
## 
## Correlation of Fixed Effects:
##      (Intr)
## Rep2 -0.282

sigma(mymod)

## [1] 6.686995

Models can be compared

anova(mymod.gma,mymod)

## refitting model(s) with ML (instead of REML)

## Data: NULL
## Models:
## mymod.gma: NULL
## mymod: NULL
##           Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)  
## mymod.gma  4 1432.6 1445.8 -712.31   1424.6                           
## mymod      5 1429.8 1446.3 -709.90   1419.8 4.8129      1    0.02825 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Alternative plots are available

library(lattice)
dotplot(myranef(mymod))

## $GMA

## 
## $SMA

pr01 <- profile(mymod)
xyplot(pr01, aspect = 1.3)

confint(pr01,level=0.9)

##                   5 %      95 %
## .sig01       3.035099  9.315263
## .sig02       3.142867  9.168337
## .sigma       5.978508  7.450321
## (Intercept) 70.111129 75.819397
## Rep2        -8.794168 -5.690974

splom(pr01)

Caution

The functions are still experimental and should be used at your own risk.

R Code for allowing non binary random design matrix in the lme4 R package

2017-09-09

Mixed models and non binary random design matrix

Principle of the code

Example code

Estimation of two different model (with and without GMA)

Caution