```

References

Background

The estimation process for GLMM is difficult. In GLMM, the (marginal) likelihood to be maximized does not have a simple closed-form expression. This is because the likelihood expression is obtained by integrating or averaging over the distribution of the random effects. This integrating over random effects is intractable.

Therefore various approximation methods were developed.

The adaptive Gaussian quadrature method is replaces the integration over the entire range of random effects with summations at representative quadrature points. This can be made arbitrarily accurate by increasing quadrature points to be considered, but in the higher dimensional case (many random effects), it becomes computational burdensome.

Another approximation is to approximate the integrand that is integrated over the random effects, so that the marginal likelihood has a closed form. This is called the Laplace approximation and corresponds to an adaptive Gaussian quadrature with only one quadrature point.

The penalized quasilikelihood (PQL) method approximate the model for the outcome (conditional mean given a random effect and an error) by a linear mixed effects model. The outcome variable and the error are transformed to fit this configuration. The transformed outcome variable is called the pseudo-data. Estimation of the psuedo-data and the parameters of interest are iterated until convergence.

Becoming increasing popular is the Markov Chain Monte Carlo method, which enables sampling from a distribution that does not have a closed form expression. Given the convergence of the stationary distribution of a sufficiently long MCMC chain and the law of large numbers, the originally intractable integration is numerically achieved (Monte Carlo integration).

Load packages

library(haven)
library(tidyr)
library(dplyr)

Load epilepsy data and prepare


Data on seizure counts for 59 epileptics.


Source: Table 2 (page 664) in Thall and Vail (1990).
With permission of Blackwell Publishing.

Reference: Thall, P.F. and Vail, S.C. (1990). Some covariance models for
longitudinal count data with overdispersion. Biometrics, 46, 657-671,


Description:

The data are from a placebo-controlled clinical trial of 59 epileptics.
Patients with partial seizures were enrolled in a randomized
clinical trial of the anti-epileptic drug, progabide. Participants
in the study were randomized to either progabide or a placebo, as
an adjuvant to the standard anti-epileptic chemotherapy. Progabide
is an anti-epileptic drug whose primary mechanism of action is to
enhance gamma-aminobutyric acid (GABA) content; GABA is the
primary inhibitory neurotransmitter in the brain.
Prior to receiving treatment, baseline data on the number of
epileptic seizures during the preceding 8-week interval were
recorded. Counts of epileptic seizures during  2-week intervals
before each of four successive post-randomization clinic visits
were recorded.


Variable List:

Patient ID, Treatment (0=Placebo, 1=Progabide), Age,
Baseline 8 week seizure count, First 2 week seizure count,
Second 2 week seizure count, Third 2 week seizure count,
Fourth 2 week seizure count.
epi <- haven::read_sas("http://www.hsph.harvard.edu/fitzmaur/ala2e/epilepsy.sas7bdat")
names(epi) <- tolower(names(epi))
epi
## Source: local data frame [59 x 8]
## 
##       id   trt   age    y0    y1    y2    y3    y4
##    (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl)
## 1      1     0    31    11     5     3     3     3
## 2      2     0    30    11     3     5     3     3
## 3      3     0    25     6     2     4     0     5
## 4      4     0    36     8     4     4     1     4
## 5      5     0    22    66     7    18     9    21
## 6      6     0    29    27     5     2     8     7
## 7      7     0    31    12     6     4     0     2
## 8      8     0    36    52    40    20    23    12
## 9      9     0    37    23     5     6     6     5
## 10    10     0    28    10    14    13     6     0
## ..   ...   ...   ...   ...   ...   ...   ...   ...
## Wide to long conversion
epiL <- gather_(epi, key_col = "time", value_col = "y", gather_cols = paste0("y", 0:4)) %>%
    mutate(time = as.numeric(gsub("y", "", .$time))) %>%
    arrange(id, time)
## Warning: attributes are not identical across measure variables; they will be dropped
## Add observation time (offset)
epiL$T[epiL$time == 0] <- 8
epiL$T[epiL$time != 0] <- 2
epiL$logT <- log(epiL$T)
epiL
## Source: local data frame [295 x 7]
## 
##       id   trt   age  time     y     T      logT
##    (dbl) (dbl) (dbl) (dbl) (dbl) (dbl)     (dbl)
## 1      1     0    31     0    11     8 2.0794415
## 2      1     0    31     1     5     2 0.6931472
## 3      1     0    31     2     3     2 0.6931472
## 4      1     0    31     3     3     2 0.6931472
## 5      1     0    31     4     3     2 0.6931472
## 6      2     0    30     0    11     8 2.0794415
## 7      2     0    30     1     3     2 0.6931472
## 8      2     0    30     2     5     2 0.6931472
## 9      2     0    30     3     3     2 0.6931472
## 10     2     0    30     4     3     2 0.6931472
## ..   ...   ...   ...   ...   ...   ...       ...

Modeling

The time term is assumed normal and the quantitiy of interest is the treatment \(\times\) time interaction. The time points are nested within ID.

Generalized estimating equation

This model has the population average interpretaion, thus, the parameters are different from the following GLMM examples.

library(geepack)
geeglm1 <- geeglm(formula   = y ~ time + trt + time:trt,
                  family    = poisson(link = "log"),
                  id        = id,
                  offset    = logT,
                  data      = epiL,
                  corstr    = "ar1",
                  scale.fix = FALSE)
summary(geeglm1)
## 
## Call:
## geeglm(formula = y ~ time + trt + time:trt, family = poisson(link = "log"), 
##     data = epiL, offset = logT, id = id, corstr = "ar1", scale.fix = FALSE)
## 
##  Coefficients:
##              Estimate   Std.err   Wald Pr(>|W|)    
## (Intercept)  1.212426  0.197628 37.637 8.52e-10 ***
## time         0.001602  0.031498  0.003    0.959    
## trt          0.084465  0.239008  0.125    0.724    
## time:trt    -0.044992  0.047928  0.881    0.348    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Estimated Scale Parameters:
##             Estimate Std.err
## (Intercept)    23.58   11.83
## 
## Correlation: Structure = ar1  Link = identity 
## 
## Estimated Correlation Parameters:
##       Estimate Std.err
## alpha   0.8953 0.05795
## Number of clusters:   59   Maximum cluster size: 5

glmmPQL (Penalized Quasi-Likelihood)

This method did not converge.

library(MASS)
glmmPQL1 <- glmmPQL(fixed = y ~ time + trt + time:trt + offset(logT),
                    random = ~ 1 + time | id,
                    family = poisson,
                    data = epiL)
## Error in lme.formula(fixed = zz ~ time + trt + time:trt + offset(logT), : nlminb problem, convergence error code = 1
##   message = iteration limit reached without convergence (10)

lme4 (Laplace Approximation)

The model is fit using the Laplace approximatin. The adaptive gaussian quadrature is only available for one random effect model.

library(lme4)
glmerLaplace <- glmer(formula = y ~ time + trt + time:trt + (1 + time | id) + offset(logT),
                      data = epiL,
                      family = poisson(link = "log"),
                      nAGQ = 1)
summary(glmerLaplace)
## Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
##  Family: poisson  ( log )
## Formula: y ~ time + trt + time:trt + (1 + time | id) + offset(logT)
##    Data: epiL
## 
##      AIC      BIC   logLik deviance df.resid 
##   1924.2   1950.0   -955.1   1910.2      288 
## 
## Scaled residuals: 
##    Min     1Q Median     3Q    Max 
## -3.379 -0.723 -0.117  0.585  6.631 
## 
## Random effects:
##  Groups Name        Variance Std.Dev. Corr
##  id     (Intercept) 0.5269   0.726        
##         time        0.0201   0.142    0.22
## Number of obs: 295, groups:  id, 59
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   1.1040     0.1426    7.74  9.8e-15 ***
## time         -0.0226     0.0336   -0.67    0.501    
## trt           0.0175     0.1963    0.09    0.929    
## time:trt     -0.0934     0.0467   -2.00    0.046 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##          (Intr) time   trt   
## time      0.065              
## trt      -0.724 -0.053       
## time:trt -0.054 -0.694  0.074

glmmML (Adaptive Gaussian Quadrature)

Only one random effect (random intercept) is allowed. So it doesn’t fit the purpose here.

library(glmmML)
glmmML1 <- glmmML(formula = y ~ time + trt + time:trt + offset(logT),
                  family = poisson,
                  data = epiL,
                  cluster = id)
summary(glmmML1)
## 
## Call:  glmmML(formula = y ~ time + trt + time:trt + offset(logT), family = poisson,      data = epiL, cluster = id) 
## 
## 
##                coef se(coef)      z Pr(>|z|)
## (Intercept)  1.0658   0.1517  7.025  2.1e-12
## time         0.0192   0.0157  1.226  2.2e-01
## trt         -0.0275   0.2094 -0.131  9.0e-01
## time:trt    -0.0383   0.0220 -1.742  8.1e-02
## 
## Scale parameter in mixing distribution:  0.779 gaussian 
## Std. Error:                              0.0702 
## 
##         LR p-value for H_0: sigma = 0:  0 
## 
## Residual deviance: 963 on 290 degrees of freedom     AIC: 973

brms (Markov Chain Monte Carlo via Stan)

This is a frontend for Stan. The grammer is essentially that of lme4. The plot method is nice with ggplot2 graphs.

## Set mc.cores, and it will parallelize
options(mc.cores = parallel::detectCores())
library(brms)
brm1 <- brm(formula = y ~ time + trt + time:trt + (1 + time | id) + offset(logT),
            data = epiL,
            family = "poisson")
(summ_brm1 <- summary(brm1))
##  Family: poisson (log) 
## Formula: y ~ time + trt + time:trt + (1 + time | id) + offset(logT) 
##    Data: epiL (Number of observations: 295) 
## Samples: 2 chains, each with n.iter = 2000; n.warmup = 500; n.thin = 1; 
##          total post-warmup samples = 3000
##    WAIC: 1835.84
##  
## Random Effects: 
## ~id (Number of levels: 59) 
##                     Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept)           0.77      0.08     0.62     0.95        669    1
## sd(time)                0.15      0.02     0.11     0.20       1171    1
## cor(Intercept,time)     0.20      0.17    -0.14     0.51       1177    1
## 
## Fixed Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept     1.11      0.15     0.81     1.40        360 1.00
## time         -0.02      0.04    -0.09     0.05       1250 1.00
## trt           0.02      0.21    -0.40     0.43        389 1.01
## time:trt     -0.09      0.05    -0.19     0.00       1205 1.00
## 
## Samples were drawn using NUTS(diag_e). For each parameter, Eff.Sample is a 
## crude measure of effective sample size, and Rhat is the potential scale 
## reduction factor on split chains (at convergence, Rhat = 1).
plot(brm1)

rstanarm (Markov Chain Monte Carlo via Stan)

This is an official frontend for Stan. The grammer is essentially that of lme4 except for the offset term.

## This also parallelizes if mc.cores is set.
options(mc.cores = parallel::detectCores())
library(rstanarm)
stan_glmer1 <- stan_glmer(formula = y ~ time + trt + time:trt + (1 + time | id),
                          offset = epiL$logT,
                          data = epiL,
                          family = poisson(link = "log"))
stan_glmer1
## stan_glmer(formula = y ~ time + trt + time:trt + (1 + time | 
##     id), data = epiL, family = poisson(link = "log"), offset = epiL$logT)
## 
## Estimates:
##             Median MAD_SD
## (Intercept)  1.1    0.1  
## time         0.0    0.0  
## trt          0.0    0.2  
## time:trt    -0.1    0.0  
## 
## Error terms:
##  Groups Name        Std.Dev. Corr
##  id     (Intercept) 0.75         
##         time        0.15     0.19
## Num. levels: id 59 
## 
## Sample avg. posterior predictive 
## distribution of y (X = xbar):
##          Median MAD_SD
## mean_PPD 12.9    0.3
plot(stan_glmer1)

glmmstan (Markov Chain Monte Carlo via Stan)

Yet another Stan frontend. It is also only available on Github. The grammer is essentially that of lme4.

## library(devtools)
## install_github("norimune/glmmstan")
library(glmmstan)
glmmstan1 <- glmmstan(y ~ time + trt + time:trt + (1 + time | id),
                      data = epiL,
                      ## Use raw offset variable not log(T)
                      offset = "T",
                      family = "poisson")
## Error in glmmstan(y ~ time + trt + time:trt + (1 + time | id), data = epiL, : (list) object cannot be coerced to type 'double'

It didn’t run for some reason.

MCMCglmm (Markov Chain Monte Carlo via CSparse)

library(MCMCglmm)
MCMCglmm1 <- MCMCglmm(fixed = y ~ time + trt + time:trt + offset(logT),
                      ## us() also estimate the covariance, but it failed to run.
                      ## idh() does not estimate the covariance.
                      random = ~ idh(1 + time):id,
                      family = "poisson",
                      data = epiL)
summary(MCMCglmm1)
## 
##  Iterations = 3001:12991
##  Thinning interval  = 10
##  Sample size  = 1000 
## 
##  DIC: 1605 
## 
##  G-structure:  ~idh(1 + time):id
## 
##                post.mean l-95% CI u-95% CI eff.samp
## (Intercept).id  0.662372 3.95e-01  0.97591   1000.0
## time.id         0.000998 1.75e-24  0.00619     40.4
## 
##  R-structure:  ~units
## 
##       post.mean l-95% CI u-95% CI eff.samp
## units     0.374    0.265    0.471      738
## 
##  Location effects: y ~ time + trt + time:trt + offset(logT) 
## 
##             post.mean  l-95% CI  u-95% CI eff.samp  pMCMC    
## (Intercept)  2.656491  2.304972  3.003983     1000 <0.001 ***
## time        -0.333876 -0.426037 -0.255185     1000 <0.001 ***
## trt          0.000175 -0.507664  0.463569      874   0.95    
## time:trt    -0.093195 -0.222662  0.024543      883   0.14    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The offset() does not seem to be working.

The offset requires some tricks. In the following link, the offset is entered as a regular term, but with a very strong narrow prior.

glmm (Monte Carlo-Maximum Likelihood)

“This function fits generalized linear mixed models (GLMMs) by approximating the likelihood with ordinary Monte Carlo, then maximizing the approximated likelihood.”

This is different from the full MCMC method in that the likelihood is approximated by a Monte Carlo process, and then from there it is a regular maximum likelihood.

library(glmm)
glmm1 <- glmm(fixed = as.integer(y) ~ time + trt + time:trt,
              random = ~ 0 + time,
              data = epiL,
              family.glm = poisson.glmm,
              m = 10^4,
              varcomps.names = c("(Intercept)"))
## Error in eigenstuff$vectors %*% diag(lambda.half): non-conformable arguments

I could not figure out how to run this.

Summary of results from model that run.

parNames <- c("Intercept","Time","Trt","Trt_Time","VarRanInt","VarRanSlTime","CovRan")

## Add the SAS (AGQ) results
coefs <- data.frame(SAS_AGQ = c(Intercept = 1.0707,
                                Time = -0.0004,
                                Trt = 0.0513,
                                Trt_Time = -0.3065,
                                VarRanInt = 0.5010,
                                VarRanSlTime = 0.2334,
                                CovRan = 0.0541))

cbind(coefs,

      lme4_Lap = c(coef(summary(glmerLaplace))[,1],
                   diag(summary(glmerLaplace)$varcor$id),
                   summary(glmerLaplace)$varcor$id[1,2]),

      brms_MCMC = c(summ_brm1$fixed[,1],
                    summ_brm1$random$id[1:2,1]^2,
                    summ_brm1$random$id[3,1]),

      stanarm = c(stan_glmer1$stan_summary[1:4,"mean"],
                  ## Unclear how to extract these
                  c(0.74,0.22)^2, 0.09),

      MCMCglmm = c(summary(MCMCglmm1)$solutions[,1],
                   summary(MCMCglmm1)$Gcovariances[,1],
                   NA),

      geepack = c(coef(geeglm1), rep(NA,3)))
##              SAS_AGQ lme4_Lap brms_MCMC  stanarm   MCMCglmm   geepack
## Intercept     1.0707  1.10396   1.10724  1.10866  2.6564909  1.212426
## Time         -0.0004 -0.02264  -0.02266 -0.02381 -0.3338757  0.001602
## Trt           0.0513  0.01748   0.01553  0.01107  0.0001751  0.084465
## Trt_Time     -0.3065 -0.09340  -0.09474 -0.09480 -0.0931951 -0.044992
## VarRanInt     0.5010  0.52686   0.58854  0.54760  0.6623720        NA
## VarRanSlTime  0.2334  0.02012   0.02300  0.04840  0.0009981        NA
## CovRan        0.0541  0.02237   0.20298  0.09000         NA        NA

GEE estimates parameters with different interpretation. Thus, they do not have to agree.