Weird results for (correlated) random intercepts and slopes in unbalanced data

The “Beat the Blues” dataset looks at longitudinal change in a depression score “bdi” over 8 months under 2 different treatments:

library("lme4")
## Loading required package: Matrix
## Loading required package: Rcpp

data("BtheB", package = "HSAUR")
BtheB_long <- reshape(BtheB, idvar = "subject", varying = c("bdi.2m", "bdi.4m", 
    "bdi.6m", "bdi.8m"), direction = "long")
BtheB_long$time <- factor(BtheB_long$time, levels = c("2m", "4m", "6m", "8m"))
BtheB_long$subject <- factor(BtheB_long$subject)

# add a numeric time variable that gives the number of months after
# treatment:
BtheB_long$months <- as.numeric(BtheB_long$time) * 2

# remove observations with missing values
BtheB_long <- BtheB_long[complete.cases(BtheB_long), ]
str(BtheB_long)
## 'data.frame':    280 obs. of  8 variables:
##  $ drug     : Factor w/ 2 levels "No","Yes": 1 2 2 1 2 2 2 1 2 2 ...
##  $ length   : Factor w/ 2 levels "<6m",">6m": 2 2 1 2 2 1 1 2 1 2 ...
##  $ treatment: Factor w/ 2 levels "TAU","BtheB": 1 2 1 2 2 2 1 1 2 2 ...
##  $ bdi.pre  : num  29 32 25 21 26 7 17 20 18 20 ...
##  $ time     : Factor w/ 4 levels "2m","4m","6m",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ bdi      : num  2 16 20 17 23 0 7 20 13 5 ...
##  $ subject  : Factor w/ 100 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ months   : num  2 2 2 2 2 2 2 2 2 2 ...
##  - attr(*, "reshapeLong")=List of 4
##   ..$ varying:List of 1
##   .. ..$ bdi: chr  "bdi.2m" "bdi.4m" "bdi.6m" "bdi.8m"
##   .. ..- attr(*, "v.names")= chr "bdi"
##   .. ..- attr(*, "times")= chr  "2m" "4m" "6m" "8m"
##   ..$ v.names: chr "bdi"
##   ..$ idvar  : chr "subject"
##   ..$ timevar: chr "time"

Fit random intercept / slope model, with and without correlation between random effects:

lmm_slopes <- lmer(bdi ~ months + treatment + drug + bdi.pre + (months | subject), 
    data = BtheB_long)
lmm_slopes_nocorr <- lmer(bdi ~ months + treatment + drug + bdi.pre + (months || 
    subject), data = BtheB_long)
print(summary(lmm_slopes), correlation = FALSE)
## Linear mixed model fit by REML ['lmerMod']
## Formula: bdi ~ months + treatment + drug + bdi.pre + (months | subject)
##    Data: BtheB_long
## 
## REML criterion at convergence: 1868
## 
## Scaled residuals: 
##    Min     1Q Median     3Q    Max 
## -2.327 -0.466 -0.081  0.372  3.538 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  subject  (Intercept) 49.825   7.059         
##           months       0.231   0.481    -0.09
##  Residual             23.865   4.885         
## Number of obs: 280, groups: subject, 97
## 
## Fixed effects:
##                Estimate Std. Error t value
## (Intercept)      5.9803     2.2183    2.70
## months          -0.6996     0.1560   -4.48
## treatmentBtheB  -2.4898     1.6989   -1.47
## drugYes         -2.9070     1.7334   -1.68
## bdi.pre          0.6455     0.0766    8.42
print(summary(lmm_slopes_nocorr), correlation = FALSE)
## Linear mixed model fit by REML ['lmerMod']
## Formula: bdi ~ months + treatment + drug + bdi.pre + ((1 | subject) +  
##     (0 + months | subject))
##    Data: BtheB_long
## 
## REML criterion at convergence: 1868
## 
## Scaled residuals: 
##    Min     1Q Median     3Q    Max 
## -2.378 -0.469 -0.093  0.366  3.581 
## 
## Random effects:
##  Groups    Name        Variance Std.Dev.
##  subject   (Intercept) 48.083   6.934   
##  subject.1 months       0.192   0.438   
##  Residual              24.093   4.908   
## Number of obs: 280, groups: subject, 97
## 
## Fixed effects:
##                Estimate Std. Error t value
## (Intercept)      5.9684     2.2138    2.70
## months          -0.6970     0.1548   -4.50
## treatmentBtheB  -2.5015     1.6985   -1.47
## drugYes         -2.9206     1.7328   -1.69
## bdi.pre          0.6462     0.0766    8.44
anova(lmm_slopes, lmm_slopes_nocorr, refit = FALSE)
## Data: BtheB_long
## Models:
## lmm_slopes_nocorr: bdi ~ months + treatment + drug + bdi.pre + ((1 | subject) + 
## lmm_slopes_nocorr:     (0 + months | subject))
## lmm_slopes: bdi ~ months + treatment + drug + bdi.pre + (months | subject)
##                   Df  AIC  BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## lmm_slopes_nocorr  8 1884 1913   -934     1868                        
## lmm_slopes         9 1886 1919   -934     1868  0.03      1       0.86

So, the estimated correlation between slopes and intercepts is -0.0086 and it is not significantly different from 0.

First major weirdness:

The empirical correlation of the estimated random effects is much much larger than this:

cor(ranef(lmm_slopes)[[1]])
##             (Intercept) months
## (Intercept)      1.0000 0.3839
## months           0.3839 1.0000

I realize there's a difference between the parameters that maximize the deviance that are reported in the summary and the empirical distribution of the posterior estimates, but still – a different sign, and that much larger? Even more strange, the correlation of the estimates becomes larger once we assume uncorrelated effects:

cor(ranef(lmm_slopes_nocorr)[[1]])
##             (Intercept) months
## (Intercept)      1.0000 0.5537
## months           0.5537 1.0000

Second major weirdness:

If we check how dropping the correlation from the prior for the random effects affected the values of the estimated random effects, we see this (red dots are estimates under assumed correlation, end points of arrows are estimates for prior without correlation):

ints_slopes <- data.frame(ranef(lmm_slopes)[[1]], ranef(lmm_slopes_nocorr)[[1]])
# make better column names
colnames(ints_slopes) <- c("int", "slope", "int_nocorr", "slope_nocorr")
require("ggplot2")
## Loading required package: ggplot2
require("grid")  # for arrows
## Loading required package: grid
require("gridExtra")  # for arrange
## Loading required package: gridExtra
grid.arrange(qplot(x = int, y = slope, data = ints_slopes, geom = "point", col = "red") + 
    theme_bw() + ylim(range(ints_slopes$slope_nocorr)) + theme(legend.position = "none"), 
    qplot(x = int_nocorr, y = slope_nocorr, data = ints_slopes, geom = "point") + 
        theme_bw() + ylim(range(ints_slopes$slope_nocorr)) + theme(legend.position = "none"), 
    nrow = 1)
## Warning: Removed 3 rows containing missing values (geom_point).

plot of chunk unnamed-chunk-5

qplot(x = int, y = slope, xend = int_nocorr, yend = slope_nocorr, data = ints_slopes, 
    geom = "point", col = "red") + geom_segment(arrow = arrow(length = unit(0.1, 
    "cm")), col = "black") + theme_bw() + theme(legend.position = "none")

plot of chunk unnamed-chunk-6

So the effects of dropping the correlation are

Third major weirdness:

pr_lmm_slopes <- profile(lmm_slopes, which = ".sig02")
## Warning: non-monotonic profile
qplot(x = .sig02, y = .zeta, data = pr_lmm_slopes) + theme_bw()

plot of chunk unnamed-chunk-7

confint(pr_lmm_slopes)
##        2.5 % 97.5 %
## .sig02    -1      1

Why does profile fail here? Why would the profile LR have to be monotonous? Is it reasonable hat confint returns \( (-1, 1) \) if the deviance wasn't even evaluated for values below -0.5987?


What's happening here? One possible explanation is to note that there are lots of subjects with early drop-out that have \( \leq 2 \) measurements, so there isn't really enough info for estimating slopes for those:

obs_per_subject <- table(BtheB_long$subject)
table(obs_per_subject)
## obs_per_subject
##  0  1  2  3  4 
##  3 24 15  6 52

The 24 subjects with only a single observation are the ones whose random effects lie on this weird line-shaped pattern in the centre of the point cloud:

## add obs_per_subject as 'Freq' variable to random effect estimates:
ints_slopes <- merge(obs_per_subject, ints_slopes, by = "row.names")

qplot(x = int, y = slope, data = ints_slopes, geom = "point", col = as.factor(Freq)) + 
    theme_bw() + ylim(range(ints_slopes$slope_nocorr))
## Warning: Removed 3 rows containing missing values (geom_point).

plot of chunk unnamed-chunk-9

qplot(x = int_nocorr, y = slope_nocorr, data = ints_slopes, geom = "point", 
    col = as.factor(Freq)) + theme_bw() + ylim(range(ints_slopes$slope_nocorr))

plot of chunk unnamed-chunk-9

pr_lmm_slopes <- profile(lmm_slopes, which = ".sig02")
## Warning: non-monotonic profile
confint(pr_lmm_slopes)
##        2.5 % 97.5 %
## .sig02    -1      1