Plotting raw data is not always very informative when we have complex models. A “better” way to visualize and communicate results are model predictions with associated uncertainty. However, this is often not trivial for mixed-models and particularly for GLMMs. One option is to show the effects on the link scale (e.g. coefficient estimates), or give the relative (%) change (i.e. exp(fixef(model))). Those options are all fine but sometimes you want a direct connection between data and model and make plots on the response scale. Here I try to look at that in detail for models fitted with a Poisson distribution.
First I simulate a data set with overdispersion. Code to simulate is taken from Harrison, Xavier (2014): Overdispersion and Observation-Level Random Effect Simulation Data. fig share. httpd://dz.do.org/10.6084/m9.fig share.1144471.v3
# Data Generation -
# An experimental set up with two treatments, A and B (fix), performed in X populations (random).
# The data is overdispersed.
set.seed(1)
#Random Intercept
intercept.od<--0.5 # Treatment A mean
intercept.od.sd<-0.5 # Among populations variation (std dev)
n.pops.od<-10 # 10 populations
n.ind.od<-50 # 50 in each population
popid.od<-gl(n.pops.od,n.ind.od )
# Treatment effect
treat.slope.od<-0.25 # link scale
treat <- rep(c("a", "b"), 10, each=25)#
# Overdispersion, sqrt(variance)
epsvals<- sqrt(0.55)
# Data
data2 <- data.frame(popid.od, treat, obs=factor(1:(n.pops.od*n.ind.od)))
# simulate the response (off)
set.seed(1)
data2$off <- simulate(~treat+(1|popid.od)+ (1|obs), newdata=data2, newparams=list(theta=c(epsvals,intercept.od.sd), beta=c(intercept.od, treat.slope.od)), family=poisson)[[1]]
## theta parameter vector not named: assuming same order as internal vector
## beta parameter vector not named: assuming same order as internal vector
# Run model with obervation-level random effect to handle overdispersion
m.glmer <- glmer(off ~ treat+(1|popid.od)+(1|obs),family=poisson,data=data2)
summary(m.glmer)
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: poisson ( log )
## Formula: off ~ treat + (1 | popid.od) + (1 | obs)
## Data: data2
##
## AIC BIC logLik deviance df.resid
## 1331.5 1348.3 -661.7 1323.5 496
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.0087 -0.6968 -0.5896 0.4247 1.7795
##
## Random effects:
## Groups Name Variance Std.Dev.
## obs (Intercept) 0.5381 0.7336
## popid.od (Intercept) 0.1513 0.3890
## Number of obs: 500, groups: obs, 500; popid.od, 10
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.5370 0.1590 -3.377 0.000733 ***
## treatb 0.2504 0.1202 2.083 0.037209 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## treatb -0.416
# with MCMCglmm
m.mc <- MCMCglmm(off ~ treat, random= ~ popid.od, data=data2, family = "poisson", prior=list(R=list(V=1,
nu=0.002), G=list(G1=list(V=1, nu=0.002))), thin = 25, burnin = 10000,
pr = TRUE, pl = TRUE, saveX = TRUE, saveZ = TRUE, nitt = 50000, verbose = FALSE)
summary(m.mc)
##
## Iterations = 10001:49976
## Thinning interval = 25
## Sample size = 1600
##
## DIC: 1261.963
##
## G-structure: ~popid.od
##
## post.mean l-95% CI u-95% CI eff.samp
## popid.od 0.2637 0.04416 0.6206 1238
##
## R-structure: ~units
##
## post.mean l-95% CI u-95% CI eff.samp
## units 0.5677 0.3567 0.788 652
##
## Location effects: off ~ treat
##
## post.mean l-95% CI u-95% CI eff.samp pMCMC
## (Intercept) -0.54844 -0.92702 -0.15950 1332 0.0075 **
## treatb 0.25354 0.01864 0.49062 1387 0.0325 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
There is some evidence of a treatment effect and the glmer and MCMCglmm model output are almost identical. So how do we get the predicted treatment means? The raw means are
Lets start with treatment level predictions.
# First create a data set with our two treatment levels
set.seed(1)
newdat <- data.frame(treat=c("a", "b"), obs=c(1000,2000), popid.od = c(1000,2000))
# Treatment level prediction
glmer.no.marg <- predict(m.glmer, newdata=newdat, re.form=NA, type="link") # link scale
# Compare with raw means
rawMeans <- aggregate(off ~ treat, data2, mean)
data.frame(type = rep(c("Treat. level prediction", "Raw mean"), each=2),
rbind(data.frame(treat=c("a","b"),off= exp(glmer.no.marg)), rawMeans))
## type treat off
## 1 Treat. level prediction a 0.5845046
## 2 Treat. level prediction b 0.7508442
## 11 Raw mean a 0.8480000
## 21 Raw mean b 1.0720000
Predicted values are low compared to the raw means. This is expected though. To quote the MCMCglmm course notes, “the expectation of the response variable y is different from the linear predictor if we wish to average over the residuals” MCMCglmm course notes p.45-46. In the Poisson model random effects are not additive on the response scale, \(E[{\bf y}] = \textrm{exp}({\bf X}{\beta}+{\bf Z}{\bf u}+{\bf e})\). This complicates things when move between link scale and response scale, see e.g. Fig 2.5 in MCMCglmm course notes. To get predictions closer to the raw means we can marginalise over random effects, including the observation-level effect that act as a residual term (i.e. overdispersion).
marg.res <- predict(m.glmer, re.form=~(1|popid.od) + (1|obs), type="response")
aggregate(marg.res ~ treat, data2, mean)
## treat marg.res
## 1 a 0.8257953
## 2 b 1.0547712
Much better! We can also do this manually for each treatment level. According to MCMCglmm course notes p. 46 we should add the \(0.5*\sigma^{2})\) (\(\sigma^{2}\) is the sum of the variance components) to the fixed effect estimate.
#treatment A and B predicted means
glmer.all.marg.treatA <- fixef(m.glmer)[1] + 0.5*(VarCorr(m.glmer)$obs[1]+VarCorr(m.glmer)$popid.od[1])
glmer.all.marg.treatB <-fixef(m.glmer)[1]+fixef(m.glmer)[2] + 0.5*(VarCorr(m.glmer)$obs[1]+VarCorr(m.glmer)$popid.od[1])
exp(c(treat_A=glmer.all.marg.treatA, treat_B=glmer.all.marg.treatB))
## treat_A.(Intercept) treat_B.(Intercept)
## 0.8250882 1.0598936
# Numbers looks good now. What if we only marginalise over the residual term?
glmer.resid.marg.treatA <- fixef(m.glmer)[1] + 0.5*VarCorr(m.glmer)$obs[1]
glmer.resid.marg.treatB <- fixef(m.glmer)[1]+fixef(m.glmer)[2] + 0.5*VarCorr(m.glmer)$obs[1]
exp(c(treat_A=glmer.resid.marg.treatA, treat_B=glmer.resid.marg.treatB))
## treat_A.(Intercept) treat_B.(Intercept)
## 0.7649630 0.9826578
# The difference we see is the effect of random pop variable.
#We can plot the different means.
library(ggplot2)
raw <- aggregate(off ~ treat, data2, mean)
pred.dat <- data.frame(treat= raw[,1],Prediction = c(log(raw[,2]), c(glmer.all.marg.treatA, glmer.all.marg.treatB, glmer.resid.marg.treatA, glmer.resid.marg.treatB, glmer.no.marg)), type = rep( c("Raw data", "All random eff marginalised", "Obs-level marg.", "No terms marginalised"), each=2))
ggplot(pred.dat, aes(y = exp(Prediction), x = treat, fill = type)) +
geom_bar(position=position_dodge(), stat="identity") +
ylim(c(0,1.2)) + ylab("Predicted response")
Lets try this with the MCMCglmm model.
newdat$off <- 0 # predict.MCMCglmm requires this.
# predict(m.mc, newdata=newdat, marginal=NULL, posterior="all", type="response")
# Ouch, error msg. This should marginalise over all random effects, including the observation-level effect that is called "units" in MCMCglmm. Not sure why it doesnt work. Lets try with only the pop effect.
predict(m.mc, newdata=newdat, marginal=~popid.od, posterior="all", type="response")
## [,1]
## 1 0.8954663
## 2 1.1532057
# Works but prediction higher than raw data. This is weird and the predicted values do not change if I change population or obs numbers in the new data set, and remain the same when I run it again.
# Lets try with the original data set instead.
marg.res.mc <- predict(m.mc, marginal=NULL, posterior="all", type="response")
(aggregate(marg.res.mc ~ treat, data2, mean))
## treat V1
## 1 a 0.8484267
## 2 b 1.0924385
# OK, make sense. Very similar to the raw means.
# We can try 'by hand'
a.marg <- mean(m.mc$Sol[,1] + 0.5*rowSums(m.mc$VCV)) # treat a, link scale
b.marg <-mean(rowSums(m.mc$Sol[,1:2]) + 0.5*rowSums(m.mc$VCV)) # treat b, link scale
exp(c(treat_A=a.marg, treat_B=b.marg))
## treat_A treat_B
## 0.8756996 1.1284099
# Similar results to above but not identical. The latter, by hand, results here should be "correct" according to the course notes and Im not sure how it can't be replicated using the predict() function.
# Lets compare predictions with the treatment raw means (red line)
par(mfrow=c(1,2))
hist(exp(m.mc$Sol[,1] + 0.5*rowSums(m.mc$VCV)), breaks=100, main="Treat A", xlab="Prediction")
abline(v=mean(data2[data2$treat == "a", "off"]), col="red")
hist(exp(m.mc$Sol[,1]+m.mc$Sol[,2] + 0.5*rowSums(m.mc$VCV)), breaks=100, main="Treat B", xlab="Prediction")
abline(v=mean(data2[data2$treat == "b", "off"]), col="red")
#Again we can try with only marginalising over the additional residual effect (overdispersion).
a.marg.res <- mean(m.mc$Sol[,1] + 0.5*m.mc$VCV[,2]) # treat a
b.marg.res <- mean(rowSums(m.mc$Sol[,1:2]) + 0.5*m.mc$VCV[,2]) # treat b
exp(c(treat_A=a.marg.res, treat_B=b.marg.res))
## treat_A treat_B
## 0.7675115 0.9890008
# As we expected, lower values.
# Lets put it all together and plot
a.nomarg.res <- mean(m.mc$Sol[,1]) # treat a, only fixed effect
b.nomarg.res <- mean(rowSums(m.mc$Sol[,1:2])) # treat b, only fixed effect
pred.dat[9:14,2] <- c(a.nomarg.res, b.nomarg.res, a.marg, b.marg, a.marg.res, b.marg.res)
pred.dat[9:14,1] <- rep(c("a", "b"),3)
pred.dat[9:14,3] <- rep(c("No terms marginalised", "All random eff marginalised", "Obs-level marg."),each=2)
pred.dat$model <- c(rep("data",2), rep("glmer",6), rep("MCMCglmm",6))
# Lets plot again
ggplot(pred.dat, aes(y = exp(Prediction), x = treat, fill = type)) +
geom_bar(position=position_dodge(), stat="identity") +
ylim(c(0,1.2)) +
ylab("Prediction") +
facet_wrap(~model)
However, what about confidence intervals? For mixed-models we have uncertainty at multiple layers and we have to decide what uncertainty we want to include. Often we only want to incorporate the uncertainty in the parameters to create confidence intervals (confidence in the ‘effect’). Adding other sources of variation give us prediction intervals (possible values if we sample a new population) and sometimes that is of interest as well. With our glmer model we can get standard errors (95% confidence interval roughly 2*SE) very easy (bootstrap approach better but slower and yield very similar results).
X <- model.matrix(formula(m.glmer,fixed.only=TRUE)[-2],
newdat)
V <- vcov(m.glmer) ## var-cov matrix of beta
pred.se <- sqrt(diag(X %*% V %*% t(X))) ## SEs of predictions
pred.dat$se <- rep(pred.se,7)
ggplot(subset(pred.dat, model == "glmer"), aes(y = exp(Prediction), x = treat, fill = type)) +
geom_bar(position="dodge", stat="identity") +
geom_errorbar(aes(ymin=exp(Prediction-se), ymax=exp(Prediction+se)),
width=0, position=position_dodge(1)) +
facet_grid(~model) +
ylim(c(0,1.4)) +
ylab("Prediction") + ggtitle("Predictions +- SE")
# We can also get bootstrap estimates of the 95% confidence intervals.
# Here for the fully marginalised means.
FUN <- function(fit) {
return(c(exp(fixef(fit)[1] + 0.5*(VarCorr(fit)$obs[1]+VarCorr(fit)$popid.od[1])),
exp(fixef(fit)[1]+fixef(fit)[2] + 0.5*(VarCorr(fit)$obs[1]+VarCorr(fit)$popid.od[1])) ))
}
result <- suppressWarnings(bootMer(m.glmer, FUN, nsim = 1000))
ci <- rbind(quantile(result$t[,1], c(0.025,0.975)), quantile(result$t[,2], c(0.025,0.975)) )
ci_easy <- cbind(pred.dat$Prediction - 1.96*pred.dat$se, pred.dat$Prediction + 1.96*pred.dat$se)
data.frame("ci type"= rep(c("easy","boot"),each=2), treat=c("a", "b"), rbind(ci, exp(ci_easy[3:4,])))
## ci.type treat X2.5. X97.5.
## 1 easy a 0.5965702 1.109125
## 2 easy b 0.7574102 1.413629
## 3 boot a 0.6041427 1.126837
## 4 boot b 0.7831465 1.434437
I first simply added the standard errors derived from parameter uncertainty to the the marginalised means. Then a bootstrap approach was used to estimate the uncertainty around the means. Not sure if all this make sense though. This becomes harder with the MCMCglmm model.
# For MCMCglmm
# First I try the predict function
data.frame(treat=c("a","b"), predict(m.mc, newdata=newdat, marginal=~popid.od, posterior="all", type="response", interval = "confidence"))
## treat fit lwr upr
## 1 a 0.8954663 0.5663806 1.283309
## 2 b 1.1532057 0.7442952 1.641415
# However, think this includes unceartinty in the variance components (population and observation-level overdispersion). That can be a good thing....depends I guess. These values are too high though.
#predict(m.mc, newdata=newdat, marginal=NULL, posterior="all", type="response", interval = "confidence")
# Do not work.....not sure why.
# Lets do this by hand.This, I think, makes sense to do and should corrspond to the above function.
a.marg <- HPDinterval(m.mc$Sol[,1] + 0.5*rowSums(m.mc$VCV)) # treat a
b.marg <- HPDinterval(as.mcmc(rowSums(m.mc$Sol[,1:2])) + 0.5*rowSums(m.mc$VCV)) # treat b
res <- exp(rbind(treat_A=a.marg, treat_B=b.marg)); rownames(res) <- c("A", "B")
res
## lower upper
## A 0.5981825 1.337931
## B 0.7701878 1.676680
# The intervals are wide though.....too wide I think.
# I guess this ignores the uncertainty in the variance components.
# very similar numbers though...so....
aint <- HPDinterval(m.mc$Sol[,1] + 0.5*sum(apply(m.mc$VCV,2, mean))) # treat a
bint <- HPDinterval(as.mcmc(rowSums(m.mc$Sol[,1:2])) + 0.5*sum(apply(m.mc$VCV,2, mean))) # treat b
res <- exp(rbind(treat_A=aint, treat_B=bint)); rownames(res) <- c("A", "B")
res
## lower upper
## A 0.5997070 1.292014
## B 0.7819007 1.676013
In general I get intervals much much wider than for the glmer model. Well, actually, it is the upper limit that much higher. For he lower limit glmer and MCMCglmm matches quite well. Something is not righ there….
# Lets arrange the CIs and plot them
pred.dat$ci_lo <- pred.dat$Prediction-pred.se*2
pred.dat$ci_up <- pred.dat$Prediction+pred.se*2
pred.dat[1:2, 6:7] <- 0
pred.dat[11:12, 6:7] <- rbind(a.marg, b.marg)
pred.dat[9:10, 6:7] <- rbind(HPDinterval(m.mc$Sol[,1]),
HPDinterval(as.mcmc(rowSums(m.mc$Sol[,1:2]))))
pred.dat[13:14, 6:7] <- rbind(HPDinterval(m.mc$Sol[,1] + 0.5*m.mc$VCV[,2]), # treat a
HPDinterval(as.mcmc(rowSums(m.mc$Sol[,1:2])) + 0.5*m.mc$VCV[,2])) # treat b
ci_easy[-c(3:4),] <- 0; pred.dat <- cbind(pred.dat, ci_easy); colnames(pred.dat)[8:9] <-c("boot_lo", "boot_up")
ggplot(pred.dat, aes(y = exp(Prediction), x = treat, fill = type)) +
geom_bar(position="dodge", stat="identity") +
geom_errorbar(aes(ymin=exp(ci_lo), ymax=exp(ci_up)),
width=0, position=position_dodge(0.9)) +
facet_grid(~model) +
ylim(c(0,1.75)) +
ylab("Prediction") +
geom_errorbar(aes(ymin=exp(boot_lo), ymax=exp(boot_up)),
width=0, position=position_dodge(0.1), colour="red")
# Red error bar show the bootstrap CI for the glmer model.
MCMCglmm intervals are much wider as for glmer and I’m not sure how to interpret this. I expect them to be wider but maybe not this much. The 95% bands cover the other treatment mean value although the effect was ‘significant’ (MCMCglmm model P = 0.0325
).
After all, maybe showing percent change isn’t a bad option :).
est <- as.data.frame(summary(m.mc)$solutions)
est$vars <- rownames(est)
colnames(est)[2:3] <-c("lo_95", "hi_95")
est[,1:3] <- (exp(est[,1:3])-1) * 100
ggplot(est[2,], aes(x=vars, y=post.mean)) +
geom_hline(yintercept=0, lty=2, lwd=1, colour="grey50") +
geom_errorbar(aes(ymin=lo_95, ymax=hi_95),
lwd=1, colour="red", width=0) +
geom_point(size=4, pch=21, fill="yellow") +
ylim(c(-10, 100)) +
xlab("") +
ylab("Effect (%)") +
theme_bw() +
coord_flip()