(This isn’t particularly fleshed out, but I’ll post it until (if and when) I get around to updating it)
Let’s set up a really simple example: two groups. We’ll fit with a no-intercept model so that we can just think about the issues of back-transforming coefficients, not go through the additional step of computing predictions. (However, the basic issues are exactly the same.)
set.seed(101)
d <- data.frame(f=paste0("f",rep(1:2,each=20)),
y=rpois(40,lambda=rep(c(3,6),each=20)))
Compute means:
## base R:
get_mean <- function(d) with(d,tapply(y,list(f),mean))
get_mean(d)
## f1 f2
## 2.95 6.15
## alternatively, with the 'new hotness':
library("dplyr")
d %>% group_by(f) %>% summarise(m=mean(y))
## Source: local data frame [2 x 2]
##
## f m
## 1 f1 2.95
## 2 f2 6.15
Alternatively:
get_mean_model <- function(d,invlink,...) {
g1 <- glm(y~f-1,data=d,...)
invlink(coef(g1))
}
get_mean_model(d,exp,family="poisson")
## ff1 ff2
## 2.95 6.15
This works perfectly for (simple) GLMs. It would get more complicated in the case of GLMMs, where we would have to worry about
Now let’s suppose we have logit-Normally distributed data:
link <- qlogis ## logit
invlink <- plogis ## logistic
d2 <- data.frame(f=paste0("f",rep(1:2,each=20)),
y=invlink(rnorm(40,mean=rep(c(3,6),each=20),sd=1)))
m1 <- get_mean(d2)
m2 <- get_mean_model(transform(d2,y=link(y)),
family="gaussian",
invlink=invlink)
g1 <- lm(link(y)~f-1,data=d2)
backtransf <- function(lmfit) {
## second derivative
d2fun <- deriv(D(expression(1/(1+exp(-x))),"x"),"x",
function.arg=TRUE)
x <- coef(lmfit)
d2val <- c(attr(d2fun(x),"gradient"))
s <- summary(lmfit)$sigma
return(plogis(x)+d2val*s^2/2)
}
m3 <- backtransf(g1)
The other way to solve this problem is to back-transform simulated data:
backtransf_sim <- function(lmfit,n=100000) {
x <- coef(lmfit)
s <- summary(lmfit)$sigma
sim <- lapply(x,rnorm,n=n,sd=s)
sapply(sim,function(x) mean(plogis(x)))
}
m4 <- backtransf_sim(g1)
cbind(raw=m1,bt=m2,bt_corr=m3,bt_corr_sim=m4)
## raw bt bt_corr bt_corr_sim
## f1 0.9540271 0.9680887 0.9446481 0.9407790
## f2 0.9947408 0.9978494 0.9961176 0.9952491