Modified Poisson to avoid non-collapsibility of odds ratio

References

https://rpubs.com/Hiro_macchan/53793

Odds ratio has non-collapsibility issue

Including (model 1) or not including (model 4) a non-confounding variable X3 associated with the outcome matters in the odds ratio of the variable of interest (Y axis value). The OR conditioning on X3 and the OR not conditioning on X3 are two different effects that do not agree when the effect measure is OR.

boot <- 100
beta1 <- 2
beta2 <- -2
beta3 <- 1.5
beta1_beta2 <- 1
res_mat <- matrix(,boot,4)
res_mat_2 <- matrix(,boot,2)
res_mat_3 <- matrix(,boot,3)
res_mat_4 <- matrix(,boot,3)
mf_mat <- matrix(,boot,4)
mf_mat_2 <- matrix(,boot,2)
mf_mat_3 <- matrix(,boot,3)
mf_mat_4 <- matrix(,boot,3)

for(i in 1:boot){
  n <- 10000
  x1 <- rnorm(n,mean = 0,sd=10)
  e2 <- rlogis(n,0,1)
  x2 <- beta1_beta2*x1 + e2
  x3 <- rnorm(n,mean = 0,sd=10)
  e <- rlogis(n,0,1)
  z <- beta1*x1+beta2*x2+ beta3*x3 +e
  y_pred <- 1/(1+exp(-z))
  y <- ifelse(y_pred<0.5,0,1)
  
  res <- glm(y~x1+x2+x3, family = binomial(logit))
  res_mat[i,] <- res$coefficients
  mf_mat[i,] <- coef(res) * mean((dlogis(predict(res,type = "link"))))
  
  res_2 <- glm(y~x1, family = binomial(logit))
  res_mat_2[i,] <- res_2$coefficients
  mf_mat_2[i,] <- coef(res_2) * mean((dlogis(predict(res_2,type = "link"))))
  
  res_3 <- glm(y~x1+x3, family = binomial(logit))
  res_mat_3[i,] <- res_3$coefficients
  mf_mat_3[i,] <- coef(res_3) * mean((dlogis(predict(res_3,type = "link"))))

  
  res_4 <- glm(y~x1+x2, family = binomial(logit))
  res_mat_4[i,] <- res_4$coefficients
  mf_mat_4[i,] <- coef(res_4) * mean((dlogis(predict(res_4,type = "link"))))
}
treat_coef <- cbind(res_mat[,2],res_mat_2[,2],res_mat_3[,2],res_mat_4[,2])
colnames(treat_coef) <- c("model1","model2","model3","model4")
boxplot(treat_coef, xlab = "model", ylab = "treat_coef")

Risk ratio does not have non-collapsibility issue

Including (model 1) or not including (model 4) a non-confounding variable associated with the outcome does not matter in the risk ratio of the variable of interest (Y axis value).

The RR conditioning on X3 and the RR not conditioning on X3 are two different effects that DO agree when the effect measure is RR.

Here Poisson regression is fit. The point estimate should be unbiased. The variance estimate will be wrong. Robust variance estimator should be used for variance estimation (modified Poisson regression of Zou; http://aje.oxfordjournals.org/content/159/7/702.full)

for(i in 1:boot){
  n <- 10000
  x1 <- rnorm(n,mean = 0,sd=10)
  e2 <- rlogis(n,0,1)
  x2 <- beta1_beta2*x1 + e2
  x3 <- rnorm(n,mean = 0,sd=10)
  e <- rlogis(n,0,1)
  z <- beta1*x1+beta2*x2+ beta3*x3 +e
  y_pred <- 1/(1+exp(-z))
  y <- ifelse(y_pred<0.5,0,1)
  
  res <- glm(y~x1+x2+x3, family = poisson(link = "log"))
  res_mat[i,] <- res$coefficients
  mf_mat[i,] <- coef(res) * mean((dlogis(predict(res,type = "link"))))
  
  res_2 <- glm(y~x1, family = poisson(link = "log"))
  res_mat_2[i,] <- res_2$coefficients
  mf_mat_2[i,] <- coef(res_2) * mean((dlogis(predict(res_2,type = "link"))))
  
  res_3 <- glm(y~x1+x3, family = poisson(link = "log"))
  res_mat_3[i,] <- res_3$coefficients
  mf_mat_3[i,] <- coef(res_3) * mean((dlogis(predict(res_3,type = "link"))))

  
  res_4 <- glm(y~x1+x2, family = poisson(link = "log"))
  res_mat_4[i,] <- res_4$coefficients
  mf_mat_4[i,] <- coef(res_4) * mean((dlogis(predict(res_4,type = "link"))))
}
treat_coef <- cbind(res_mat[,2],res_mat_2[,2],res_mat_3[,2],res_mat_4[,2])
colnames(treat_coef) <- c("model1","model2","model3","model4")
boxplot(treat_coef, xlab = "model", ylab = "treat_coef")

Top Page: http://rpubs.com/kaz_yos/
github: https://github.com/kaz-yos