Summary

This markdown reproduces the main results presented in the paper
Berlin JA, Longnecker MP, Greenland S. Meta-analysis of epidemiologic dose-response data. Epidemiology. 1993 May 1:218-28 (Berlin, Longnecker, and Greenland 1993).

The authors present methods for summarizing epidemiological studies of dose-response associations. In particular, they discuss methods for estimating the dose-response association from either single or multiple studies; for assigning levels to exposure categories; for investigating the effects of study design and subject characteristics on dose-response curves; and for choosing between a fixed and random-effects model. The topics are illustrated using data from case-control studies of the relation between duration of oral contraceptive use and risk of breast cancer.

Packages and data

The packages used throughout this markdown are dosresmeta, knitr, and the collection of packages in tidyverse.

The dataset are contained in the dosresmeta package: oc_breast and cc_ex.


Fixed Effects Dose-Response Models for a Single Study

The first topic is how to estimate a curve from summarized dose-response from a single study (Orsini et al. 2006). The methodology is illustrated using one study of the relation between the duration of oral contraceptive use and breast cancer risk (Meirik et al. 1986).

The data are part of the oc_breast data that contains the results from multiple studies.

library(dosresmeta)
data("oc_breast")

Reconstructing data presented in Table 1.

meirik <- subset(oc_breast, author == "Meirik")
meirik %>% 
  mutate(`Duration categories (assigned dose)` = c("Never (0)", "0-3 (1.5)", "4-7 (5.5)", "8-11 (9.5)", ">=12 (14.4)"),
         controls = n - cases, 
         ci = paste(lb, ub, sep = "-"),
         ci = replace(ci, is.na(se), NA)) %>%
  select(`Duration categories (assigned dose)`, cases, controls, or, ci) %>%
  kable()
Duration categories (assigned dose) cases controls or ci
Never (0) 96 156 1.0 NA
0-3 (1.5) 156 205 1.1 0.8-1.6
4-7 (5.5) 80 93 1.2 0.8-1.9
8-11 (9.5) 51 50 1.4 0.8-2.3
>=12 (14.4) 39 23 2.2 1.2-4

One of the features of summarized dose-response data is the correlation within set of estimated (log) relative risks. The authors discuss the implication of ignoring the covariance structure.

# Model assuming independece
tab1_ind <- dosresmeta(logor ~ duration, type = type, cases = cases, n = n,
                    se = se, data = meirik, covariance = "indep")
# Model taking into account the covariances
tab1_dep <- dosresmeta(logor ~ duration, type = type, cases = cases, n = n,
                       se = se, data = meirik, covariance = "gl")

data.frame(
  model = c("Weighted least squares", "Weighted least squares with adjustment for covariance"),
  slope = c(coef(tab1_dep), coef(tab1_ind)),
  `standard error` = c(vcov(tab1_dep)^.5, vcov(tab1_ind)^.5)
) %>% kable(digits = 4)
model slope standard.error
Weighted least squares 0.0454 0.0174
Weighted least squares with adjustment for covariance 0.0458 0.0156

The authors compared the results of an increment of 15 years of use.

rbind(
  c("Weighted least squares", predict(tab1_ind, delta = 15, expo = T)),
  c("Weighted least squares with adjustment for covariance", predict(tab1_dep, delta = 15, expo = T))
)
                                                             delta pred     ci.lb    ci.ub   
[1,] "Weighted least squares"                                15    1.989114 1.259205 3.142121
[2,] "Weighted least squares with adjustment for covariance" 15    1.976406 1.185243 3.29568 
rbind(
  c("Weighted least squares", predict(tab1_ind, delta = 10, expo = T)),
  c("Weighted least squares with adjustment for covariance", predict(tab1_dep, delta = 10, expo = T)))
                                                             delta pred     ci.lb    ci.ub   
[1,] "Weighted least squares"                                10    1.581635 1.166087 2.14527 
[2,] "Weighted least squares with adjustment for covariance" 10    1.574892 1.119966 2.214607

So the effect of the covariance adjustment in this example was to increase the width of the 95% confidence interval for the 10-year ln relative risk from 0.61 to 0.682 (12%).

Meta-analysis of Dose-Response Studies with Fixed Effects Model

When several studies investigated a common association, meta-analysis can be adopted for combining the study-specific results. The methodology is illustrated using the whole dataset.

oc_breast %>% slice(1:10) %>% kable()
id author year type duration cases n or lb ub logor se menopause period
1 Paffenbarger 1977 cc 0.00 230 718 1.00 NA NA 0.0000000 NA 1 1972
1 Paffenbarger 1977 cc 0.50 81 231 1.16 0.82 1.64 0.1484200 0.1768265 1 1972
1 Paffenbarger 1977 cc 1.50 23 68 0.91 0.50 1.67 -0.0943107 0.3076513 1 1972
1 Paffenbarger 1977 cc 3.00 49 102 1.76 1.07 2.94 0.5653138 0.2578494 1 1972
1 Paffenbarger 1977 cc 5.00 30 103 0.85 0.50 1.43 -0.1625189 0.2680717 1 1972
1 Paffenbarger 1977 cc 7.20 39 102 1.28 0.79 2.07 0.2468601 0.2457369 1 1972
2 Sartwell 1977 cc 0.00 262 595 1.00 NA NA 0.0000000 NA 1 1972
2 Sartwell 1977 cc 0.25 4 10 1.00 0.28 3.58 0.0000000 0.6500957 1 1972
2 Sartwell 1977 cc 0.75 3 8 0.92 0.22 3.89 -0.0833816 0.7328035 1 1972
2 Sartwell 1977 cc 1.50 2 5 0.82 0.14 4.94 -0.1984509 0.9090673 1 1972

The authors considered different models:
- model 2a and 2f: \(\log \textrm{RR} = \beta_1 \times \textrm{duration}\)
- model 2b and 2g: \(\log \textrm{RR} = \beta_1 \times \textrm{duration} + \beta_2 \times \textrm{duration}^2\)
- model 2d and 2if: \(\log \textrm{RR} = \beta_1 \times \sqrt{\textrm{duration}}\)

both ignoring and taking into accounts the covariances.

## Different dose-response models
mod2a <- dosresmeta(logor ~ duration, id = id, type = type, cases = cases, n = n,
                    se = se, data = oc_breast, covariance = "indep", method = "fixed")
mod2b <- dosresmeta(logor ~ duration + I(duration^2), id = id, type = type, cases = cases, n = n,
                    se = se, data = oc_breast, covariance = "indep", method = "fixed")
mod2d <- dosresmeta(logor ~I(sqrt(duration)), id = id, type = type, cases = cases, n = n,
                    se = se, data = oc_breast, covariance = "indep", method = "fixed")
mod2f <- dosresmeta(logor ~ duration, id = id, type = type, cases = cases, n = n,
                    se = se, data = oc_breast, covariance = "gl", method = "fixed")
mod2g <- dosresmeta(logor ~ duration + I(duration^2), id = id, type = type, cases = cases, n = n,
                    se = se, data = oc_breast, covariance = "gl", method = "fixed")
mod2i <- dosresmeta(logor ~I(sqrt(duration)), id = id, type = type, cases = cases, n = n,
                    se = se, data = oc_breast, covariance = "gl", method = "fixed")

## Auxiliary function
tab2f <- function(mod){
  tab <- cbind(t(t(coef(mod))), diag(mod$vcov)^.5, abs(t(t(coef(mod)))/diag(mod$vcov)^.5))
  colnames(tab) <- c("coef", "se", "z")
  tab
}
data.frame(model = c("2a.", "2b.", "", "2d.", "2f.", "2g.", "", "2i.")) %>% 
  cbind(do.call("rbind", lapply(list(mod2a, mod2b, mod2d, mod2f, mod2g, mod2i), tab2f))) %>%
  kable(digits = c(0, 4, 4, 2))
Warning in data.row.names(row.names, rowsi, i): some row.names duplicated: 4,5,6,7,8 --> row.names NOT used
model coef se z
2a. -0.0102 0.0026 3.87
2b. -0.0272 0.0067 4.08
0.0020 0.0007 2.79
2d. -0.0298 0.0068 4.36
2f. -0.0125 0.0029 4.36
2g. -0.0328 0.0066 4.98
0.0023 0.0007 3.42
2i. -0.0411 0.0080 5.12

Calculating and comparing the pooled relative risks for selected duration values.

data.frame(duration = c(0, 5, 16)) %>%
  cbind(
    quadratic = predict(mod2g, ., expo = T)[, -c(1:2)],
    square_root = predict(mod2i, ., expo = T)[, -1]
    ) %>%
  kable(digits = 2)
duration quadratic.pred quadratic.ci.lb quadratic.ci.ub square_root.pred square_root.ci.lb square_root.ci.ub
0 1.00 1.00 1.00 1.00 1.00 1.00
5 0.90 0.86 0.93 0.91 0.88 0.94
16 1.05 0.89 1.25 0.85 0.80 0.90

Exposure Scale Issues

The authors compared different approaches (means, medians, and midpoints) to assign doses to exposure categories. They use data from a case-control study on the relation between alcohol and breast cancer to make the point.

data("cc_ex")
tab3 <- cc_ex %>%
  mutate(
    mean = c(0, 1.5, 5.1, 21.3),
    median = c(0, 1.7, 5.1, 16.3),
    mid = c(0, 1.9, 6.0, 10.8),
    ci = paste(round(lb, 1), round(ub, 1), sep = "-"),
    var = ((log(ub)-log(lb))/(2*qnorm(.975)))^2
  ) 
tab3 %>% 
  select(gday, mean, median, mid, adjrr, ci, logrr, var) %>%
  kable(digits = c(0, 1, 1, 1, 1, 0, 1, 2))
gday mean median mid adjrr ci logrr var
Ref. 0.0 0.0 0.0 1.0 1-1 0.0 0.00
<2.5 1.5 1.7 1.9 0.8 0.5-1.3 -0.2 0.05
2.5-9.3 5.1 5.1 6.0 1.2 0.7-1.9 0.1 0.06
>9.3 21.3 16.3 10.8 1.6 1-2.5 0.5 0.06

Reproducing the results in Table 3

mod_mean <- dosresmeta(formula = logrr ~ mean, type = "cc", cases = case,
                       n = n, lb = lb, ub = ub, data= tab3)
mod_median <- dosresmeta(formula = logrr ~ median, type = "cc", cases = case,
                       n = n, lb = lb, ub = ub, data= tab3)
mod_mid <- dosresmeta(formula = logrr ~ mid, type = "cc", cases = case,
                         n = n, lb = lb, ub = ub, data= tab3)
do.call("rbind", lapply(list(mod_mean, mod_median, mod_mid), tab2f)) %>%
  kable()
coef se z
mean 0.0244134 0.0106534 2.291593
median 0.0321280 0.0140842 2.281140
mid 0.0462183 0.0209631 2.204740

Identifying Sources of Heterogeneity with Fixed Effects Models

The authors considered the case of meta-regression, where covariates at the study level may explain the heterogeneity across study findings. In the practical example based on the oc_breast data, the two potential sources of heterogeneity were whether or not a study included postmenopausal women, and the final year of case accrual, thought to be a surrogate for the changing formulations of oral contraceptives over time.

The different estimated models were:
- model 4a: \(\log \textrm{RR} = \beta_1 \times \sqrt{\textrm{duration}}\)
- model 4b: \(\log \textrm{RR} = \beta_1 \times \sqrt{\textrm{duration}} + \beta_2 \times \textrm{menopause}\)
- model 4c: \(\log \textrm{RR} = \beta_1 \times \sqrt{\textrm{duration}} + \beta_2 \times \textrm{(period - 1986)}\)
- model 4d: \(\log \textrm{RR} = \beta_1 \times \sqrt{\textrm{duration}} + \beta_2 \times \textrm{menopause} + \beta_3 \times \textrm{(period - 1986)}\)
- model 4e: \(\log \textrm{RR} = \beta_1 \times \sqrt{\textrm{duration}} + \beta_2 \times \textrm{menopause} + \beta_3 \times \textrm{(period - 1986)} + \beta_4 \times \textrm{menopause} \times \textrm{(period - 1986)}\)

mod4a <- dosresmeta(logor ~ I(sqrt(duration)), id = id, type = type, cases = cases, n = n,
                    se = se, data = oc_breast, method = "fixed")
mod4b <- dosresmeta(logor ~ I(sqrt(duration)), id = id, type = type, cases = cases, n = n,
                    se = se, data = oc_breast, method = "fixed",
                    mod = ~ menopause)
mod4c <- dosresmeta(logor ~ I(sqrt(duration)), id = id, type = type, cases = cases, n = n,
                    se = se, data = oc_breast, method = "fixed",
                    mod = ~ I(period - 1986))
mod4d <- dosresmeta(logor ~ I(sqrt(duration)), id = id, type = type, cases = cases, n = n,
                    se = se, data = oc_breast, method = "fixed",
                    mod = ~ menopause + I(period - 1986))
mod4e <- dosresmeta(logor ~ I(sqrt(duration)), id = id, type = type, cases = cases, n = n,
                    se = se, data = oc_breast, method = "fixed",
                    mod = ~ menopause*I(period - 1986))

tab4f <- function(mod){
  gof <- dosresmeta:::gof(mod)
  tab <- cbind(t(t(coef(mod))), diag(mod$vcov)^.5, gof$deviance$D, gof$deviance$df)
  colnames(tab) <- c("coef", "se", "deviance", "df")
  round(tab, 4)
}
data.frame(model = c("4a.", "4b.", "", "4c.", "", "4d.", "", "", 
                     "4e.", "", "", "")) %>% 
  cbind(do.call("rbind", lapply(list(mod4a, mod4b, mod4c, mod4d, mod4e), tab4f))) %>%
  mutate(
    deviance = replace(deviance, model == "", NA),
    df = replace(df, model == "", NA)
    ) %>% kable(digits = c(0, 4, 4, 0, 0))
Warning in data.row.names(row.names, rowsi, i): some row.names duplicated: 2,4,6,7,8,9,10,11 --> row.names NOT used
model coef se deviance df
4a. -0.0411 0.0080 218 88
4b. 0.1175 0.0251 174 87
-0.1766 0.0265 NA NA
4c. -0.0929 0.0112 174 87
-0.0170 0.0025 NA NA
4d. 0.0480 0.0281 144 86
-0.1473 0.0270 NA NA
-0.0142 0.0026 NA NA
4e. 0.1718 0.0383 121 85
-0.2883 0.0401 NA NA
0.0111 0.0059 NA NA
-0.0313 0.0066 NA NA

Meta-analysis of Dose-Response Studies with Random Effects Models

The final topic covered is the choice between a fixed- and a random-effects model.
Reproducing the results in table 5.

mod5_fixed <- dosresmeta(logor ~ I(sqrt(duration)), id = id, type = type, cases = cases, n = n,
                     se = se, data = oc_breast, method = "fixed",
                     mod = ~ menopause*I(period - 1986))
mod5_mm <- dosresmeta(logor ~ I(sqrt(duration)), id = id, type = type, cases = cases, n = n,
                         se = se, data = oc_breast, method = "mm",
                         mod = ~ menopause*I(period - 1986))

data.frame(coef_mm = coef(mod5_mm), se_mm = diag(vcov(mod5_mm))^.5,
           coef_fixed = coef(mod5_fixed), se_fixed = diag(vcov(mod5_fixed))^.5) %>%
  .[c(2:4, 1),] %>%
  round(4)
                           coef_mm  se_mm coef_fixed se_fixed
menopause                  -0.2468 0.0502    -0.2883   0.0401
I(period - 1986)            0.0111 0.0070     0.0111   0.0059
menopause:I(period - 1986) -0.0245 0.0083    -0.0313   0.0066
(Intercept)                 0.1720 0.0421     0.1718   0.0383

Appendix Table A1

oc_breast %>%
  group_by(id) %>%
  filter(row_number()==1) %>%
  select(author) %>%
  cbind(coef = mod4a$bi, se = unlist(mod4a$Si)^.5) %>%
  as.data.frame() %>%
  kable(digits = c(0, 0, 5, 5))
id author coef se
1 Paffenbarger 0.076091876 0.07208
2 Sartwell -0.090724132 0.16455
3 Lees 0.087833324 0.08967
4 Ravnihar -0.062128495 0.18727
5 Paffenbarger 0.126795605 0.05546
6 Kelsey -0.225524649 0.14299
7 Harris -0.059723518 0.09848
8 Brinton 0.062919337 0.04149
9 Janerich 0.063684420 0.04054
10 Pike 0.186057120 0.13185
11 Vessey 0.024426839 0.04597
12 Rosenberg 0.015408363 0.03907
13 Miller 0.071203657 0.10590
14 Meirik 0.148012840 0.06204
15 Ellery -0.042445450 0.11687
16 La Vecchia 0.034509702 0.11228
17 Paul -0.111716697 0.01090
18 Lee 0.070644893 0.08961
19 CASH 0.007116616 0.02046
20 McPherson YW 0.091471024 0.06911
21 McPherson OW -0.040747827 0.05640
22 Miller 0.287003221 0.08290
23 Jick -0.013089555 0.14784
24 Olsson 0.185356837 0.07595

Information about R session:

R version 3.4.1 (2017-06-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] bindrcpp_0.2     dosresmeta_2.0.0 mvmeta_0.4.7     dplyr_0.7.2      purrr_0.2.2.2    readr_1.1.1     
 [7] tidyr_0.6.3      tibble_1.3.3     ggplot2_2.2.1    tidyverse_1.1.1  knitr_1.16      

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.12     highr_0.6        cellranger_1.1.0 compiler_3.4.1   plyr_1.8.4       bindr_0.1       
 [7] forcats_0.2.0    tools_3.4.1      digest_0.6.12    lubridate_1.6.0  jsonlite_1.5     evaluate_0.10.1 
[13] nlme_3.1-131     gtable_0.2.0     lattice_0.20-35  pkgconfig_2.0.1  rlang_0.1.1      psych_1.7.5     
[19] yaml_2.1.14      parallel_3.4.1   haven_1.1.0      xml2_1.1.1       httr_1.2.1       stringr_1.2.0   
[25] hms_0.3          rprojroot_1.2    grid_3.4.1       glue_1.1.1       R6_2.2.2         readxl_1.0.0    
[31] foreign_0.8-69   rmarkdown_1.6    modelr_0.1.1     reshape2_1.4.2   magrittr_1.5     backports_1.1.0 
[37] scales_0.4.1     htmltools_0.3.6  rvest_0.3.2      assertthat_0.2.0 mnormt_1.5-5     colorspace_1.3-2
[43] stringi_1.1.5    lazyeval_0.2.0   munsell_0.4.3    broom_0.4.2     

References

Berlin, Jesse A, Matthew P Longnecker, and Sander Greenland. 1993. “Meta-Analysis of Epidemiologic Dose-Response Data.” Epidemiology. JSTOR, 218–28.

Meirik, Olav, Hans-Olov Adami, Thoralf Christoffersen, Eiliv Lund, Reinhold Bergström, and Per Bergsjö. 1986. “Oral Contraceptive Use and Breast Cancer in Young Women: A Joint National Case-Control Study in Sweden and Norway.” The Lancet 328 (8508). Elsevier: 650–54.

Orsini, Nicola, Rino Bellocco, Sander Greenland, and others. 2006. “Generalized Least Squares for Trend Estimation of Summarized Dose-Response Data.” Stata Journal 6 (1). Stata press 4905 LAKEWAY PARKWAY, COLLEGE STATION, TX 77845 USA: 40.