Meta Analysis in R

Here is an example of using the metafor package to conduct meta analyses in R. First, we create an artificial data set containing means, sd’s, and n’s from both treatment and control groups across 15 studies. We set each of these values equal to their corresponding values in the escalc function as displayed below. We set append to TRUE to stack the 15 values for each of the parameters into six variables. The data (i.e. dat in the example below) also contains two possible moderators cat, which is a generic categorical variable with a 1 and 0, and a contin variable which is a generic continuous indicator. Finally, the escalc package produces two important values yi and vi. yi is the differenced effect for each study divided by the pooled standard deviation (i.e. Cohen’s D), which we indicated by selecting the “SMD” measure and vi is the variance associated with that effect size.

library(metafor)
set.seed(12345)
datExample = cbind(mean1 = rnorm(15), mean2 = rnorm(15), sd1 = abs(rnorm(15)), sd2 = abs(rnorm(15)), n1 = abs(rnorm(15)*20), n2 = abs(rnorm(15)*20), cat = c(rep(1, 7), rep(0,8)), contin = rnorm(15))

dat = escalc(measure = "SMD", m1i = mean1, m2i = mean2, sd1i = sd1,  sd2i = sd2, n1i = n1, n2i = n2, data = datExample, append = TRUE)
dat

##         mean1      mean2        sd1        sd2        n1        n2 cat
## 1   0.5855288  0.8168998 0.81187318 1.46072940  2.991840 18.852017   1
## 2   0.7094660 -0.8863575 2.19683355 1.41309878 26.850630 16.525166   1
## 3  -0.1093033 -0.3315776 2.04919034 0.56740325 11.066062 16.230810   1
## 4  -0.4534972  1.1207127 1.63244564 0.58318765 31.799257  9.524966   1
## 5   0.6058875  0.2987237 0.25427119 1.30679883 11.737592 20.425168   1
## 6  -1.8179560  0.7796219 0.49118828 0.54038607 36.647546 12.907661   1
## 7   0.6300986  1.4557851 0.32408658 1.94769266 17.762789 20.862871   1
## 8  -0.2761841 -0.6443284 1.66205024 0.05359027 31.869769  6.087382   0
## 9  -0.2841597 -1.5531374 1.76773385 0.35166284 10.337093 49.542218   0
## 10 -0.9193220 -1.5977095 0.02580105 0.67097654 25.913434 19.424413   0
## 11 -0.1162478  1.8050975 1.12851083 0.27795369  1.092312 37.341984   0
## 12  1.8173120 -0.4816474 2.38035806 0.69117127 15.692987 13.440849   0
## 13  0.3706279  0.6203798 1.06026555 0.82379533 20.987056  6.159068   0
## 14  0.5202165  0.6121235 0.93714054 2.14506502 46.610239 10.730474   0
## 15 -0.7505320 -0.1623110 0.85445172 2.34694398 28.054108 16.497401   0
##        contin      yi     vi
## 1  -0.9639015 -0.1579 0.3879
## 2  -0.8550825  0.8075 0.1053
## 3   1.8869469  0.1579 0.1524
## 4  -0.3918194 -1.0503 0.1498
## 5  -0.9806329  0.2826 0.1354
## 6   0.6873321 -5.0726 0.3644
## 7  -0.5050435 -0.5573 0.1082
## 8   2.1577198  0.2340 0.1964
## 9  -0.5997976  1.6065 0.1385
## 10 -0.6945467  1.5221 0.1156
## 11  0.2239254 -6.6400 1.5158
## 12 -1.1562233  1.2326 0.1642
## 13  0.4224185 -0.2383 0.2111
## 14 -1.3247553 -0.0732 0.1147
## 15  0.1410843 -0.3676 0.0978

Now we can combine the effect sizes and the estimate overall impact and if that impact is significantly different from zero. The first model is a fixed effect, which assumes that the intercepts (i.e. starting values) are the same across studies. This assumption seems unlikely, since most studies are conducted in different conditions. However, the metafor package provides a test of heterogeneity to evaluate whether a random effects model is a better fit. In the fixed effect analysis, the overall effect size is .0818.

rma(yi, vi, data = dat, method = "FE")

## 
## Fixed-Effects Model (k = 15)
## 
## Test for Heterogeneity: 
## Q(df = 14) = 166.2080, p-val < .0001
## 
## Model Results:
## 
## estimate       se     zval     pval    ci.lb    ci.ub          
##   0.0818   0.1020   0.8020   0.4226  -0.1181   0.2818          
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Next, we have the random effects model, which is almost identical, except that we drop the method = “FE” going to the default of random effects. The random effects model includes several heterogeneity estimators. The I^2 evaluates the percentage of variability in the effect sizes that is accounted for by the variability between the estimates. The H^2 statistic evaluates the total variability divided by the within-study variability. Values closer to one indicate that most of the variability is due to within study variation and values above one indicate larger portions of the variability are not due to within study variation. All other results are interpreted the same as the fixed effect model.

rma(yi, vi, data = dat)

## 
## Random-Effects Model (k = 15; tau^2 estimator: REML)
## 
## tau^2 (estimated amount of total heterogeneity): 3.8872 (SE = 1.5575)
## tau (square root of estimated tau^2 value):      1.9716
## I^2 (total heterogeneity / total variability):   96.09%
## H^2 (total variability / sampling variability):  25.58
## 
## Test for Heterogeneity: 
## Q(df = 14) = 166.2080, p-val < .0001
## 
## Model Results:
## 
## estimate       se     zval     pval    ci.lb    ci.ub          
##  -0.4334   0.5246  -0.8262   0.4087  -1.4616   0.5948          
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Finally, we have a model with two moderators a binary categorical variable (cat) and a continuous variable (contin). Although the categorical variable is already properly coded, I used the factor(cat) to demonstrate that instead of creating dummy codes for a categorical variable with more than two categories, researchers can use the factor() function to create n-1 dummy variable automatically. Then we include the continuous variable, contin, set the data, and select the knha = TRUE, because that is the appropriate model when including both categorical and continuous variable in a random effects model. With this example, there is an omnibus test of the included moderator’s effects. Then there are individual estimates and test for each of the included moderators. For example, the effect size after accounting for the cat variable is -.6509.

rma(yi,vi, mods = ~ factor(cat) + contin, data = dat, knha = TRUE)

## 
## Mixed-Effects Model (k = 15; tau^2 estimator: REML)
## 
## tau^2 (estimated amount of residual heterogeneity):     4.1248 (SE = 1.7825)
## tau (square root of estimated tau^2 value):             2.0310
## I^2 (residual heterogeneity / unaccounted variability): 96.25%
## H^2 (unaccounted variability / sampling variability):   26.66
## R^2 (amount of heterogeneity accounted for):            0.00%
## 
## Test for Residual Heterogeneity: 
## QE(df = 12) = 140.1416, p-val < .0001
## 
## Test of Moderators (coefficient(s) 2,3): 
## F(df1 = 2, df2 = 12) = 0.6067, p-val = 0.5610
## 
## Model Results:
## 
##               estimate      se     tval    pval    ci.lb   ci.ub   
## intrcpt        -0.2066  0.7927  -0.2606  0.7988  -1.9337  1.5205   
## factor(cat)1   -0.6509  1.1506  -0.5657  0.5820  -3.1579  1.8561   
## contin         -0.5348  0.5589  -0.9568  0.3575  -1.7526  0.6830   
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1