R Problem Set #3: Systematic Reviews and Meta-Analysis

Note: The dataset that we will be working with in this problem set is for practice only- the data are similar to those published in a meta-analysis on this subject, but are not identical.

You will need to install the metafor package for this assignment. To do so, follow these steps one time:

install.packages("metafor")

And then run the following code each time you use R to load the package into the working memory:

library(metafor)

## Warning: package 'metafor' was built under R version 3.2.5

The research question that we will try to answer is whether there is a significant difference in patient mortality when comparing patients randomized to beta-blockers and patients randomized to placebo. All patients were individuals who had a myocardial infarction. Only randomized trials were used for this meta analysis, and odds ratios (OR) were calculated. An OR less than 1 would indicate that the odds of mortality in those receiving beta-blockers are less than the odds of mortality in the placebo group.

Open SRMA-Dataset1.dta in R.

setwd("C:/your working directory")
require(foreign)
SRMA1<-read.dta("SRMA-Dataset1.dta")

#Examine the data
head(SRMA1)

##                       study year  or oruci     orlci      logor   varlogor
## 1                    Barber 1967 0.9   1.3 0.6230769 -0.1053605 0.03519927
## 2                  Reynolds 1972 1.0   5.0 0.2000000  0.0000000 0.67427385
## 3               Wilhelmsson 1974 0.5   1.2 0.2083333 -0.6931472 0.19951205
## 4                   Ahlmark 1974 0.6   1.8 0.2000000 -0.5108256 0.31417871
## 5 Multicentre International 1975 0.8   1.1 0.5818182 -0.2231435 0.02639858
## 6                     Yusuf 1979 1.0  10.0 0.1000000  0.0000000 1.38012755
##     sdlogor
## 1 0.1876147
## 2 0.8211418
## 3 0.4466677
## 4 0.5605165
## 5 0.1624764
## 6 1.1747884

Variables:

study is the name of the study.
year is the study year.
or is the odds ratio.
oruci is the upper bounds of the 95% confidence interval.
orlci is the lower bounds of the 95% confidence interval.
logor is the log-odds ratio.
varlogor is the variance of the log-odds ratio.
sdlogor is the standard deviation of the log-odds.

Question 1:

Open the SRMA-Dataset1.xls file with Excel. Before running any commands, look at the raw data. Does it look like there is a protective effect of beta-blockers against mortality events? If there appears to be an effect, does it look statistically significant?

Click below to show answer:

The raw data seem to have more ORs with values less than 1, indicating that the use of beta-blockers may have a protective effect against mortality. However, several studies indicate non-significant effects of beta-blockers (95% confidence intervals of the OR include the null value of OR = 1), while other studies indicate a significant effect (95% confidence intervals do not include the null value of OR=1). Without formal statistical methods, it would be difficult to say that there is a significant effect of beta-blockers on mortality in patients with history of myocardial infarction.

Question 2:

Create a Forest Plot of the data using the rma() and forest() commands. The rma command is a function to fit the meta-analytic fixed- and random/mixed-effects models used to calculate the overall effect size of the pooled studies. The rma command requires five arguements: the effect estimate yi, the standard deviation/standard error of that effect estimate vi, the name of the dataframe data, a vector of the study names slab, and the type of model to be fitted method. In this problem set, we will use the logarithm of the odds ratio (for bonus points, try to figure out why we would use the logarithm of the odds ratio, and not just the odds ratio). For the method argument, we will first fit a fixed effect model with method="FE". This assumes that there isn’t heterogeneity between studies, which is a large assumption that we will test later.

### fit random-effects model (use slab argument to define study labels)
fes <- rma(yi=logor , vi=varlogor ,data=SRMA1 ,slab=study, method="FE")
fes

## 
## Fixed-Effects Model (k = 33)
## 
## Test for Heterogeneity: 
## Q(df = 32) = 42.5751, p-val = 0.1002
## 
## Model Results:
## 
## estimate       se     zval     pval    ci.lb    ci.ub          
##  -0.1496   0.0228  -6.5467   <.0001  -0.1943  -0.1048      *** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Next, we will use the forest() command to plot out the effect sizes and confidence intervals for the studies in the meta-analysis, as well as the pooled effects. The arguments are, first, the name of the model fit from the rma() function (this is the only required argument, the rest is for formatting), at, which marks where to put x-axis marks, atransf is used to exponentiate the log-odds back to odds, xlab is the x-axis label, the mlab is the label for the pooled effect, and psize is the size of the mean effect markers.

forest(fes, at=log(c(.05,.25,1,4,16)), atransf=exp, xlab="Log-odds ratio", mlab="FE Model for All Studies", psize=1)

Click below to show answer to why log-odds are used:

The log of the OR is used because it is more normally distributed than just the OR. An OR is only allowed to be between 0 and infinity, and the null hypothesis is that the OR = 1. Thus, the OR is much closer to 0 than infinity, and will have a skewed distribution if you repeat an experiment randomly many times and create a histogram of all the OR’s. However, if you take the log of the OR, the null hypothesis is now that the log(OR) = 0, the log(OR) can now range between negative and positive infinity, thus giving it a more symmetric and normal distribution. 

One might ask, why do we want the OR to have a normal distribution? The answer lies in the fact that you can calculate a 95% confidence interval much more easily for a symmetric distribution than for an asymmetric distribution. You would usually say that the 95% confidence interval is mean ± 1.96*standard deviation, and you would like the remaining 5% of your confidence interval to lie evenly on the left side and right side of your distribution- if it is normal, this is true. If the distribution is skewed, this is not true.

Question 3:

Do you think there is heterogeneity between studies? Calculate the appropriate summary estimate based on your conclusion about heterogeneity.

Click below to show answer

The Q statistic above (from the "Test for Heterogeneity") has a p-value of 0.1. We typically use a conservative p-value cut-off of 0.2 to assess heterogeneity so we would conclude that there is significant heterogeneity between study results. A random effect summary estimate should therefore be used. Note that the confidence interval around the summary estimate is wider when we use the random effects model.  Use method="REML" to fit a random-effects model.

### fit random-effects model (use slab argument to define study labels)
res <- rma(yi=logor , vi=varlogor ,data=SRMA1 ,slab=study, method="REML")
res

## 
## Random-Effects Model (k = 33; tau^2 estimator: REML)
## 
## tau^2 (estimated amount of total heterogeneity): 0.0099 (SE = 0.0110)
## tau (square root of estimated tau^2 value):      0.0996
## I^2 (total heterogeneity / total variability):   23.65%
## H^2 (total variability / sampling variability):  1.31
## 
## Test for Heterogeneity: 
## Q(df = 32) = 42.5751, p-val = 0.1002
## 
## Model Results:
## 
## estimate       se     zval     pval    ci.lb    ci.ub          
##  -0.2162   0.0435  -4.9741   <.0001  -0.3014  -0.1310      *** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

forest(res, at=log(c(.05,.25,1,4,16)), atransf=exp, xlab="Log-odds ratio", mlab="RE Model for All Studies", psize=1)

One large threat to the validity of a meta-analysis is the question of whether or not there is publication bias. If only significant findings in a certain direction (protective effects) are published, and non-significant findings or findings in the other direction (the drug is harmful) are not published, then a meta-analysis might report a biased answer. One way to assess if there is publication bias is the use of the funnel plot- if there is publication bias, you would see an asymmetric funnel, and if there is no publication bias, there would be a symmetric funnel plot.

Question 4:

Create a funnel plot in R Use the command funnel() and as before, note that the output of the rma() command is used to create the funnel plot.

A final question that we might want to answer is, “If we had done a meta-analysis every year, in which year would we have discovered that there was a significant summary OR?” Once armed with this answer, it might be possible to argue either for or against use of the intervention starting in that year. Delayed implementation of a helpful drug (as determined by meta-analysis) might be seen as unethical, and continued use of a non-effective or harmful drug (as determined by meta-analysis) might also be seen as unethical.

funnel(res, main="Standard Error")

Question 5:

Determine in which year the meta-analysis summary OR became statistically significant (a significant protective effect was found, p<0.05). The way to do this is to conduct a “cumulative” analysis. In R, use the cumul() command. After fitting the model with the rma() function, a cumulative meta-analysis can be conducted with the cumul() function. The results obtained that way can then be passed to the forest() function, which will draw a cumulative forest plot.

### cumulative meta-analysis (in the order of publication year)
tmp <- cumul(res, order=order(SRMA1$year))
 
### cumulative forest plot
forest(tmp, xlim=c(-2,2), at=log(c( .5, 1, 2)), 
       atransf=exp, digits=c(2,3), cex=1)
 
### switch to bold font
par(cex=.75, font=2)
 
### add column headings to the plot
text(-4, 35, "Author(s) and Year", pos=4)
text( 2, 35, "Odds Ratio [95% CI]", pos=2)

Click below to show answer

The answer seems to be that in 1981 the Hjalmarson study was the study that tipped the summary OR into the significantly protective realm, and from then on the summary ORs in subsequent years stayed significantly protective. Thus, one might argue that beta-antagonists should have been prescribed to patients with a prior heart attack to reduce the odds of mortality.

Systematic Review and Meta-Analysis R-lab

Andrew Mertens

August 21, 2016

R Problem Set #3: Systematic Reviews and Meta-Analysis

Open SRMA-Dataset1.dta in R.

Variables:

Question 1:

Click below to show answer:

Question 2:

Click below to show answer to why log-odds are used:

Question 3:

Click below to show answer

Question 4:

Question 5:

Click below to show answer