This file is the continuous outcome practical. It uses an eGFR dataset comparing people with sickle cell disease (SCD) with controls.
Training question: Is eGFR different between people with SCD and controls?
Primary effect measure: Mean difference (MD) because all studies report eGFR on the same scale.
Additional teaching effect measure: Standardized mean difference (SMD) to show what to do when studies measure the same construct on different scales.
Direction of effect: In this dataset, a positive MD means the mean eGFR is higher in the SCD group than in the control group.
Training note: Some studies are simulated for workshop teaching. Use the dataset to learn workflow and interpretation, not to make clinical recommendations.
To keep the training simple and consistent, all three practical files use the same small package set:
{readxl} to import Excel files{dplyr} to clean and inspect data{meta} to run the standard meta-analyses{knitr} to display clean tablesTeaching note:
{metafor}is a powerful advanced package and is excellent for custom analyses. However, for introductory hands-on training,{meta}is more direct because it provides purpose-built functions such asmetabin(),metaprop(), andmetacont(). We do not use{rmeta}in these final files because it is an older package and is less suitable for a modern, reproducible training workflow.
packages <- c("readxl", "dplyr", "meta", "knitr")
missing <- packages[!sapply(packages, requireNamespace, quietly = TRUE)]
if (length(missing) > 0) install.packages(missing)
library(readxl)
library(dplyr)
library(meta)
library(knitr)
metacont().data_file <- "SCA_Africa_Meta_10studies_final_dataset.xlsx"
if (!file.exists(data_file)) stop("Put the SCA Excel file in the same folder as this Rmd.")
cont_df <- read_excel(data_file, sheet = "1_StudyData", range = "A4:L14", col_names = TRUE)
names(cont_df) <- c(
"No", "FirstAuthorYear", "Country", "StudyDesign", "AgeGroup",
"nSCD", "nControl", "OverallROBINSE", "MeaneGFRSCD", "SDeGFRSCD",
"MeaneGFRControl", "SDeGFRControl"
)
cont_df <- cont_df %>%
mutate(
FirstAuthorYear = trimws(FirstAuthorYear),
AgeGroup = trimws(AgeGroup),
OverallROBINSE = trimws(OverallROBINSE),
nSCD = as.numeric(nSCD),
nControl = as.numeric(nControl),
MeaneGFRSCD = as.numeric(MeaneGFRSCD),
SDeGFRSCD = as.numeric(SDeGFRSCD),
MeaneGFRControl = as.numeric(MeaneGFRControl),
SDeGFRControl = as.numeric(SDeGFRControl)
)
cont_df %>%
select(FirstAuthorYear, Country, AgeGroup, nSCD, nControl,
MeaneGFRSCD, SDeGFRSCD, MeaneGFRControl, SDeGFRControl, OverallROBINSE) %>%
kable(caption = "eGFR dataset: SCD versus control")
| FirstAuthorYear | Country | AgeGroup | nSCD | nControl | MeaneGFRSCD | SDeGFRSCD | MeaneGFRControl | SDeGFRControl | OverallROBINSE |
|---|---|---|---|---|---|---|---|---|---|
| Ndour et al., 2022 * | Senegal | Mixed (4–57 y) | 163 | 177 | 138.4 | 41.2 | 112.6 | 27.4 | Low |
| Olawale et al., 2021 * | Nigeria | Paediatric (5–15 y) | 150 | 50 | 152.3 | 38.6 | 131.5 | 24.7 | Moderate |
| Suliman et al., 2020 * | Sudan | Adults (≥18 y) | 32 | 23 | 118.6 | 30.8 | 98.2 | 20.3 | Serious |
| Bolarinwa et al., 2012 (sim) | Nigeria | Adults | 72 | 50 | 104.8 | 28.4 | 96.4 | 21.6 | Moderate |
| Ephraim et al., 2015 (sim) | Ghana | Mixed | 138 | 80 | 126.8 | 35.4 | 101.2 | 24.8 | Moderate |
| Aloni et al., 2013 (sim) | DR Congo | Paediatric (2–15 y) | 68 | 68 | 158.4 | 44.6 | 120.6 | 26.1 | Low |
| Nwogoh et al., 2018 (sim) | Nigeria | Adults | 60 | 55 | 109.4 | 27.8 | 94.8 | 20.6 | Moderate |
| Ngo Sock et al., 2021 (sim) | Cameroon | Adults | 45 | 40 | 112.4 | 27.8 | 94.6 | 21.3 | Moderate |
| Fakunle et al., 2018 (sim) | Nigeria | Paediatric (3–16 y) | 80 | 80 | 144.1 | 40.8 | 119.4 | 25.2 | Moderate |
| Diaw et al., 2019 (sim) | Senegal | Adults | 55 | 50 | 116.2 | 29.4 | 97.8 | 20.6 | Moderate |
metacont().cont_df %>%
summarise(
studies = n(),
missing_means = sum(is.na(MeaneGFRSCD) | is.na(MeaneGFRControl)),
missing_sds = sum(is.na(SDeGFRSCD) | is.na(SDeGFRControl)),
nonpositive_sample_size = sum(nSCD <= 0 | nControl <= 0, na.rm = TRUE),
negative_sd = sum(SDeGFRSCD < 0 | SDeGFRControl < 0, na.rm = TRUE)
) %>%
kable(caption = "Basic checks for continuous outcome data")
| studies | missing_means | missing_sds | nonpositive_sample_size | negative_sd |
|---|---|---|---|---|
| 10 | 0 | 0 | 0 | 0 |
m_md <- metacont(
n.e = nSCD,
mean.e = MeaneGFRSCD,
sd.e = SDeGFRSCD,
n.c = nControl,
mean.c = MeaneGFRControl,
sd.c = SDeGFRControl,
studlab = FirstAuthorYear,
data = cont_df,
sm = "MD",
common = FALSE,
random = TRUE,
method.tau = "REML",
method.random.ci = "HK"
)
summary(m_md)
## MD 95%-CI %W(random)
## Ndour et al., 2022 * 25.8000 [18.2968; 33.3032] 12.2
## Olawale et al., 2021 * 20.8000 [11.5788; 30.0212] 10.5
## Suliman et al., 2020 * 20.4000 [ 6.8831; 33.9169] 7.1
## Bolarinwa et al., 2012 (sim) 8.4000 [-0.4813; 17.2813] 10.8
## Ephraim et al., 2015 (sim) 25.6000 [17.5740; 33.6260] 11.7
## Aloni et al., 2013 (sim) 37.8000 [25.5177; 50.0823] 8.0
## Nwogoh et al., 2018 (sim) 14.6000 [ 5.7051; 23.4949] 10.8
## Ngo Sock et al., 2021 (sim) 17.8000 [ 7.3336; 28.2664] 9.4
## Fakunle et al., 2018 (sim) 24.7000 [14.1916; 35.2084] 9.4
## Diaw et al., 2019 (sim) 18.4000 [ 8.7577; 28.0423] 10.1
##
## Number of studies: k = 10
## Number of observations: o = 1536 (o.e = 863, o.c = 673)
##
## MD 95%-CI t p-value
## Random effects model 21.1223 [15.6679; 26.5767] 8.76 < 0.0001
##
## Quantifying heterogeneity (with 95%-CIs):
## tau^2 = 31.3808 [2.3198; 176.3573]; tau = 5.6019 [1.5231; 13.2800]
## I^2 = 56.9% [12.7%; 78.7%]; H = 1.52 [1.07; 2.17]
##
## Test of heterogeneity:
## Q d.f. p-value
## 20.87 9 0.0133
##
## Details of meta-analysis methods:
## - Inverse variance method
## - Restricted maximum-likelihood estimator for tau^2
## - Q-Profile method for confidence interval of tau^2 and tau
## - Calculation of I^2 based on Q
## - Hartung-Knapp adjustment for random effects model (df = 9)
summary_cont(m_md, "Mean difference: eGFR SCD vs control") %>%
mutate(
Estimate = fmt_num(Estimate),
Lower_95_CI = fmt_num(Lower_95_CI),
Upper_95_CI = fmt_num(Upper_95_CI),
Tau2 = fmt_num(Tau2, 4),
I2_percent = fmt_num(I2_percent, 1),
Q_p_value = fmt_p(Q_p_value)
) %>%
kable(caption = "Pooled mean difference using random-effects REML and Hartung-Knapp CI")
| Analysis | Studies | Estimate | Lower_95_CI | Upper_95_CI | Tau2 | I2_percent | Q_p_value |
|---|---|---|---|---|---|---|---|
| Mean difference: eGFR SCD vs control | 10 | 21.122 | 15.668 | 26.577 | 31.3808 | 56.9 | 0.013 |
forest(
m_md,
prediction = TRUE,
print.tau2 = TRUE,
print.I2 = TRUE,
leftcols = c("studlab", "n.e", "mean.e", "sd.e", "n.c", "mean.c", "sd.c"),
leftlabs = c("Study", "SCD n", "SCD mean", "SCD SD", "Control n", "Control mean", "Control SD"),
rightcols = c("effect", "ci", "w.random"),
rightlabs = c("MD", "95% CI", "Weight"),
xlab = "Mean difference in eGFR"
)
This section is included for teaching. Use SMD when studies measure the same construct using different scales. Here, MD remains the preferred primary analysis because all studies report eGFR on the same scale.
m_smd <- metacont(
n.e = nSCD,
mean.e = MeaneGFRSCD,
sd.e = SDeGFRSCD,
n.c = nControl,
mean.c = MeaneGFRControl,
sd.c = SDeGFRControl,
studlab = FirstAuthorYear,
data = cont_df,
sm = "SMD",
common = FALSE,
random = TRUE,
method.tau = "REML",
method.random.ci = "HK"
)
summary(m_smd)
## SMD 95%-CI %W(random)
## Ndour et al., 2022 * 0.7417 [ 0.5217; 0.9618] 23.1
## Olawale et al., 2021 * 0.5809 [ 0.2558; 0.9061] 10.6
## Suliman et al., 2020 * 0.7464 [ 0.1915; 1.3013] 3.6
## Bolarinwa et al., 2012 (sim) 0.3230 [-0.0401; 0.6862] 8.5
## Ephraim et al., 2015 (sim) 0.7989 [ 0.5133; 1.0845] 13.7
## Aloni et al., 2013 (sim) 1.0287 [ 0.6704; 1.3869] 8.7
## Nwogoh et al., 2018 (sim) 0.5890 [ 0.2150; 0.9630] 8.0
## Ngo Sock et al., 2021 (sim) 0.7067 [ 0.2672; 1.1463] 5.8
## Fakunle et al., 2018 (sim) 0.7249 [ 0.4048; 1.0451] 10.9
## Diaw et al., 2019 (sim) 0.7137 [ 0.3183; 1.1091] 7.1
##
## Number of studies: k = 10
## Number of observations: o = 1536 (o.e = 863, o.c = 673)
##
## SMD 95%-CI t p-value
## Random effects model 0.7042 [0.5830; 0.8253] 13.15 < 0.0001
##
## Quantifying heterogeneity (with 95%-CIs):
## tau^2 = 0 [0.0000; 0.0745]; tau = 0 [0.0000; 0.2729]
## I^2 = 0.0% [0.0%; 62.4%]; H = 1.00 [1.00; 1.63]
##
## Test of heterogeneity:
## Q d.f. p-value
## 8.87 9 0.4490
##
## Details of meta-analysis methods:
## - Inverse variance method
## - Restricted maximum-likelihood estimator for tau^2
## - Q-Profile method for confidence interval of tau^2 and tau
## - Calculation of I^2 based on Q
## - Hartung-Knapp adjustment for random effects model (df = 9)
## - Hedges' g (bias corrected standardised mean difference; using exact formulae)
forest(m_smd, prediction = TRUE, print.tau2 = TRUE, print.I2 = TRUE, xlab = "Standardized mean difference")
This section asks: Would the mean-difference result change if we used a different estimator for between-study variance?
The only thing we change below is method.tau.
fit_md <- function(data, tau_method = "REML") {
metacont(
n.e = nSCD,
mean.e = MeaneGFRSCD,
sd.e = SDeGFRSCD,
n.c = nControl,
mean.c = MeaneGFRControl,
sd.c = SDeGFRControl,
studlab = FirstAuthorYear,
data = data,
sm = "MD",
common = FALSE,
random = TRUE,
method.tau = tau_method,
method.random.ci = "HK"
)
}
md_tau <- bind_rows(
summary_cont(fit_md(cont_df, "REML"), "REML"),
summary_cont(fit_md(cont_df, "DL"), "DL"),
summary_cont(fit_md(cont_df, "SJ"), "SJ")
)
md_tau %>%
mutate(
Estimate = fmt_num(Estimate),
Lower_95_CI = fmt_num(Lower_95_CI),
Upper_95_CI = fmt_num(Upper_95_CI),
Tau2 = fmt_num(Tau2, 4),
I2_percent = fmt_num(I2_percent, 1),
Q_p_value = fmt_p(Q_p_value)
) %>%
kable(caption = "Sensitivity of pooled MD to tau-squared estimator")
| Analysis | Studies | Estimate | Lower_95_CI | Upper_95_CI | Tau2 | I2_percent | Q_p_value |
|---|---|---|---|---|---|---|---|
| REML | 10 | 21.122 | 15.668 | 26.577 | 31.3808 | 56.9 | 0.013 |
| DL | 10 | 21.121 | 15.668 | 26.575 | 31.1663 | 56.9 | 0.013 |
| SJ | 10 | 21.155 | 15.679 | 26.632 | 40.6733 | 56.9 | 0.013 |
loo_md <- metainf(m_md, pooled = "random")
summary(loo_md)
## Leave-one-out meta-analysis
##
## MD 95%-CI p-value
## Omitting Ndour et al., 2022 * 20.4958 [14.3775; 26.6142] < 0.0001
## Omitting Olawale et al., 2021 * 21.1914 [14.9377; 27.4452] < 0.0001
## Omitting Suliman et al., 2020 * 21.2010 [15.0621; 27.3399] < 0.0001
## Omitting Bolarinwa et al., 2012 (sim) 22.6062 [17.8249; 27.3875] < 0.0001
## Omitting Ephraim et al., 2015 (sim) 20.5545 [14.4286; 26.6804] < 0.0001
## Omitting Aloni et al., 2013 (sim) 19.7301 [15.1625; 24.2978] < 0.0001
## Omitting Nwogoh et al., 2018 (sim) 21.9172 [15.9853; 27.8491] < 0.0001
## Omitting Ngo Sock et al., 2021 (sim) 21.4931 [15.3479; 27.6383] < 0.0001
## Omitting Fakunle et al., 2018 (sim) 20.7715 [14.6382; 26.9048] < 0.0001
## Omitting Diaw et al., 2019 (sim) 21.4558 [15.2668; 27.6448] < 0.0001
##
## Random effects model 21.1223 [15.6679; 26.5767] < 0.0001
## tau^2 tau I^2
## Omitting Ndour et al., 2022 * 35.0045 5.9165 57.9%
## Omitting Olawale et al., 2021 * 39.1461 6.2567 61.7%
## Omitting Suliman et al., 2020 * 36.7829 6.0649 61.6%
## Omitting Bolarinwa et al., 2012 (sim) 10.2410 3.2001 34.2%
## Omitting Ephraim et al., 2015 (sim) 35.5302 5.9607 58.7%
## Omitting Aloni et al., 2013 (sim) 16.0349 4.0044 39.4%
## Omitting Nwogoh et al., 2018 (sim) 32.4092 5.6929 57.1%
## Omitting Ngo Sock et al., 2021 (sim) 37.1589 6.0958 60.9%
## Omitting Fakunle et al., 2018 (sim) 36.5497 6.0456 60.7%
## Omitting Diaw et al., 2019 (sim) 37.9332 6.1590 61.1%
##
## Random effects model 31.3808 5.6019 56.9%
##
## Details of meta-analysis methods:
## - Inverse variance method
## - Restricted maximum-likelihood estimator for tau^2
## - Calculation of I^2 based on Q
## - Hartung-Knapp adjustment for random effects model (df = {8, 9})
forest(loo_md, xlab = "Mean difference in eGFR")
galbraith_plot(m_md, main = "Galbraith plot: eGFR mean difference")
funnel(m_md, xlab = "Mean difference", main = "Funnel plot: eGFR mean difference")
egger_md <- metabias(m_md, method.bias = "linreg", k.min = 3)
egger_md
## Linear regression test of funnel plot asymmetry
##
## Test result: t = 0.40, df = 8, p-value = 0.6974
## Bias estimate: 1.2409 (SE = 3.0777)
##
## Details:
## - multiplicative residual heterogeneity variance (tau^2 = 2.5562)
## - predictor: standard error
## - weight: inverse variance
## - reference: Egger et al. (1997), BMJ
tf_md <- trimfill(m_md)
summary(tf_md)
## MD 95%-CI %W(random)
## Ndour et al., 2022 * 25.8000 [18.2968; 33.3032] 12.2
## Olawale et al., 2021 * 20.8000 [11.5788; 30.0212] 10.5
## Suliman et al., 2020 * 20.4000 [ 6.8831; 33.9169] 7.1
## Bolarinwa et al., 2012 (sim) 8.4000 [-0.4813; 17.2813] 10.8
## Ephraim et al., 2015 (sim) 25.6000 [17.5740; 33.6260] 11.7
## Aloni et al., 2013 (sim) 37.8000 [25.5177; 50.0823] 8.0
## Nwogoh et al., 2018 (sim) 14.6000 [ 5.7051; 23.4949] 10.8
## Ngo Sock et al., 2021 (sim) 17.8000 [ 7.3336; 28.2664] 9.4
## Fakunle et al., 2018 (sim) 24.7000 [14.1916; 35.2084] 9.4
## Diaw et al., 2019 (sim) 18.4000 [ 8.7577; 28.0423] 10.1
##
## Number of studies: k = 10 (with 0 added studies)
## Number of observations: o = 1536 (o.e = 863, o.c = 673)
##
## MD 95%-CI t p-value
## Random effects model 21.1223 [15.6679; 26.5767] 8.76 < 0.0001
##
## Quantifying heterogeneity (with 95%-CIs):
## tau^2 = 31.3808 [2.3198; 176.3573]; tau = 5.6019 [1.5231; 13.2800]
## I^2 = 56.9% [12.7%; 78.7%]; H = 1.52 [1.07; 2.17]
##
## Test of heterogeneity:
## Q d.f. p-value
## 20.87 9 0.0133
##
## Details of meta-analysis methods:
## - Inverse variance method
## - Restricted maximum-likelihood estimator for tau^2
## - Q-Profile method for confidence interval of tau^2 and tau
## - Calculation of I^2 based on Q
## - Hartung-Knapp adjustment for random effects model (df = 9)
## - Trim-and-fill method to adjust for funnel plot asymmetry (L-estimator)
funnel(tf_md, xlab = "Mean difference", main = "Trim-and-fill: eGFR mean difference")
m_md_rob <- metacont(
n.e = nSCD,
mean.e = MeaneGFRSCD,
sd.e = SDeGFRSCD,
n.c = nControl,
mean.c = MeaneGFRControl,
sd.c = SDeGFRControl,
studlab = FirstAuthorYear,
data = cont_df,
sm = "MD",
common = FALSE,
random = TRUE,
method.tau = "REML",
method.random.ci = "HK",
subgroup = OverallROBINSE
)
summary(m_md_rob)
## MD 95%-CI %W(random)
## Ndour et al., 2022 * 25.8000 [18.2968; 33.3032] 12.2
## Olawale et al., 2021 * 20.8000 [11.5788; 30.0212] 10.5
## Suliman et al., 2020 * 20.4000 [ 6.8831; 33.9169] 7.1
## Bolarinwa et al., 2012 (sim) 8.4000 [-0.4813; 17.2813] 10.8
## Ephraim et al., 2015 (sim) 25.6000 [17.5740; 33.6260] 11.7
## Aloni et al., 2013 (sim) 37.8000 [25.5177; 50.0823] 8.0
## Nwogoh et al., 2018 (sim) 14.6000 [ 5.7051; 23.4949] 10.8
## Ngo Sock et al., 2021 (sim) 17.8000 [ 7.3336; 28.2664] 9.4
## Fakunle et al., 2018 (sim) 24.7000 [14.1916; 35.2084] 9.4
## Diaw et al., 2019 (sim) 18.4000 [ 8.7577; 28.0423] 10.1
## OverallROBINSE
## Ndour et al., 2022 * Low
## Olawale et al., 2021 * Moderate
## Suliman et al., 2020 * Serious
## Bolarinwa et al., 2012 (sim) Moderate
## Ephraim et al., 2015 (sim) Moderate
## Aloni et al., 2013 (sim) Low
## Nwogoh et al., 2018 (sim) Moderate
## Ngo Sock et al., 2021 (sim) Moderate
## Fakunle et al., 2018 (sim) Moderate
## Diaw et al., 2019 (sim) Moderate
##
## Number of studies: k = 10
## Number of observations: o = 1536 (o.e = 863, o.c = 673)
##
## MD 95%-CI t p-value
## Random effects model 21.1223 [15.6679; 26.5767] 8.76 < 0.0001
##
## Quantifying heterogeneity (with 95%-CIs):
## tau^2 = 31.3808 [2.3198; 176.3573]; tau = 5.6019 [1.5231; 13.2800]
## I^2 = 56.9% [12.7%; 78.7%]; H = 1.52 [1.07; 2.17]
##
## Test of heterogeneity:
## Q d.f. p-value
## 20.87 9 0.0133
##
## Results for subgroups (random effects model):
## k MD 95%-CI tau^2 tau Q
## OverallROBINSE = Low 2 30.7744 [-44.3408; 105.8896] 45.0373 6.7110 2.67
## OverallROBINSE = Moderate 7 18.5789 [ 12.9506; 24.2072] 16.6230 4.0771 10.30
## OverallROBINSE = Serious 1 20.4000 [ 6.8831; 33.9169] -- -- 0.00
## I^2
## OverallROBINSE = Low 62.6%
## OverallROBINSE = Moderate 41.8%
## OverallROBINSE = Serious --
##
## Test for subgroup differences (random effects model):
## Q d.f. p-value
## Between groups 3.70 2 0.1575
##
## Details of meta-analysis methods:
## - Inverse variance method
## - Restricted maximum-likelihood estimator for tau^2
## - Q-Profile method for confidence interval of tau^2 and tau
## - Calculation of I^2 based on Q
## - Hartung-Knapp adjustment for random effects model (df = 9)
forest(m_md_rob, xlab = "Mean difference in eGFR", prediction = TRUE, print.tau2 = TRUE)
Using a random-effects mean difference meta-analysis with REML estimation of between-study variance and Hartung-Knapp confidence intervals, the pooled mean difference in eGFR between SCD and control groups was 21.122 (95% CI 15.668 to 26.577). Positive values indicate higher mean eGFR in the SCD group. Heterogeneity was I2 = 56.9%, tau2 = 31.3808, Q-test p = 0.013. Because the studies are observational and some entries are simulated for training, the result should be interpreted as a methodological exercise rather than a clinical conclusion.
metacont() for continuous outcomes with group
means, SDs, and sample sizes.