title: "Approximate Ballesteros (2013) Correlation & Adding to GSHt Blood vs. Total Symptoms Meta-Analysis"
author: "Danilo Assis Pereira, Ph.D."
date: "2025-01-31"
output: html_document
From Ballesteros et al. (2013), we only have:
They did not provide a separate t or r for GSH alone, which would have been ideal for a meta‐analysis. Instead, we can interpret the entire 2‐predictor model as an approximation of “GSH_total” (since GSH_total = GSHr + GSSG). Below, we convert that overall F to R, and note it is a small, non‐significant correlation.
For multiple regression with \(p\) predictors and \(\mathrm{df}_{\text{resid}}\) residual df:
\[ F = \frac{(R^2 / p)}{\bigl[(1 - R^2)/(n - p - 1)\bigr]} \;\Longrightarrow\; R^2 = \frac{p \cdot F}{p \cdot F + \mathrm{df}_{\text{resid}}}. \]
Here:
p <- 2
f_value <- 0.30
df_resid <- 26
R_sq <- (p * f_value) / ((p * f_value) + df_resid)
R <- sqrt(R_sq)
cat("Overall R^2 =", R_sq, "\n")
Overall R^2 = 0.02255639
cat("Overall R =", R, "\n")
Overall R = 0.1501879
# The sign is unknown (multiple correlation is always non-negative).
# So we treat it as +0.15 unless context suggests negative.
Result: \(R \approx 0.15\). Because it’s a model-level correlation, it’s positive by definition. The model is not significant, but if you want a single r for “GSH_total vs. BPRS total,” 0.15 is the best guess.
Caveat: This lumps GSHr and GSSG together. We don’t have a partial or zero‐order correlation for GSH alone.
Let’s say you have a data frame with correlations (r) between GSHt in the blood and total symptom measures from multiple studies:
df_gsht <- data.frame(
authors = c("Tsai et al. 2013",
"Tuncel et al. 2015",
"Nucifora et al. 2017",
"Hendouei et al. 2018",
"Hendouei et al. 2018*",
"Hendouei et al. 2018**",
"Kizilpinar et al. 2023",
"Lin et al. 2023",
"Lin et al. 2023"),
r = c(-0.413, -0.106, -0.311, -0.1, -0.1, 0.1, 0.016, 0.068, -0.047),
n = c(41, 18, 51, 34, 34, 32, 26, 92, 219)
)
df_gsht
authors r n
1 Tsai et al. 2013 -0.413 41
2 Tuncel et al. 2015 -0.106 18
3 Nucifora et al. 2017 -0.311 51
4 Hendouei et al. 2018 -0.100 34
5 Hendouei et al. 2018* -0.100 34
6 Hendouei et al. 2018** 0.100 32
7 Kizilpinar et al. 2023 0.016 26
8 Lin et al. 2023 0.068 92
9 Lin et al. 2023 -0.047 219
Now we’d like to add the approximate Ballesteros “r
= +0.15” to see whether it changes the meta‐analysis. We must pick an
n for Ballesteros. According to the paper, they had 29
medicated patients (some references mention 29) or 52 total participants
(including controls). Usually, we only want the patient subset
if the correlation is specifically in patients. Let’s assume
n=29 (the patient sample).
df_ball_approx <- data.frame(
authors = c("Ballesteros et al. 2013 (approx)"),
r = c(0.15), # from the F conversion
n = c(29) # approximate number of SCZ patients
)
df_gsht2 <- rbind(df_gsht, df_ball_approx)
df_gsht2
authors r n
1 Tsai et al. 2013 -0.413 41
2 Tuncel et al. 2015 -0.106 18
3 Nucifora et al. 2017 -0.311 51
4 Hendouei et al. 2018 -0.100 34
5 Hendouei et al. 2018* -0.100 34
6 Hendouei et al. 2018** 0.100 32
7 Kizilpinar et al. 2023 0.016 26
8 Lin et al. 2023 0.068 92
9 Lin et al. 2023 -0.047 219
10 Ballesteros et al. 2013 (approx) 0.150 29
Note: If you have a more accurate sample size from the original text, use that instead of 29.
We can do a standard correlation meta‐analysis using metafor:
library(metafor)
Loading required package: Matrix
Loading required package: metadat
Loading required package: numDeriv
Loading the 'metafor' package (version 4.8-0). For an
introduction to the package please type: help(metafor)
# 1) Convert r -> Fisher's z
df_gsht2$z <- 0.5 * log((1 + df_gsht2$r) / (1 - df_gsht2$r))
df_gsht2$var_z <- 1 / (df_gsht2$n - 3)
# 2) Fit a random-effects model, e.g. using DerSimonian-Laird
res_dl <- rma(yi = z, vi = var_z, data = df_gsht2, method="DL")
summary(res_dl)
Random-Effects Model (k = 10; tau^2 estimator: DL)
logLik deviance AIC BIC AICc
2.9744 12.3820 -1.9488 -1.3436 -0.2345
tau^2 (estimated amount of total heterogeneity): 0.0079 (SE = 0.0139)
tau (square root of estimated tau^2 value): 0.0886
I^2 (total heterogeneity / total variability): 27.35%
H^2 (total variability / sampling variability): 1.38
Test for Heterogeneity:
Q(df = 9) = 12.3880, p-val = 0.1923
Model Results:
estimate se zval pval ci.lb ci.ub
-0.0763 0.0555 -1.3739 0.1695 -0.1850 0.0325
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
cat("\nBack-transformed to correlation:\n")
Back-transformed to correlation:
print(predict(res_dl, transf=transf.ztor))
pred ci.lb ci.ub pi.lb pi.ub
-0.0761 -0.1830 0.0325 -0.2740 0.1280
Check if adding the Ballesteros row changes your overall effect size or p‐value meaningfully. If it does not, you can decide whether to include or exclude this approximate data point (noting its limitations in your paper).
You mentioned that GSHr (reduced form) plus GSSG (oxidized form) typically sums to total GSH. While that’s chemically valid, the original paper’s regression approach might or might not reflect exactly “GSH_total.” They simply included 2 separate measures in the model. Without direct data on “(GSHr + GSSG) correlation with BPRS,” we can’t be certain. So do interpret r=0.15 with caution.
Bottom Line:
- The most rigorous approach is to avoid including it unless the paper
explicitly reported GSH_total vs. total symptom severity as a single
correlation.
- If you do include it, set r=+0.15, n=29 (approx.), and label it as
“estimated from overall F.” Provide the disclaimers in your Method
section.
End of Document