R Markdown created by Joshua Polanin on August 3, 2024
In consultation with Laura Michaelson
Updated and submitted on August 13, 2024
Adapted from ClubSandwich and metafor vignettes
Copy/paste from form, use once ready: We did not have a strong ‘a priori’ hypothesis about (a) the underlying relationships or (b) the best modeling approach. As a result, we sought to use model-building techniques that would allow for some flexibility. We limited testing only to models that advanced our understanding of the relationship, using sensitivity analyses to test the robustness of our conclusions.
## Registered S3 method overwritten by 'clubSandwich':
## method from
## bread.mlm sandwich
## Loading required package: Matrix
## Loading required package: metadat
## Loading required package: numDeriv
##
## Loading the 'metafor' package (version 4.6-0). For an
## introduction to the package please type: help(metafor)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ tidyr::expand() masks Matrix::expand()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ✖ tidyr::pack() masks Matrix::pack()
## ✖ tidyr::unpack() masks Matrix::unpack()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
##
## Attaching package: 'Hmisc'
##
##
## The following objects are masked from 'package:dplyr':
##
## src, summarize
##
##
## The following objects are masked from 'package:base':
##
## format.pval, units
dat <- read.csv("ds_as_race_analysts_rev_10july24.csv", header = T)
str(dat)
## 'data.frame': 103 obs. of 9 variables:
## $ author_year : chr "Guion (2009)" "Guion (2009)" "Guion (2009)" "Guion (2009)" ...
## $ study_id : int 100 100 100 100 100 100 200 200 300 300 ...
## $ sample_id : int 101 101 102 102 102 101 201 202 301 301 ...
## $ pub_type : chr "dissertation" "dissertation" "dissertation" "dissertation" ...
## $ as_inst : chr "CASQ-R" "CASQ-R" "CASQ-R" "CASQ-R" ...
## $ valence : chr "composite" "positive" "composite" "positive" ...
## $ corr : num -0.469 -0.286 -0.283 -0.186 0.309 0.485 0.257 0.271 -0.55 0.09 ...
## $ n : int 113 113 20 20 20 113 292 297 49 49 ...
## $ race..0...Black.: int 1 1 0 0 0 1 1 0 1 1 ...
We start with some simple data cleaning, renaming the race levels.
dat$race..0...Black. <- as.factor(dat$race..0...Black.)
levels(dat$race..0...Black.) <- c("Black", "White")
We want the correlation to ‘mean’ the same thing, regardless of sign
So now, every correlation means that, higher ‘negative’ attribution style, higher depression
dat$corr_flip <- ifelse(dat$valence != "negative", dat$corr * -1, dat$corr)
Using metafor’s ‘escalc’ to transform the bivariate correlations to Fisher’s Z Take the ‘flipped’ correlation form above
dat <- escalc(measure = "ZCOR",
ri = corr_flip,
ni = n,
data = dat,
var.names = c("z_cor", "z_var"))
Create a numeric variable of the composite variable for correlation testing
dat$valence_num <- ifelse(dat$valence == "composite", 0,
ifelse(dat$valence == "positive", 1, -1))
str(dat)
## Classes 'escalc' and 'data.frame': 103 obs. of 13 variables:
## $ author_year : chr "Guion (2009)" "Guion (2009)" "Guion (2009)" "Guion (2009)" ...
## $ study_id : int 100 100 100 100 100 100 200 200 300 300 ...
## $ sample_id : int 101 101 102 102 102 101 201 202 301 301 ...
## $ pub_type : chr "dissertation" "dissertation" "dissertation" "dissertation" ...
## $ as_inst : chr "CASQ-R" "CASQ-R" "CASQ-R" "CASQ-R" ...
## $ valence : chr "composite" "positive" "composite" "positive" ...
## $ corr : num -0.469 -0.286 -0.283 -0.186 0.309 0.485 0.257 0.271 -0.55 0.09 ...
## $ n : int 113 113 20 20 20 113 292 297 49 49 ...
## $ race..0...Black.: Factor w/ 2 levels "Black","White": 2 2 1 1 1 2 2 1 2 2 ...
## $ corr_flip : num 0.469 0.286 0.283 0.186 0.309 0.485 0.257 0.271 0.55 -0.09 ...
## $ z_cor : num 0.509 0.294 0.291 0.188 0.319 ...
## ..- attr(*, "ni")= int [1:103] 113 113 20 20 20 113 292 297 49 49 ...
## ..- attr(*, "measure")= chr "ZCOR"
## $ z_var : num 0.00909 0.00909 0.05882 0.05882 0.05882 ...
## $ valence_num : num 0 1 0 1 -1 -1 -1 -1 0 0 ...
## - attr(*, "digits")= Named num [1:9] 4 4 4 4 4 4 4 4 4
## ..- attr(*, "names")= chr [1:9] "est" "se" "test" "pval" ...
## - attr(*, "yi.names")= chr "z_cor"
## - attr(*, "vi.names")= chr "z_var"
It’ll be important to conduct sensitivity analyses, so we split the dataset
dat.comp <- subset(dat, dat$valence == "composite")
dat.posneg <- subset(dat, dat$valence != "composite")
Similarly, we split the dataset into only black and only white participants
dat.black <- subset(dat, dat$race..0...Black. == "Black")
dat.white <- subset(dat, dat$race..0...Black. == "White")
Observe the distribution and test for multicolinearity among valence and sample
t1 <- table(dat$valence, dat$race..0...Black.) ## look at distribution
t1
##
## Black White
## composite 8 20
## negative 19 32
## positive 9 15
t2 <- chisq.test(t1) ## test for differences in distribution ## result indicates no differences
t2
##
## Pearson's Chi-squared test
##
## data: t1
## X-squared = 0.68889, df = 2, p-value = 0.7086
cor(dat$valence_num, as.numeric(dat$race..0...Black.), use = "complete.obs") ## no correlation among predictors, either
## [1] 0.01411934
Notes about this model: Includes the two focal predictors plus the pub_type
Includes a random effect on the study_id
Does NOT include a random effect on sample_id so we can keep the models the same when conducting the subset sensitivity analyses
After running the rma.mv model, we run the coef_test function which adjusts the SEs for within-study correlation
m1 <- rma.mv(z_cor ~ factor(valence) + factor(race..0...Black.) + factor(dat$pub_type) + valence*race..0...Black. ,
V = z_var, random = list(~ 1 | study_id),
data = dat)
## Warning: Redundant predictors dropped from the model.
summary(m1)
##
## Multivariate Meta-Analysis Model (k = 103; method: REML)
##
## logLik Deviance AIC BIC AICc
## -4.9996 9.9993 25.9993 46.5141 27.6545
##
## Variance Components:
##
## estim sqrt nlvls fixed factor
## sigma^2 0.0161 0.1268 36 no study_id
##
## Test for Residual Heterogeneity:
## QE(df = 96) = 282.3150, p-val < .0001
##
## Test of Moderators (coefficients 2:7):
## QM(df = 6) = 54.8304, p-val < .0001
##
## Model Results:
##
## estimate se zval pval
## intrcpt 0.3416 0.0749 4.5619 <.0001
## factor(valence)negative 0.0926 0.0700 1.3234 0.1857
## factor(valence)positive -0.1890 0.0790 -2.3921 0.0168
## factor(race..0...Black.)White 0.1794 0.0602 2.9813 0.0029
## factor(dat$pub_type)journal -0.0106 0.0564 -0.1888 0.8502
## valencenegative:race..0...Black.White -0.1675 0.0721 -2.3228 0.0202
## valencepositive:race..0...Black.White 0.0271 0.0821 0.3297 0.7416
## ci.lb ci.ub
## intrcpt 0.1949 0.4884 ***
## factor(valence)negative -0.0446 0.2298
## factor(valence)positive -0.3438 -0.0341 *
## factor(race..0...Black.)White 0.0614 0.2973 **
## factor(dat$pub_type)journal -0.1211 0.0999
## valencenegative:race..0...Black.White -0.3088 -0.0262 *
## valencepositive:race..0...Black.White -0.1339 0.1880
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
m1_ct <- coef_test(m1, "CR2")
m1_ct
## Coef. Estimate SE t-stat d.f. (Satt)
## intrcpt 0.3416 0.0880 3.880 6.84
## factor(valence)negative 0.0926 0.0856 1.082 5.65
## factor(valence)positive -0.1890 0.0818 -2.309 4.57
## factor(race..0...Black.)White 0.1794 0.0797 2.252 3.60
## factor(dat$pub_type)journal -0.0106 0.0612 -0.174 16.71
## valencenegative:race..0...Black.White -0.1675 0.0811 -2.064 6.00
## valencepositive:race..0...Black.White 0.0271 0.0728 0.372 4.92
## p-val (Satt) Sig.
## 0.00634 **
## 0.32333
## 0.07391 .
## 0.09486 .
## 0.86405
## 0.08458 .
## 0.72540
In addition, we conduct a Wald_test on the interaction because it is what we are interested in - any differnces in relationships across race
m1_wt <- Wald_test(m1, "CR2", constraints = constrain_zero(6:7), cluster = dat$study_id)
m1_wt
## test Fstat df_num df_denom p_val sig
## HTZ 7.68 2 4.09 0.0412 *
m2 <- rma.mv(z_cor ~ factor(race..0...Black.),
V = z_var, random = list(~ 1 | study_id),
data = dat.comp)
summary(m2)
##
## Multivariate Meta-Analysis Model (k = 28; method: REML)
##
## logLik Deviance AIC BIC AICc
## 0.7044 -1.4089 4.5911 8.3654 5.6820
##
## Variance Components:
##
## estim sqrt nlvls fixed factor
## sigma^2 0.0171 0.1306 13 no study_id
##
## Test for Residual Heterogeneity:
## QE(df = 26) = 76.3825, p-val < .0001
##
## Test of Moderators (coefficient 2):
## QM(df = 1) = 9.1284, p-val = 0.0025
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## intrcpt 0.3580 0.0668 5.3609 <.0001 0.2271 0.4888
## factor(race..0...Black.)White 0.1830 0.0606 3.0213 0.0025 0.0643 0.3018
##
## intrcpt ***
## factor(race..0...Black.)White **
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
m2_ct <- coef_test(m2, "CR2")
m2_ct
## Coef. Estimate SE t-stat d.f. (Satt) p-val (Satt)
## intrcpt 0.358 0.0745 4.80 4.78 0.00547
## factor(race..0...Black.)White 0.183 0.0775 2.36 3.55 0.08582
## Sig.
## **
## .
m3 <- rma.mv(z_cor ~ factor(race..0...Black.),
V = z_var, random = list(~ 1 | study_id),
data = dat.posneg)
summary(m3)
##
## Multivariate Meta-Analysis Model (k = 75; method: REML)
##
## logLik Deviance AIC BIC AICc
## -21.0915 42.1831 48.1831 55.0545 48.5309
##
## Variance Components:
##
## estim sqrt nlvls fixed factor
## sigma^2 0.0122 0.1103 29 no study_id
##
## Test for Residual Heterogeneity:
## QE(df = 73) = 247.4694, p-val < .0001
##
## Test of Moderators (coefficient 2):
## QM(df = 1) = 4.1961, p-val = 0.0405
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## intrcpt 0.3418 0.0376 9.0886 <.0001 0.2681 0.4155
## factor(race..0...Black.)White 0.0700 0.0342 2.0484 0.0405 0.0030 0.1370
##
## intrcpt ***
## factor(race..0...Black.)White *
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
m3_ct<- coef_test(m3, "CR2")
m3_ct
## Coef. Estimate SE t-stat d.f. (Satt) p-val (Satt)
## intrcpt 0.342 0.0393 8.69 9.55 <0.001
## factor(race..0...Black.)White 0.070 0.0463 1.51 5.76 0.183
## Sig.
## ***
##
m4 <- rma.mv(z_cor ~ factor(valence),
V = z_var, random = list(~ 1 | study_id),
data = dat.black)
summary(m4)
##
## Multivariate Meta-Analysis Model (k = 36; method: REML)
##
## logLik Deviance AIC BIC AICc
## 5.2684 -10.5369 -2.5369 3.4492 -1.1083
##
## Variance Components:
##
## estim sqrt nlvls fixed factor
## sigma^2 0.0058 0.0759 20 no study_id
##
## Test for Residual Heterogeneity:
## QE(df = 33) = 34.1994, p-val = 0.4099
##
## Test of Moderators (coefficients 2:3):
## QM(df = 2) = 18.3776, p-val = 0.0001
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## intrcpt 0.3742 0.0581 6.4374 <.0001 0.2603 0.4882
## factor(valence)negative 0.0255 0.0681 0.3743 0.7081 -0.1080 0.1590
## factor(valence)positive -0.2346 0.0790 -2.9716 0.0030 -0.3894 -0.0799
##
## intrcpt ***
## factor(valence)negative
## factor(valence)positive **
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
m4_ct <- coef_test(m4, "CR2")
m4_ct
## Coef. Estimate SE t-stat d.f. (Satt) p-val (Satt) Sig.
## intrcpt 0.3742 0.0547 6.847 3.59 0.00349 **
## factor(valence)negative 0.0255 0.0709 0.359 5.78 0.73203
## factor(valence)positive -0.2346 0.0940 -2.496 4.69 0.05802 .
m5 <- rma.mv(z_cor ~ factor(valence),
V = z_var, random = list(~ 1 | study_id),
data = dat.white)
summary(m5)
##
## Multivariate Meta-Analysis Model (k = 67; method: REML)
##
## logLik Deviance AIC BIC AICc
## -9.3227 18.6453 26.6453 35.2809 27.3233
##
## Variance Components:
##
## estim sqrt nlvls fixed factor
## sigma^2 0.0169 0.1301 34 no study_id
##
## Test for Residual Heterogeneity:
## QE(df = 64) = 261.5820, p-val < .0001
##
## Test of Moderators (coefficients 2:3):
## QM(df = 2) = 22.4734, p-val < .0001
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## intrcpt 0.5063 0.0363 13.9310 <.0001 0.4351 0.5776
## factor(valence)negative -0.0743 0.0332 -2.2388 0.0252 -0.1393 -0.0093
## factor(valence)positive -0.1598 0.0346 -4.6166 <.0001 -0.2276 -0.0919
##
## intrcpt ***
## factor(valence)negative *
## factor(valence)positive ***
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
m5_ct <- coef_test(m5, "CR2")
m5_ct
## Coef. Estimate SE t-stat d.f. (Satt) p-val (Satt) Sig.
## intrcpt 0.5063 0.0330 15.33 11.41 <0.001 ***
## factor(valence)negative -0.0743 0.0292 -2.55 4.94 0.0520 .
## factor(valence)positive -0.1598 0.0495 -3.23 4.36 0.0282 *
m6 <- rma.mv(z_cor ~ factor(valence) + factor(race..0...Black.) + factor(dat$pub_type) + valence*race..0...Black. ,
V = z_var, random = list(~ 1 | study_id, ~ 1 | sample_id),
data = dat)
## Warning: Redundant predictors dropped from the model.
summary(m6)
##
## Multivariate Meta-Analysis Model (k = 103; method: REML)
##
## logLik Deviance AIC BIC AICc
## -4.1906 8.3812 26.3812 49.4603 28.4742
##
## Variance Components:
##
## estim sqrt nlvls fixed factor
## sigma^2.1 0.0138 0.1176 36 no study_id
## sigma^2.2 0.0024 0.0489 56 no sample_id
##
## Test for Residual Heterogeneity:
## QE(df = 96) = 282.3150, p-val < .0001
##
## Test of Moderators (coefficients 2:7):
## QM(df = 6) = 48.8478, p-val < .0001
##
## Model Results:
##
## estimate se zval pval
## intrcpt 0.3508 0.0780 4.4986 <.0001
## factor(valence)negative 0.0764 0.0739 1.0335 0.3014
## factor(valence)positive -0.2035 0.0828 -2.4597 0.0139
## factor(race..0...Black.)White 0.1636 0.0660 2.4788 0.0132
## factor(dat$pub_type)journal -0.0077 0.0561 -0.1378 0.8904
## valencenegative:race..0...Black.White -0.1490 0.0773 -1.9274 0.0539
## valencepositive:race..0...Black.White 0.0435 0.0869 0.5013 0.6162
## ci.lb ci.ub
## intrcpt 0.1980 0.5036 ***
## factor(valence)negative -0.0685 0.2213
## factor(valence)positive -0.3657 -0.0414 *
## factor(race..0...Black.)White 0.0342 0.2930 *
## factor(dat$pub_type)journal -0.1177 0.1023
## valencenegative:race..0...Black.White -0.3004 0.0025 .
## valencepositive:race..0...Black.White -0.1267 0.2138
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
tst1 <- coef_test(m6, "CR2")
tst1
## Coef. Estimate SE t-stat d.f. (Satt)
## intrcpt 0.35081 0.0767 4.574 6.61
## factor(valence)negative 0.07641 0.0733 1.043 5.59
## factor(valence)positive -0.20354 0.0835 -2.437 4.55
## factor(race..0...Black.)White 0.16360 0.0650 2.518 4.05
## factor(dat$pub_type)journal -0.00773 0.0606 -0.128 16.54
## valencenegative:race..0...Black.White -0.14896 0.0672 -2.216 6.09
## valencepositive:race..0...Black.White 0.04355 0.0635 0.686 4.98
## p-val (Satt) Sig.
## 0.00296 **
## 0.34007
## 0.06379 .
## 0.06478 .
## 0.89992
## 0.06788 .
## 0.52325
Wald_test(m6, "CR2", constraints = constrain_zero(6:7), cluster = dat$study_id)
## test Fstat df_num df_denom p_val sig
## HTZ 4.77 2 4.01 0.087 .
# Extract model coefficients and their standard errors
estimates <- c(
intrcpt = m1_ct$beta[1],
val_neg = m1_ct$beta[2],
val_pos = m1_ct$beta[3],
race_white = m1_ct$beta[4],
neg_white_inter = m1_ct$beta[6],
pos_white_inter = m1_ct$beta[7]
)
m1_ct$SE
## [1] 0.08803916 0.08562223 0.08183845 0.07965741 0.06122304 0.08113888 0.07280778
ses <- c(
intrcpt = m1_ct$SE[1],
val_neg = m1_ct$SE[2],
val_pos = m1_ct$SE[3],
race_white = m1_ct$SE[4],
neg_white_inter = m1_ct$SE[6],
pos_white_inter = m1_ct$SE[7]
)
# Calculate estimates and CIs for each combination
data <- data.frame(
race = rep(c("Black", "White"), each = 3),
valence = rep(c("composite", "negative", "positive"), 2),
estimate = c(
estimates["intrcpt"],
estimates["intrcpt"] + estimates["val_neg"],
estimates["intrcpt"] + estimates["val_pos"],
estimates["intrcpt"] + estimates["race_white"],
estimates["intrcpt"] + estimates["race_white"] + estimates["val_neg"] + estimates["neg_white_inter"],
estimates["intrcpt"] + estimates["race_white"] + estimates["val_pos"] + estimates["pos_white_inter"]
),
se = c(
ses["intrcpt"],
sqrt(ses["intrcpt"]^2 + ses["val_neg"]^2),
sqrt(ses["intrcpt"]^2 + ses["val_pos"]^2),
sqrt(ses["intrcpt"]^2 + ses["race_white"]^2),
sqrt(ses["intrcpt"]^2 + ses["race_white"]^2 + ses["val_neg"]^2 + ses["neg_white_inter"]^2),
sqrt(ses["intrcpt"]^2 + ses["race_white"]^2 + ses["val_pos"]^2 + ses["pos_white_inter"]^2)
)
)
data$ci.lb <- data$estimate - 1.96 * data$se
data$ci.ub <- data$estimate + 1.96 * data$se
# Plotting the point estimates with their confidence intervals and connecting lines
#p1 <- ggplot(data, aes(x = valence, y = estimate, group = race, color = race)) +
# geom_point() +
# geom_line() +
# geom_errorbar(aes(ymin = ci.lb, ymax = ci.ub), width = 0.1) +
# scale_x_discrete(limits = c("composite", "negative", "positive"),
# labels=c("composite" = "Composite", "negative" = "Negative", "positive" = "Positive")) +
# scale_y_continuous(limits = c(-0.8, 0.8)) +
# theme_minimal() +
# labs(title = "Model-Based Correlations without Confidence Intervals",
# x = "Valence",
# y = "Correlation",
# color = "Racial Group") +
# theme(axis.text.x = element_text(angle = 45, hjust = 1))
#p1
p2 <- ggplot(data, aes(x = valence, y = estimate, group = race, color = race)) +
geom_point() +
geom_line() +
#geom_errorbar(aes(ymin = ci.lb, ymax = ci.ub), width = 0.1) +
scale_x_discrete(limits = c("composite", "negative", "positive"),
labels=c("composite" = "Composite", "negative" = "Negative", "positive" = "Positive")) +
scale_y_continuous(limits = c(-0.8, 0.8)) +
theme_minimal() +
labs(title = "Model-Based Correlations without Confidence Intervals",
x = "Valence",
y = "Correlation",
color = "Racial Group") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
p2
Composite_Difference <- data[4,3] - data[1,3]
Negative_Difference <- data[5,3] - data[2,3]
Positive_Difference <- data[6,3] - data[3,3]
Composite_Difference
## [1] 0.1793604
Negative_Difference
## [1] 0.01187764
Positive_Difference
## [1] 0.2064421