Setup

library(pacman); p_load(dplyr, psych, dmetar, meta, metapower, brms, MCMCglmm)

Rationale

This is a reanalysis of Pesta et al. (2020). The point is to estimate the power. This uses the data from Table 2 and is relegated to the Black and White subgroups, since they are the only ones I would regard as plausibly well-represented. It compares two methods for thinking about meta-analyses of heritability: taking \(\sqrt{h^2}\) versus simply using \(h^2\) as r.

Analysis

First, create columns of unstandardized variances and SEs.

data <- mutate(data,
  SA = sqrt(A),
  SESA = sqrt(ASE))

Second, subset to the two groups for the analysis, Whites and Blacks.

data <- subset(data, Race == "White" | Race == "Black")

describe(data)

Third, perform the aggregate meta-analysis to assess the question, ‘What is the mean heritability of cognitive ability measure scores in Blacks and Whites in the United States?’ This has to be Z-scored in order to achieve the correct estimate variances. We should perform a power analysis beforehand. For this, I will assume heritability is 50% (unstandardized, this is 0.707) since the sample is fairly young, and I will use the K studies and median Nh. I will use a moderate heterogeneity value of \(I^2\) = 0.50. The function mpower performs the r-to-z transformation on correlations (Griffin, 2021; this is just atanh(r)), so the effect size of 0.707 being used will differ from what it reports as the expected effect size for that reason. The need for transformation differs from Cohen’s d but is similar to the case of the odds ratio, which has to be logged.

mpower(effect_size = 0.707, study_size = 465, k = 31, i2 = 0.50, es_type = "r")
## 
##  Power Analysis for Meta-analysis 
## 
##  Effect Size Metric:                r 
##  Expected Effect Size:              0.88 
##  Expected Study Size:               465 
##  Expected Number of Studies:        31 
## 
##  Estimated Power: Mean Effect Size 
## 
##  Fixed-Effects Model                1 
##  Random-Effects Model (i2 = 50%):   1

On to the meta-analysis, which is otherwise identical to the main study. Using backtransf = F, the r-to-z transformation is reversed for the presentation of the meta-analytic r value.

AMeta <- metagen(data = data,
                 TE = SA,
                 seTE = SESA,
                 studlab = Study,
                 sm = "COR",
                 fixed = F,
                 random = T,
                 method.tau = "HS", 
                 hakn = T,
                 backtransf = F,
                 title = "The Heritability of Cognitive Ability Among Blacks and Whites in the U.S."); summary(AMeta)
## Review:     The Heritability of Cognitive Ability Among Blacks and Whites in ...
## 
##                                   COR            95%-CI %W(random)
## Scarr-Salapatek (1971)         0.5292 [-0.5968; 1.6551]        1.3
## Beaver et al. (2013)           0.6928 [ 0.0428; 1.3429]        4.0
## Hodges (1976)                  0.5831 [-0.2250; 1.3912]        2.6
## Osborne (1980)                 0.7348 [-0.0491; 1.5188]        2.8
## Scarr (1981)                   0.7280 [-0.2120; 1.6680]        1.9
## Scarr et al. (1993)            0.6403 [-0.5993; 1.8799]        1.1
## Beaver et al. (2013)           0.7211 [ 0.1667; 1.2755]        5.5
## Rhemtulla & Tucker-Drob (2012) 0.6083 [-0.0418; 1.2583]        4.0
## Hart et al. (2013)             0.8944 [ 0.2746; 1.5142]        4.4
## Woodley of Menie et al. (2015) 0.3742 [-0.2759; 1.0242]        4.0
## Figlio et al. (2017)           0.7550 [ 0.3167; 1.1932]        8.8
## Figlio et al. (2017)           0.7681 [ 0.2496; 1.2867]        6.3
## Mollon et al. (2018)           0.8485 [ 0.3300; 1.3671]        6.3
## Engelhardt et al. (2019)       0.6782 [ 0.0584; 1.2980]        4.4
## Pesta et al. (2019)            0.8185 [ 0.1685; 1.4686]        4.0
## Pesta et al. (2019)            0.7810 [-0.4272; 1.9892]        1.2
## Scarr-Salapatek (1971)         0.5568 [-0.4987; 1.6122]        1.5
## Beaver et al. (2013)           0.7141 [ 0.0943; 1.3339]        4.4
## Hodges (1976)                  0.4472 [-0.5130; 1.4074]        1.8
## Osborne (1980)                 0.7681 [-0.2874; 1.8236]        1.5
## Scarr (1981)                   0.6928 [-0.2674; 1.6530]        1.8
## Scarr et al. (1993)            0.7211 [-0.5790; 2.0212]        1.0
## Beaver et al. (2013)           0.6708 [-0.2057; 1.5473]        2.2
## Hart et al. (2013)             0.9434 [ 0.1594; 1.7274]        2.8
## Woodley of Menie et al. (2015) 0.9695 [-0.8375; 2.7765]        0.5
## Figlio et al. (2017)           0.7483 [ 0.1603; 1.3363]        4.9
## Figlio et al. (2017)           0.6928 [ 0.0428; 1.3429]        4.0
## Mollon et al. (2018)           0.7810 [ 0.0477; 1.5144]        3.1
## Engelhardt et al. (2019)       0.3606 [-0.7130; 1.4341]        1.5
## Pesta et al. (2019)            0.9000 [ 0.3120; 1.4880]        4.9
## Pesta et al. (2019)            0.7874 [-0.2681; 1.8429]        1.5
## 
## Number of studies combined: k = 31
## 
##                         COR           95%-CI     t  p-value
## Random effects model 0.7246 [0.6763; 0.7729] 30.64 < 0.0001
## 
## Quantifying heterogeneity:
##  tau^2 = 0; tau = 0; I^2 = 0.0% [0.0%; 40.2%]; H = 1.00 [1.00; 1.29]
## 
## Test of heterogeneity:
##     Q d.f. p-value
##  3.81   30  1.0000
## 
## Details on meta-analytical method:
## - Inverse variance method
## - Hunter-Schmidt estimator for tau^2
## - Hartung-Knapp adjustment for random effects model
## - Untransformed correlations

The mean heritability is, thus, \(0.7246^2\), or

0.7246^2
## [1] 0.5250452

which is close to the 0.60 for all groups in the study. Next, we can perform a subgroup analysis. Because many of these estimates came from the same studies and used the same instruments, \(\tau^2\) should probably be regarded as shared. However, because this is not wholly true, it may be worthwhile to run the meta-analysis without \(\tau^2\) shared. Because of the limited number of studies without samples characteristically overlapping, the results should barely differ at even high levels of \(\tau^2\).

update.meta(AMeta, 
            subgroup = Race,
            tau.common = T)
## Review:     The Heritability of Cognitive Ability Among Blacks and Whites in ...
## 
## Number of studies combined: k = 31
## 
##                         COR           95%-CI     t  p-value
## Random effects model 0.7246 [0.6763; 0.7729] 30.64 < 0.0001
## 
## Quantifying heterogeneity:
##  tau^2 = 0; tau = 0; I^2 = 0.0% [0.0%; 40.2%]; H = 1.00 [1.00; 1.29]
## 
## Quantifying residual heterogeneity:
##  tau^2 = 0; tau = 0; I^2 = 0.0% [0.0%; 40.8%]; H = 1.00 [1.00; 1.30]
## 
## Test of heterogeneity:
##     Q d.f. p-value
##  3.81   30  1.0000
## 
## Results for subgroups (random effects model):
##                k    COR           95%-CI tau^2 tau    Q  I^2
## Race = White  16 0.7183 [0.6505; 0.7861]     0   0 2.16 0.0%
## Race = Black  15 0.7350 [0.6555; 0.8146]     0   0 1.64 0.0%
## 
## Test for subgroup differences (random effects model):
##                   Q d.f. p-value
## Between groups 0.12    1  0.7322
## Within groups  3.80   29  1.0000
## 
## Details on meta-analytical method:
## - Inverse variance method
## - Hunter-Schmidt estimator for tau^2 (assuming common tau^2 in subgroups)
## - Hartung-Knapp adjustment for random effects model
## - Untransformed correlations
update.meta(AMeta,
            subgroup = Race,
            tau.common = F)
## Review:     The Heritability of Cognitive Ability Among Blacks and Whites in ...
## 
## Number of studies combined: k = 31
## 
##                         COR           95%-CI     t  p-value
## Random effects model 0.7246 [0.6763; 0.7729] 30.64 < 0.0001
## 
## Quantifying heterogeneity:
##  tau^2 = 0; tau = 0; I^2 = 0.0% [0.0%; 40.2%]; H = 1.00 [1.00; 1.29]
## 
## Test of heterogeneity:
##     Q d.f. p-value
##  3.81   30  1.0000
## 
## Results for subgroups (random effects model):
##                k    COR           95%-CI tau^2 tau    Q  I^2
## Race = White  16 0.7183 [0.6505; 0.7861]     0   0 2.16 0.0%
## Race = Black  15 0.7350 [0.6555; 0.8146]     0   0 1.64 0.0%
## 
## Test for subgroup differences (random effects model):
##                     Q d.f. p-value
## Between groups   0.12    1  0.7322
## 
## Details on meta-analytical method:
## - Inverse variance method
## - Hunter-Schmidt estimator for tau^2
## - Hartung-Knapp adjustment for random effects model
## - Untransformed correlations

For the sake of interest, it could also be interesting to see if there’s moderation of the aggregate effect size by type of test.

AMeta <- metagen(data = data,
                 TE = SA,
                 seTE = SESA,
                 studlab = Study,
                 sm = "COR",
                 fixed = F,
                 random = T,
                 method.tau = "REML", 
                 hakn = T,
                 backtransf = F,
                 subgroup = Test.Types,
                 title = "The Heritability of Cognitive Ability Among Blacks and Whites in the U.S. (By Test Type)"); summary(AMeta)
## Review:     The Heritability of Cognitive Ability Among Blacks and Whites in ...
## 
##                                   COR            95%-CI %W(random)  Test.Types
## Scarr-Salapatek (1971)         0.5292 [-0.5968; 1.6551]        1.3     IQ or g
## Beaver et al. (2013)           0.6928 [ 0.0428; 1.3429]        4.0     IQ or g
## Hodges (1976)                  0.5831 [-0.2250; 1.3912]        2.6     IQ or g
## Osborne (1980)                 0.7348 [-0.0491; 1.5188]        2.8     IQ or g
## Scarr (1981)                   0.7280 [-0.2120; 1.6680]        1.9     IQ or g
## Scarr et al. (1993)            0.6403 [-0.5993; 1.8799]        1.1     IQ or g
## Beaver et al. (2013)           0.7211 [ 0.1667; 1.2755]        5.5     IQ or g
## Rhemtulla & Tucker-Drob (2012) 0.6083 [-0.0418; 1.2583]        4.0 Achievement
## Hart et al. (2013)             0.8944 [ 0.2746; 1.5142]        4.4 Achievement
## Woodley of Menie et al. (2015) 0.3742 [-0.2759; 1.0242]        4.0     IQ or g
## Figlio et al. (2017)           0.7550 [ 0.3167; 1.1932]        8.8 Achievement
## Figlio et al. (2017)           0.7681 [ 0.2496; 1.2867]        6.3 Achievement
## Mollon et al. (2018)           0.8485 [ 0.3300; 1.3671]        6.3     IQ or g
## Engelhardt et al. (2019)       0.6782 [ 0.0584; 1.2980]        4.4     IQ or g
## Pesta et al. (2019)            0.8185 [ 0.1685; 1.4686]        4.0     IQ or g
## Pesta et al. (2019)            0.7810 [-0.4272; 1.9892]        1.2     IQ or g
## Scarr-Salapatek (1971)         0.5568 [-0.4987; 1.6122]        1.5     IQ or g
## Beaver et al. (2013)           0.7141 [ 0.0943; 1.3339]        4.4     IQ or g
## Hodges (1976)                  0.4472 [-0.5130; 1.4074]        1.8     IQ or g
## Osborne (1980)                 0.7681 [-0.2874; 1.8236]        1.5     IQ or g
## Scarr (1981)                   0.6928 [-0.2674; 1.6530]        1.8     IQ or g
## Scarr et al. (1993)            0.7211 [-0.5790; 2.0212]        1.0     IQ or g
## Beaver et al. (2013)           0.6708 [-0.2057; 1.5473]        2.2     IQ or g
## Hart et al. (2013)             0.9434 [ 0.1594; 1.7274]        2.8 Achievement
## Woodley of Menie et al. (2015) 0.9695 [-0.8375; 2.7765]        0.5     IQ or g
## Figlio et al. (2017)           0.7483 [ 0.1603; 1.3363]        4.9 Achievement
## Figlio et al. (2017)           0.6928 [ 0.0428; 1.3429]        4.0 Achievement
## Mollon et al. (2018)           0.7810 [ 0.0477; 1.5144]        3.1     IQ or g
## Engelhardt et al. (2019)       0.3606 [-0.7130; 1.4341]        1.5     IQ or g
## Pesta et al. (2019)            0.9000 [ 0.3120; 1.4880]        4.9     IQ or g
## Pesta et al. (2019)            0.7874 [-0.2681; 1.8429]        1.5     IQ or g
## 
## Number of studies combined: k = 31
## 
##                         COR           95%-CI     t  p-value
## Random effects model 0.7246 [0.6763; 0.7729] 30.64 < 0.0001
## 
## Quantifying heterogeneity:
##  tau^2 = 0; tau = 0; I^2 = 0.0% [0.0%; 40.2%]; H = 1.00 [1.00; 1.29]
## 
## Test of heterogeneity:
##     Q d.f. p-value
##  3.81   30  1.0000
## 
## Results for subgroups (random effects model):
##                            k    COR           95%-CI tau^2 tau    Q  I^2
## Test.Types = IQ or g      24 0.7028 [0.6415; 0.7640]     0   0 2.97 0.0%
## Test.Types = Achievement   7 0.7648 [0.6752; 0.8544]     0   0 0.64 0.0%
## 
## Test for subgroup differences (random effects model):
##                     Q d.f. p-value
## Between groups   1.74    1  0.1873
## 
## Details on meta-analytical method:
## - Inverse variance method
## - Restricted maximum-likelihood estimator for tau^2
## - Hartung-Knapp adjustment for random effects model
## - Untransformed correlations

Next, we can calculate the amount of power and the minimum detectable effect. Lets assume an \(I^2\) of our upper-bound from the total meta-analysis, of 0.40. We need to use an even number for this, so lets reduce the study_size by 1, since increasing it would overestimate our power, if only marginally. Lets also set k equal to 15, since there were 15 studies for the Black group.

BWPower <- subgroup_power(n_groups = 2,
                          effect_sizes = c(0.7183, 0.7350),
                          study_size = 464,
                          k = 15,
                          i2 = 0.4,
                          es_type = "r"); BWPower; plot_subgroup_power(BWPower)
## 
##  Power Analysis for Subgroup analysis: 
## 
##  Effect Size Metric:                r 
##  Number of Subgroups:               2 
##  Groups:                            
##  Expected Effect Sizes:             0.904124 0.9395164 
##  Expected Study Size:               464 
##  Expected Number of Studies:        15 
## 
##  Esimated Power to detect subgroup differences 
## 
##  Fixed-Effects Model:               0.3112391 
##  Random-Effects Model (i2 = 40%):   0.2060019

The power to detect differences is quite low! How low? One way to tell is to take the smallest number of studies with the median N used above which passes 0.80. These are

BWPower$subgroup_power_range$k_v[which.max(BWPower$subgroup_power_range$fixed_power_b > 0.80)] - 15
## [1] 41

So, to have 80% power with the observed differences, in a fixed effects model (which is appropriate given \(I^2\)), we would need 41 more studies in the meta-analysis. The estimand for that robustness check is the ability to reliable detect the small observed effect. Another way to do this is to calculate the minimum detectable effect explicitly. The estimand for this analysis is how large of an effect we can discover with our data. With \(I^2\) still equal to 0.40, if we assume the first group has an r of 0.7183, we achieve 80% power with an r difference of +0.03117 or -0.03434. For 80% power in a random effects model, this would be +0.03966 or -0.04495. These are very small differences, and they translate to heritability differences of

0.03117^2
## [1] 0.0009715689
0.03434^2
## [1] 0.001179236
0.03966^2
## [1] 0.001572916
0.04495^2
## [1] 0.002020502

But this is naive. You need to add these values to the unsquared heritability values and then square them to get the real extent of the effect because of Jensen’s inequality; for example

0.707^2 + 0.005^2 == 0.712^2
## [1] FALSE

But

(0.707+0.005)^2 == 0.712^2
## [1] TRUE

So, the real detectable difference is

abs(0.7183^2-(0.7183+0.03117)^2)
## [1] 0.04575039
0.7183^2-(0.7183-0.03434)^2
## [1] 0.04815361
abs(0.7183^2-(0.7183+0.03966)^2)
## [1] 0.05854847
abs(0.7183^2-(0.7183+0.04495)^2)
## [1] 0.06659567

There was seemingly plenty of power to detect rather modest heritability differences. The minimum detectable effects were all on the order of \(|r|\) = 0.03-0.05. It’s rather easy to do this. Because of that, it might be interesting to see what we would be able to detect if k = 10, the N per study was 200, and \(h^2\) and \(I^2\) were 0.50. With the size difference for the correlation from the analysis:

PowerTest <- subgroup_power(n_groups = 2,
                          effect_sizes = c(0.707, 0.7237),
                          study_size = 200,
                          k = 10,
                          i2 = 0.5,
                          es_type = "r"); PowerTest; plot_subgroup_power(PowerTest)
## 
##  Power Analysis for Subgroup analysis: 
## 
##  Effect Size Metric:                r 
##  Number of Subgroups:               2 
##  Groups:                            
##  Expected Effect Sizes:             0.8811601 0.9153706 
##  Expected Study Size:               200 
##  Expected Number of Studies:        10 
## 
##  Esimated Power to detect subgroup differences 
## 
##  Fixed-Effects Model:               0.1171319 
##  Random-Effects Model (i2 = 50%):   0.08309647

But, to achieve 80% power, this only needs to increase to +/- 0.05809/0.06951 for fixed effects and +/- 0.07908/0.10191 for random effects.

subgroup_power(n_groups = 2,
                          effect_sizes = c(0.707, 0.76509),
                          study_size = 200,
                          k = 10,
                          i2 = 0.5,
                          es_type = "r")
## 
##  Power Analysis for Subgroup analysis: 
## 
##  Effect Size Metric:                r 
##  Number of Subgroups:               2 
##  Groups:                            
##  Expected Effect Sizes:             0.8811601 1.008377 
##  Expected Study Size:               200 
##  Expected Number of Studies:        10 
## 
##  Esimated Power to detect subgroup differences 
## 
##  Fixed-Effects Model:               0.8000241 
##  Random-Effects Model (i2 = 50%):   0.5084634
subgroup_power(n_groups = 2,
                          effect_sizes = c(0.707, 0.63749),
                          study_size = 200,
                          k = 10,
                          i2 = 0.5,
                          es_type = "r")
## 
##  Power Analysis for Subgroup analysis: 
## 
##  Effect Size Metric:                r 
##  Number of Subgroups:               2 
##  Groups:                            
##  Expected Effect Sizes:             0.8811601 0.7539339 
##  Expected Study Size:               200 
##  Expected Number of Studies:        10 
## 
##  Esimated Power to detect subgroup differences 
## 
##  Fixed-Effects Model:               0.8000794 
##  Random-Effects Model (i2 = 50%):   0.5085191
subgroup_power(n_groups = 2,
                          effect_sizes = c(0.707, 0.78608),
                          study_size = 200,
                          k = 10,
                          i2 = 0.5,
                          es_type = "r")
## 
##  Power Analysis for Subgroup analysis: 
## 
##  Effect Size Metric:                r 
##  Number of Subgroups:               2 
##  Groups:                            
##  Expected Effect Sizes:             0.8811601 1.061088 
##  Expected Study Size:               200 
##  Expected Number of Studies:        10 
## 
##  Esimated Power to detect subgroup differences 
## 
##  Fixed-Effects Model:               0.9773868 
##  Random-Effects Model (i2 = 50%):   0.8000934
subgroup_power(n_groups = 2,
                          effect_sizes = c(0.707, 0.60509),
                          study_size = 200,
                          k = 10,
                          i2 = 0.5,
                          es_type = "r")
## 
##  Power Analysis for Subgroup analysis: 
## 
##  Effect Size Metric:                r 
##  Number of Subgroups:               2 
##  Groups:                            
##  Expected Effect Sizes:             0.8811601 0.7011386 
##  Expected Study Size:               200 
##  Expected Number of Studies:        10 
## 
##  Esimated Power to detect subgroup differences 
## 
##  Fixed-Effects Model:               0.9774969 
##  Random-Effects Model (i2 = 50%):   0.8004997

Or

abs(0.7183^2-(0.7183+0.05809)^2)
## [1] 0.08682654
0.7183^2-(0.7183-0.06951)^2
## [1] 0.09502643
abs(0.7183^2-(0.7183+0.07908)^2)
## [1] 0.11986
abs(0.7183^2-(0.7183+0.10191)^2)
## [1] 0.1567896

So, with much smaller effects, detectable differences in heritability are modest if 80% power is considered sufficient.

Another way to consider heritability meta-analyses is to leave heritability squared and treat it directly as a correlation coefficient. This was done by Dochtermann et al. (2019). While this seems inconsistent with heritability being a variance term, it is nonetheless what they did and what some might consider more appropriate. It would certainly add at least a modicum of inferential confidence to see two ways of considering heritability converge on similar results. The proof that Dochtermann et al. (2019) Fisher-transformed heritability values can be found where their data is hosted (https://archive.ph/ePtUE). Consider rows one and two of their sheet. The heritability values are 0.68 and 0.27. Transforming these, they are exactly their values for Zr in the same rows. See:

atanh(c(0.68, 0.27))
## [1] 0.8291140 0.2768638

So we will do this, too.

data$ZR = atanh(data$A)
data$ZRSE = atanh(data$ASE)
AMeta <- metagen(data = data,
                 TE = ZR,
                 seTE = ZRSE,
                 studlab = Study,
                 sm = "ZCOR",
                 fixed = F,
                 random = T,
                 method.tau = "HS", 
                 hakn = T,
                 backtransf = T,
                 title = "The Heritability of Cognitive Ability Among Blacks and Whites in the U.S."); summary(AMeta)
## Review:     The Heritability of Cognitive Ability Among Blacks and Whites in ...
## 
##                                   COR            95%-CI %W(random)
## Scarr-Salapatek (1971)         0.2800 [-0.3664; 0.7441]        1.5
## Beaver et al. (2013)           0.4800 [ 0.2973; 0.6288]        4.3
## Hodges (1976)                  0.3400 [ 0.0176; 0.5983]        3.3
## Osborne (1980)                 0.5400 [ 0.2801; 0.7261]        3.5
## Scarr (1981)                   0.5300 [ 0.1304; 0.7815]        2.5
## Scarr et al. (1993)            0.4100 [-0.3754; 0.8527]        1.1
## Beaver et al. (2013)           0.5200 [ 0.3963; 0.6252]        4.8
## Rhemtulla & Tucker-Drob (2012) 0.3700 [ 0.1703; 0.5405]        4.3
## Hart et al. (2013)             0.8000 [ 0.7173; 0.8605]        4.5
## Woodley of Menie et al. (2015) 0.1400 [-0.0754; 0.3429]        4.3
## Figlio et al. (2017)           0.5700 [ 0.5001; 0.6325]        5.1
## Figlio et al. (2017)           0.5900 [ 0.4932; 0.6724]        4.9
## Mollon et al. (2018)           0.7200 [ 0.6471; 0.7799]        4.9
## Engelhardt et al. (2019)       0.4600 [ 0.2919; 0.6005]        4.5
## Pesta et al. (2019)            0.6700 [ 0.5330; 0.7728]        4.3
## Pesta et al. (2019)            0.6100 [-0.0750; 0.9039]        1.2
## Scarr-Salapatek (1971)         0.3100 [-0.2586; 0.7191]        1.8
## Beaver et al. (2013)           0.5100 [ 0.3506; 0.6407]        4.5
## Hodges (1976)                  0.2000 [-0.2701; 0.5931]        2.4
## Osborne (1980)                 0.5900 [ 0.0922; 0.8518]        1.8
## Scarr (1981)                   0.4800 [ 0.0432; 0.7627]        2.4
## Scarr et al. (1993)            0.5200 [-0.3357; 0.9055]        0.9
## Beaver et al. (2013)           0.4500 [ 0.0871; 0.7074]        2.9
## Hart et al. (2013)             0.8900 [ 0.8025; 0.9400]        3.5
## Woodley of Menie et al. (2015) 0.9400 [-0.6194; 0.9996]        0.2
## Figlio et al. (2017)           0.5600 [ 0.4268; 0.6694]        4.6
## Figlio et al. (2017)           0.4800 [ 0.2973; 0.6288]        4.3
## Mollon et al. (2018)           0.6100 [ 0.4076; 0.7553]        3.8
## Engelhardt et al. (2019)       0.1300 [-0.4430; 0.6276]        1.8
## Pesta et al. (2019)            0.8100 [ 0.7399; 0.8627]        4.6
## Pesta et al. (2019)            0.6200 [ 0.1389; 0.8643]        1.8
## 
## Number of studies combined: k = 31
## 
##                         COR           95%-CI     t  p-value
## Random effects model 0.5595 [0.4821; 0.6283] 12.12 < 0.0001
## 
## Quantifying heterogeneity:
##  tau^2 = 0.0459 [0.0257; 0.1147]; tau = 0.2142 [0.1605; 0.3387]
##  I^2 = 77.4% [68.3%; 83.9%]; H = 2.10 [1.78; 2.49]
## 
## Test of heterogeneity:
##       Q d.f.  p-value
##  132.90   30 < 0.0001
## 
## Details on meta-analytical method:
## - Inverse variance method
## - Hunter-Schmidt estimator for tau^2
## - Q-profile method for confidence interval of tau^2 and tau
## - Hartung-Knapp adjustment for random effects model
## - Fisher's z transformation of correlations
update.meta(AMeta, 
            subgroup = Race,
            tau.common = T)
## Review:     The Heritability of Cognitive Ability Among Blacks and Whites in ...
## 
## Number of studies combined: k = 31
## 
##                         COR           95%-CI     t  p-value
## Random effects model 0.5595 [0.4821; 0.6283] 12.12 < 0.0001
## 
## Quantifying heterogeneity:
##  tau^2 = 0.0459 [0.0257; 0.1147]; tau = 0.2142 [0.1605; 0.3387]
##  I^2 = 77.4% [68.3%; 83.9%]; H = 2.10 [1.78; 2.49]
## 
## Quantifying residual heterogeneity:
##  tau^2 = 0.0446 [0.0261; 0.1198]; tau = 0.2111 [0.1616; 0.3461]
##  I^2 = 77.7% [68.6%; 84.2%]; H = 2.12 [1.78; 2.51]
## 
## Test of heterogeneity:
##       Q d.f.  p-value
##  132.90   30 < 0.0001
## 
## Results for subgroups (random effects model):
##                k    COR           95%-CI  tau^2    tau     Q   I^2
## Race = White  16 0.5389 [0.4390; 0.6257] 0.0446 0.2111 71.62 79.1%
## Race = Black  15 0.5887 [0.4484; 0.7006] 0.0446 0.2111 58.44 76.0%
## 
## Test for subgroup differences (random effects model):
##                     Q d.f.  p-value
## Between groups   0.45    1   0.5034
## Within groups  130.05   29 < 0.0001
## 
## Details on meta-analytical method:
## - Inverse variance method
## - Hunter-Schmidt estimator for tau^2 (assuming common tau^2 in subgroups)
## - Q-profile method for confidence interval of tau^2 and tau
## - Hartung-Knapp adjustment for random effects model
## - Fisher's z transformation of correlations

This approach is a lot noisier! Lets see how powerful it is given this new \(I^2\).

BWPower <- subgroup_power(n_groups = 2,
                          effect_sizes = c(0.5389, 0.5887),
                          study_size = 464,
                          k = 15,
                          i2 = 0.777,
                          es_type = "r"); BWPower; plot_subgroup_power(BWPower)
## 
##  Power Analysis for Subgroup analysis: 
## 
##  Effect Size Metric:                r 
##  Number of Subgroups:               2 
##  Groups:                            
##  Expected Effect Sizes:             0.6026041 0.6756742 
##  Expected Study Size:               464 
##  Expected Number of Studies:        15 
## 
##  Esimated Power to detect subgroup differences 
## 
##  Fixed-Effects Model:               0.8572991 
##  Random-Effects Model (i2 = 77.7%):   0.2984232

So, we have 86% power to detect a difference of this magnitude with fixed effects. This is of course not very meaningful, as it’s just a reiteration of the p value, as are all post-hoc power estimates. How large would the difference need to be to achieve 80% power with random effects? It could be +0.09372 or -0.10927 (run those values!). Since those are in heritability units already, that makes it apparent we can detect quite small differences in heritability, of around 10% around our meta-analytic mean. But we can also set \(I^2\) to 80%, \(h^2\) to 50%, k to 10, and the typical study size to 100 to go further. For fixed effects models, this results in a detectable (with 80% power, at p = 0.05) changes in heritability of -0.14903 or +0.12433. For random effects, these would be a worrying -0.36027 and +0.24337. These values would be problematic for sure, but the real difference observed in the data was a pittance of that and the minimum detectable effects were much smaller.

Dochtermann et al. also used a robustness check in the form of setting all sample N’s to 100 for their meta-analysis to disallow overweighting any particular sample. I will attempt this in a different way, by setting all SEs equal to half of the largest SE, so 0.85/2 = 0.425.

data$NC = rep(c(0.425), 31)

AMeta <- metagen(data = data,
                 TE = ZR,
                 seTE = NC,
                 studlab = Study,
                 sm = "ZCOR",
                 fixed = F,
                 random = T,
                 method.tau = "HS", 
                 hakn = T,
                 backtransf = T,
                 title = "The Heritability of Cognitive Ability Among Blacks and Whites in the U.S."); summary(AMeta)
## Review:     The Heritability of Cognitive Ability Among Blacks and Whites in ...
## 
##                                   COR            95%-CI %W(random)
## Scarr-Salapatek (1971)         0.2800 [-0.4970; 0.8078]        3.2
## Beaver et al. (2013)           0.4800 [-0.3004; 0.8755]        3.2
## Hodges (1976)                  0.3400 [-0.4454; 0.8297]        3.2
## Osborne (1980)                 0.5400 [-0.2249; 0.8931]        3.2
## Scarr (1981)                   0.5300 [-0.2382; 0.8903]        3.2
## Scarr et al. (1993)            0.4100 [-0.3777; 0.8534]        3.2
## Beaver et al. (2013)           0.5200 [-0.2512; 0.8874]        3.2
## Rhemtulla & Tucker-Drob (2012) 0.3700 [-0.4174; 0.8401]        3.2
## Hart et al. (2013)             0.8000 [ 0.2596; 0.9589]        3.2
## Woodley of Menie et al. (2015) 0.1400 [-0.5993; 0.7504]        3.2
## Figlio et al. (2017)           0.5700 [-0.1834; 0.9016]        3.2
## Figlio et al. (2017)           0.5900 [-0.1541; 0.9071]        3.2
## Mollon et al. (2018)           0.7200 [ 0.0745; 0.9403]        3.2
## Engelhardt et al. (2019)       0.4600 [-0.3236; 0.8693]        3.2
## Pesta et al. (2019)            0.6700 [-0.0222; 0.9280]        3.2
## Pesta et al. (2019)            0.6100 [-0.1234; 0.9124]        3.2
## Scarr-Salapatek (1971)         0.3100 [-0.4718; 0.8189]        3.2
## Beaver et al. (2013)           0.5100 [-0.2639; 0.8844]        3.2
## Hodges (1976)                  0.2000 [-0.5582; 0.7762]        3.2
## Osborne (1980)                 0.5900 [-0.1541; 0.9071]        3.2
## Scarr (1981)                   0.4800 [-0.3004; 0.8755]        3.2
## Scarr et al. (1993)            0.5200 [-0.2512; 0.8874]        3.2
## Beaver et al. (2013)           0.4500 [-0.3349; 0.8662]        3.2
## Hart et al. (2013)             0.8900 [ 0.5291; 0.9782]        3.2
## Woodley of Menie et al. (2015) 0.9400 [ 0.7188; 0.9884]        3.2
## Figlio et al. (2017)           0.5600 [-0.1975; 0.8988]        3.2
## Figlio et al. (2017)           0.4800 [-0.3004; 0.8755]        3.2
## Mollon et al. (2018)           0.6100 [-0.1234; 0.9124]        3.2
## Engelhardt et al. (2019)       0.1300 [-0.6058; 0.7459]        3.2
## Pesta et al. (2019)            0.8100 [ 0.2859; 0.9611]        3.2
## Pesta et al. (2019)            0.6200 [-0.1076; 0.9151]        3.2
## 
## Number of studies combined: k = 31
## 
##                         COR           95%-CI     t  p-value
## Random effects model 0.5597 [0.4663; 0.6407] 10.17 < 0.0001
## 
## Quantifying heterogeneity:
##  tau^2 = 0 [0.0000; 0.0337]; tau = 0 [0.0000; 0.1836]
##  I^2 = 0.0% [0.0%; 40.2%]; H = 1.00 [1.00; 1.29]
## 
## Test of heterogeneity:
##      Q d.f. p-value
##  19.93   30  0.9184
## 
## Details on meta-analytical method:
## - Inverse variance method
## - Hunter-Schmidt estimator for tau^2
## - Q-profile method for confidence interval of tau^2 and tau
## - Hartung-Knapp adjustment for random effects model
## - Fisher's z transformation of correlations
update.meta(AMeta, 
            subgroup = Race,
            tau.common = T)
## Review:     The Heritability of Cognitive Ability Among Blacks and Whites in ...
## 
## Number of studies combined: k = 31
## 
##                         COR           95%-CI     t  p-value
## Random effects model 0.5597 [0.4663; 0.6407] 10.17 < 0.0001
## 
## Quantifying heterogeneity:
##  tau^2 = 0 [0.0000; 0.0337]; tau = 0 [0.0000; 0.1836]
##  I^2 = 0.0% [0.0%; 40.2%]; H = 1.00 [1.00; 1.29]
## 
## Quantifying residual heterogeneity:
##  tau^2 = 0 [0.0000; 0.0376]; tau = 0 [0.0000; 0.1940]
##  I^2 = 0.0% [0.0%; 40.8%]; H = 1.00 [1.00; 1.30]
## 
## Test of heterogeneity:
##      Q d.f. p-value
##  19.93   30  0.9184
## 
## Results for subgroups (random effects model):
##                k    COR           95%-CI tau^2 tau     Q  I^2
## Race = White  16 0.5212 [0.4224; 0.6078]     0   0  4.75 0.0%
## Race = Black  15 0.5982 [0.4216; 0.7311]     0   0 14.64 4.4%
## 
## Test for subgroup differences (random effects model):
##                    Q d.f. p-value
## Between groups  0.78    1  0.3771
## Within groups  19.39   29  0.9110
## 
## Details on meta-analytical method:
## - Inverse variance method
## - Hunter-Schmidt estimator for tau^2 (assuming common tau^2 in subgroups)
## - Q-profile method for confidence interval of tau^2 and tau
## - Hartung-Knapp adjustment for random effects model
## - Fisher's z transformation of correlations

And they performed a Bayesian Beta regression because of the non-normal nature of Zr. This is basically a mixed effects meta-analysis. Note that the moderation effects are fairly dubious. For example, it’s likely that “Study” proxies to some extent for age, which is known to moderate heritability (see ‘the Wilson effect’).

APrior <- get_prior(A + 0.001 ~ 1 + (1|Study) + (1|Test.Types) + (1|Analysis.Type) + (1|Race), data = data, family = Beta)
AMCMC <- brm(A + 0.001 ~ 1 + (1|Study) + (1|Test.Types) + (1|Analysis.Type) + (1|Race), data = data, family = Beta, prior = APrior, 
             control = list(adapt_delta = 0.99), 
             chains = 32, cores = 16, iter = 10000, warmup = 2500, 
             seed = 1)
## Compiling Stan program...
## Start sampling
plot(AMCMC)

AX <- posterior_samples(AMCMC, "Intercept")
## Warning: Method 'posterior_samples' is deprecated. Please see ?as_draws for
## recommended alternatives.
summary(AMCMC)
##  Family: beta 
##   Links: mu = logit; phi = identity 
## Formula: A + 0.001 ~ 1 + (1 | Study) + (1 | Test.Types) + (1 | Analysis.Type) + (1 | Race) 
##    Data: data (Number of observations: 31) 
##   Draws: 32 chains, each with iter = 10000; warmup = 2500; thin = 1;
##          total post-warmup draws = 240000
## 
## Group-Level Effects: 
## ~Analysis.Type (Number of levels: 4) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.48      0.53     0.01     1.90 1.00    93394   117833
## 
## ~Race (Number of levels: 2) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.90      1.11     0.02     3.91 1.00    80093   106548
## 
## ~Study (Number of levels: 13) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.49      0.28     0.04     1.12 1.00    58978    75587
## 
## ~Test.Types (Number of levels: 2) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     1.15      1.22     0.04     4.37 1.00    98471   104835
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept     0.16      1.16    -2.28     2.57 1.00   110642   116255
## 
## Family Specific Parameters: 
##     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## phi     6.88      2.09     3.63    11.71 1.00   105574   160641
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
posterior.mode(as.mcmc(plogis(AX$b_Intercept))); tanh(posterior.mode(as.mcmc(plogis(AX$b_Intercept))))
##      var1 
## 0.5365338
##      var1 
## 0.4903597
HPDinterval(as.mcmc(plogis(AX$b_Intercept)))
##         lower     upper
## var1 0.120232 0.9533259
## attr(,"Probability")
## [1] 0.95
plot(as.mcmc(plogis(AX$b_Intercept)))

Conclusion

It has been claimed that this meta-analysis was underpowered to make negative conclusions about differences in heritability between racial groups in the United States. This claim has not been quantitatively evidenced, so I sought to assess whether it was the case. At least for the Black-White comparison that was the primary focus of the paper - and which was almost the sole group generalizations were made for - it appeared that “low power” claims were a matter of taste rather than practical importance.

Depending on the method used for meta-analyzing heritability, the size of the difference the meta-analysis had 80% power to detect was between 4 and 11%. The observed differences were well within this minute range, which makes it difficult to consider them practically concerning since it is such a narrow interval. If these were of practical concern, one would have to suggest that, say, there are strong nonlinearities in the relationships between variances and mean differences, extreme sampling error, or some other artifact that may haunt the analysis. For group differences to be caused by something within the small region in which effects could be detected would suggest that this small percent has extremely outsized effects, and I do not believe anyone is able to support that conclusion. Since these claims are variously implausible or as-yet without evidence, I will not consider them any further; without any reason to believe in them, neither should anyone else.

In general, if a person claims someone else has “low power”, they ought to state exactly what that means. Unless we’re dealing with N = 3 and something like a negative claim about d = 0.2 or a positive claim about d = 50, the comment’s meaning or truth value cannot be taken for granted. “Low power” is always with respect to a given effect size. The meta-analysis I reanalyzed probably was not subject to meaningful power concerns with respect to the comparison I examined. This was true virtually regardless of how you think heritability should be treated. Even when I reran the meta-analysis exactly as Dochtermann et al. did, the results were substantially the same. Saying something has “low power” without a rigorous enough examination of the power of the analysis is good grounds to dismiss someone’s subsequent views.

As a final note, semantic disagreements over labels should be qualified numerically, not only in terms of how they would affect the analysis, but by the degree of correspondence or the lack thereof between labels and, say, self-identification, ancestry, education level, or whatever is relevant. Semantic disagreements without this sort of qualification are basically useless.

References

Pesta, B. J., Kirkegaard, E. O. W., Nijenhuis, J. te, Lasker, J., & Fuerst, J. G. R. (2020). Racial and ethnic group differences in the heritability of intelligence: A systematic review and meta-analysis. Intelligence, 78, 1–12. https://doi.org/10.1016/j.intell.2019.101408

Griffin, J. W. (2021). Calculating statistical power for meta-analysis using metapower. The Quantitative Methods for Psychology, 17(1), 24–39. https://doi.org/10.20982/tqmp.17.1.p024

Dochtermann, N. A., Schwab, T., Anderson Berdal, M., Dalos, J., & Royaute, R. (2019). The Heritability of Behavior: A Meta-analysis. The Journal of Heredity, 110(4), 403–410. https://doi.org/10.1093/jhered/esz023