Setup

library(pacman); p_load(DT, sandwich, lmtest, pwr, InteractionPoweR)

CRITR <- function(n, alpha = .05, rnd = 3) {
  df <- n - 2; CRITT <- qt(alpha/2, df, lower.tail = F)
  CRITR <- sqrt((CRITT^2)/((CRITT^2) + df ))
  return(round(CRITR, rnd))}

CRITR(98) #Full Sample
## [1] 0.199
CRITR(30) #mid-Pleistocene
## [1] 0.361
CRITR(14) #Neanderthal
## [1] 0.532
CRITR(54) #Pleistocene
## [1] 0.268
CRITR(32) #mid-Pleistocene
## [1] 0.349
CRITR(19) #Neanderthal
## [1] 0.456
CRITR(47) #Pleistocene
## [1] 0.288
datatable(data, extensions = c("Buttons", "FixedColumns"), options = list(dom = 'Bfrtip', buttons = c('copy', 'csv', 'print'), scrollX = T, fixedColumns = list(leftColumns = 3)))

Background

Will, Krapp, Stock & Manica (2021) tested relationships between various environmental variables, body size, and brain size in Homo, with data from mid-Pleistocene Homo, Neanderthal, and Pleistocene Homo sapiens. They dubbed their model “LM-TC”, or “linear model with taxonomic differences plus a climate effect”, but as they note: “[F]or the real fossil analysis we used the model with the interaction term, LM-T*C”. They provided their data in their Source Data file. Above, I have posted their data from within the Source Data, which contained data on brain size, taxonomy (i.e., type of Homo), and mean temperature and coldest quarter temperature, respectively. The brain size spreadsheets (Figs 3c-d) were filtered by the mean temperature and coldest quarter temperature ones (Figs 3a-b). This yielded a dataset with brain volumes for 30 mid-Pleistocene Homo, 14 Neanderthal, and 54 Pleistocene Homo. Though they did not provide this data in a directly usable format, they reported that they had 32, 19, and 47 of each of these groups in their supplementary materials. So, seven of the Pleistocene Homo were taken from the other groups somewhere along the way. It makes little difference (see above). Here, I am only interested in the relationship of brain size, mean temperature, and coldest quarter temperature.

The authors performed a power analysis via simulation, but this hardly justified their interpretation of many of their results because they had very low power. Their power analysis ended up being cross-validation more so than power analysis. They also discussed differences between types of Homo, which meant they discussed interaction effects, but they did not do power analyses for interactions, and as will be shown below, they had little power anyway. This should be immediately obvious given how large they expected their effects to be. For them, strong effects were those that were \(\frac{1}{4}\) of the maximum slopes, while medium was \(\frac{1}{8}\) and small was \(\frac{1}{16}\). The maximum slope is given as follows:

MaxSlope <- function(VarX, VarY, Scale = .25, rnd = 3) {
  MaxSlope  = diff(range(VarY))/diff(range(VarX))
  Intercept = ((min(VarY) + max(VarY))/2) - (Scale * MaxSlope * (min(VarX) + max(VarX))/2)
  cat(paste0("Using Will et al.'s (2021) formulae, the expected maximum slope is ", round(MaxSlope, rnd), " with an intercept of ", round(Intercept, rnd), ". \n"))}

MaxSlope(data$BrainSize, data$MeanAnnualTemperature)
## Using Will et al.'s (2021) formulae, the expected maximum slope is 45.367 with an intercept of 198.733.
45.367/c(4, 8, 16) #Large, medium, and small effects
## [1] 11.341750  5.670875  2.835437
MaxSlope(data$BrainSize, data$ColdestQuarterTemperature)
## Using Will et al.'s (2021) formulae, the expected maximum slope is 69.215 with an intercept of 145.924.
69.215/c(4, 8, 16)
## [1] 17.303750  8.651875  4.325938

With these maximum slopes in mind, what do the actual regressions look like?

summary(lm(BrainSize ~ MeanAnnualTemperature, data))
## 
## Call:
## lm(formula = BrainSize ~ MeanAnnualTemperature, data = data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.47190 -0.11168  0.00025  0.11523  0.32998 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            8.402939   0.546523   15.38   <2e-16 ***
## MeanAnnualTemperature -0.004228   0.001948   -2.17   0.0325 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1549 on 96 degrees of freedom
## Multiple R-squared:  0.04677,    Adjusted R-squared:  0.03684 
## F-statistic:  4.71 on 1 and 96 DF,  p-value: 0.03246
summary(lm(BrainSize ~ ColdestQuarterTemperature, data))
## 
## Call:
## lm(formula = BrainSize ~ ColdestQuarterTemperature, data = data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.48624 -0.10486  0.00798  0.10997  0.33545 
## 
## Coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                7.796094   0.366507  21.271   <2e-16 ***
## ColdestQuarterTemperature -0.002134   0.001350  -1.581    0.117    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1567 on 96 degrees of freedom
## Multiple R-squared:  0.02536,    Adjusted R-squared:  0.01521 
## F-statistic: 2.498 on 1 and 96 DF,  p-value: 0.1173

And when standardized? (Correlations!)

summary(lm(scale(BrainSize) ~ scale(MeanAnnualTemperature), data))
## 
## Call:
## lm(formula = scale(BrainSize) ~ scale(MeanAnnualTemperature), 
##     data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.9894 -0.7075  0.0016  0.7300  2.0904 
## 
## Coefficients:
##                                Estimate Std. Error t value Pr(>|t|)  
## (Intercept)                   2.852e-15  9.914e-02    0.00   1.0000  
## scale(MeanAnnualTemperature) -2.163e-01  9.965e-02   -2.17   0.0325 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9814 on 96 degrees of freedom
## Multiple R-squared:  0.04677,    Adjusted R-squared:  0.03684 
## F-statistic:  4.71 on 1 and 96 DF,  p-value: 0.03246
summary(lm(scale(BrainSize) ~ scale(ColdestQuarterTemperature), data))
## 
## Call:
## lm(formula = scale(BrainSize) ~ scale(ColdestQuarterTemperature), 
##     data = data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -3.08026 -0.66424  0.05058  0.69662  2.12505 
## 
## Coefficients:
##                                    Estimate Std. Error t value Pr(>|t|)
## (Intercept)                       2.344e-15  1.002e-01   0.000    1.000
## scale(ColdestQuarterTemperature) -1.593e-01  1.008e-01  -1.581    0.117
## 
## Residual standard error: 0.9924 on 96 degrees of freedom
## Multiple R-squared:  0.02536,    Adjusted R-squared:  0.01521 
## F-statistic: 2.498 on 1 and 96 DF,  p-value: 0.1173

The smallest effect is still many times what the data had. It is outrageous to expect anything like being well-powered. As shown above, the lowest correlations that are significant with this sample size are .199 in aggregate, .361 in the mid-Pleistocene sample, .532 in the Neanderthal sample, and .268 in the Pleistocene sample. How much power does this have if we have a modest correlation of .2? We have nothing else to expect and it’s just under the critical value, so it might as well be attempted.

pwr.f2.test(u = 1, v = (98 - 2), f2 = .04/(1 - .04), sig.level = .05)
## 
##      Multiple regression power calculation 
## 
##               u = 1
##               v = 96
##              f2 = 0.04166667
##       sig.level = 0.05
##           power = 0.5162005

The power without any interactions was about what would be expected under their supplementary power analysis of a weak effect size. And what is our power to detect interactions that cancel out the relationship within a given taxonomic grouping? For extra power, lets say there are only two groups and they have different brain sizes, with one of them having .2 d larger brains (which is rpb = .1 between group and size).

power_interaction_r2(
  alpha = .05,
  N = 98,
  r.x1x2.y = -.2,
  r.x1.y = .2,
  r.x2.y = .2,
  r.x1.x2 = .1)
##         pwr
## 1 0.5155657

And what if the interaction is larger or smaller, meaning that it’s a suppressor or just has a smaller effect?

power_interaction_r2(
  alpha = .05,
  N = 98,
  r.x1x2.y = -.1,
  r.x1.y = .2,
  r.x2.y = .2,
  r.x1.x2 = .1)
##         pwr
## 1 0.1699474
power_interaction_r2(
  alpha = .05,
  N = 98,
  r.x1x2.y = -.15,
  r.x1.y = .2,
  r.x2.y = .2,
  r.x1.x2 = .1)
##         pwr
## 1 0.3227544
power_interaction_r2(
  alpha = .05,
  N = 98,
  r.x1x2.y = -.25,
  r.x1.y = .2,
  r.x2.y = .2,
  r.x1.x2 = .1)
##         pwr
## 1 0.7049273
power_interaction_r2(
  alpha = .05,
  N = 98,
  r.x1x2.y = -.3,
  r.x1.y = .2,
  r.x2.y = .2,
  r.x1.x2 = .1)
##         pwr
## 1 0.8504296

There is no way this analysis is well-powered, but that can also be tested via simulation.

set.seed(1)

power_interaction(
  n.iter = 10000,
  alpha = .05,
  N = 98,
  r.x1x2.y = -.2,
  r.x1.y = .2,
  r.x2.y = .2,
  r.x1.x2 = .2,
  k.x2 = 2,
  skew.x1 = -.67, #value for BrainSize variable
  skew.x2 = .055 #average for mean temperature and coldest quarter
)
## Warning: executing %dopar% sequentially: no parallel backend registered
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## 
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
## 
##     select
## Performing 10000 simulations
##    N    pwr
## 1 98 0.4815

Analysis

So what do robust regressions in this data look like?

Robust Regressions

Mod <- lm(BrainSize ~ MeanAnnualTemperature, data)
VC  <- vcovHC(Mod, type = "HC5"); RM <- coeftest(Mod, vcov = VC)
RM
## 
## t test of coefficients:
## 
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            8.4029395  0.5537292 15.1752  < 2e-16 ***
## MeanAnnualTemperature -0.0042281  0.0019920 -2.1226  0.03636 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
ModInt <- lm(BrainSize ~ MeanAnnualTemperature * Taxonomy, data)
VCInt  <- vcovHC(ModInt, type = "HC5"); RMInt <- coeftest(ModInt, vcov = VCInt)
RMInt
## 
## t test of coefficients:
## 
##                                                          Estimate Std. Error
## (Intercept)                                             9.0079371  1.0186847
## MeanAnnualTemperature                                  -0.0068997  0.0036129
## TaxonomyNeanderthals                                   -3.4228683  4.8617231
## TaxonomyPleistocene Homo sapiens                       -1.4433561  1.1039315
## MeanAnnualTemperature:TaxonomyNeanderthals              0.0129005  0.0175053
## MeanAnnualTemperature:TaxonomyPleistocene Homo sapiens  0.0059334  0.0039246
##                                                        t value  Pr(>|t|)    
## (Intercept)                                             8.8427 6.213e-14 ***
## MeanAnnualTemperature                                  -1.9097   0.05928 .  
## TaxonomyNeanderthals                                   -0.7040   0.48318    
## TaxonomyPleistocene Homo sapiens                       -1.3075   0.19431    
## MeanAnnualTemperature:TaxonomyNeanderthals              0.7369   0.46303    
## MeanAnnualTemperature:TaxonomyPleistocene Homo sapiens  1.5119   0.13399    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Mod <- lm(BrainSize ~ ColdestQuarterTemperature, data)
VC  <- vcovHC(Mod, type = "HC5"); RM <- coeftest(Mod, vcov = VC)
RM
## 
## t test of coefficients:
## 
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                7.7960937  0.4000646 19.4871   <2e-16 ***
## ColdestQuarterTemperature -0.0021341  0.0014857 -1.4364   0.1541    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
ModInt <- lm(BrainSize ~ ColdestQuarterTemperature * Taxonomy, data)
VCInt  <- vcovHC(ModInt, type = "HC5"); RMInt <- coeftest(ModInt, vcov = VCInt)
RMInt
## 
## t test of coefficients:
## 
##                                                              Estimate
## (Intercept)                                                 7.4577489
## ColdestQuarterTemperature                                  -0.0014572
## TaxonomyNeanderthals                                       -1.2074585
## TaxonomyPleistocene Homo sapiens                           -0.1241849
## ColdestQuarterTemperature:TaxonomyNeanderthals              0.0052193
## ColdestQuarterTemperature:TaxonomyPleistocene Homo sapiens  0.0013129
##                                                            Std. Error t value
## (Intercept)                                                 0.8277184  9.0100
## ColdestQuarterTemperature                                   0.0030197 -0.4826
## TaxonomyNeanderthals                                        5.9693612 -0.2023
## TaxonomyPleistocene Homo sapiens                            0.8759134 -0.1418
## ColdestQuarterTemperature:TaxonomyNeanderthals              0.0223792  0.2332
## ColdestQuarterTemperature:TaxonomyPleistocene Homo sapiens  0.0032055  0.4096
##                                                             Pr(>|t|)    
## (Intercept)                                                2.765e-14 ***
## ColdestQuarterTemperature                                     0.6306    
## TaxonomyNeanderthals                                          0.8401    
## TaxonomyPleistocene Homo sapiens                              0.8876    
## ColdestQuarterTemperature:TaxonomyNeanderthals                0.8161    
## ColdestQuarterTemperature:TaxonomyPleistocene Homo sapiens    0.6831    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Discussion

There was a marginally significant negative relationship between mean annual temperature and brain size in this sample. There was no significant interaction by taxon. The relationship between coldest quarter temperature and brain size was negative and nonsignificant. The power to detect a modest main effect in the form of a correlation of .2 or an explained variance of .04 was around 50% with this full sample. There was likely substantially less power to detect an interaction because interactions require very large sample sizes (see, e.g., Gelman, Hill & Vehtari, 2020). Moreover, power was lower due to how the data was structured, in a way that would have precipitated an interaction with larger sample sizes. This was because there were considerable range restriction in the mid-Pleistocene Homo and Neanderthal samples in terms of both brain sizes and mean annual and coldest quarter temperatures.

It is unsurprising that the significant environmental relationships were all with variables that were uniformly less range-restricted, and it is likewise unsurprising that the aggregate relationship was capped lower than what makes sense because the relatively small-headed and much older (and thus, likely less cognitively adapted) mid-Pleistocene Homo lived almost exclusively in hotter climates, the Neanderthal, whose head sizes came in the middle of the range, lived exclusively in relatively temperate climates, and the most modern and big-headed Pleistocene Homo sapiens had the largest heads and lived in a broader range of climates. If each group was equal in size, The paucity of samples from colder climates would have still attenuated results, but in any case, the overall analysis was limited by range restriction, and this could not be handled by subsetting because the individual groups were too small. With the data as it is, it’s a wonder that Will et al.’s (2021) relationships even trended in a sensible direction (see supplementary figures 6 - 7).

Overall, the only conclusion that can be gathered from this data is that more is needed. This was not enough, and power was much too low. If there were more granularity in temperatures, that could help, but the range-restricted data available also came with lots of clumping, which unfortunately weakened it further. No negative conclusions can be gathered from this data with any reasonable degree of confidence, and this conclusion can be reached even before considering measurement error when imputing brain size from cranial size.

References

Will, M., Krapp, M., Stock, J. T., & Manica, A. (2021). Different environmental variables predict body and brain size evolution in Homo. Nature Communications, 12(1), 4116. https://doi.org/10.1038/s41467-021-24290-7

Gelman, A., Hill, J., & Vehtari, A. (2020). Regression and Other Stories (1st ed.). Cambridge University Press. https://doi.org/10.1017/9781139161879