Setup

Packages

library(pacman)
p_load(MASS, mice, psych, HH, ggplot2, readr, dplyr, magrittr, visreg, broom, purrr, tidyr, kirkegaard, DT, scales, purrr, ggpubr, jtools, huxtable, broom.mixed)

Data

Data retrieved from http://supp.apa.org/psycarticles/supplemental/ebs0000219/ebs0000219_supp.html.

AL <- read.csv("MMA_data_sexed_checked.csv")

For some reason this isn’t already in metric - or as Richardson calls it, “more rational units” - So I have to convert. I’ll also make subsets for men and women.

AL$Height <- AL$Height * 2.54; AL$Armspan <- AL$Armspan * 2.54; AL$Weight <- AL$Weight * 0.544
MAL <- subset(AL, sex == "male"); WAL <- subset(AL, sex == "female")
AL$sex %>% table2()
GroupCountPercent
male1.48e+0389.1
female181       10.9
0       0  

For some reason, Richardson writes “I validated the [sex assignment] algorithm’s accuracy by manually checking 50 randomly selected fighters from each inferred sex. Forty-four inferred females (88%) were indeed females (the 7 that weren’t were corrected)….” but 7 + 44 = 51. Nonetheless, I went through and checked everyone’s sex based on (1) recognizing their names (2) whether male or female classified fighters had feminine or masculine sounding names, respectively. The men seemed to be correctly classified, but Aleksa Camur, Takeya Mizugaki, Yuri Villefort, Tai Tuivasa, Stevie Ray, Callan Potter, Sage Northcutt, and Bristol Marunde were not. I did not do a complete check because that’s needlessly time consuming for a reanalysis, but I am reasonably confident that I have found most, if not all, of the errors. The table below will allow the data with these individuals’ proper sex to be downloaded and it will be given in metric initially rather than requiring conversion. I will perform all analyses using the provided data; the corrected data are provided below.

ALC <- read.csv("MMACorrect.csv"); ALC$Height <- ALC$Height * 2.54; ALC$Armspan <- ALC$Armspan * 2.54; ALC$Weight <- ALC$Weight *0.544
datatable(ALC, extensions = c("Buttons", "FixedColumns"), options = list(dom = 'Bfrtip', buttons = c('copy', 'csv', 'print'), scrollX = T, fixedColumns = list(leftColumns = 2)))

Motivation

Having frequently cited the advantages of arm length in bouts myself, the result by Richardson (2020) was intuitively plausible. However, the intercept in his regression was above 0.50, which signaled a problem to me (saw the paper at 12:42 and remarked about the problem at 12:45). In the real world all fights between two people that are not draws end with one winner and one loser. Therefore, the expected win percentage across all fighters is always 50%. Therefore, I sought to investigate whether there was selection by winningness into his sample. There is no means of deducing a relationship to sexual selection with this result, but that is a separate issue. The effect of this few misclassifications will presumably be quite modest.

Regardless of the reliability of the results, I disapprove of retraction per se and only believe in doing it in cases of fraud or major error, regardless of the topic or its perceived social importance.

Analysis

Replicate First

Before anything, all coefficients must be replicated in order to verify reporting accuracy. I am unsure why this is not done during peer review. Anecdotally, I always ask for data and code and attempt replication during review, but only two reviewers have ever asked me for the same. In the table below, I list in order all reported coefficients from the study.

ID Type Test Value Correct?
1 Correlation (Armspan, Height | Men) r = 0.86 Yes
2 Correlation (Armspan, Height | Women) r = 0.86 Yes
3 Correlation (AHR, Weight) r = 0.11 No
4 Correlation (AHR, Height) r = 0.06 Yes
5a Multiple Regression (Armspan ~ Height + Weight + Sex) b = 0.88 Yes
5b Multiple Regression (Armspan ~ Height + Weight + Sex) b = 0.08 Yes
5c Multiple Regression (Armspan ~ Height + Weight + Sex) b = 2.73 Yes
5d Multiple Regression (Armspan ~ Height + Weight + Sex) r^2 = 0.79 Yes
6 Multiple Regression (Win Percentage ~ Armspan * Sex + Height + Weight) p = 0.59 Yes
7 Multiple Regression (Win Percentage ~ Armspan + Sex + Height + Weight) b = 0.002
7 Multiple Regression (Win Percentage ~ Armspan + Sex + Height + Weight) b = -0.002 Yes
7 Multiple Regression (Win Percentage ~ Armspan + Sex + Height + Weight) b = 0.026 Yes
7 Multiple Regression (Win Percentage ~ Armspan + Sex + Height + Weight) b = 0.00007 Yes

Note: that the correlation between armspan and height is the same to four decimal places for men and women is extraordinary. Besides this interesting detail, there is only one errant value, the correlation between AHR and height.

An important thing to check with variables as highly correlated as height and weight are (a staggering r = 0.79) is whether there’s variance inflation. Given the value computed below, there is likely to be meaningful variance inflation. Therefore, it may be worthwhile to at the least check the result without height and weight simultaneously entered into the model (there are other ways to do this, I’m just rushing this because it’s no big deal). Note that this still leaves the interaction with sex insignificant in either case. As shown below, it’s not likely a huge effect. The results are probably reliable (not the right method, but the right ones won’t differ much).

# Correlations

cor(MAL$Armspan, MAL$Height); cor(WAL$Armspan, WAL$Height); cor(AL$AHR, AL$Weight); cor(AL$AHR, AL$Height); cor(AL$Height, AL$Weight)
## [1] 0.85622
## [1] 0.85615
## [1] 0.15805
## [1] 0.055325
## [1] 0.78729
# Regressions

L1 <- lm(Armspan ~ Height + Weight + sex, AL); summ(L1, model.info = F, digits = 5)
F(3,1656) 2118.68587
0.79331
Adj. R² 0.79294
Est. S.E. t val. p
(Intercept) 15.91722 3.24492 4.90528 0.00000
Height 0.88121 0.02234 39.44996 0.00000
Weight 0.07945 0.01031 7.70571 0.00000
sexmale 2.73361 0.41830 6.53510 0.00000
Standard errors: OLS
L2 <- lm(Win_Percentage ~ Armspan * sex + Height + Weight, AL)#; summ(L2, model.info = F, digits = 3)
L3 <- lm(Win_Percentage ~ Armspan + sex + Height + Weight, AL)#; summ(L3, model.info = F, digits = 3)
L4 <- lm(Win_Percentage ~ Armspan + sex + Height, AL)#; summ(L4, model.info = F, digits = 3)
L5 <- lm(Win_Percentage ~ Armspan + sex + Weight, AL)#; summ(L5, model.info = F, digits = 3)

export_summs(L2, L3, L4, L5, scale = F, error_format = "[{conf.low}, {conf.high}]", digits = getOption("jtools-digits", 5))
Model 1Model 2Model 3Model 4
(Intercept)0.56254 ***0.63870 ***0.65355 ***0.53729 ***
[0.25292, 0.87215]   [0.49393, 0.78347]   [0.54325, 0.76386]   [0.42326, 0.65133]   
Armspan0.00247 *  0.00200 ***0.00197 ***0.00114 ** 
[0.00046, 0.00448]   [0.00091, 0.00309]   [0.00090, 0.00304]   [0.00036, 0.00193]   
sexmale0.11491    0.02593 ** 0.02591 ** 0.02446 *  
[-0.20535, 0.43516]   [0.00717, 0.04470]   [0.00715, 0.04467]   [0.00572, 0.04321]   
Height-0.00159 *  -0.00156 *  -0.00165 *            
[-0.00297, -0.00021]   [-0.00294, -0.00019]   [-0.00291, -0.00039]             
Weight-0.00006    -0.00007              -0.00029    
[-0.00053, 0.00041]   [-0.00054, 0.00039]             [-0.00071, 0.00014]   
Armspan:sexmale-0.00052                                  
[-0.00240, 0.00135]                                 
N1660          1660          1660          1660          
R20.01894    0.01876    0.01870    0.01582    
*** p < 0.001; ** p < 0.01; * p < 0.05.
# Variance Inflation

HH::vif(L3)
## Armspan sexmale  Height  Weight 
##  4.8382  1.2549  5.4513  2.7284

How Selected is the Sample?

One way to indicate whether the sampling is biased is to check the win percentage. If it meaningfully differs from 0.50, it is clearly selected for winners (>0.50) or losers (<0.50). If the variables included in the analysis influence winning, the result could be attributed to conditioning on a collider (or other forms of selection bias).

mean(AL$Win_Percentage); mean(MAL$Win_Percentage); mean(WAL$Win_Percentage) # Mean win percentage
## [1] 0.74214
## [1] 0.74602
## [1] 0.71037
1-mean(AL$Win_Percentage); 1-mean(MAL$Win_Percentage); 1-mean(WAL$Win_Percentage) # Excess winners
## [1] 0.25786
## [1] 0.25398
## [1] 0.28963
0.5/(1-mean(AL$Win_Percentage)); MO <- 0.5/(1-mean(MAL$Win_Percentage)); WO <- 0.5/(1-mean(WAL$Win_Percentage)); MO; WO # Odds
## [1] 1.939
## [1] 1.9687
## [1] 1.7263
OR <- MO / WO; OR # Odds ratio
## [1] 1.1404
# Are inclusion odds different for men and women?

exp(1)^(log(OR) + 1.96*sqrt((1/0.5) + (1/MO) + (1/0.5) + (1/WO))) 
## [1] 94.833
exp(1)^(log(OR) - 1.96*sqrt((1/0.5) + (1/MO) + (1/0.5) + (1/WO))) 
## [1] 0.013713

There are far too many winners in this dataset; therefore, losers are excluded from the sample for some (or potentially no) reason. This does not appear to be significantly related to sex. Depending on the reason, the coefficients discovered could be affected in ways that increase or decrease the support for the hypothesis (since the idea is to relate this to the general population) that was tested or they could do nothing. If there is selection for winning and winning is affected by armspan in the expected direction, there should be attenuation. Unfortunately, the effect can only be theorized about in the absence of knowledge about the type of selection (or lack thereof) affecting the data. As an example of how this sort of thing can affect results, take the relationship between point scoring and height in the National Basketball Association (NBA).

NBA Result

Data came from https://stats.nba.com/players/bio and https://stats.nba.com/players/traditional. I used each season from 2014-2015 to replicate the finding and since 2019-2020 was cut short so I didn’t want to use it alone. I removed five people for whom there were no height data in the 2014-2015 season. I could have just facet wrapped a single df by season but I didn’t.

alpha = 0.5
NBA14 <- read.csv("NBA1415.csv")
NBA15 <- read.csv("NBA1516.csv")
NBA16 <- read.csv("NBA1617.csv")
NBA17 <- read.csv("NBA1718.csv")
NBA18 <- read.csv("NBA1819.csv")
NBA19 <- read.csv("NBA1920.csv")

seasons <- lapply(mget(paste0("NBA", 14:19)), transform,
       Height = ((HeightInch * 2.54)))

NBA14 <- seasons$NBA14; NBA15 <- seasons$NBA15; NBA16 <- seasons$NBA16; NBA17 <- seasons$NBA17; NBA18 <- seasons$NBA18; NBA19 <- seasons$NBA19
TotN = count(NBA14) + count(NBA15) + count(NBA16) + count(NBA17) + count(NBA18) + count(NBA19)
# Personal Fouls

TOTPF <- ((count(NBA14)/TotN) * cor(NBA14$Height, NBA14$PF)) + ((count(NBA15)/TotN) * cor(NBA15$Height, NBA15$PF)) + ((count(NBA16)/TotN) * cor(NBA16$Height, NBA16$PF)) + ((count(NBA17)/TotN) * cor(NBA17$Height, NBA17$PF)) + ((count(NBA18)/TotN) * cor(NBA18$Height, NBA18$PF)) + ((count(NBA19)/TotN) * cor(NBA19$Height, NBA19$PF))

TOTPF$n
## [1] 0.064535
# Blocks

TOTBLK <- ((count(NBA14)/TotN) * cor(NBA14$Height, NBA14$BLK)) + ((count(NBA15)/TotN) * cor(NBA15$Height, NBA15$BLK)) + ((count(NBA16)/TotN) * cor(NBA16$Height, NBA16$BLK)) + ((count(NBA17)/TotN) * cor(NBA17$Height, NBA17$BLK)) + ((count(NBA18)/TotN) * cor(NBA18$Height, NBA18$BLK)) + ((count(NBA19)/TotN) * cor(NBA19$Height, NBA19$BLK))

TOTBLK$n
## [1] 0.19572
# Steals

TOTSTL <- ((count(NBA14)/TotN) * cor(NBA14$Height, NBA14$STL)) + ((count(NBA15)/TotN) * cor(NBA15$Height, NBA15$STL)) + ((count(NBA16)/TotN) * cor(NBA16$Height, NBA16$STL)) + ((count(NBA17)/TotN) * cor(NBA17$Height, NBA17$STL)) + ((count(NBA18)/TotN) * cor(NBA18$Height, NBA18$STL)) + ((count(NBA19)/TotN) * cor(NBA19$Height, NBA19$STL))

TOTSTL$n
## [1] -0.10404
REG0LM <- lm(PTS ~ Height, NBA14)
REG1LM <- lm(PTS ~ Height, NBA15)
REG2LM <- lm(PTS ~ Height, NBA16)
REG3LM <- lm(PTS ~ Height, NBA17)
REG4LM <- lm(PTS ~ Height, NBA18)
REG5LM <- lm(PTS ~ Height, NBA19)
COEFS <- plot_summs(REG0LM, REG1LM, REG2LM, REG3LM, REG4LM, REG5LM, scale = F, model.names = c("2014-2015", "2015-2016", "2016-2017", "2017-2018", "2018-2019", "2019-2020"), legend.title = "Season")

RUNC <- COEFS + apatheme + labs(x = "\n Pearson Correlation \n", y = NULL) + theme(legend.position = c(0.9, 0.75))

REG0 <- NBA14 %>% ggplot(aes(Height, PTS)) + geom_point(alpha = alpha) + geom_smooth(method = 'lm', color = "lightblue") + apatheme + theme(axis.title.x = element_blank(), axis.title.y = element_blank()) 
REG1 <- NBA15 %>% ggplot(aes(Height, PTS)) + geom_point(alpha = alpha) + geom_smooth(method = 'lm', color = "orange") + apatheme + theme(axis.title.x = element_blank(), axis.title.y = element_blank())
REG2 <- NBA16 %>% ggplot(aes(Height, PTS)) + geom_point(alpha = alpha) + geom_smooth(method = 'lm', color = "#09FDFB") + apatheme + theme(axis.title.x = element_blank(), axis.title.y = element_blank())
REG3 <- NBA17 %>% ggplot(aes(Height, PTS)) + geom_point(alpha = alpha) + geom_smooth(method = 'lm', color = "violet") + apatheme + theme(axis.title.x = element_blank(), axis.title.y = element_blank())
REG4 <- NBA18 %>% ggplot(aes(Height, PTS)) + geom_point(alpha = alpha) + geom_smooth(method = 'lm', color = "blue") + apatheme + theme(axis.title.x = element_blank(), axis.title.y = element_blank())
REG5 <- NBA19 %>% ggplot(aes(Height, PTS)) + geom_point(alpha = alpha) + geom_smooth(method = 'lm', color = "yellow") + apatheme + theme(axis.title.x = element_blank(), axis.title.y = element_blank())
#png("NBAPlot.png", width = 16, height = 16, units = 'in', res=250)
ggarrange(RUNC, ggarrange(REG0, REG1, ncol = 2, labels = c("2014-2015", "2015-2016"), label.x = 0.02, font.label = list(size = 12, color = "black", face = "plain", family = 'serif')), ggarrange(REG2, REG3, ncol = 2, labels = c("2016-2017", "2017-2018"), label.x = 0.02, font.label = list(size = 12, color = "black", face = "plain", family = 'serif')), ggarrange(REG4, REG5, ncol = 2, labels = c("2018-2019", "2019-2020"), label.x = 0.02, font.label = list(size = 12, color = "black", face = "plain", family = 'serif')), nrow = 4)
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'

#dev.off()

Replication in the WNBA?

It would be interesting to get similar data for the WNBA since the gameplay dynamics are a bit different and the heights a bit shorter (in 2019, almost 6 inches/15 cm; strangely enough, the variability is quite similar). I found the relevant data from the 2019 season to assess the relationship between height and points. Two players with missing heights were removed. Data came from here: https://stats.wnba.com/players/bio.

WNBA19 <- read.csv("WNBA19.csv")
WNBA19$Height = (WNBA19$HeightInch * 2.54)
cor(WNBA19$Height, WNBA19$PTS)
## [1] 0.0065044
WNBA19 %>% ggplot(aes(Height, PTS)) + geom_point(alpha = alpha) + geom_smooth(method = 'lm', color = "orangered") + apatheme + labs(y = "Points (Per Game)", x = "Height (Centimeters)")
## `geom_smooth()` using formula 'y ~ x'

The result replicates in the 2019 WNBA season. I may check others later.

Over six seasons (the 2019-2020 season is set to resume on July 30, 2020), the relationship between height and points scored in the NBA is around zero and, in fact, is slightly negative in four of those seasons. The weighted correlation between height and personal fouls - possibly a measure of skill unrelated to height - is r = 0.06 - not much. For blocks, where one would expect taller players to perform much better, the relationship is modest at r = 0.20. For steals, where height could be especially detrimental, the relationship is not very strongly negative, at r = -0.10. One could use these as a sort of disproof that height is affecting gameplay performance in the expected directions and that selection has occurred.

Just as well, one could say that taller players practice harder to compensate for their height in steals and shorter players try harder to get good at blocks, and their overall greater practice due to their shorter stature (or, perhaps, the clumsiness of taller players) helps to explain why they have fewer fouls. But substantive explanations requiring substantive knowledge to understand might also be at play. For instance, positions affect gameplay; as an example, guards are dribbling so they’ll be the ones going for steals. Taller players may also have more players going up against them in the paint so they ring up more intentional fouls at the end of games to spare the important shooters. All of these (and more, like bench times versus points scored) confound our capabilities to make inferences regarding what selection occurred if any.

Since it is likely that height and basketball skills are associated in the general population, this example should urge caution. This result actually looks quite similar to the expected result with a dummy variable associated with two outcomes of interest. If, for instance, NBA players were 20% of the population and height and point scoring (say, ability?) correlated with NBA membership at r = 0.4 each, we might see no relationship in the NBA and the non-NBA public but in a sample of both, the relationship would not have disappeared. Notably, range restriction in the classical sense cannot explain this result because there is around 50% greater variability in height in the NBA than in the general population. The difference is that in the NBA, the heights are shifted to the right.

Discussion

I initially wanted to continue this and elaborate on types of selection biases with DAGs I put together with dagitty but I ended up redacting most of that because it seemed futile to try to propose a lot of different ways the sampling could affect the estimate when I can’t determine any of them yet; it might be useful to do that for illustrative purposes later, so if I ever get around to cleaning it up, I can post it. Stated another way, though it may be possible to use classical range restriction corrections, they are married to the Gaussian theory of errors and we do not know that the sampling bias lines up with the assumptions that entails. It could be that in the total population, armspan and winning (for all fights, not just UFC ones, to extend the example and its relevance) enjoy a minor negative correlation if height matters for fighting, for instance. This would certainly be strange and I do not think it is true, but the possibilities that the result is overstated (from the null) or incorrect (sign reversal), or even understated (it may need to be stronger) for that matter, remain on the table barring a population-representative sample, or, to at least make a bit of progress, one that at least does not show clear signs of being selected for winning - our dependent variable. Subsetting to a part of the sample with the correct win rate will not work since that can lead to its own biases and it doesn’t correct the existing bias which may be in the sample.

There are numerous other examples of selection possibly removing “obvious” relationships, some tested, some not. In the former category, IQ and conscientiousness (Murray et al., 2014), education and liberal values (various longitudinal studies showing no or small effects and, recently, a behavior genetic analysis showing a small effect), GRE scores and success (Huitema & Stein, 1993), and now, NBA heights and point scoring - though I am not the first to note this -, and in the latter, say, the WEMP and success (Dennison & Ogilvie, 2014; like that study, I could probably reasonably claim based on the NBA results that height is a negative predictor of basketball skills).

The association is weak, the sample is selected for winning, and there is no real connection between these metrics or tests and evolution clearly drawn so the result is not very convincing overall.

References

Richardson, T. (2020). Is arm length a sexually selected trait in humans? Evidence from mixed martial arts. Evolutionary Behavioral Sciences. https://doi.org/10.1037/ebs0000219

Murray, A. L., Johnson, W., McGue, M., & Iacono, W. G. (2014). How are conscientiousness and cognitive ability related to one another? A re-examination of the intelligence compensation hypothesis. Personality and Individual Differences, 70, 17-22. https://doi.org/10.1016/j.paid.2014.06.014

Huitema, B. E., & Stein, C. R. (1993). Validity of the GRE without Restriction of Range. Psychological Reports. https://doi.org/10.2466/pr0.1993.72.1.123

Dennison, T., & Ogilvie, S. (2014). Does the European Marriage Pattern Explain Economic Growth? The Journal of Economic History, 74(3), 651-693. https://doi.org/10.1017/S0022050714000564

NBA/WNBA Data

datatable(NBA14, extensions = c("Buttons", "FixedColumns"), options = list(dom = 'Bfrtip', buttons = c('copy', 'csv', 'print'), scrollX = T, fixedColumns = list(leftColumns = 2)))
datatable(NBA15, extensions = c("Buttons", "FixedColumns"), options = list(dom = 'Bfrtip', buttons = c('copy', 'csv', 'print'), scrollX = T, fixedColumns = list(leftColumns = 2)))
datatable(NBA16, extensions = c("Buttons", "FixedColumns"), options = list(dom = 'Bfrtip', buttons = c('copy', 'csv', 'print'), scrollX = T, fixedColumns = list(leftColumns = 2)))
datatable(NBA17, extensions = c("Buttons", "FixedColumns"), options = list(dom = 'Bfrtip', buttons = c('copy', 'csv', 'print'), scrollX = T, fixedColumns = list(leftColumns = 2)))
datatable(NBA18, extensions = c("Buttons", "FixedColumns"), options = list(dom = 'Bfrtip', buttons = c('copy', 'csv', 'print'), scrollX = T, fixedColumns = list(leftColumns = 2)))
datatable(NBA19, extensions = c("Buttons", "FixedColumns"), options = list(dom = 'Bfrtip', buttons = c('copy', 'csv', 'print'), scrollX = T, fixedColumns = list(leftColumns = 2)))
datatable(WNBA19, extensions = c("Buttons", "FixedColumns"), options = list(dom = 'Bfrtip', buttons = c('copy', 'csv', 'print'), scrollX = T, fixedColumns = list(leftColumns = 2)))