Setup

Packages

library(pacman)
p_load(DT, metafor, metaviz, jtools, ggplot2, knitr, meta, dmetar)

Data

AM <- read.csv("AsianIQs.csv")
datatable(AM, extensions = c("Buttons", "FixedColumns"), options = list(dom = 'Bfrtip', buttons = c('copy', 'csv', 'print'), scrollX = T, fixedColumns = list(leftColumns = 3)))

Rationale

The average IQs of northeast Asians (Japanese, Koreans, Chinese; henceforth just Asians) are considered to be important by several prominent group difference researchers (e.g. Jensen, Rushton, Lynn). It's typically found that Asians score between three and ten points (average \(\approx\) six) above whites of European extraction (henceforth, just whites). For example, Lynn (1995) found a 4.52 point advantage in g over whites (n = 1692; age = 6-17) for the Asian (n = 48) Differential Ability Scales sample. In another case, Kane (2008) matched samples (all n's = 77 and ages = 10) of African-Americans (henceforth just black), Hispanic-Americans (henceforth just Hispanic), Asians, and whites on parental socioeconomic status, age, first language, and disability status with the Universal Nonverbal Intelligence Test norming data and found that there was strict factorial invariance and full-scale IQs for each of the groups were 92.25, 99.58, 113.22, and 106.33, respectively. Interestingly, across all subtests, mean differences relative to the white group correlated with the (white) g loadings at r = 0.471, 0.403, and -0.393 for the black, Hispanic, and Asian groups respectively (negative value indicates an Asian advantage).

For some reason, people seem to think that, with regards to means, Asians are different than other groups in that instead of having the same relative mean in childhood and adulthood, they converge to the white mean in adulthood. There is no existing evidence for this claim; instead, it is aimed at sowing doubt about an area of research where there is, apparently, a paucity of evidence. To that end, I will meta-analyze the evidence presented by Lynn (2006), Lynn & Becker (2019), and the Japanese Wechsler norming sample data provided to me by a colleague. The Chinese and Korean norming sample data deliver virtually the same results as the Japanese data, but I do not know if I am allowed to share them. I know that for the Japanese data, I can at least share those. Before beginning, there are some things to note:

  1. I will consider a sample "adult" if it has a mean age of 16 or greater. The latent factor stability from this age is usually extreme. For example, in the Vietnam Experience Study, I found that the g factors from a high school (age \(\approx\) 18) and follow-up (age \(\approx\) 35) testing, correlated at r = 0.94 (here: https://rpubs.com/JLLJ/EDUVES). In that same sample, the participants took the Army Classification Battery verbal and arithmetic tests at both intervals. For the "Asian" group (I do not know their ethnicity and thus they are excluded from this meta-analysis), their verbal scores increased by 9.42 points (\(\Delta\) SD = 1.12) between testing intervals and for arithmetic, 0.60 (-0.19) points. For the black sample, 8.54 (3.89) and -2.53 (3.72) points; for Hispanics, 3.83 (2.62) and -1.80 (2.17); for Native Americans 6.16 (2.74) and -0.23 (0.24), and finally, for whites 3.45 (0.78) and 0.65 (1.98), with n's of 34, 525, 20, 49, and 3654 respectively. The stability was pretty similar for the different groups, so on its face, I don't really put much credence in the idea that the change would differ for them. If it did, then we would expect noninvariance in adult comparisons (that is not observed) since we have invariance for childhood comparisons.

  2. Invariance is not tested for the majority of samples. I only know it holds in a handful of studies and the Japanese norming data with minor exceptions. For example, I found a total of two noninvariant loadings, one noninvariant intercept, and three residual variances that had to be released out of the tests from all three WAIS iterations (with similar numbers for the WISCs, for which the mean differences were basically identical). All parameter noninvariance had to do with the block design and object assembly tests which, to me, suggests the issue resulted from the intended higher difficulty in Japan. The effect of the differences in the loadings and intercepts was < 0.1 and 0.3 g. The residual variances accounted for approximately 50% of the smaller (2x) and larger (1x) observed variances in the Japanese samples relative to white Americans for the same WAIS versions. I may reassess this in the future with MI cutoffs dynamically computed and, hopefully someday, raw data.

  3. I used Lynn's provided estimates and Flynn effect mean correction method, as well as the method for converting between norming SDs/means described in the British WAIS-IV manual. There are more white American WAIS/WAIS-R/WAIS-III samples (I hope to get the WAIS-IV) to use as well as many from places like Australia, Greece, the United Kingdom, Denmark, Russia, Sudan, and so on, but for my purposes, I will only use the white American comparison here. These Flynn effect-corrected means are compared to the white norming sample means and SDs (100/15) and I assume the Asian SD is the same because in all cases where I had data, the latent variances and most of the residual variances could be constrained to equality. With this in mind, the mean difference used in the meta-analysis was just the difference from 100 divided by 15. I obtained the ultimately used IQs by taking the standardized difference, multiplying it by 15, and adding 100. Out of the six viable sources in Lynn & Becker (2019) the two Raven norming samples were excluded because I could not find the n's. In order of publication, they presented IQs of 113.81 and 92.39 respectively.

I did not attempt to expand the American data beyond what Lynn listed in 2006 because I was more interested in the results from Asia. With that said, the subsequemt American results are an underestimate of the Asian advantage in American samples. These are confirmed in achievement tests like the SAT, ACT, LSAT, MCAT, PISA, TIMSS, and so on, for which bias generally (in what I have inspected, though I have not yet investigated some of these) cancels across all subtests or items. I meta-analyzed all the data together, the Asian national data alone, the American data alone, and the Japanese norming data alone. I have excluded the UK datapoint from Scott & Anderson's 2003 presentation at the EAWOP Conference in Lisbon because the British definition of "Asian" includes groups like Pakistanis and Afghans; as indicated by the mean differences (1.02 d disadvantage in g, 0.92 in verbal, and 0.76 in quantitative/mathematical), these were probably not northeast Asians as I've defined the term above.

Analysis

AM$se <- 1/sqrt(AM$N)
JAM <- subset(AM, Sample == "Norming")
AAM <- subset(AM, Continent == "Asia"); UAM <- subset(AM, Continent == "NorthAmerica")
Amean <- rma(yi = IQd, sei = se, measure = "SMD", ni = N, data = AM); Amean
## 
## Random-Effects Model (k = 23; tau^2 estimator: REML)
## 
## tau^2 (estimated amount of total heterogeneity): 0.1767 (SE = 0.0547)
## tau (square root of estimated tau^2 value):      0.4203
## I^2 (total heterogeneity / total variability):   99.39%
## H^2 (total variability / sampling variability):  164.65
## 
## Test for Heterogeneity:
## Q(df = 22) = 2001.1486, p-val < .0001
## 
## Model Results:
## 
## estimate      se    zval    pval   ci.lb   ci.ub 
##   0.3463  0.0888  3.8990  <.0001  0.1722  0.5203  *** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
JAmean <- rma(yi = IQd, sei = se, measure = "SMD", ni = N, data = JAM); JAmean
## 
## Random-Effects Model (k = 3; tau^2 estimator: REML)
## 
## tau^2 (estimated amount of total heterogeneity): 0.0038 (SE = 0.0045)
## tau (square root of estimated tau^2 value):      0.0618
## I^2 (total heterogeneity / total variability):   84.97%
## H^2 (total variability / sampling variability):  6.65
## 
## Test for Heterogeneity:
## Q(df = 2) = 13.6286, p-val = 0.0011
## 
## Model Results:
## 
## estimate      se     zval    pval   ci.lb   ci.ub 
##   0.5994  0.0387  15.4925  <.0001  0.5236  0.6752  *** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
AAmean <- rma(yi = IQd, sei = se, measure = "SMD", ni = N, data = AAM); AAmean
## 
## Random-Effects Model (k = 17; tau^2 estimator: REML)
## 
## tau^2 (estimated amount of total heterogeneity): 0.1521 (SE = 0.0553)
## tau (square root of estimated tau^2 value):      0.3900
## I^2 (total heterogeneity / total variability):   99.29%
## H^2 (total variability / sampling variability):  140.45
## 
## Test for Heterogeneity:
## Q(df = 16) = 1311.7190, p-val < .0001
## 
## Model Results:
## 
## estimate      se    zval    pval   ci.lb   ci.ub 
##   0.4723  0.0959  4.9243  <.0001  0.2843  0.6603  *** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
UAmean <- rma(yi = IQd, sei = se, measure = "SMD", ni = N, data = UAM, control = list(maxiter = 500)); UAmean #convergence problems with small samples
## 
## Random-Effects Model (k = 6; tau^2 estimator: REML)
## 
## tau^2 (estimated amount of total heterogeneity): 0.0842 (SE = 0.0566)
## tau (square root of estimated tau^2 value):      0.2902
## I^2 (total heterogeneity / total variability):   97.99%
## H^2 (total variability / sampling variability):  49.71
## 
## Test for Heterogeneity:
## Q(df = 5) = 576.0774, p-val < .0001
## 
## Model Results:
## 
## estimate      se     zval    pval    ci.lb   ci.ub 
##  -0.0160  0.1223  -0.1308  0.8959  -0.2556  0.2236    
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
viz_forest(x = AM[1:23, c("IQd", "se")], 
           study_labels = AM[1:23, c("Author")],
           summary_label = "Summary Effect", xlab = "Cohen's d", variant = "rain", method = "DL", col = "Reds")

viz_forest(x = AM[, c("IQd", "se")], 
           group = AM[, "Continent"], 
           study_labels = AM[, "Author"], 
           summary_label = c("Summary (Asia)", "Summary (North America)"), 
           xlab = "Cohen's d",
           variant = "rain", method = "DL",
           col = "Purples")

ASE <- lm(IQd ~ se, AM)
AY <- lm(IQd ~ StudyYear, AM)
summ(ASE, scale = T) #, n.sd = 2 <- Gelman method, doesn't really change the result
## MODEL INFO:
## Observations: 23
## Dependent Variable: IQd
## Type: OLS linear regression 
## 
## MODEL FIT:
## F(1,21) = 0.02, p = 0.89
## R² = 0.00
## Adj. R² = -0.05 
## 
## Standard errors: OLS
## -----------------------------------------------
##                     Est.   S.E.   t val.      p
## ----------------- ------ ------ -------- ------
## (Intercept)         0.35   0.09     3.84   0.00
## se                  0.01   0.09     0.14   0.89
## -----------------------------------------------
## 
## Continuous predictors are mean-centered and scaled by 1 s.d.
summ(AY, scale = T) 
## MODEL INFO:
## Observations: 23
## Dependent Variable: IQd
## Type: OLS linear regression 
## 
## MODEL FIT:
## F(1,21) = 1.11, p = 0.30
## R² = 0.05
## Adj. R² = 0.00 
## 
## Standard errors: OLS
## -----------------------------------------------
##                     Est.   S.E.   t val.      p
## ----------------- ------ ------ -------- ------
## (Intercept)         0.35   0.09     3.93   0.00
## StudyYear           0.09   0.09     1.05   0.30
## -----------------------------------------------
## 
## Continuous predictors are mean-centered and scaled by 1 s.d.
ggplot(AM, aes(x = StudyYear, y = IQd)) + geom_point(aes(size = N)) + geom_smooth(method = lm, color = "steelblue4", formula = 'y ~ x') + labs(x = "(Study) Year", y = "Cohen's d") + theme_minimal() + theme(legend.position = "none", text = element_text(size = 12, family = "serif"), plot.title = element_text(hjust = 0.5))

It seems more recent studies do not yield higher estimates.

viz_funnel(AM[, c("IQd", "se")],
           method = "DL",
           contours_col = "Greens",
           xlab = "Cohen's d")

viz_funnel(AM[, c("IQd", "se")],
           method = "DL",
           contours_col = "Purples",
           xlab = "Cohen's d",
           trim_and_fill = T,
           egger = T)

viz_sunset(AM[, c("IQd", "se")],
           method = "DL",
           xlab = "Cohen's d",
           power_contours = "continuous")

pcurve(Ameta)

## P-curve analysis 
##  ----------------------- 
## - Total number of provided studies: k = 23 
## - Total number of p<0.05 studies included into the analysis: k = 19 (82.61%) 
## - Total number of studies with p<0.025: k = 19 (82.61%) 
##    
## Results 
##  ----------------------- 
##                     pBinomial   zFull pFull   zHalf pHalf
## Right-skewness test         0 -27.198     0 -26.275     0
## Flatness test               1  27.064     1  27.697     1
## Note: p-values of 0 or 1 correspond to p<0.001 and p>0.999, respectively.   
## Power Estimate: 99% (99%-99%)
##    
## Evidential value 
##  ----------------------- 
## - Evidential value present: yes 
## - Evidential value absent/inadequate: no
adf <- data.frame("Group" = c("Total", "Japanese Norming", "Asian", "American"), "Lower" = c(Amean$ci.lb*15 + 100, JAmean$ci.lb*15 + 100, AAmean$ci.lb*15 + 100, UAmean$ci.lb*15 + 100), "Mean" = c(Amean$b*15 + 100, JAmean$b*15 + 100, AAmean$b*15 + 100, UAmean$b*15 + 100), "Upper" = c(Amean$ci.ub*15 + 100, JAmean$ci.ub*15 + 100, AAmean$ci.ub*15 + 100, UAmean$ci.ub*15 + 100))
kable(adf)
Group Lower Mean Upper
Total 102.58310 105.19412 107.8051
Japanese Norming 107.85363 108.99110 110.1286
Asian 104.26510 107.08513 109.9052
American 96.16574 99.76012 103.3545

Discussion

The adult northeast Asian IQ seems to be pretty similar to the estimates from younger samples. There is presently no reason to think that older Asian individuals would be relatively less intelligent than younger samples excepting, say, results limited affected by measurement bias due to illiteracy in countries like China and Singapore which have only recently become highly literate. The argument by a certain contentious blogger that there should be discrimination against Asians in admissions because their skills will be overestimated by tests never had any basis in fact and was always bad, but now anyone seriously humoring it should be forced to contradict the weight of the presented evidence before they're taken seriously in the slightest. And, to avoid the person who originally made that argument trying to continue arguing for it based on non-cognitive data of no certain psychometric character, it should be said that such an argument carries no real weight and never has. If serious people start taking their specious anti-Asian arguments seriously, I may be duty-bound to write a more forceful and comprehensive response. I may add to this in the future but there's little point as it's boring, low-reward, and data are too inaccessible. If data become more accessible, I would like to assess the equality of variances more comprehensively.

References

Lynn, R. (1996). Racial and ethnic differences in intelligence in the United States on the Differential Ability Scale. Personality and Individual Differences, 20(2), 271-273. https://doi.org/10.1016/0191-8869(95)00158-1

Kane, H. (2008). Race Differences on the UNIT: Evidence from Multi-Sample Confirmatory Analysis. Mankind Quarterly, 48(3). http://mankindquarterly.org/archive/issue/48-3/2

Lynn, R. (2006). Race Differences in Intelligence: An Evolutionary Analysis. http://archive.org/details/RaceDifferencesInIntelligenceAnEvolutionaryAnalysis

Postscript: Japanese Matrices

Apparently few people know that there are large norming samples from many different countries as diverse as Zambia, Japan, and Brazil. These matrices are taken directly from their manuals, which should be consulted for more information.

p_load(lavaan, corrplot)

WAIS <- '
1                                       
0.62    1                                   
0.65    0.48    1                               
0.7 0.61    0.54    1                           
0.44    0.37    0.49    0.39    1                       
0.66    0.54    0.5 0.58    0.41    1                   
0.5 0.4 0.55    0.47    0.42    0.34    1               
0.59    0.5 0.55    0.56    0.39    0.48    0.49    1           
0.32    0.42    0.59    0.45    0.42    0.43    0.48    0.53    1       
0.43    0.37    0.4 0.38    0.32    0.3 0.46    0.49    0.4 1   
0.4 0.34    0.51    0.37    0.33    0.33    0.38    0.52    0.56    0.39    1'

Names <- list("Information", "Comprehension", "Arithmetic", "Similarities", "Digit Span", "Vocabulary", "Letter-Number Sequence", "Picture Completion", "Block Design", "Visual Puzzle", "Matrix Reasoning")
WAIS.cor <- getCov(WAIS, names = Names)

WAISR <- '
1                                       
0.37    1                                   
0.77    0.41    1                               
0.58    0.46    0.63    1                           
0.54    0.27    0.64    0.45    1                       
0.63    0.35    0.68    0.54    0.55    1                   
0.61    0.31    0.44    0.39    0.39    0.43    1               
0.41    0.35    0.4 0.41    0.34    0.41    0.45    1           
0.39    0.35    0.42    0.51    0.45    0.42    0.49    0.44    1       
0.29    0.28    0.33    0.38    0.28    0.34    0.49    0.41    0.54    1   
0.36    0.32    0.32    0.38    0.25    0.33    0.39    0.31    0.39    0.3 1'

Names <- list("Information", "Digit Span", "Vocabulary", "Arithmetic", "Comprehension", "Similarities", "Picture Completion", "Visual Puzzle", "Block Design", "Matrix Reasoning", "Letter-Number Sequence")
WAISR.cor <- getCov(WAISR, names = Names)

WAISIII <- '
1                                                   
0.69    1                                               
0.55    0.53    1                                           
0.37    0.33    0.47    1                                       
0.71    0.6 0.55    0.38    1                                   
0.63    0.57    0.45    0.28    0.55    1                               
0.39    0.39    0.48    0.5 0.4 0.37    1                           
0.4 0.41    0.34    0.24    0.4 0.37    0.28    1                       
0.41    0.41    0.44    0.33    0.41    0.36    0.4 0.33    1                   
0.41    0.44    0.5 0.36    0.42    0.37    0.37    0.41    0.41    1               
0.44    0.45    0.52    0.33    0.45    0.37    0.36    0.41    0.35    0.45    1           
0.38    0.4 0.4 0.25    0.44    0.34    0.31    0.44    0.29    0.41    0.41    1       
0.38    0.39    0.44    0.36    0.39    0.36    0.41    0.37    0.64    0.46    0.33    0.34    1   
0.33    0.34    0.32    0.24    0.29    0.3 0.26    0.4 0.29    0.51    0.38    0.4 0.36    1'

Names <- list("Vocabulary", "Similarities", "Arithmetic", "Digit Span", "Information", "Comprehension", "Letter-Number Sequence", "Picture Completion", "Coding", "Block Design", "Matrix Reasoning", "Picture Arrangement", "Symbol Search", "Object Assembly")
WAISIII.cor <- getCov(WAISIII, names = Names)

#corrplot(WAIS.cor, order = "hclust", cl.pos = "n", tl.pos = "n", col = col4(100)); corrplot(WAISR.cor, order = "hclust", cl.pos = "n", tl.pos = "n", col = col4(100)); corrplot(WAISIII.cor, order = "hclust", cl.pos = "n", tl.pos = "n", col = col4(100))