Setup

Functions

#Multiple people have used this, not mine, good for subsetting a data frame

CF <- function(D, DC) {
  CV <- complete.cases(D[, DC])
  return(D[CV, ])}

#Not mine, good non-Grob for assembling plots

multiplot <- function(..., plotlist=NULL, file, cols=1, layout=NULL) {
  library(grid)
  plots <- c(list(...), plotlist)
  numPlots = length(plots)
  if (is.null(layout)) {
    layout <- matrix(seq(1, cols * ceiling(numPlots/cols)),
                    ncol = cols, nrow = ceiling(numPlots/cols))}
 if (numPlots==1) {
    print(plots[[1]])
  } else {
    grid.newpage()
    pushViewport(viewport(layout = grid.layout(nrow(layout), ncol(layout))))
    for (i in 1:numPlots) {
      matchidx <- as.data.frame(which(layout == i, arr.ind = TRUE))
      print(plots[[i]], vp = viewport(layout.pos.row = matchidx$row,
                                      layout.pos.col = matchidx$col))}}}

Packages

library(pacman)
p_load(dplyr, DT, meta, DescTools, ggplot2)

Rationale

This is a prelude to a more comprehensive meta-analysis of the effects of violations of the assumptions underlying twin studies. At this point in time (May 13, 2020), I will only be reviewing monozygote/twin-specific effects, aka, violations of the so-called “equal environments assumption” (EEA). In the future, this will be extended to generational effects, age, range restriction in environments, interactions, prenatal environmental effects, and other issues. Regarding prenatal environmental effects, I recommend Emil’s blogpost on them and how they happen to look quite small and they act against inducing similarity; elsewhere, this has been said by Martin, Boomsma & Machin (1997) among others, and its original observation by Wilson led Bouchard to name the increase in the heritability of intelligence with age the “Wilson effect” (2013), which had been noted for years before the Chipuer, Rovine & Plomin (1990) and Devlin, Daniels & Roeder (1997) studies whose results were obtained by improper assumptions about how prenatal effects and other primary biases act and - as pointed out by numerous contemporary comments - a neglect for the effect of age.

Briefly, twin studies exploit naturally existing variation in the levels of relatedness between siblings in order to statistically identify the variance in measured traits which is attributable to theoretically-specified sources of variance whose means of statistical identification are also derived from theory. In this way, they can be considered to be exploiting a natural experiment for parameter identification and they amount to causal evidence in the typical sense used in econometrics. Modern twin studies typically utilize the 100% genetic relatedness of monozygotic (identical) twins and the average of 50% relatedness for dizygotic (fraternal) twins who are preferred over siblings despite the same degree of relatedness in order to avoid confounding by generational and age effects on traits and partition variance into three components in a structural equation model. The first of these variance components is \(A\), or the additive heritability, otherwise known as \(h^2\); the second is \(C\), or the shared environment, which is all sources of non-additive genetic variance which increase sibling relatedness beyond genetic expectations, and; the third is \(E\), or the nonshared environment, or all sources of variance which reduce the similarity between siblings. Estimates are biased from the theory-based causal model by the effects of omitted sources of variance including, but not limited to

  1. Assortative mating (increase \(C\); can increase \(E\) if disassortative)
  2. Epigenetics (increase \(C\) and, sometimes, \(E\))
  3. Gene-environment covariance (assumed to increase \(A\) and sometimes the other components but can wildly alter estimates in both directions; see Goldberger, 1979)
  4. Dominance (increase \(A\) and \(C\))
  5. Epistasis (increase \(A\) and \(C\))
  6. Prenatal effects (decrease \(A\) and typically fade)
  7. Range restriction in environmental effects (can increase or decrease \(A\), \(C\), or \(E\) depending on their method of action)
  8. Differential treatment by zygosity or appearance (increase \(A\))
  9. Differential drive for similarity or differences in the level of contact leading to similarity (increase \(A\))
  10. Selection on being a twin (can increase or decrease \(A\), \(C\), or \(E\) depending on their method of action)
  11. Interactions (can increase or decrease \(A\), \(C\), or \(E\) depending on their method of action and may act on trait variance for MZs and DZs by level of a moderator, like, most commonly, socioeconomic status)
  12. Measurement error (increase \(E\), decrease \(A\) and \(C\))
  13. Differences in the relatedness of fraternal twins or full-siblings (decrease \(A\), increase \(C\) and \(E\) depending on direction)
  14. Somatic mutations (can increase \(C\), but more often \(E\))

The effects of many of these things have been empirically estimated and they generally return no or small effects. For example, assortative mating usually does act and tends to considerably inflate \(C\) (see, e.g., Hugh-Jones et al., 2016, pp. 106-107, who observed that the \(C\) in educational attainment in the meta-analysis by Polderman et al., 2015 could be derived from assortative mating (which they set to 0.5; other studies have found that the coefficient of assortative mating is often ~0.2 for traits like IQ, and it can also be negative; van Leeuwen, van den Berg & Boomsma, 2008 recorded \(r \approx 0.33\) for IQ)), range restriction in environments tends to have little effect (e.g., McGue et al., 2007 and the earlier work from other UMN scholars dealing with this; they have also, in my view, dispositively concluded against twin selection arguments), and the effects of contact seem moreso to reflect existing similarity than to augment it (Lykken et al., 1990), look-alikes tend to be no more similar than chance (Segal, 2013), the inclusion of family or people in similar measured environments doesn’t seem to bias GREML-GCTA (Conley et al., 2014; various, including especially the supplements to some of James Lee’s work, which are always brilliant), twin misclassification by zygosity seems to have no meaningful effect (Conley et al., 2013; see also Scarr & Weinberg, 1976 pp. 732-733; Rowe, 2002), variation in relatedness in siblings can recover trait heritability (Visscher et al., 2008), the Scarr-Rowe effect is replicable but not huge (Tucker-Drob & Bates, 2016; later work has generally yielded smaller effect, but it is still replicable; anecdotally, I replicated it in the NCPP and found that the relationship between the vector of interaction effects and g loadings was \(r \approx 0.90\)), most empirical estimates of resemblance yield no evidence of epistasis or dominance effects (Hill, Goddard & Visscher, 2008), \(A\) often increases when assessing the heritability of latent variables rather than sumscores (e.g., van den Berg, Glas & Boomsma, 2007), and the Wilson effect appears for virtual twins as well (Segal et al., 2007). The assumptions of twin models and the extent of their violations are always an empirical rather than a merely theoretical matter, and in each case, they can be tested provided they can be specified and aren’t outrageous, whether by extending the models (see, e.g., extended-twin family models, which can model things like twin-specific effects or horizontal and vertical cultural transfer) or referring to alternative paradigms and methods. Bias may be more salient for certain traits than others, but it is most often estimated to be small. This observation, amidst the considerable discussion of model violations, led Bouchard (Bouchard, 1984; this was a topic I relished getting to sit down and discuss with him while eating lunch at a conference in late-summer, 2019) to say

A principal feature of the many critiques of hereditarian research is an excessive concern for purity, both in terms of meeting every last assumption of the models being tested and in terms of eliminating all possible errors. The various assumptions and potential errors that may, or may not, be of concern are enumerated and discussed at great length. The longer the discussion of potential biasing factors, the more likely the critic is to conclude that they are actual sources of bias. By the time a chapter summary or conclusion section is reached, the critic asserts that it is impossible to learn anything using the design under discussion. There is often, however, a considerable amount known about the possible effect of the violation of assumptions. As my colleague Paul Meehl has observed, ‘Why these constraints are regularly treated as “assumptions” instead of refutable conjectures is itself a deep and fascinating question.’ (Meehl, 1978, p. 810). In addition, potential systematic errors sometimes have testable consequences that can be estimated. They are, unfortunately, seldom evaluated. In other instances the data themselves are simply abused. In other instances the data themselves are simply abused. As I have pointed out elsewhere:

The data are subgrouped using a variety of criteria that, although plausible on their face, yield the smallest genetic estimates that can be squeezed out. Statistical significance tests are liberally applied and those favorable to the investigator’s prior position are emphasized. Lack of statistical significance is overlooked when it is convenient to do so, and multiple measurements of the same construct (constructive replication within a study) are ignored. There is repeated use of significance tests on data chosen post hoc. The sample sizes are often very small, and the problem of sampling error is entirely ignored. (Bouchard, 1982, p. 190)

This fallacious line of reasoning is so endemic that I have given it a name, ‘pseudo-analysis’. Pseudo-analysis has been very widely utilized in the critiques and reanalyses of data gathered on monozygotic twins reared apart. I will look closely at this particular kinship, but warn the reader that the general conclusion applies equally to most other kinships.

Perhaps the most disagreeable criticism of all is the consistent claim that IQ tests are systematically flawed (each test in a different way) and, consequently, are poor measures of anything. These claims are seldom supported by reasonable evidence. If this class of argument were true, one certainly would not expect the various types of IQ tests (some remarkably different in content) to correlate as highly with each other as they do, nor, given the small samples used, would we expect them to produce such consistent results from study to study. Different critics launch this argument to different degrees, but they are of a common class.

Bouchard goes on to deal with claims regarding bias resulting from “Experimenter Bias”, “Rearing Environment Similarity”, and “Twin Separation, Twin Contact and Twin Reunion”, and then jumps into the “pseudo-scientific flavor” of criticisms of studies of twins reared-apart (some of which have since been empirically addressed, like the effects of parental occupation; see, e.g., Johnson et al., 2007) before proposing a “philosophy” for conducting behavior genetic research which is curiously Lakatosian (or is it Galtonian? I’ve included his recommendations alongside Galton’s frontispiece to the Annals of Eugenics and a quotation from Macmillan’s Magazine in an ending footnote) in outlook and calling for more self-critical research.

What I am interested in presently is assessing all of these effects. To this end, I will perform a meta-analysis of published equal environments assumption violations and derive a meta-analytic estimate of the bias to heritability. Upon later revisions, I will add further assessments of additional biasing factors, more comprehensive looks at the literature, novel effect estimates, and so on. For now, my data are derived principally from the supplement of Barnes et al. (2014). This supplement is recommended reading for its discussion of some other assumptions found in standard twin models.

Analysis

The MaTCH/CaTCH values cited as Polderman et al. (2015) and Lakhani et al. (2019) are for the differences between same-sex and opposite-sex dizygotic and full-sibling (respectively) correlations. The idea is that sex should have a ubiquitous and strong effect on resemblance for many phenotypes if the equal environments assumption is violated by factors making pairs more similar. For phenotypes like height, of course, sex cannot be a valid test of social effects. For the former study, these effects are estimated for the “Psychiatric” category and for the latter, “Schizophrenia and other psychiatric disorders”. In order to meta-analyze effects on \(h^2\), \(\sqrt{h^2}\) was taken in order to perform a meta-analysis of correlations, which were transformed back to variances in the end. Note, that unstandardized coefficients are directly interpretable and variances are not; in truth, I believe that interpreting variances, more often than not, leads to confusion, such as in cases where a small variance is assumed to mean little, or different variances are assumed to be directly comparable. An exemplary manifestation of confusion about variances is when people note that a trait is 80% heritable; this does not imply that environments are four-times less important, it implies that they are two-times less important, because \(\sqrt{0.8} = 0.8944\) versus \(\sqrt{0.2} = 0.4472\), which is the actual effect measure (by the same token, the variance explained in educational attainment with known genes in European populations so far - 16% - is only 2.29 times less important than the remainder, which, if generalizable to cross-population causal variants, puts important caps on certain theories positing mean compensation in other populations). Numerous studies using extended twin family designs are capable of estimating twin-specific effects (which deserve their own meta-analysis alongside virtual sibling studies) or effects which are specific to a certain zygosity. I will not be using these effects since they are not directly relevant to the equal environments assumption if they do not test the effect by zygosity and, instead, only refer to the effect by twin/non-twin status; there are many of these papers (e.g., Klassen et al., 2018 through Bell, Kandler & Riemann, 2018, in my references below).

#EEAM <- read.csv(".../EqualEnvironments.csv")
datatable(EEAM, extensions = c("Buttons", "FixedColumns"), options = list(dom = 'Bfrtip', buttons = c('copy', 'csv', 'print'), scrollX = T, fixedColumns = list(leftColumns = 3)))
#Subset to studies with empirical estimates

EMEEA <- CF(EEAM, "h2r")
#Raw correlations - not appropriate for meta-analysis since misestimates variance, but good way of assessing effect this might have and sometimes interesting to compare to proper variance estimate; Sidik-Jonkman used for both analyses, but results are virtually unaltered with different estimators, from Hunter-Schmidt to empirical Bayes.
RCOR <- metacor(h2r,
                sqrt(Npairs),
                data = EMEEA,
                studlab = Authors,
                sm = "COR",
                method.tau = "SJ")
  
#r-to-z transformed for proper variance estimation
ZCOR <- metacor(h2r,
                sqrt(Npairs),
                data = EMEEA,
                studlab = Authors,
                sm = "ZCOR",
                method.tau = "SJ")
RAW <- forest(RCOR,
       sortvar = TE,
       xlim = c(-1, 1),
       rightlabs = c("Correlation", "95% CI", "Weight"),
       leftcols = c("Authors"),
       leftlabs = c("Study"),
       pooled.totals = F,
       smlab = "",
       text.random = "Overall Effect",
       print.tau2 = F,
       col.diamond = "orangered",
       col.diamond.lines = "black",
       col.predict = "black",
       print.I2.ci = F,
       digits.sd = 2,
       comb.fixed = F,
       col.square = "#00348E",
       overall = T)

STAND <- forest(ZCOR,
       sortvar = TE,
       xlim = c(-1, 1),
       rightlabs = c("Correlation", "95% CI", "Weight"),
       leftcols = c("Authors"),
       leftlabs = c("Study"),
       pooled.totals = F,
       smlab = "",
       text.random = "Overall Effect",
       print.tau2 = F,
       col.diamond = "gold",
       col.diamond.lines = "black",
       col.predict = "black",
       print.I2.ci = F,
       digits.sd = 2,
       comb.fixed = F,
       col.square = "#5F85E7",
       overall = T)

#funnel(RCOR, xlab = "Raw Relationship between g Factors"); funnel(ZCOR, xlab = "Meta-Analytic Relationship between g Factors")
trimfill(RCOR); trimfill(ZCOR)
##                                       COR             95%-CI %W(random)
## Cronk et al.                       0.1414 [-0.1510;  0.4339]        5.2
## Loehlin and Nichols                0.1871 [-0.1694;  0.5435]        4.4
## Martin et al.                      0.2345 [-0.0454;  0.5144]        5.3
## Kendler et al.                     0.4000 [ 0.1047;  0.6953]        5.1
## Tambs, Harris, and Magnus         -0.1414 [-0.4139;  0.1310]        5.4
## Kendler et al.                     0.2646 [-0.0498;  0.5790]        4.9
## Bailey, Dunne, and Martin          0.1414 [-0.1533;  0.4361]        5.2
## Bailey et al.                      0.1414 [-0.1533;  0.4361]        5.1
## Derks, Dolan, and Boomsma          0.3162 [ 0.0307;  0.6018]        5.3
## Mazzeo et al.                      0.1000 [-0.2485;  0.4485]        4.5
## Littvay                            0.1414 [-0.2429;  0.5257]        4.1
## Conley et al.                     -0.6083 [-0.9258; -0.2908]        4.9
## Felson                             0.3162 [-0.1704;  0.8028]        3.1
## Kandler, Gottschling & Spinath    -0.0837 [-0.5841;  0.4168]        3.0
## Polderman et al.                   0.0707 [ 0.0132;  0.1282]        8.0
## Lakhani et al.                     0.1095 [ 0.0431;  0.1760]        8.0
## Filled: Kendler et al.            -0.1079 [-0.4223;  0.2065]        4.9
## Filled: Derks, Dolan, and Boomsma -0.1596 [-0.4451;  0.1260]        5.3
## Filled: Felson                    -0.1596 [-0.6462;  0.3270]        3.1
## Filled: Kendler et al.            -0.2433 [-0.5386;  0.0519]        5.1
## 
## Number of studies combined: k = 20 (with 4 added studies)
## 
##                         COR            95%-CI    z p-value
## Random effects model 0.0564 [-0.0532; 0.1660] 1.01  0.3130
## 
## Quantifying heterogeneity:
##  tau^2 = 0.0381 [0.0069; 0.0944]; tau = 0.1952 [0.0834; 0.3072];
##  I^2 = 55.8% [27.0%; 73.2%]; H = 1.50 [1.17; 1.93]
## 
## Test of heterogeneity:
##      Q d.f. p-value
##  42.97   19  0.0013
## 
## Details on meta-analytical method:
## - Inverse variance method
## - Sidik-Jonkman estimator for tau^2
## - Q-profile method for confidence interval of tau^2 and tau
## - Trim-and-fill method to adjust for funnel plot asymmetry
## - Untransformed correlations
##                                       COR             95%-CI %W(random)
## Cronk et al.                       0.1414 [-0.1618;  0.4202]        5.5
## Loehlin and Nichols                0.1871 [-0.1915;  0.5172]        4.5
## Martin et al.                      0.2345 [-0.0641;  0.4946]        5.5
## Kendler et al.                     0.4000 [ 0.0602;  0.6567]        4.7
## Tambs, Harris, and Magnus         -0.1414 [-0.4021;  0.1405]        5.8
## Kendler et al.                     0.2646 [-0.0774;  0.5509]        4.9
## Bailey, Dunne, and Martin          0.1414 [-0.1642;  0.4222]        5.5
## Bailey et al.                      0.1414 [-0.1642;  0.4223]        5.5
## Derks, Dolan, and Boomsma          0.3162 [ 0.0015;  0.5739]        5.2
## Mazzeo et al.                      0.1000 [-0.2577;  0.4336]        4.7
## Littvay                            0.1414 [-0.2603;  0.5014]        4.2
## Conley et al.                     -0.6083 [-0.8475; -0.1637]        3.0
## Felson                             0.3162 [-0.2540;  0.7233]        2.6
## Kandler, Gottschling & Spinath    -0.0837 [-0.5545;  0.4278]        3.0
## Polderman et al.                   0.0707 [ 0.0130;  0.1279]        9.0
## Lakhani et al.                     0.1095 [ 0.0427;  0.1754]        8.9
## Filled: Kendler et al.            -0.1012 [-0.4220;  0.2421]        4.9
## Filled: Derks, Dolan, and Boomsma -0.1567 [-0.4493;  0.1664]        5.2
## Filled: Felson                    -0.1567 [-0.6322;  0.4046]        2.6
## Filled: Kendler et al.            -0.2488 [-0.5494;  0.1088]        4.7
## 
## Number of studies combined: k = 20 (with 4 added studies)
## 
##                         COR            95%-CI    z p-value
## Random effects model 0.0698 [-0.0434; 0.1811] 1.21  0.2267
## 
## Quantifying heterogeneity:
##  tau^2 = 0.0364 [0.0000; 0.0886]; tau = 0.1907 [0.0000; 0.2976];
##  I^2 = 32.2% [0.0%; 60.6%]; H = 1.21 [1.00; 1.59]
## 
## Test of heterogeneity:
##      Q d.f. p-value
##  28.03   19  0.0829
## 
## Details on meta-analytical method:
## - Inverse variance method
## - Sidik-Jonkman estimator for tau^2
## - Q-profile method for confidence interval of tau^2 and tau
## - Trim-and-fill method to adjust for funnel plot asymmetry
## - Fisher's z transformation of correlations
dfZ <- data.frame(RCOR$seTE, RCOR$TE); dfZ$se <- FisherZInv(dfZ$RCOR.seTE); dfZ$r <- FisherZInv(dfZ$RCOR.TE)
dfR <- data.frame(ZCOR$seTE, ZCOR$TE); dfR$se <- dfR$ZCOR.seTE; dfR$r <- dfR$ZCOR.TE

estimateT = FisherZInv(ZCOR$TE.random); set = FisherZInv(ZCOR$seTE.random)
estimateT2 = RCOR$TE.random; set2 = RCOR$seTE.random
print(paste("The r-to-z meta-analysis yielded a bias of r =", estimateT, "with an SE of", set, "and the raw meta-analysis yielded a bias of r =", estimateT2, "with an SE of", set2, "which translate to biases at the level of heritability of", estimateT^2, "and", estimateT2^2, "respectively."))
## [1] "The r-to-z meta-analysis yielded a bias of r = 0.119881159345262 with an SE of 0.0626479417372227 and the raw meta-analysis yielded a bias of r = 0.10741125319668 with an SE of 0.0611314462098188 which translate to biases at the level of heritability of 0.0143714923659642 and 0.0115371773132813 respectively."
(1-(pnorm(estimateT/set)))*2; (1-(pnorm(estimateT2/set2)))*2 #both > 0.05
## [1] 0.05567525
## [1] 0.0789086
dfR; dfZ
se.seq = seq(0, max(dfZ$se), 0.001)
ll95 = estimateT - (1.96*se.seq)
ul95 = estimateT + (1.96*se.seq)
ll95a = FisherZInv(ZCOR$lower.random)
ul95a = FisherZInv(ZCOR$upper.random)
ll99 = estimateT - (3.29*se.seq)
ul99 = estimateT + (3.29*se.seq)
ll99a = 1.67857*FisherZInv(ZCOR$lower.random)
ul99a = 1.67857*FisherZInv(ZCOR$upper.random)
meanll95 = estimateT - (1.96*set)
meanul95 = estimateT + (1.96*set)
dfZCI <- data.frame(ll95, ul95, ll99, ul99, se.seq, estimateT, meanll95, meanul95, ll95a, ul95a, ll99a, ul99a)

se.seq2 = seq(0, max(dfR$se), 0.001)
ll952 = estimateT2 - (1.96*se.seq2)
ul952 = estimateT2 + (1.96*se.seq2)
ll952a = RCOR$lower.random
ul952a = RCOR$upper.random
ll992 = estimateT2 - (3.29*se.seq2)
ul992 = estimateT2 + (3.29*se.seq2)
ll992a = 1.67857*(RCOR$lower.random)
ul992a = 1.67857*(RCOR$upper.random)
meanll952 = estimateT2 - (1.96*set2)
meanul952 = estimateT2 + (1.96*set2)
dfRCI <- data.frame(ll952, ul952, ll992, ul992, se.seq2, estimateT2, meanll952, meanul952, ll952a, ul952a, ll992a, ul992a)
STAND <- ggplot(aes(x = se, y = r), data = dfZ) + 
  geom_point(shape = 16, size = 3, colour = "#00348E") + 
  xlab('Standard Error') + ylab('r-to-z Correlations') + 
  geom_line(aes(x = se.seq, y = ll95), linetype = 'dotted', colour = "#666666", size = 1, data = dfZCI) +
  geom_line(aes(x = se.seq, y = ul95), linetype = 'dotted', colour = "#666666", size = 1, data = dfZCI) +
  geom_line(aes(x = se.seq, y = ll99), linetype = 'dashed', colour = "#666666", size = 1, data = dfZCI) +
  geom_line(aes(x = se.seq, y = ul99), linetype = 'dashed', colour = "#666666", size = 1, data = dfZCI) +
  geom_segment(aes(x = min(se.seq), y = estimateT, xend = max(se.seq), yend = estimateT), linetype='dotted', colour = "#E9C535", size = 1, data=dfZCI) +
  geom_segment(aes(x = min(se.seq), y = ll95a, xend = max(se.seq), yend = ll95a), linetype='dotted' , colour = "gold", size = 1, data=dfZCI) +
  geom_segment(aes(x = min(se.seq), y = ul95a, xend = max(se.seq), yend = ul95a), linetype='dotted' , colour = "gold", size = 1, data=dfZCI) +
  scale_x_reverse() +
  coord_flip() + 
  theme_bw() + 
  theme(text = element_text(family = "serif", size = 12))

RAW <- ggplot(aes(x = se, y = r), data = dfR) + 
  geom_point(shape = 16, size = 3, colour = "#5F85E7") + 
  xlab('Standard Error') + ylab('Raw Correlations') + 
  geom_line(aes(x = se.seq2, y = ll952), linetype = 'dotted', colour = "#666666", size = 1, data = dfRCI) +
  geom_line(aes(x = se.seq2, y = ul952), linetype = 'dotted', colour = "#666666", size = 1, data = dfRCI) +
  geom_line(aes(x = se.seq2, y = ll992), linetype = 'dashed', colour = "#666666", size = 1, data = dfRCI) +
  geom_line(aes(x = se.seq2, y = ul992), linetype = 'dashed', colour = "#666666", size = 1, data = dfRCI) +
  geom_segment(aes(x = min(se.seq2), y = estimateT2, xend = max(se.seq2), yend = estimateT2), linetype='dotted', colour = "#E9C535", size = 1, data=dfRCI) +
  geom_segment(aes(x = min(se.seq2), y = ll952a, xend = max(se.seq2), yend = ll952a), linetype='dotted' , colour = "gold", size = 1, data=dfRCI) +
  geom_segment(aes(x = min(se.seq2), y = ul952a, xend = max(se.seq2), yend = ul952a), linetype='dotted' , colour = "gold", size = 1, data=dfRCI) +
  scale_x_reverse() +
  coord_flip() + 
  theme_bw() +
  theme(text = element_text(family = "serif", size = 12))

multiplot(RAW, STAND, cols = 2)

The proportion reporting that the equal environments assumption is valid versus invalid and the number of reported violations may also be interesting.

valf <- data.frame("Valid" = c(mean(EEAM$Dichotomous.Conclusion), mean(EMENE$Dichotomous.Conclusion), mean(EMEEP$Dichotomous.Conclusion), mean(EMEEA$Dichotomous.Conclusion)), "Category" = c("All", "Non-Empirical", "Empirical", "Estimated"), "Violations" = c(mean(EEAM$NumViolations, na.rm = T), NA, mean(EMEEP$NumViolations, na.rm = T), mean(EMEEA$NumViolations)), "K" = c(66, 10, 56, 16)); valf
ggplot(valf, aes(x = Category, y = Valid, fill = Category)) + geom_bar(stat = "identity", show.legend = F) + coord_cartesian(ylim = c(0, 1)) + xlab("Category") + ylab("Proportion \"Valid\"") + theme_bw() + scale_fill_hue(c = 40) + theme(text = element_text(family = "serif", size = 16))

The majority of the work gathered claimed to support validity, with empirical work more likely to report validity and 80% of non-empirical work claiming a lack of validity; interestingly, all work where empirical estimates were supplied reported validity of this standard twin model assumption; excluding the same-sex/opposite-sex DZ twin results I added would not change this conclusion and it would slightly alter the overall validity conclusions, but, again, the disparity between assessments with estimates and non-empirical studies would remain. This outcome should be reassessed with all literature included, but it’s doubtful that much would change given that this is a reasonably comprehensive view of the literature on this assumption. If all literature were included, it would presumably lead to a greater proportion of the non-empirical work claiming a lack of validity due to conspicuous “repeat offenders”, or people whose careers involve the repetition (in many different publications and books) that this assumption does not work despite rarely - if ever - supplying any proof or novel reasons to consider the assumption invalid.

Discussion

Existing empirically-derived estimates of the effect of violating the equal environments assumption on heritability estimates are typically minor (+1.4% \(h^2, p \approx 0.06, k = 16\)). This is intuitively unsurprising since there’s no clear way in which things like how one’s treated based on appearance would affect variables like blood pressure, pulse, intelligence, spirometric measures, clinical blood measures, urinalysis results, anthropometry, nutritional habits, dental status, personality, height, weight, fears, self-reported luck or driving skills, and so on. People often propose considerable effects, but there’s vanishingly little evidence and - as far as I’m aware - no one specifies how and to what extent these biases are supposed to affect trait similarity. With regards to the gene-environment covariance at the heart of equal environments discussion, the estimates delivered by twin models probably remain well-identified, but more violation estimates need to be produced and violations for different phenotypes tested. It would be useful to find if there are differences for psychological, social, and physical phenotypes - construed broadly - in particular.

To-Do

  • In the sibling regression method, it is the slope, not the \(r^2\) which corresponds to the heritability.
  • Add details specified above.
  • Add additional MaTCH/CaTCH phenotypes.
  • Add all outcomes from cited studies and estimates on all components (\(C\), \(E\), etc.).
  • Estimates for specific phenotypes and effect of phenotypic heterogeneity discussion (need more data to get more estimates, expand).
  • Expand meta-analysis to allow for moderator testing (for EEA testing and heritability calculation method, age, publication year, country, etc.).

References

Martin, N., Boomsma, D., & Machin, G. (1997). A twin-pronged attack on complex traits. Nature Genetics, 17(4), 387-392. https://doi.org/10.1038/ng1297-387

Bouchard, T. J. (2013). The Wilson Effect: The increase in heritability of IQ with age. Twin Research and Human Genetics: The Official Journal of the International Society for Twin Studies, 16(5), 923-930. https://doi.org/10.1017/thg.2013.54

Chipuer, H. M., Rovine, M. J., & Plomin, R. (1990). LISREL modeling: Genetic and environmental influences on IQ revisited. Intelligence, 14(1), 11-29. https://doi.org/10.1016/0160-2896(90)90011-H

Devlin, B., Daniels, M., & Roeder, K. (1997). The heritability of IQ. Nature, 388(6641), 468-471. https://doi.org/10.1038/41319

Goldberger, A. S. (1979). Heritability. Economica, 46(184), 327-347. JSTOR. https://doi.org/10.2307/2553675

Hugh-Jones, D., Verweij, K. J. H., St. Pourcain, B., & Abdellaoui, A. (2016). Assortative mating on educational attainment leads to genetic spousal resemblance for polygenic scores. Intelligence, 59, 103-108. https://doi.org/10.1016/j.intell.2016.08.005

Polderman, T. J. C., Benyamin, B., de Leeuw, C. A., Sullivan, P. F., van Bochoven, A., Visscher, P. M., & Posthuma, D. (2015). Meta-analysis of the heritability of human traits based on fifty years of twin studies. Nature Genetics, 47(7), 702-709. https://doi.org/10.1038/ng.3285

van Leeuwen, M., van den Berg, S. M., & Boomsma, D. I. (2008). A twin-family study of general IQ. Learning and Individual Differences, 18(1), 76-88. https://doi.org/10.1016/j.lindif.2007.04.006

McGue, M., Keyes, M., Sharma, A., Elkins, I., Legrand, L., Johnson, W., & Iacono, W. G. (2007). The Environments of Adopted and Non-adopted Youth: Evidence on Range Restriction From the Sibling Interaction and Behavior Study (SIBS). Behavior Genetics, 37(3), 449-462. https://doi.org/10.1007/s10519-007-9142-7

Lykken, D. T., McGue, M., Bouchard, T. J., & Tellegen, A. (1990). Does contact lead to similarity or similarity to contact? Behavior Genetics, 20(5), 547-561. https://doi.org/10.1007/BF01065871

Segal, N. L. (2013). Personality similarity in unrelated look-alike pairs: Addressing a twin study challenge. Personality and Individual Differences, 54(1), 23-28. https://doi.org/10.1016/j.paid.2012.07.031

Conley, D., Siegal, M. L., Domingue, B., Harris, K. M., McQueen, M., & Boardman, J. (2014). Testing the Key Assumption of Heritability Estimates Based on Genome-wide Genetic Relatedness. Journal of Human Genetics, 59(6), 342-345. https://doi.org/10.1038/jhg.2014.14

Conley, D., Rauscher, E., Dawes, C., Magnusson, P. K. E., & Siegal, M. L. (2013). Heritability and the Equal Environments Assumption: Evidence from Multiple Samples of Misclassified Twins. Behavior Genetics, 43(5), 415-426. https://doi.org/10.1007/s10519-013-9602-1

Scarr, S., & Weinberg, R. A. (1976). IQ test performance of Black children adopted by White families. American Psychologist, 31(10), 726-739. https://doi.org/10.1037/0003-066X.31.10.726

Rowe, D. C. (2002). IQ, birth weight, and number of sexual partners in White, African American, and mixed race adolescents. Population and Environment: A Journal of Interdisciplinary Studies, 23(6), 513-524. https://doi.org/10.1023/A:1016313718644

Visscher, P. M., Medland, S. E., Ferreira, M. A. R., Morley, K. I., Zhu, G., Cornes, B. K., Montgomery, G. W., & Martin, N. G. (2006). Assumption-Free Estimation of Heritability from Genome-Wide Identity-by-Descent Sharing between Full Siblings. PLOS Genetics, 2(3), e41. https://doi.org/10.1371/journal.pgen.0020041

Tucker-Drob, E. M., & Bates, T. C. (2016). Large Cross-National Differences in Gene \(\times\) Socioeconomic Status Interaction on Intelligence. Psychological Science, 27(2), 138-149. https://doi.org/10.1177/0956797615612727

Hill, W. G., Goddard, M. E., & Visscher, P. M. (2008). Data and Theory Point to Mainly Additive Genetic Variance for Complex Traits. PLOS Genetics, 4(2), e1000008. https://doi.org/10.1371/journal.pgen.1000008

van den Berg, S. M., Glas, C. A. W., & Boomsma, D. I. (2007). Variance Decomposition Using an IRT Measurement Model. Behavior Genetics, 37(4), 604-616. https://doi.org/10.1007/s10519-007-9156-1

Segal, N. L., McGuire, S. A., Havlena, J., Gill, P., & Hershberger, S. L. (2007). Intellectual similarity of virtual twin pairs: Developmental trends. Personality and Individual Differences, 42(7), 1209-1219. https://doi.org/10.1016/j.paid.2006.09.028

Bouchard, T. (1984). The Hereditarian Research Program: Triumphs and Tribulations In Modgil & C. Modgil (Eds.), Arthur Jensen: Consensus and Controversy. Lewes, Sussex, Falmer Press.

Meehl, P. E. (1978). Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. Journal of Consulting and Clinical Psychology, 46(4), 806-834. https://doi.org/10.1037/0022-006X.46.4.806

Bouchard Jr., T. J. (1982). Identical Twins Reared Apart: Reanalysis or Pseudo-analysis? 27(3), 190-191. https://doi.org/10.1037/021001

Johnson, W., Bouchard, T. J., McGue, M., Segal, N. L., Tellegen, A., Keyes, M., & Gottesman, I. I. (2007). Genetic and environmental influences on the Verbal-Perceptual-Image Rotation (VPR) model of the structure of mental abilities in the Minnesota study of twins reared apart. Intelligence, 35(6), 542-562. https://doi.org/10.1016/j.intell.2006.10.003

Barnes, J. C., Wright, J. P., Boutwell, B. B., Schwartz, J. A., Connolly, E. J., Nedelec, J. L., & Beaver, K. M. (2014). Demonstrating the Validity of Twin Research in Criminology. Criminology, 52(4), 588-626. https://doi.org/10.1111/1745-9125.12049

Lakhani, C. M., Tierney, B. T., Manrai, A. K., Yang, J., Visscher, P. M., & Patel, C. J. (2019). Repurposing large health insurance claims data to estimate genetic and environmental contributions in 560 phenotypes. Nature Genetics, 51(2), 327-334. https://doi.org/10.1038/s41588-018-0313-7

Klassen, L., Eifler, E. F., Hufer, A., & Riemann, R. (2018). WHY DO PEOPLE DIFFER IN THEIR ACHIEVEMENT MOTIVATION? A NUCLEAR TWIN FAMILY STUDY. Primenjena Psihologija, 11(4), 433-450. https://doi.org/10.19090/pp.2018.4.433-450

Bleidorn, W., Hufer, A., Kandler, C., Hopwood, C. J., & Riemann, R. (2018). A Nuclear Twin Family Study of Self-Esteem. European Journal of Personality, 32(3), 221-232. https://doi.org/10.1002/per.2136

Slawinski, B. L., Klump, K. L., & Burt, S. A. (2019). The etiology of social aggression: A nuclear twin family study. Psychological Medicine, 49(1), 162-169. https://doi.org/10.1017/S0033291718000697

Hufer, A., Kornadt, A. E., Kandler, C., & Riemann, R. (2020). Genetic and environmental variation in political orientation in adolescence and early adulthood: A Nuclear Twin Family analysis. Journal of Personality and Social Psychology, 118(4), 762-776. https://doi.org/10.1037/pspp0000258

Kandler, C., Gottschling, J., & Spinath, F. M. (2016). Genetic and Environmental Parent-Child Transmission of Value Orientations: An Extended Twin Family Study. Child Development, 87(1), 270-284. https://doi.org/10.1111/cdev.12452

Martin, N. G., Eaves, L. J., Heath, A. C., Jardine, R., Feingold, L. M., & Eysenck, H. J. (1986). Transmission of social attitudes. Proceedings of the National Academy of Sciences, 83(12), 4364-4368. https://doi.org/10.1073/pnas.83.12.4364

Eaves, L., Heath, A., Martin, N., Maes, H., Neale, M., Kendler, K., Kirk, K., & Corey, L. (1999). Comparing the biological and cultural inheritance of personality and social attitudes in the Virginia 30 000 study of twins and their relatives. Twin Research and Human Genetics, 2(2), 62-80. https://doi.org/10.1375/twin.2.2.62

Wadsworth, S. J., Corley, R. P., Hewitt, J. K., Plomin, R., & DeFries, J. C. (2002). Parent-offspring resemblance for reading performance at 7, 12 and 16 years of age in the Colorado Adoption Project. Journal of Child Psychology and Psychiatry, 43(6), 769-774. https://doi.org/10.1111/1469-7610.00085

Hatemi, P. K., Hibbing, J. R., Medland, S. E., Keller, M. C., Alford, J. R., Smith, K. B., Martin, N. G., & Eaves, L. J. (2010). Not by Twins Alone: Using the Extended Family Design to Investigate Genetic Influence on Political Beliefs. American Journal of Political Science, 54(3), 798-814. https://doi.org/10.1111/j.1540-5907.2010.00461.x

Boomsma, D. I., Saviouk, V., Hottenga, J.-J., Distel, M. A., Moor, M. H. M. de, Vink, J. M., Geels, L. M., Beek, J. H. D. A. van, Bartels, M., Geus, E. J. C. de, & Willemsen, G. (2010). Genetic Epidemiology of Attention Deficit Hyperactivity Disorder (ADHD Index) in Adults. PLOS ONE, 5(5), e10621. https://doi.org/10.1371/journal.pone.0010621

Faraone, S. V., & Larsson, H. (2019). Genetics of attention deficit hyperactivity disorder. Molecular Psychiatry, 24(4), 562-575. https://doi.org/10.1038/s41380-018-0070-0

Vinkhuyzen, A. A. E., van der Sluis, S., Maes, H. H. M., & Posthuma, D. (2012). Reconsidering the Heritability of Intelligence in Adulthood: Taking Assortative Mating and Cultural Transmission into Account. Behavior Genetics, 42(2), 187-198. https://doi.org/10.1007/s10519-011-9507-9

Kandler, C., Bleidorn, W., & Riemann, R. (2012). Left or right? Sources of political orientation: the roles of genetic factors, cultural transmission, assortative mating, and personality. Journal of Personality and Social Psychology, 102(3), 633-645. https://doi.org/10.1037/a0025560

Lyngstad, T. H., Ystroem, E., & Zambrana, I. M. (2017). An Anatomy of Intergenerational Transmission: Learning from the educational attainments of Norwegian twins and their parents [Preprint]. SocArXiv. https://doi.org/10.31235/osf.io/fby2t

Swagerman, S. C., van Bergen, E., Dolan, C., de Geus, E. J. C., Koenis, M. M. G., Hulshoff Pol, H. E., & Boomsma, D. I. (2017). Genetic transmission of reading ability. Brain and Language, 172, 3-8. https://doi.org/10.1016/j.bandl.2015.07.008

Kornadt, A. E., Hufer, A., Kandler, C., & Riemann, R. (2018). On the genetic and environmental sources of social and political participation in adolescence and early adulthood. PLOS ONE, 13(8), e0202518. https://doi.org/10.1371/journal.pone.0202518

Bell, E., Kandler, C., & Riemann, R. (2018). Genetic and environmental influences on sociopolitical attitudes: Addressing some gaps in the new paradigm. Politics and the Life Sciences, 37(2), 236-249. https://doi.org/10.1017/pls.2018.17

Note

From Galton, F. General impressions are never to be trusted. Ann Eugenics 1925;1:i.

General impressions are never to be trusted. Unfortunately when they are of long standing they become fixed rules of life, and assume a prescriptive right not to be questioned. Consequently those who are not accustomed to original inquiry entertain a hatred and a horror of statistics. They cannot endure the idea of submitting their sacred impressions to cold-blooded verification. But it is the triumph of scientific men to rise superior to such superstitions, to desire tests by which the value of beliefs may be ascertained, and to feel sufficiently masters of themselves to discard contemptuously whatever may be found untrue.

From Galton, F. Hereditary Character and Talent. Macmillan’s Magazine, vol. 12, 1865 pp. 157-166.

Resemblance frequently fails where we might have expected it to hold; but we may fairly ascribe the failure to the influence of conditions that we do not yet comprehend. So long as we have a plenitude of evidence in favour of the hypothesis of the hereditary descent of talent, we need not be disconcerted when negative evidence is brought against us.

Bouchard’s (1984) description of the “basic philosophy” of meta-analysis which should be adopted for behavior genetics.

  1. No single study in social science research is definitive. Individual studies sample a portion of the universe of cases, partially sample the constructs of theoretical interest, and tend to have other flaws that may, or may not, influence the meaning of the data.

  2. Almost all studies are based on samples (sometimes very small) and, consequently, statistics based on these samples have associated sampling error.

  3. All measures of constructs have error of measurement associated with them.

  4. Empirical studies should be more carefully reported than they have been in the recent past. All basic statistical information should be published (means, standard deviations, correlation matrices, etc.) or made readily available in archival form. [emphasis mine]

  5. All characteristics of a study, including methods of sample selection, age, sex, and other demographic characteristics of the cases should be carefully reported.

  6. Hypotheses regarding biasing factors should be tested systematically and quantitatively, rather than on an ad hoc basis. Reviewers sometimes report conclusions based on ‘good studies’ which, in fact, often represent an a priori selection of studies without demonstrating that any study characteristics are related to outcome.

  7. Compute confidence intervals. The null hypothesis is almost always false and simply a function of statistical power. [emphasis also mine]