Setup

Functions

MPH2 <- function(p, h = 0.8){
  MPH <- function(g){1/sqrt(2*pi*h*(1-h))*exp(-(g-(h*p))^2/(2*h*(1-h)))}
  if(p>=0){
    FU <- integrate(MPH, lower = p/2, upper = Inf)} else{
    FU <- integrate(MPH, lower = -Inf, upper = p/2)}
return(FU$value)}

NH2 <- function(h){
iFUN <- function(g, p){exp(-(g-(h*p))^2/(2*h*(1-h)))}
iINT = Vectorize(function(p){return(integrate(iFUN, p/2, Inf, p = p)$value)})
iFUN2 <- function(p){exp(-p^2/2)*iINT(p)}
oINT = integrate(iFUN2, 0, Inf)$value
N = 1/(pi*sqrt(h*(1-h)))*oINT
return(N)}

[MPH2] The probability that genetic factors contributed more than environmental factors to a particular individual's mean deviation:

\[M_p(h^2)=P(G>E)=\begin{cases} & \int_{p/2}^{\infty}\frac{1}{\sqrt{2\pi h^2(1-h^2)}}e^{-(g-h^2p)/2h^2(1-h^2)}dg\text{ if } p\geq 0 \\ & \int_{-\infty}^{p/2}\frac{1}{\sqrt{2\pi h^2(1-h^2)}}e^{-(g-h^2p)/2h^2(1-h^2)}dg\text{ if } p <0 \end{cases}\]

[NH2] The probability that genetic factors contributed more than environmental factors to an arbitrary individual's mean deviation:

\[N(h^2)=\frac{1}{\pi\sqrt{h^2(1-h^2)}}\int_{0}^{\infty}e^{-p^2/2}\left ( \int_{p/2}^{\infty}e^{-(g-h^2p)^2/2h^2(1-h^2)} dg\right )dp\]

[Not Done Yet] The probability that genetic factors contributed more than environmental factors averaged over p with gene-environment correlation and interaction included:

\[\bar{M}_{\rho,\gamma^2}(h^2) = P(|G| > |E|) = E(M(p))=\\1-\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{\infty}e^{\frac{-p^2}{2}}\times\\\left ( \int_{-\infty}^{0} \int_{g}^{-g}\frac{1}{2\pi\sigma_1\sigma_2\sqrt{(1-\rho_{12}^2)}}e^{\frac{-1}{2*(1-\rho_{12}^2)}\left [\left ( \frac{g-\mu_1}{\sigma_1} \right )^2 - 2\rho_{12}\left ( \frac{g-\mu_1}{\sigma_1} \right ) \left ( \frac{g-\mu_2}{\sigma_2} \right ) + \left ( \frac{g-\mu_2}{\sigma_2} \right )\right ]}dedg + \int_{0}^{\infty} \int_{-g}^{g}\frac{1}{2\pi\sigma_1\sigma_2\sqrt{(1-\rho_{12}^2)}}e^{\frac{-1}{2*(1-\rho_{12}^2)}\left [\left ( \frac{g-\mu_1}{\sigma_1} \right )^2 - 2\rho_{12}\left ( \frac{g-\mu_1}{\sigma_1} \right ) \left ( \frac{g-\mu_2}{\sigma_2} \right ) + \left ( \frac{g-\mu_2}{\sigma_2} \right )\right ]}dedg \right )dp\]

With the pdf abbreviated as F(g, e), this should reduce to

\[\left ( \int_{-\infty}^{\infty} \int_{-g}^{g} F(g, e)de dg \right ) \times \int_{-\infty}^{\infty}e^{\frac{-p^2}{2}}dp\]

thanks to the additivity of integrals. Unfortunately, when I attempted to implement this, it did not work in R; I'm probably doing something wrong, but R gives absolutely no hint as to what that might be and it has the annoying habit of overloading numeric literals to implicitly convert to vectorized form for arguments expecting that and, because that happens invisibly, it looks like the function just accepts numbers as inputs when, in fact, it does not, and as soon as you try to pass an actual numeric variable, it breaks because it won't coerce the numeric variable into a vector form implicitly. It was trivial to implement in Python but it just never seemed to work here. Integrals are needlessly complicated in R. I also did not implement the same formula without interaction and the formula for a particular individual in R.

Rationale

Tal (2009) gave formulae for relating individual heritability estimates to the probabilities that an arbitrary or particular individual's phenotypic values were "in all likelihood" more so a result of genetic than environmental effects. In a later publication (Tal, 2012), this was extended to include gene-environment correlation and interaction. I've given functions for the former two above and may add the others later. I've also verified the outputs with the numbers used by Tal and reproduced his graphs.

Examples, Graphs

#Test values
MPH2(0.4); MPH2(1.33); NH2(0.8)

## [1] 0.6179114

## [1] 0.8407391

## [1] 0.7048328

g <- Vectorize(NH2); x <- seq(0.001, 0.999, 0.001)
adf <- data.frame("x" = x, "gx" = g(x))
ggplot(adf, aes(x = x, y = gx)) + geom_smooth(color = "steelblue4", formula = 'y ~ x', method = 'loess') + labs(x = "Heritability", y = "Probability A > E") + theme_bw() + theme(legend.position = "none", text = element_text(size = 12, family = "serif"), plot.title = element_text(hjust = 0.5)) + ylim(0, 1)

f <- Vectorize(MPH2); x <- seq(0.001, 0.999, 0.001)
bdf <- data.frame("x" = x, "fx1" = f(1.178, x), "fx2" = f(0.5, x), "fx3" = f(3, x), "fx4" = f(0.8, x), "fx5" = f(2, x), "fx6" = f(0.2, x), "fx7" = f(1, x), "fx8" = f(0, x))
ggplot(bdf, aes(x = x, y = fx1)) + labs(x = "Heritability", y = "Probability A > E | p") + theme_bw() + theme(legend.position = "none", text = element_text(size = 12, family = "serif"), plot.title = element_text(hjust = 0.5)) + ylim(0, 1) + xlim(0, 1) + 
  geom_smooth(color = "orangered1", formula = 'y ~ x', method = 'loess') +
  geom_smooth(aes(x = x, y = fx2), color = "green", formula = 'y ~ x', method = 'loess') +
  geom_smooth(aes(x = x, y = fx3), color = "purple", formula = 'y ~ x', method = 'loess') +
  geom_smooth(aes(x = x, y = fx4), color = "steelblue", formula = 'y ~ x', method = 'loess') +
  geom_smooth(aes(x = x, y = fx5), color = "gold1", formula = 'y ~ x', method = 'loess') +
  geom_smooth(aes(x = x, y = fx6), color = "cyan", formula = 'y ~ x', method = 'loess') +
  geom_smooth(aes(x = x, y = fx7), color = "darkblue", formula = 'y ~ x', method = 'loess') + 
  geom_smooth(aes(x = x, y = fx8), color = "gray", formula = 'y ~ x', method = 'loess')

## Warning: Removed 7 rows containing non-finite values (stat_smooth).

## Warning: Removed 1 rows containing non-finite values (stat_smooth).

## Warning: Removed 28 rows containing non-finite values (stat_smooth).

## Warning: Removed 3 rows containing non-finite values (stat_smooth).

## Warning: Removed 8 rows containing non-finite values (stat_smooth).

## Warning: Removed 1 rows containing non-finite values (stat_smooth).

## Warning: Removed 6 rows containing missing values (geom_smooth).

## Warning: Removed 19 rows containing missing values (geom_smooth).

## Warning: Removed 2 rows containing missing values (geom_smooth).

## Warning: Removed 13 rows containing missing values (geom_smooth).

## Warning: Removed 5 rows containing missing values (geom_smooth).

Discussion

These are an interesting proof that heritability estimates contain some information about the causes of phenotypic variance. As shown in Tal (2012), this applies even with G \(\times\) E and rGE. Note that this, definitionally, does not inform about conditions such as idiopathic intellectual disability (Galasso et al., 2010). Because the low end of the cognitive bell curve does seem to be, to some degree, discontinuous, that should always be kept in mind. The same is not true for the top-end, as registry, consortium (GHCA in particular), and extremes studies have shown (Haworth et al., 2009; Shakeshaft et al., 2015; Spain et al., 2016). In the early part of the last century, it was common to use regression to the mean to support continuities and discontinuities for normal and idiopathic disability (see, e.g., Urbach, 1974, pp. 129-130), and in the former case, just for ability.

References

Tal, O. (2009). From heritability to probability. Biology & Philosophy, 24(1), 81-105. https://doi.org/10.1007/s10539-008-9129-7

Tal, O. (2012). The Impact of Gene-Environment Interaction and Correlation on the Interpretation of Heritability. Acta Biotheoretica, 60(3), 225-237. https://doi.org/10.1007/s10441-011-9139-8

Galasso, C., Lo-Castro, A., El-Malhany, N., & Curatolo, P. (2010). "Idiopathic" mental retardation and new chromosomal abnormalities. Italian Journal of Pediatrics, 36, 17. https://doi.org/10.1186/1824-7288-36-17

Haworth, C. M. A., Wright, M. J., Martin, N. W., Martin, N. G., Boomsma, D. I., Bartels, M., Posthuma, D., Davis, O. S. P., Brant, A. M., Corley, R. P., Hewitt, J. K., Iacono, W. G., McGue, M., Thompson, L. A., Hart, S. A., Petrill, S. A., Lubinski, D., & Plomin, R. (2009). A Twin Study of the Genetics of High Cognitive Ability Selected from 11,000 Twin Pairs in Six Studies from Four Countries. Behavior Genetics, 39(4), 359-370. https://doi.org/10.1007/s10519-009-9262-3

Shakeshaft, N. G., Trzaskowski, M., McMillan, A., Krapohl, E., Simpson, M. A., Reichenberg, A., Cederloef, M., Larsson, H., Lichtenstein, P., & Plomin, R. (2015). Thinking positively: The genetics of high intelligence. Intelligence, 48, 123-132. https://doi.org/10.1016/j.intell.2014.11.005

Spain, S. L., Pedroso, I., Kadeva, N., Miller, M. B., Iacono, W. G., McGue, M., Stergiakouli, E., Smith, G. D., Putallaz, M., Lubinski, D., Meaburn, E. L., Plomin, R., & Simpson, M. A. (2016). A genome-wide analysis of putative functional and exonic variation associated with extremely high intelligence. Molecular Psychiatry, 21(8), 1145-1151. https://doi.org/10.1038/mp.2015.108

Urbach, P. (1974). Progress and Degeneration in the 'IQ Debate' (I). The British Journal for the Philosophy of Science, 25(2), 99-135. https://doi.org/10.1093/bjps/25.2.99

Probability from Heritability

Jay Lasker

August 14, 2020