- Multivariate models
- Bayesian methods
- Machine learning
- G x E interactions
October 26, 2018
Mixed models also enable us to evaluate multiple traits:
It preserves the same formulation
\[y = Xb+Zu+e\] However, we now stack the traits together:
\(y=\{y_1,y_2,...,y_k\}\), \(X=\{X_1, X_2, ... , X_k\}'\), \(b=\{b_1,b_2,...,b_k\}\), \(Z=\{Z_1, Z_2, ... , Z_k\}'\), \(u=\{u_1,u_2,...,u_k\}\), \(e=\{e_1,e_2,...,e_k\}\).
The multivariate variance looks nice at first \[ Var(y) = Var(u) + Var(e) \]
But can get ugly with a closer look:
\[ Var(u) = Z ( G \otimes \Sigma_a ) Z' = \left[\begin{array}{rr} Z_1 G Z_1'\sigma^2_{a_1} & Z_1 G Z_2'\sigma_{a_{12}} \\ Z_2 G Z_1' \sigma_{a_{21}} & Z_2 G Z_2'\sigma^2_{a_2} \end{array}\right] \]
and
\[ Var(e) = R \otimes \Sigma_e = \left[\begin{array}{rr} R\sigma^2_{e_1} & R\sigma_{e_1e_2} \\ R\sigma_{e_2e_1} & R\sigma^2_{e_2} \end{array}\right] \]
You can still think the multivariate mixed model as
\[y=Wg+e\]
Where
\[ y = \left[\begin{array}{rr} y_1 \\ y_2 \end{array}\right], W = \left[\begin{array}{rr} X_1 & 0 & Z_1 & 0 \\ 0 & X_2 & 0 & Z_2 \\ \end{array}\right], g = \left[\begin{array}{rr} b_1 \\ b_2 \\ u_1 \\ u_2 \end{array}\right], e = \left[\begin{array}{rr} e_1 \\ e_2 \end{array}\right] \]
Left-hand side (\(W'R^{-1}W+\Sigma\)) \[ \left[\begin{array}{rr} X_1'X_1\Sigma^{-1}_{e_{11}} & X_1'X_2\Sigma^{-1}_{e_{12}} & X_1'Z_1\Sigma^{-1}_{e_{11}} & X_1'Z_2\Sigma^{-1}_{e_{12}}\\ X_2'X_1\Sigma^{-1}_{e_{12}} & X_2'X_2\Sigma^{-1}_{e_{22}} & X_2'Z_1\Sigma^{-1}_{e_{12}} & X_2'Z_2\Sigma^{-1}_{e_{22}}\\ Z_1'X_1\Sigma^{-1}_{e_{11}} & Z_1'X_2\Sigma^{-1}_{e_{12}} & G^{-1}\Sigma^{-1}_{a_{11}} + Z_1'Z_1\Sigma^{-1}_{e_{11}} & G^{-1}\Sigma^{-1}_{a_{12}} + Z_1'Z_2\Sigma^{-1}_{e_{12}}\\ Z_2'X_1\Sigma^{-1}_{e_{12}} & Z_2'X_2\Sigma^{-1}_{e_{22}} & G^{-1}\Sigma^{-1}_{a_{11}} + Z_2'Z_1\Sigma^{-1}_{e_{12}} & G^{-1}\Sigma^{-1}_{a_{22}} + Z_2'Z_2\Sigma^{-1}_{e_{22}}\\ \end{array}\right] \] Right-hand side (\(W'R^{-1}y\)) \[ \left[\begin{array}{rr} X_1'(y_1\Sigma^{-1}_{e_{11}}+y_2\Sigma^{-1}_{e_{12}}) \\ X_2'(y_1\Sigma^{-1}_{e_{22}}+y_2\Sigma^{-1}_{e_{12}}) \\ Z_1'(y_1\Sigma^{-1}_{e_{12}}+y_2\Sigma^{-1}_{e_{11}}) \\ Z_2'(y_1\Sigma^{-1}_{e_{12}}+y_2\Sigma^{-1}_{e_{12}}) \\ \end{array}\right] \]
data(wheat, package = 'BGLR')
G = NAM::GRM(wheat.X)
Y = wheat.Y; colnames(Y) = c('E1','E2','E3','E4')
mmm = NAM::reml( y = Y, K = G )
knitr::kable( round(mmm$VC$GenCor,2) )
| E1 | E2 | E3 | E4 | |
|---|---|---|---|---|
| E1 | 1.00 | -0.25 | -0.22 | -0.50 |
| E2 | -0.25 | 1.00 | 0.96 | 0.55 |
| E3 | -0.22 | 0.96 | 1.00 | 0.72 |
| E4 | -0.50 | 0.55 | 0.72 | 1.00 |
mmm$VC$Vg
## E1 E2 E3 E4 ## E1 0.6277835 -0.1446924 -0.1102175 -0.2743640 ## E2 -0.1446924 0.5440731 0.4419945 0.2822577 ## E3 -0.1102175 0.4419945 0.3919626 0.3130735 ## E4 -0.2743640 0.2822577 0.3130735 0.4828705
mmm$VC$Ve
## E1 E2 E3 E4 ## E1 0.53504246 0.08247812 -0.1159118 0.06882868 ## E2 0.08247812 0.56214755 0.2973841 0.15795801 ## E3 -0.11591175 0.29738408 0.6714234 0.11086214 ## E4 0.06882868 0.15795801 0.1108621 0.59405228
biplot(princomp(mmm$VC$GenCor,cor=T),xlim=c(-.4,.4),ylim=c(-.11,.11))
The general framework on a hierarchical Bayesian model follows:
\[p(\theta|x)\propto p(x|\theta) p(\theta)\]
Where:
For the model:
\[y=Xb+Zu+e, \space\space\space u\sim N(0,K\sigma^2_a), \space e \sim N(0,I\sigma^2_e)\]
Probabilistic model:
\[p(b,u,\sigma^2_a,\sigma^2_e|y,X,Z,K) \propto N(y,X,Z,K|b,u,\sigma^2_a,\sigma^2_e) \times \space\space\space\space\space\space\space\space\space\space \space\space\space\space\space\space\space\space\space\space\space\space \space\space\space\space\space\space\space\space\space\space\space\space\space\space \space\space\space\space\space\space\space\space\space\space\space\space \space\space\space\space\space\space\space\space\space\space\space\space\space\space\space\space \space\space\space\space\space\space \\ N(b, u|\sigma^2_a,\sigma^2_e) \times \chi^{-2}(\sigma^2_a,\sigma^2_e|S_a,S_e,\nu_a,\nu_e)\]
REML: the priors \((S_a,S_e,\nu_a,\nu_e)\) are estimated from data.
Hierarchical Bayes: You provide priors. Here is how:
\[\sigma^2_a=\frac {u'K^{-1}u+S_a\nu_a} {\chi^2(q+\nu_a)}\]
sigma2a=(t(u)%*%iK%*%u+Sa*dfa)/rchisq(df=ncol(Z)+dfa,n=1)
\[\sigma^2_e=\frac {e'e+S_e\nu_e} {\chi^2(n+\nu_e)}\]
sigma2e=(t(e)%*%e+Se*dfe)/rchisq(df=length(y)+dfe,n=1)
What does it mean for you? If your "prior knowledge" tells you that a given trait has approximately \(h^2=0.5\) (nothing unreasonable). In which case, half of the phenotypic variance is due to genetics, and the other half is due to error. Your prior shape is:
\[S_a = S_e= \sigma^2_y\times0.5\]
We usually assign small a prior degrees of freeds. Samething like four or five prior degrees of freedom. That means that assuming \(\nu_0=5\), you are yielding to your model 5 data points that support heritability 0.5
\[\nu_a=\nu_e=5\]
Example of prior influence: In a dataset with 300 data points, 1.6% of the variance components information comes from prior (5/305), and 98.4% comes from data (300/305).
For whole-genome regression models
\[y=\mu+Ma+e, \space\space\space a\sim N(0,I\sigma^2_b), \space e\sim N(0,I\sigma^2_e)\]
We scale the prior genetic variance based on allele frequencies
\[ S_b = \frac{\sigma^2_y\times0.5}{2 \sum{p_j(1-p_j)} }\]
Two common settings:
L1-L2 machines include all mixed and Bayesian models we have seen so far. The basic framework is driven by a single (random) term model:
\[y=Xb+e\]
The univariate soltion indicates how the model is solved. A model without regularization yields the least square (LS) solution. If we regularize by deflating the nominator, we get the L1 regularization (LASSO). If we regularize by inflating the denominator, we get the L2 regularization (Ridge). For any combination of both, we get a elastic-net (EN). Thus:
\[b_{LS}=\frac{x'y}{x'x}, \space\space b_{Lasso}=\frac{x'y-\lambda}{x'x}, \space\space b_{Ridge}=\frac{x'y}{x'x+\lambda}, \space\space b_{EN}=\frac{x'y-\lambda_1}{x'x+\lambda_2}\]
Whereas the Bayesian and mixed model framework resolves the regularization as \(\lambda=\sigma^2_e/\sigma^2_b\), ML methods search for \(\lambda\) through (k-fold) cross-validation.
Common loss functions in L1-L2 machines
Other losses that are less popular
Cross-validations to search for best value of lambda
lasso = glmnet::cv.glmnet(x=wheat.X,y=rowMeans(Y),alpha=1); ridge = glmnet::cv.glmnet(x=wheat.X,y=rowMeans(Y),alpha=0); par(mfrow=c(1,2)); plot(ridge); plot(lasso)
Re-fit the model using this best value
lasso = glmnet::glmnet(x=wheat.X,y=rowMeans(Y),lambda=lasso$lambda.min,alpha=1) ridge = glmnet::glmnet(x=wheat.X,y=rowMeans(Y),lambda=ridge$lambda.min,alpha=0) par(mfrow=c(1,2)); plot(lasso$beta,main='LASSO'); plot(ridge$beta,main='RIDGE');
Of course, the losses presented above are not limited to the application of prediction and classification. Below, we see an example of deploying LASSO for a graphical model (Markov Random Field): How the traits of the multivariate model relate in terms of additive genetics:
ADJ=huge::huge(mmm$VC$GenCor,.3,method='glasso',verbose=F)$path[[1]] plot(igraph::graph.adjacency(adjmatrix=ADJ),vertex.label.cex=3)
Reproducing kernel Hilbert Spaces (RKHS), is a generalization of a GBLUP… Most commonly instead of using the linear kernel (\(ZZ'\alpha\)), RKHS commonly uses one or more Gaussian or exponential kernels:
\[K=\exp(-\theta D^2)\]
Where \(D^2\) is the squared Euclidean distance, and \(\theta\) is a bandwidth:
q=quantile(D2,0.05)We can use REML, PRESS (=cross-validation) or Bayesian approach to solve RKHS
# Make the kernel D2 = as.matrix(dist(wheat.X)^2) K = exp(-D2/mean(D2)) # Below we are going to calibrate models on Env 2 and predict Env 3 rkhs_press = NAM::press(y=Y[,2],K=K)$hat rkhs_reml = NAM::reml(y=Y[,2],K=K)$EBV rkhs_bgs = NAM::gibbs(y=Y[,2],iK=solve(K))$Fit.mean
## | | | 0% | | | 1% | |= | 1% | |= | 2% | |== | 2% | |== | 3% | |== | 4% | |=== | 4% | |=== | 5% | |==== | 5% | |==== | 6% | |==== | 7% | |===== | 7% | |===== | 8% | |====== | 8% | |====== | 9% | |====== | 10% | |======= | 10% | |======= | 11% | |======= | 12% | |======== | 12% | |======== | 13% | |========= | 13% | |========= | 14% | |========= | 15% | |========== | 15% | |========== | 16% | |=========== | 16% | |=========== | 17% | |=========== | 18% | |============ | 18% | |============ | 19% | |============= | 19% | |============= | 20% | |============= | 21% | |============== | 21% | |============== | 22% | |=============== | 22% | |=============== | 23% | |=============== | 24% | |================ | 24% | |================ | 25% | |================= | 25% | |================= | 26% | |================= | 27% | |================== | 27% | |================== | 28% | |=================== | 28% | |=================== | 29% | |=================== | 30% | |==================== | 30% | |==================== | 31% | |==================== | 32% | |===================== | 32% | |===================== | 33% | |====================== | 33% | |====================== | 34% | |====================== | 35% | |======================= | 35% | |======================= | 36% | |======================== | 36% | |======================== | 37% | |======================== | 38% | |========================= | 38% | |========================= | 39% | |========================== | 39% | |========================== | 40% | |========================== | 41% | |=========================== | 41% | |=========================== | 42% | |============================ | 42% | |============================ | 43% | |============================ | 44% | |============================= | 44% | |============================= | 45% | |============================== | 45% | |============================== | 46% | |============================== | 47% | |=============================== | 47% | |=============================== | 48% | |================================ | 48% | |================================ | 49% | |================================ | 50% | |================================= | 50% | |================================= | 51% | |================================= | 52% | |================================== | 52% | |================================== | 53% | |=================================== | 53% | |=================================== | 54% | |=================================== | 55% | |==================================== | 55% | |==================================== | 56% | |===================================== | 56% | |===================================== | 57% | |===================================== | 58% | |====================================== | 58% | |====================================== | 59% | |======================================= | 59% | |======================================= | 60% | |======================================= | 61% | |======================================== | 61% | |======================================== | 62% | |========================================= | 62% | |========================================= | 63% | |========================================= | 64% | |========================================== | 64% | |========================================== | 65% | |=========================================== | 65% | |=========================================== | 66% | |=========================================== | 67% | |============================================ | 67% | |============================================ | 68% | |============================================= | 68% | |============================================= | 69% | |============================================= | 70% | |============================================== | 70% | |============================================== | 71% | |============================================== | 72% | |=============================================== | 72% | |=============================================== | 73% | |================================================ | 73% | |================================================ | 74% | |================================================ | 75% | |================================================= | 75% | |================================================= | 76% | |================================================== | 76% | |================================================== | 77% | |================================================== | 78% | |=================================================== | 78% | |=================================================== | 79% | |==================================================== | 79% | |==================================================== | 80% | |==================================================== | 81% | |===================================================== | 81% | |===================================================== | 82% | |====================================================== | 82% | |====================================================== | 83% | |====================================================== | 84% | |======================================================= | 84% | |======================================================= | 85% | |======================================================== | 85% | |======================================================== | 86% | |======================================================== | 87% | |========================================================= | 87% | |========================================================= | 88% | |========================================================== | 88% | |========================================================== | 89% | |========================================================== | 90% | |=========================================================== | 90% | |=========================================================== | 91% | |=========================================================== | 92% | |============================================================ | 92% | |============================================================ | 93% | |============================================================= | 93% | |============================================================= | 94% | |============================================================= | 95% | |============================================================== | 95% | |============================================================== | 96% | |=============================================================== | 96% | |=============================================================== | 97% | |=============================================================== | 98% | |================================================================ | 98% | |================================================================ | 99% | |=================================================================| 99% | |=================================================================| 100%
par(mfrow=c(1,3))
plot(rkhs_press,Y[,3],main=paste('PRESS, cor =',round(cor(rkhs_press,Y[,3]),4) ))
plot(rkhs_reml,Y[,3],main=paste('REML, cor =',round(cor(rkhs_reml,Y[,3]),4) ))
plot(rkhs_bgs,Y[,3],main=paste('Bayes, cor =',round(cor(rkhs_bgs,Y[,3]),4) ))
RKHS for epistasis and variance component analysis
Ks = NAM::G2A_Kernels(wheat.X) # Get all sorts of linear kernels
FIT = BGLR::BGLR(rowMeans(Y),verbose=FALSE,
ETA=list(A=list(K=Ks$A,model='RKHS'),AA=list(K=Ks$A,model='RKHS')))
pie(c(Va=FIT$ETA$A$varU,Vaa=FIT$ETA$AA$varU,Ve=FIT$varE),main='Epistasis')
For the same task (E2 predict E3), let's check members of the Bayesian alphabet
fit_BRR = bWGR::wgr(Y[,2],wheat.X); cor(c(fit_BRR$hat),Y[,3])
## [1] 0.58123
fit_BayesB = bWGR::BayesB(Y[,2],wheat.X); cor(fit_BayesB$hat,Y[,3])
## [1] 0.5345673
fit_emBayesA = bWGR::emBA(Y[,2],wheat.X); cor(fit_emBayesA$hat,Y[,3])
## [1] 0.6388318
Tree regression and classifiers
fit_tree = party::ctree(y~.,data.frame(y=Y[,2],wheat.X)); plot(fit_tree)
cor(c(fit_tree@predict_response()),Y[,3])
## [1] 0.265622
fit_rf = ranger::ranger(y~.,data.frame(y=Y[,2],wheat.X)) plot(fit_rf$predictions,Y[,3],xlab='RF predictions from E2',ylab='Yield E3',pch=20)
cor(fit_rf$predictions,Y[,3])
## [1] 0.4022794
y=as.vector(wheat.Y); Z=wheat.X; Zge=as.matrix(Matrix::bdiag(Z,Z,Z,Z)) # fit_g = bWGR::BayesRR(rowMeans(wheat.Y),Z) fit_ge = bWGR::BayesRR(y,Zge) fit_gge = bWGR::BayesRR2(y,rbind(Z,Z,Z,Z),Zge) # fit_g$h2
## [1] 0.4534988
fit_ge$h2
## [1] 0.68082
fit_gge$h2
## [1] 0.6715808
GE1=matrix(fit_ge$hat,ncol=4); GE2=matrix(fit_ge$hat,ncol=4) plot(data.frame(G=fit_g$hat,GE=rowMeans(GE1),G_and_GE=rowMeans(GE2)),main='GEBV across E')
par(mfrow=c(1,3)) plot(fit_g$hat,rowMeans(Y),main=round(cor(fit_g$hat,rowMeans(Y)),4),xlab='G model',ylab='Observed',pch=20) plot(c(GE1),y,main=round(cor(c(GE1),y),4),xlab='GE model',ylab='Observed',pch=20) plot(c(GE2),y,main=round(cor(c(GE2),y),4),xlab='G+GE model',ylab='Observed',pch=20)