print(Sys.time())
## [1] "2021-04-02 17:23:36 CST"
\(y|e,\mathbf{\beta}=\mathbf{X}\beta+e\)
\(e|\sigma^2_e~N(0,\sigma^2I_N)\)
\(\beta|\sigma^2_g~N(\frac{\sigma^2_g}{M},I_M)\)
\(y\) is centered.
\[ \begin{align} cov(y)&=\mathbb{E}(yy^T)-\mathbb{y}\mathbb{y}^T\\ &=\sigma^2_a\mathbf{\Omega}_a+\sigma^2_eI_N \end{align} \]
The target to be minimized is
\(Q=argmin_{(\sigma^2_a,\sigma^2_e)}|| yy^T-(\sigma^2_a\mathbf{\Omega}+\sigma^2_eI_N)||_F^2\)
Here the subscript \(_{F}\) is for Frobenius norm of a matrix \(\mathbf{A}\), \(||\mathbf{A}||_F=\sqrt{tr(\mathbf{AA}^T)}\).
\[ \begin{align} Q&=argmin_{(\sigma^2_a,\sigma^2_e)}|| yy^T-(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_eI_N)||_F^2\\ &=tr\{[yy^T-(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_eI_N)][yy^T-(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_eI_N)]^T\} \end{align} \]
An ordinary least squares (OLS) problem of \(Q\) is as below
\[ \left\{ \begin{array}{lll} \frac{\partial{Q}}{\partial{\sigma^2_a}} & =tr\{\frac{yy^Tyy^T-2(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_eI_N)yy^T+(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_eI_N)^2}{\partial{\sigma^2_a}}\}&=tr\{\sigma^2_a\mathbf{\Omega}_a\mathbf{\Omega}_a^T+\sigma^2_e\mathbf{\Omega}_a-yy^T\mathbf{\Omega}_a\}=0\\ \frac{\partial{Q}}{\partial{\sigma^2_e}} & =tr\{\frac{yy^Tyy^T-2(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_eI_N)yy^T+(\sigma^2_g\mathbf{\Omega}_a+\sigma^2_eI_N)^2}{\partial{\sigma^2_e}}\}&=tr\{\sigma^2_a\mathbf{\Omega}_a+\sigma^2_e\mathbf{I}_N-yy^T\mathbf{I}_N\}=0 \end{array} \right. \]
\[ \left\{ \begin{array}{lll} tr\{\sigma^2_a\mathbf{\Omega}_a\mathbf{\Omega}_a^T+\sigma^2_e\mathbf{\Omega}_a\}=tr\{yy^T\mathbf{\Omega}_a\}=tr\{y^T\mathbf{\Omega}_ay\}\\ tr\{\sigma^2_a\mathbf{\Omega}_a+\sigma^2_e\mathbf{I}_N\}=tr\{yy^T\mathbf{I}_N\}=tr\{y^T\mathbf{I}_Ny\} \end{array} \right. \] The last step uses cycling property of \(trace\).
The above result can be orgainzed in to normal equation below
\[ \begin{bmatrix} tr[\mathbf{\Omega}_a\mathbf{\Omega}_a^T] & tr[\mathbf{\Omega}_a] \\ tr[\mathbf{\Omega}_a] & N \end{bmatrix} \left[ \begin{array}{c} \hat{\sigma}^2_a \\ \hat{\sigma}^2_e \end{array} \right] = \left[ \begin{array}{c} tr[y^T\mathbf{\Omega}_ay] \\ tr[y^TI_Ny] \end{array} \right] \]
It is easy to find that
\[\color{green}{\boxed{\hat{\sigma}^2_a=\frac{y^T(\mathbf{\Omega}_a-I_N)y}{tr[\mathbf{\Omega}_a\mathbf{\Omega}_a^T]-N}}}\]
Notes:, the \(y^T(\mathbf{\Omega}-I_N)y\) is quadratic form wiki link.
\(var[y^T(\mathbf{\Omega}_a-I_N)y]=2tr[H(\mathbf{\Omega}_a-I_N)H(\mathbf{\Omega}_a-I_N)]\)
in which \(H\) can be replaced by the expected correlation matrix for \(y\), here is \(I_N\) for unrelated individuals. So,
\(var[y^T(\mathbf{\Omega}_a-I_N)y]=2tr[(\mathbf{\Omega}_a-I_N)(\mathbf{\Omega}_a-I_N)]=2tr[\mathbf{\Omega}^2_a-2\mathbf{\Omega}_aI_N-I_N^2]=2[tr(\mathbf{\Omega}_a^2)-N]=2N^2M_e\). In addition, \(E[y^T(\mathbf{\Omega}_a-I_N)y]=[tr(\mathbf{\Omega}_a^2)-N]\sigma^2_a\).
sourceCpp("~/git/Notes/R/RLib/Shotgun.cpp")
M=10000
N=200
SIMU=50
frq=runif(M, 0.2, 0.8)
Dp=c(runif(M/2, -0.95, -0.5), runif(M/2, 0.5, 0.95))
Dp=Dp[1:(M-1)]
Amat=matrix(0, SIMU, 2)
gMat=GenerateGenoDprimeRcpp(frq, N=N, Dp = Dp)
sG=scale(gMat)
K=sG%*%t(sG)/(M-1)
Me=1/var(K[row(K)<col(K)])
I=diag(1, N, N)
KI=K-I
for(i in 1:SIMU) {
y=scale(rnorm(N))
yy=y%*%t(y)
yy_KI=yy%*%KI
Amat[i,1]=t(y)%*%KI%*%y
ss=I%*%KI%*%I%*%KI
# ss=KI%*%KI
Amat[i,2]=2*sum(diag(ss))
}
print(paste("Obs mean:", mean(Amat[,1]), " vs expected mean 0"))
## [1] "Obs mean: -0.496725938666893 vs expected mean 0"
print(paste("Obs var:", var(Amat[,1]), "vs expected var:", mean(Amat[,2]), "vs expected var (eq2):", N^2/Me*2))
## [1] "Obs var: 16.4799793466223 vs expected var: 15.7523433069965 vs expected var (eq2): 13.74480721206"
n=500 #sample size
m=1000 #marker
h2=0.3 #heritability
b=rnorm(m, 0, sqrt(h2/m)) #effect
SIMU=5
#simu g
fq=runif(m, 0.1, 0.5)
x=matrix(0, n, m)
for(i in 1:m) {
x[,i]=rbinom(n, 2, fq[i])
}
sx=apply(x, 2, scale)
K=sx%*%t(sx)/m
me=var(K[col(K)<row(K)])
#simu y
H2=matrix(0, SIMU, 2)
for(i in 1:SIMU) {
y=sx%*%b+rnorm(n, 0, sqrt(1-h2))
yy=y%*%t(y)
h2Mod=lm(yy[col(yy)<row(yy)]~K[col(yy)<row(yy)])
H2[i,1]=summary(h2Mod)$coefficients[2,1]
ss=matrix(0, m, 5)
for(j in 1:m) {
mod=lm(y~x[,j])
ss[j,1:4]=summary(mod)$coefficient[2,]
}
ss[,5]=ss[,3]^2
H2[i,2]=((mean(ss[,5])-1)*n)/(n*n*me)
}
barplot(t(H2), beside = T)
abline(h=h2)
\(y|e,\mathbf{\beta}=\mathbf{X}_a\beta_a+\mathbf{X}_d\beta_d+e\)
\(e|\sigma^2_e~N(0,\sigma^2I_N)\)
\(\beta_a|\sigma^2_a~N(\frac{\sigma^2_a}{M},I_M)\)
\(\beta_d|\sigma^2_d~N(\frac{\sigma^2_d}{M},I_M)\)
\(y\) is centered.
\[ \begin{align} cov(y)&=\mathbb{E}(yy^T)-\mathbb{y}\mathbb{y}^T\\ &=\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N \end{align} \]
When include dominace variance component, the target to be minimized is
\(Q=argmin_{(\sigma^2_g,\sigma^2_d,\sigma^2_e)}|| yy^T-(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)||_F^2\)
To minimize the target function below
\[ \begin{align} Q&=argmin_{(\sigma^2_a,\sigma^2_d,\sigma^2_e)}|| yy^T-(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)||_F^2\\ &=tr\{[yy^T-(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)][yy^T-(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)]^T\} \end{align} \] After taking deriation in repect to each component, we have \[ \left\{ \begin{array}{lll} \frac{\partial{Q}}{\partial{\sigma^2_a}} & =tr\{\frac{yy^Tyy^T-2(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)yy^T+(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)^2}{\partial{\sigma^2_a}}\}&=tr\{\sigma^2_a\mathbf{\Omega}_a\mathbf{\Omega}_a^T+\sigma^2_d\mathbf{\Omega}_a\mathbf{\Omega}_d^T+\sigma^2_e\mathbf{\Omega}_a-yy^T\mathbf{\Omega}_a\}=0\\ \frac{\partial{Q}}{\partial{\sigma^2_d}} & =tr\{\frac{yy^Tyy^T-2(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)yy^T+(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)^2}{\partial{\sigma^2_d}}\}&=tr\{\sigma^2_d\mathbf{\Omega}_d\mathbf{\Omega}_a^T+\sigma^2_d\mathbf{\Omega}_d\mathbf{\Omega}_d^T+\sigma^2_e\mathbf{\Omega}_a-yy^T\mathbf{\Omega}_d\}=0\\ \frac{\partial{Q}}{\partial{\sigma^2_e}} & =tr\{\frac{yy^Tyy^T-2(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)yy^T+(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)^2}{\partial{\sigma^2_e}}\}&=tr\{\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_e\mathbf{I}_N-yy^T\mathbf{I}_N\}=0 \end{array} \right. \] then \[ \left\{ \begin{array}{lll} tr\{\sigma^2_a\mathbf{\Omega}_a\mathbf{\Omega}_a^T+\sigma^2_a\mathbf{\Omega}_a\mathbf{\Omega}_d^T+\sigma^2_e\mathbf{\Omega}_a\}=tr\{yy^T\mathbf{\Omega}_a\}=tr\{y^T\mathbf{\Omega}_ay\}\\ tr\{\sigma^2_d\mathbf{\Omega}_d\mathbf{\Omega}_a^T+\sigma^2_d\mathbf{\Omega}_d\mathbf{\Omega}_d^T+\sigma^2_e\mathbf{\Omega}_d\}=tr\{yy^T\mathbf{\Omega}_d\}=tr\{y^T\mathbf{\Omega}_dy\}\\ tr\{\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_e\mathbf{I}_N\}=tr\{yy^T\mathbf{I}_N\}=tr\{y^T\mathbf{I}_Ny\} \end{array} \right. \]
We need to solve the equation below \[ \begin{bmatrix} tr[\mathbf{\Omega}_a\mathbf{\Omega}_a^T] & tr[\mathbf{\Omega}_a\mathbf{\Omega}_d^T]& tr[\mathbf{\Omega}_a] \\ tr[\mathbf{\Omega}_d\mathbf{\Omega}_a^T] & tr[\mathbf{\Omega}_d\mathbf{\Omega}_d^T] & tr[\mathbf{\Omega}_d] \\ tr[\mathbf{\Omega}_a] & tr[\mathbf{\Omega}_d] & N \end{bmatrix} \left[ \begin{array}{c} \hat{\sigma}^2_a \\ \hat{\sigma}^2_d \\ \hat{\sigma}^2_e \end{array} \right] = \left[ \begin{array}{c} tr[y^T\mathbf{\Omega}_ay] \\ tr[y^T\mathbf{\Omega}_dy]\\tr[y^TI_Ny] \end{array} \right] \] Of note, \(E[tr(\mathbf{\Omega})_a]=E[tr(\mathbf{\Omega})_d]=N\).
\[ \left[ \begin{array}{c} \hat{\sigma}^2_a \\ \hat{\sigma}^2_d \\ \hat{\sigma}^2_e \\ \end{array} \right ]=\Lambda^{-1} \left[ \begin{array}{c} y^T\mathbf{\Omega}_ay \\ y^T\mathbf{\Omega}_dy\\ y^TI_Ny \end{array} \right] \]
In particular, \[ \begin{align} \hat{\sigma}^2_a&=\frac{(ei-fh)y^T\mathbf{\Omega}_ay-(bi-ch)y^T\mathbf{\Omega}_dy+(bf-ce)y^TI_Ny}{det(\mathbf{\Lambda})}\\ &=\frac{1}{det(\mathbf{\Lambda})}\{[Ntr(\mathbf \Omega^2_d)-N^2]y^T\mathbf{\Omega}_ay-[Ntr(\mathbf \Omega^2_d)-N^2]y^TI_Ny\}\\ &=\frac{1}{det(\mathbf{\Lambda})}\{[Ntr(\mathbf \Omega^2_d)-N^2]y^T(\mathbf{\Omega}_a-I_N)y\} \end{align} \],
in which \[ \begin{align} det(\mathbf{\Lambda})&=abd+bfg+cdh-ceg-fha-ibd\\ &=\color{red}{tr(\mathbf{\Omega}^2_a}\color{blue}{\mathbf{\Omega}^2_d})N+N^3+N^3-N^2tr(\mathbf{\Omega}^2_a)-N^2tr(\mathbf{\Omega}^2_d)-N^3\\ &=N[\color{red}{tr(\mathbf{\Omega^2}_a} \color{blue}{\mathbf{\Omega^2}_d)}-tr(\mathbf{\Omega^2}_a)N-tr(\mathbf{\Omega}^2_d)N+N^2] \end{align} \].
Under orthogonal coding for additive and dominance scheme, \(\color{red}{tr(\mathbf{\Omega}^2_a}\color{blue}{\mathbf{\Omega}^2_d)}=\color{red}{tr(\mathbf{\Omega}^2_a)}\color{blue}{tr{(\mathbf{\Omega}^2_d)}}\), and \(det(\mathbf {\Lambda})=N[\color{red}{tr(\mathbf{\Omega}^2_a)}-N][\color{blue}{tr(\mathbf{\Omega}^2_d)}-N]\)
Then, \(\hat{\sigma}^2_a=\frac{y^T[\mathbf{\Omega}_a-I_N]y}{tr(\mathbf{\Omega}^2_a)-N)}\).
Similarly for, \(\hat{\sigma}^2_d=\frac{y^T[\mathbf{\Omega}_d-I_N]y}{tr(\mathbf{\Omega}^2_d)-N)}\).
n=500 #sample size
m=1000 #marker
h2=0.3 #heritability
h2d=0.3
b=rnorm(m, 0, sqrt(h2/m)) #effect
d=rnorm(m, 0, sqrt(h2d/m))
SIMU=5
#simu g
fq=runif(m, 0.3, 0.5)
x=matrix(0, n, m)
for(i in 1:m) {
x[,i]=rbinom(n, 2, fq[i])
}
FQ=colMeans(x)/2
sA=apply(x, 2, scale)
K=sA%*%t(sA)/m
me=var(K[col(K)<row(K)])
##dominance
xd=matrix(0, n, m)
for(i in 1:m) {
cd=c(0, 2*FQ[i], 4*FQ[i]-2)
xd[,i]=cd[x[,i]+1]
}
dCt=matrix(rep(2*FQ^2, n), n, m, byrow = T)
dV=matrix(rep(sqrt(4*FQ^2*(1-FQ)^2), n), n, m, byrow = T)
sD=(xd-dCt)/dV
Kd=sD%*%t(sD)/m
med=var(Kd[col(K)<row(K)])
Dm=ifelse(x==1, 1, 0)
#simu y
H2=matrix(0, SIMU, 4)
for(i in 1:SIMU) {
y=x%*%b+Dm%*%d
vy=var(y)
y=y+rnorm(n, 0, sqrt(vy/(h2+h2d)*(1-h2-h2d)))
y=scale(y)
yy=y%*%t(y)
h2Mod=lm(yy[col(yy)<row(yy)]~K[col(yy)<row(yy)]+Kd[col(yy)<row(yy)])
H2[i,1]=summary(h2Mod)$coefficients[2,1]
H2[i,2]=summary(h2Mod)$coefficients[3,1]
ss=matrix(0, m, 5)
ssd=matrix(0, m, 5)
for(j in 1:m) {
mod=lm(y~sA[,j]+sD[,j])
ss[j,1:4]=summary(mod)$coefficient[2,]
ssd[j,1:4]=summary(mod)$coefficient[3,]
}
ss[,5]=ss[,3]^2
H2[i,3]=((mean(ss[,5])-1)*n)/(n*n*me)
ssd[,5]=ssd[,3]^2
H2[i,4]=((mean(ssd[,5])-1)*n)/(n*n*med)
}
barplot(t(H2), beside = T)
abline(h=c(h2, h2d))
After rearrangement, it is easy to see that \(y^T\mathbf{\Omega}y=\frac{1}{M}y^T\mathbf{SS}^Ty\), and \(E(\frac{1}{M}y^T\mathbf{SS}^Ty)=NE(\chi^2_{1,h^2})=N(1+N\frac{h^2}{M_Q}\sum_{m=1}^Mr^2_{k,m})\). **It is available as summary statistics!**. If further adjustment for summary statistics is requested, a modified LSE can be used to ajust summary statistics (see LSE phenome).
It is easy to see \(E(y^TI_Ny)=N\), so the expectation of the numerator is \(N^2\frac{h^2}{M_q}\sum^{M_{q}}_{m=1}r^2_{k,m}=N^2h^2E(\bar{r}^2)\), in which \(M_q\) is the number of causal variants.
The denominator can be derived that \(tr[\mathbf{\Omega}^2]=N^2E^2(\rho)+N\), in which \[E^2(\rho)=\frac{\sum_{k_1=1}^M\sum_{k_2=1}^M \rho^2_{k_1k_2}}{M^2} = \frac{\sum_{k=1}^M1+\sum_{k_1=1}^M\sum_{k_2 \ne k_1}^M \rho^2_{k_1k_2}}{M^2} \geq \frac{1}{M}\]
Or, \[\frac{[E(\sum_i^M\chi^2_1)-1]N}{N^2/M_e}\] Analogously, the dominance variance is estimated as \[\color{cyan}{\boxed{\frac{[E(\sum_i^M\chi^2_{1.dom})-1]N}{N^2/M_{e.dom}}}}\]
A realized example please refer to \(\color{red} {/public/home/xuhm/gc5k/UKB\_cohort/ss}\), \(\color{green} {/public/home/xuhm/gc5k/UKB\_cohort/ss\_adj}\)
library("knitr")
library("kableExtra")
ch2=read.table("cohortH2.txt", as.is = T, header = T)
knitr::kable(ch2, caption = "cohort h2") %>%
kable_styling("striped", full_width = T) %>%
row_spec(row=16:16, color="white", background="red")
| Cohort | N | Me | chisq | h2 | chisq.1 | h2.adjust_sex. |
|---|---|---|---|---|---|---|
| Barts | 7699 | 51936.00 | 1.116 | 0.794 | 1.2230 | 1.5222 |
| Birmingham | 19529 | 55594.92 | 1.100 | 0.287 | 1.2102 | 0.6050 |
| Bristol | 39172 | 56774.84 | 1.218 | 0.320 | 1.4353 | 0.6380 |
| Bury | 25404 | 55092.41 | 1.154 | 0.338 | 1.3046 | 0.6686 |
| Cardiff | 16226 | 52069.39 | 1.111 | 0.361 | 1.2325 | 0.7549 |
| Croydon | 18475 | 57405.47 | 1.107 | 0.337 | 1.2260 | 0.7099 |
| Edinburgh | 15441 | 53041.79 | 1.111 | 0.386 | 1.2156 | 0.7490 |
| Glasgow | 16299 | 51940.17 | 1.156 | 0.502 | 1.2900 | 0.9351 |
| Hounslow | 18420 | 56773.33 | 1.133 | 0.414 | 1.2382 | 0.7427 |
| Leeds | 39651 | 56474.83 | 1.242 | 0.349 | 1.4597 | 0.6623 |
| Liverpool | 29604 | 54490.45 | 1.183 | 0.341 | 1.3665 | 0.6826 |
| Manchester | 11502 | 53621.53 | 1.099 | 0.468 | 1.1650 | 0.7782 |
| Middlesborough | 19819 | 53409.79 | 1.133 | 0.362 | 1.2228 | 0.6074 |
| Newcastle | 34663 | 54675.71 | 1.179 | 0.286 | 1.3564 | 0.5686 |
| Nottingham | 30760 | 56132.08 | 1.173 | 0.319 | 1.3407 | 0.6287 |
| Oxford | 12260 | 55143.14 | 1.072 | 0.328 | 1.1345 | 0.6112 |
| Reading | 25886 | 58027.47 | 1.109 | 0.247 | 1.2400 | 0.5440 |
| Sheffield | 27745 | 55244.01 | 1.145 | 0.293 | 1.3013 | 0.6069 |
| Stoke | 18211 | 51302.58 | 1.114 | 0.326 | 1.2534 | 0.7220 |
/public/home/xuhm/gc5k/UKB_cohort/ss_chr
library("knitr")
library("kableExtra")
ch2=read.table("oxChr.txt", as.is = T, header = T)
knitr::kable(ch2, caption = "Oxford h2") %>%
kable_styling("striped", full_width = T) %>%
row_spec(row=6:6, bold=T, color = "white", background = "red")
| Chr | M | Me | chisq | h2 |
|---|---|---|---|---|
| 1 | 32913 | 7204.5646 | 1.0755 | 0.04480 |
| 2 | 33622 | 6403.3527 | 1.0321 | 0.01700 |
| 3 | 28040 | 5839.5143 | 1.0861 | 0.04140 |
| 4 | 26836 | 5632.8855 | 1.0545 | 0.02530 |
| 5 | 25243 | 5368.6945 | 1.0443 | 0.01960 |
| 6 | 30405 | 857.1236 | 1.3061 | 0.02170 |
| 7 | 23260 | 5020.4250 | 1.0346 | 0.01433 |
| 8 | 21530 | 3909.6782 | 1.0736 | 0.02370 |
| 9 | 18799 | 4277.3328 | 1.0470 | 0.01660 |
| 10 | 21313 | 4328.9556 | 1.0324 | 0.01160 |
| 11 | 20746 | 3269.6384 | 1.0349 | 0.00940 |
| 12 | 20194 | 4213.4118 | 1.0714 | 0.02480 |
| 13 | 14769 | 3508.2313 | 1.0363 | 0.01050 |
| 14 | 13741 | 3106.6733 | 1.0716 | 0.01830 |
| 15 | 13502 | 2872.9585 | 1.0998 | 0.02360 |
| 16 | 14848 | 3146.7089 | 1.0180 | 0.00470 |
| 17 | 14267 | 1796.0199 | 1.0360 | 0.00530 |
| 18 | 12880 | 3243.4674 | 1.0560 | 0.01500 |
| 19 | 12394 | 2346.5845 | 1.0938 | 0.01820 |
| 20 | 11181 | 2683.5395 | 1.0824 | 0.01820 |
| 21 | 6532 | 2683.5395 | 1.0076 | 0.00170 |
| 22 | 7087 | 1597.1183 | 1.0161 | 0.00210 |
\(\mathbf{\Omega}=\mathbf{s^Ts}\), in which \(\mathbf{s}\) as defined in the last page. In particular, for a pair of individuals \(i\) and \(j\), their relatedness is \(\Omega_{ij}=\frac{1}{M}\sum_{k=1}^Ms_{i,k}s_{j,k}=\frac{1}{M}\sum_{k=1}^Ms_{i,k}s_{j,k}\)
\(\mathbf{\Omega}\) is a Hermitian matrix (symetric), and the diagonal element is \(\Omega_{ii}=\frac{1}{M}\sum_{k=1}^Ms_{i,k}^2\).
It exist simple relationship that \(M_e=var(\Omega_{off-diag})=\frac{M+\sum_{l_1 \ne l_2}r_{l_1l_2}^2}{M^2}=\), and \(E(V(M_e))=\frac{2M_e}{n^2}\)
sourceCpp("~/git/Notes/R/RLib/Shotgun.cpp")
M=1000
N=100
frq=runif(M, 0.2, 0.8)
Dp=c(runif(M/2, -0.95, -0.5), runif(M/2, 0.5, 0.95))
Dp=Dp[1:(M-1)]
Amat=matrix(0, 100, 2)
MeMat=matrix(0, 100, 1)
for(i in 1:100) {
gMat=GenerateGenoDprimeRcpp(frq, N=N, Dp = Dp)
sG=scale(gMat)
K=sG%*%t(sG)/(M-1)
MeMat[i,1]=1/var(K[row(K)>col(K)])
}
print(paste("Me:", mean(MeMat)))
## [1] "Me: 590.274088578811"
print(paste("V(Me):", sd(MeMat), "E[V(Me)]:", (2/N*mean(MeMat))))
## [1] "V(Me): 12.0302061579735 E[V(Me)]: 11.8054817715762"
\(E(s_{ii}^2)=diag(\Omega_{ii})=1\), interpreting that an individual is identical to himself under random mating (no inbred, otherwise \(1+F\)).
\[\begin{align} E(\Omega_{ii}^2) &= \frac{1}{M^2}[\color{red} {\sum_{k=1}^M s_{ik}^4}+\color{green}{\sum_{k_1}\sum_{k_1 \ne k_2} s_{i,k_1}^2s_{i,k_2}^2}] \\ &=\frac{1}{M^2}\{\color{red} {\sum_{k=1}^M \frac{1}{2p_kq_k}}+\color{green}{\sum_{k_1}\sum_{k_1 \ne k_2} [(1+\rho^2_{k_1,k_2})+\frac{\rho_{k_1,k_2}(p_{k_1}-q_{k_1})(p_{k_2}-q_{k_2})}{2\sqrt{p_{k_1}q_{k_1}p_{k_2}q_{k_2}}}]}\} \end{align}\]
Under large sample and many loci, \(E(\frac{\rho_{k_1,k_2}(p_{k_1}-q_{k_1})(p_{k_2}-q_{k_2})}{2\sqrt{p_{k_1}q_{k_1}p_{k_2}q_{k_2}}})=0\)
So, we have
\[\begin{align} E(\Omega_{ii}^2)&=\frac{1}{M^2}[M\frac{1}{E(\sigma_k^2)}+M(M-1)(1+E(\rho^2_{k_1k_2}))]\\ &=\frac{1}{E(\sigma_k^2)}\frac{1}{M}+\frac{M-1}{M}[1+\color{red}{E(\rho^2_{k_1k_2})}] \end{align}\]
\(\Omega_{ij}=\frac{1}{M}\sum_{k=1}^Ms_{ik}s_{jk}\)
and \(\Omega_{ij}^2=\frac{1}{M^2}[\sum_{k=1}^Ms^2_{ik}s^2_{jk}+\sum_{k_1=1}^M\sum_{k_2=1}^Ms_{ik_1}s_{ik_2}s_{jk_1}s_{jk_2}]\)
in which \(E(s_{ik}^2s_{jk}^2)=E(s_{ik}^2)E(s_{jk}^2)\), if the individuals are not related.
\(E(s_{ik_1}s_{ik_2}s_{jk_1}s_{jk_2})=E(s_{ik_1}s_{ik_2})E(s_{jk_1}s_{jk_2})=E(\rho_{i,k_1k_2})E(\rho_{j,k_1k_2})=E^2(\rho_{k_1k_2})\).
So
\[E(\mathbf{\Omega}_{ij}^2)=\frac{1}{M}+\frac{M-1}{M}\color{green}{E^2(\rho_{k_1k_2})}\]
\[E(s^2_{ii,k})=1\]
\[E(\color{red} {s^4_{ii,k}})=\frac{1}{2p_kq_k}\] The \(\color{red} {s^4_{ik} = \frac{1}{2p_kq_k}}\), in which \(2p_kq_k\) is the expected value of the variance of the \(k^{th}\) locus under random mating. \(\color{red} {s^4_{ik}}\) can be derived as below,
| \(a_ka_k\) | \(A_ka_k\) | \(A_kA_k\) | |
|---|---|---|---|
| \(q_k^2\) | \(2p_kq_k\) | \(p_k^2\) | |
| \(\Omega_{ii}=s^2_{ii}\) | \(\frac{4p_k^2}{2p_kq_k}\) | \(\frac{(1-2p_k)^2}{2p_kq_k}\) | \(\frac{4q_k^2}{2p_kq_k}\) |
| \(\Omega_{ii}^2=s^4_{ii}\) | \(\frac{16p_k^4}{4p_k^2q_k^2}\) | \(\frac{(q_k-p_k)^4}{4p_k^2q_k^2}\) | \(\frac{16p_k^4}{4p_k^2q_k^2}\) |
\[var(s_{ii}^2)=E(s_{ii}^4)-E^2(s_{ii}^2)=\frac{1}{2p_kq_k}-1\]
library(Rcpp)
sourceCpp("~/git/Notes/R/RLib/Shotgun.cpp")
source("~/git/Notes/R/RLib/shotgun.R")
M=1000
N=500
frq=runif(M, 0.1, 0.5)
Dp=sample(c(runif(M/2, 0, .5), runif(M/2, -0.5, 0)), M)
Dp=Dp[-1]
G=GenerateGenoDprimeRcpp(frq, Dp, N)
s=apply(G, 2, scale)
#Kcpp=CorMatrixRcpp(s)
FQ=colMeans(G)/2
s2=s^2
s4=s^4
layout(matrix(1:3, 1, 3))
hist(main="Simulation test for s^2", colMeans(s2), xlab="E(s^2)")
plot(main="Simulation test for s^4", 1/(2*FQ*(1-FQ)), colMeans(s4), pch=16, cex=0.5, xlab="1/(2pq)", ylab="Simulated s^4")
abline(a=0, b=1, col="red", lwd=2, lty=2)
vs2=apply(s2, 2, var)
plot(main="Siulation test for var(s^2)",1/(2*FQ*(1-FQ))-1, vs2, pch=16, cex=0.5, xlab="1/(2pq)-1", ylab="Simulated var(s^2)")
abline(a=0, b=1, col="red", lwd=2, lty=2)
#K
K=1/M * s %*% t(s)
\[\begin{align} E(\color{green}{s_{ii,k_1}^2s_{ii,k_2}^2})&= \color{purple}{q^2_{k_1}\frac{4p_{k_1}^2}{2p_{k_1}q_{k_1}} [\frac{4p_{k_2}^2r^2_{k_1k_2}+(q_{k_2}-p_{k_2})^2 2r_{k_1k_2}\bar{r}_{k_1k_2} + 4q_{k_2}^2\bar{r}_{k_1k_2}^2}{2p_{k_2}q_{k_2}}]}\\ &+2p_{k_1k_2}\frac{(q_{k_1}-p_{k_2})^2}{2p_{k_1k_2}}[\frac{4p_{k_2}^2r_{k_1k_2}\bar{R}_{k_1k_2}+(q_{k_2}-p_{k_2})^2 (\bar{r}_{k_1k_2}\bar{R}_{k_1k_2}+r_{k_1k_2}R_{k_1k_2}) + 4q_{k_2}^2\bar{r}_{k_1k_2}R_{k_1k_2}}{2p_{k_2}q_{k_2}}]\\ &+\color{purple}{p^2_{k_1}\frac{4q_{k_1}^2}{2p_{k_1}q_{k_1}}[\frac{4p_{k_2}^2\bar{R}^2_{k_1k_2}+(q_{k_2}-p_{k_2})^2 2R_{k_1k_2}\bar{R}_{k_1k_2} + 4q_{k_2}^2R_{k_1k_2}^2}{2p_{k_2}q_{k_2}}]}\\ &=\color{purple}{2p_{k_1}q_{k_1}[\frac{4p^2_{k_2}(r^2_{k_1k_2}+\bar{R}_{k_1k_2})+(q_{k_2}-p_{k_2})^2(2r_{k_1k_2}\bar{r}_{k_1k_2}+2R_{k_1k_2}\bar{R}_{k_1k_2})+4q^2_{k_2}(\bar{r}^2_{k_1k_2}+R^2_{k_1k_2})}{2p_{k_2}{k_2}}]}\\ &+(1-4p_{k_1}q_{k_1})\frac{4p^2_{k_2}r_{k_1k_2}\bar{R}_{k_1k_2}+(q_{k_2}-p_{k_2})^2(\bar{r}_{k_1}{k_2}\bar{R}_{k_1}{k_2}+r_{k_1}{k_2}R_{k_1}{k_2})}{2p_{k_2}q_{k_2}})\\ &=1+\frac{D_{k_1k_2}[(p_{k_1}-q_{k_1})(p_{k_2}-q_{k_2})]}{2p_{k_1}q_{k_1}p_{k_2}q_{k_2}}-\frac{D^2_{k_1k_2}}{p_{k_1}q_{k_1}p_{k_2}q_{k_2}}\\ &=1+\rho^2_{k_1k_2}+\frac{\rho_{k_1k_2}(p_{k_1}-q_{k_1})(p_{k_2}-q_{k_2})}{2\sqrt{p_{k_1}q_{k_1}p_{k_2}q_{k_2}}} \end{align}\]
And consequently, we can derive \[\begin{align} cov(s_{ii,k_1}^2,s_{ii,k_2}^2)&=E(s_{i,k_1}^2s_{i,k_2}^2)-E^2(s_{i,k_1})E^2(s_{i,k_2})\\ &=1+\rho^2_{k_1k_2}+\frac{\rho_{k_1k_2}(p_{k_1}-q_{k_1})(p_{k_2}-q_{k_2})}{2\sqrt{p_{k_1}q_{k_1}p_{k_2}q_{k_2}}}-1\\ &=\rho^2_{k_1k_2}+\rho_{k_1k_2}\frac{(p_{k_1}-q_{k_1})(p_{k_2}-q_{k_2})}{2\sqrt{p_{k_1}q_{k_1}p_{k_2}q_{k_2}}} \end{align}\]
When \(k_1 = k_2\), it is obviously that \(\rho_{k_1,k_2}=1, p_{k_1}=p_{k_2}\), and \[cov(s_{ii,k_1}^2,s_{ii,k_2}^2)=1+1\times\frac{(p_{k}-q_{k})^2}{2p_{k}q_{k}}=1+\frac{1-4p_{k}q_{k}}{2p_{k}q_{k}}=\frac{1}{2p_{k}q_{k}}-1=var(s_{ii,k}^2)\]
UKB cohorts test, “/public/home/xuhm/gc5k/UKB_cohort/cohort/grm/sub”
ukMe=read.table("meMarker.txt", as.is = T)
me=ukMe[,-1]
rownames(me)=ukMe[,1]
layout(matrix(1:3, 3, 1))
me1=me[, 1:10]
MeS1=matrix(c(apply(me1, 1, mean), apply(me1, 1, sd)), nrow(me1), 2, byrow = F)
barplot(t(as.matrix(me1)), beside = T)
me2=me[,11:20]
MeS2=matrix(c(apply(me2, 1, mean), apply(me2, 1, sd)), nrow(me2), 2, byrow = F)
barplot(t(as.matrix(me2)), beside = T)
me3=me[,21:30]
MeS3=matrix(c(apply(me3, 1, mean), apply(me3, 1, sd)), nrow(me3), 2, byrow = F)
barplot(t(as.matrix(me3)), beside = T)
MeMean=matrix(0, nrow=nrow(me), 3)
MeMean[,1]=apply(me1, 1, mean)
MeMean[,2]=apply(me2, 1, mean)
MeMean[,3]=apply(me3, 1, mean)
MeSd=matrix(0, nrow=nrow(me), 3)
MeSd[,1]=apply(me1, 1, sd)
MeSd[,2]=apply(me2, 1, sd)
MeSd[,3]=apply(me3, 1, sd)
MeSd=matrix(0, nrow=nrow(me), 3)
MeSd[,1]=apply(me1, 1, sd)
MeSd[,2]=apply(me2, 1, sd)
MeSd[,3]=apply(me3, 1, sd)
layout(matrix(1,1,1))
barP=barplot(MeMean, beside = T, ylim=c(0, 60000))
segments(barP, MeMean-MeSd*2, barP, MeMean+MeSd*2)
| Genotype | \(aa\) | \(Aa\) | \(AA\) |
|---|---|---|---|
| Additive \(x_A\) | 0 | 1 | 2 |
| Dominance \(x_d\) | 0 | \(2p\) | \(4p-2\) |
| Frequency | \(q^2\) | \(2pq\) | \(p^2\) |
We have \(E(x_A)=2p\), \(E(x_d)=2p^2\), and \(E(x_ax_d)=4p^2q+8p^3-4p^2=4p^3\), and consequently, \(cov(x_a, x_d)=E(x_ax_d)-E(x_a)E(x_d)=0\).
For a pair of individuals, their relatedness in terms of dominance effect is \[ \Omega_{d_{ij}}=\frac{1}{M}\sum_{k=1}^M\frac{(x_{d_{i,k}}-2p_k^2)(x_{d_{j,k}}-2p_k^2)}{4p_k^2(1-p_k)^2} \] The effective number of markers is \(E(m_{e_d})=\frac{M^2}{\sum_{l_1=1}^M\sum_{l_2=1}^M\rho_{l_1l_2}^4}\).
frq=0.25
N=1000
g=rbinom(N, 2, frq)
bv=g*1+1*ifelse(g==1, 1, 0)
vg=var(bv)
h2=0.3
y=bv+rnorm(N, sqrt(var(bv)/h2*(1-h2)))
eF=mean(g)/2
#model 1: naive model
aCode1=g
dT1=c(0,1,0)
dCode1=dT1[(g+1)]
md1=lm(y~aCode1+dCode1)
#model 2: othorgonal model
aCode2=(g-2*eF)/(2*eF*(1-eF))
dT2=c(0, 2*eF, 4*eF-2)
dCode2=(dT2[(g+1)]-2*eF^2)/(4*eF^2*(1-eF)^2)
md2=lm(y~aCode2+dCode2)
#model 3: scale will not change t-test value (compare t-test values of dominance in model 2 and model 3)
aCode2=(g-2*eF)/(2*eF*(1-eF))
dT2=c(0, 2*eF, 4*eF-2)
dCode3=(dT2[(g+1)]-2*eF^2)
md3=lm(y~aCode2+dCode3)
aT=c(summary(md1)$coefficients[2,3], summary(md3)$coefficients[2,3], summary(md3)$coefficients[2,3])
dT=c(summary(md1)$coefficients[3,3], summary(md3)$coefficients[3,3], summary(md3)$coefficients[3,3])
For a linear model \(\mathbf{y}=\mathbf{X\beta}+e\),
We have the target function \[Q=e^Te=argmin_{\mathbf{\beta}}(\mathbf{y}-\mathbf{X\beta})^T(\mathbf{y}-\mathbf{X\beta})\] It can be minized with OLS as below \[\begin{align} \frac{\partial Q}{\partial \beta}&=\frac{(\mathbf{y}-\mathbf{X\beta})^T(\mathbf{y}-\mathbf{X\beta})}{\partial \beta}\\ &=\frac{\partial [y^Ty-\color{red}{\mathbf{\beta}^T\mathbf{X}^Ty}-\color{green}{y^T\mathbf{X\beta}}+\color{blue}{\mathbf{\beta}^T\mathbf{X}^T\mathbf{X}\beta}]}{\partial \mathbf{\beta}}\\ &=0-\color{red}{\mathbf{Xy}^T}-\color{green}{\mathbf{X}\mathbf{y}^T}+\color{blue}{[\mathbf{X}^T\mathbf{X}+(\mathbf{X}^T\mathbf{X})^T]\mathbf{\beta}}, (side\ notes, \mathbf{X}^T\mathbf{X}=(\mathbf{X}^T\mathbf{X})^T)\\ &=-2\mathbf{Xy}^T-2\mathbf{X^TX}\beta \end{align} \] Then, we have
\[\hat{\mathbf{\beta}}=(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y}\]
1: more topics about matrix calculus please refer to “Matrix Calculus Wiki”.
2: Or read a gentle introduction “Matrix Calculus You Need for Deep Learning”.
If it is general linear model, the LSE is to minimize \(Q=argmin_{\mathbf{\beta}}{(\mathbf{y}-\mathbf{X\beta})^TV^{-1}(\mathbf{y}-\mathbf{X\beta})}\), and its corresponding \(\hat{\mathbf{\beta}}=(\mathbf{X}^T\mathbf{V}^{-1}\mathbf{X})^{-1}(\mathbf{X}^T\mathbf{V}^{-1}\mathbf{y})\).
\[ \begin{align} SSE&=\color{red}{e^T}\color{green}{e}\\ &=\color{red}{(\mathbf{y}-\mathbf{X\hat{\beta}})^T}\color{green}{(\mathbf{y}-\mathbf{X\hat{\beta}})}\\ &=\color{red}{[\mathbf{y}-\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^{T}\mathbf{y}]^T}\color{green}{[\mathbf{y}-\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y}]}\\ &=\color{red}{\mathbf{y}^T}\color{green}{\mathbf{y}}-\color{red}{\mathbf{y}^T}\color{green}{\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y}}-\color{red}{\mathbf{y}{\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T}}\color{green}{\mathbf{y}}+\color{red}{\mathbf{y}^T\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^{T}}\color{green}{\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^{T}\mathbf{y}}\\ &=\mathbf{y}^T\mathbf{y}-\mathbf{y}^T\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y}\\ &=\mathbf{y}^T(I-\mathbf{P}_{\mathbf{X}})\mathbf{y} \end{align} \] Notes: \(\mathbf{P}_{\mathbf{X}}=\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\) is projection matrix wiki, that \(\mathbf{P}^n=\mathbf{P}\).
\[\hat{\sigma}_{\beta}^2=\frac{SSE}{n-m}(\mathbf{X}^T\mathbf{X})^{-1}=\hat{\sigma}^2_e(\mathbf{X}^T\mathbf{X})^{-1}\]
There is connection of the regression coefficients between simple linear regresssion and multiple linear regrression.
\[ \begin{align} \hat{\mathbf{\beta}}&=(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y}\\ &=(\Lambda_{\sigma_X}\Sigma\Lambda_{\sigma_X})^{-1}\rho_{Xy}\Lambda_{\sigma_X}\sigma_y\\ &=\Lambda_{\sigma_X}^{-1}\Sigma^{-1}\Lambda_{\sigma_X}^{-1}\rho_{Xy}\Lambda_{\sigma_X}\sigma_y\\ &=\Lambda_{\sigma_X}^{-1}\Sigma^{-1}\rho_{Xy}\sigma_y\\ &=\Sigma^{-1}\hat{b}_{Xy}\\ \end{align} \]
in which \(\hat{b}_{[i]}=\frac{cov(y, x_i)}{var(x_i)}=\rho_{y,x_i}\frac{\sigma_y}{\sigma_{x_i}}\), estimated from the simple linear regression from \(y=a+b_ix_i+e\).
And similarly, \(\hat{\sigma}^2_{\beta}=(\frac{y^Ty-\hat{\mathbf{\beta}}^T\hat{b}}{n-m})(\mathbf{\Lambda_{\sigma_X}\Sigma\Lambda_{\sigma_X}})^{-1}\)
‘/public/home/xuhm/gc5k/UKB_cohort/ss_adj’
library("knitr")
library("kableExtra")
ch2=read.table("CorUK.txt", as.is = T, header = T)
knitr::kable(ch2, caption = "cohort h2") %>%
kable_styling("striped", full_width = T) %>%
row_spec(row=16:16, color="white", background="red")
| Cohort | Gender | ht | cor |
|---|---|---|---|
| Barts | 1 | 1 | -0.6918 |
| Birmingham | 1 | 1 | -0.7139 |
| Bristol | 1 | 1 | -0.7183 |
| Bury | 1 | 1 | -0.7131 |
| Cardiff | 1 | 1 | -0.7244 |
| Croydon | 1 | 1 | -0.7133 |
| Edinburgh | 1 | 1 | -0.7114 |
| Glasgow | 1 | 1 | -0.7160 |
| Hounslow | 1 | 1 | -0.7170 |
| Leeds | 1 | 1 | -0.7136 |
| Liverpool | 1 | 1 | -0.7170 |
| Manchester | 1 | 1 | -0.6995 |
| Middlesborough | 1 | 1 | -0.7199 |
| Newcastle | 1 | 1 | -0.7173 |
| Nottingham | 1 | 1 | -0.7177 |
| Oxford | 1 | 1 | -0.7153 |
| Reading | 1 | 1 | -0.7252 |
| Sheffield | 1 | 1 | -0.7180 |
| Stoke | 1 | 1 | -0.7164 |
It is adjusted variance.
Let \(x\) be a vector for \(n\) random variables, and \(A\) be a \(n\times n\) symmetric matrix, the scalar quantity \(xAx\) is known as quadratic form. If \[x \sim MVN(\mu,\sum)\], it can be shown that
\[E[xAx]=tr[A\sum]+\mu^TA\mu\]
It can be proved as below \(x^TAx = tr[x^TAx]\), and by the cyclic property of the trace operator \(E[tr(x^TAx)]=E[tr[Axx^T]]\). Since trace operator is linear combination of matrices, it therefor follows from the linearity of the expectation operator that \[E[tr(Axx^T)]=tr[AE(xx^T)]=tr[A(\sum+\mu\mu^T)]=tr(A\sum)+tr(\sum\mu\mu^T)=tr(A\sum)+tr(\mu^T\sum\mu)=tr(A\sum)+\mu^TA\mu\].
\[var[\mu^TA\mu]=2tr[A\sum A\sum]+4\mu^TA\sum A\mu\]
\[cov[\mu^TA_1, \mu A_2]=2tr[A_1\sum A_2\sum]+4\mu^TA_1\sum A_2\mu\] for both variance and covariance, \(A\) is required to be symmetric.
When \(A\) is not symmetric
When \(A\) is asymmetric, \[x^T\tilde{A}x=x^T(A+A^T)x/2\] the expression of var and cov are the same, provided \(A\) is replaced by \(\tilde{A}=(A+A^T)/2\).
Resources for quadratic forms wiki link and Lecture notes.
It may be related to high-order moments for multivariate normal distribution, see Isserlis Theorem, published in Biometrika, 12:134-139.
\(X_i\) is a vector that follows standard Gaussian distribution \(N(0,1)\). The product below can be expanded as
\[ \begin{align} E(X_1X_2X_3X_4)&=E(X_1X_2)E(X_3X_4)+E(X_1X_3)E(X_2X_4)+E(X_1X_4)E(X_2X_3) \end{align} \] When \(X_i\), \(i=1\), it is \(E(X_1^4)=3E(X_1^2)=3\).
\[ \begin{align} E(X_1X_2X_3X_4X_5X_5)&=E(X_1X_2)E(X_3X_4X_5X_6)+E(X_1X_3)E(X_2X_4X_5X_6)\\ &+E(X_1X_4)E(X_2X_3X_5X_6)+E(X_1X_5)E(X_2X_3X_4X_6)\\ &+E(X_1X_6)E(X_2X_3X_4X_5), (5\ terms)\\ &=E(X_1X_2)[E(X_3X_4)E(X_5X_6)\\ &+E(X_3X_5)E(X_4X_6)+E(X_3X_6)E(X_4X_5)]\\ &+...,(E(X_3X_4)E(X_5X_6)\ can\ be\ expanded\ as\ Example\ 1)\\ &=15 \end{align} \]
However, Isserlis Theorem works for normal distribution only, but it is not “normal” for a standardized locus.
Another possible method is “Cumulant”, see its Wiki. Or for “joint Cumulant” if covariance exists beween two vectors, see Joint Cumulant.
\[\mathbf{A}^{-1}= \begin{bmatrix} a & b \\ c & d \end{bmatrix} ^{-1} = \frac{1}{ad-bc}\begin{bmatrix} d & -b \\ -c & a \end{bmatrix}\]
\[\mathbf{A}^{-1}= \begin{bmatrix} a & b &c \\ d & e & f \\g&h&i \end{bmatrix} ^{-1} = \frac{1}{det(\mathbf{A})}\begin{bmatrix} A & D &G \\B&E&H \\C & F &I \end{bmatrix}\] in which (\(A\) is scalar not a matrix as above) \(A=(ei-fh)\), \(B=-(di-fg)\), \(C=(dh-eg)\), \(D=-(bi-ch)\), \(E=(ai-cg)\), \(F=-(ah-bg)\), \(G=(bf-ce)\), \(H=-(af-cd)\), \(I=(ae-bd)\).
| \(a_l\) | \(A_l\) | ||
|---|---|---|---|
| \(a_k\) | \(r_{kl}=q_l+\frac{D_{kl}}{q_k}\) | \(\bar{r}_{kl}=q_l+\frac{D_{kl}}{q_k}\) | \(q_k\) |
| \(A_k\) | \(\bar{R}_{kl}=q_l+\frac{D_{kl}}{p_{kl}}\) | \(R_{kl}=p_l+\frac{D_{kl}}{p_{kl}}\) | \(p_k\) |
| \(q_l\) | \(p_l\) |
\(p_k\) and \(p_l\) are minor allelic frequencies of the two loci, respectively. It exists the transformation as \(D_{kl}=\rho_{kl}\sqrt{p_kq_kp_lq_l}\). Of note, \(D_{kl}\) is restrained as min(\(p_kq_l\), \(q_kp_l\)), and consequently \(\rho_{kl}=\sqrt{\frac{p_kq_l}{p_lq_k}}<1\).
| Locus \(l\) | ||||||
|---|---|---|---|---|---|---|
| Geno | \(a_la_l\) | \(a_lA_l\) | \(A_lA_l\) | |||
| Standardized geno | \(s_{il}=\frac{-2p_l}{\sqrt{2p_lq_l}}\) | \(s_{il}=\frac{q_l-p_l}{\sqrt{2p_lq_l}}\) | \(s_{il}=\frac{2q_l}{\sqrt{2p_lq_l}}\) | |||
| Frequency | \(q_l^2\) | \(2p_lq_l\) | \(p_l^2\) | |||
| \(a_ka_k\) | \(s_{ik}=\frac{-2p_k}{\sqrt{2p_kq_k}}\) | \(q_k^2\) | \(r_{kl}^2\) | \(2r_{kl}\bar{r}_{kl}\) | \(\bar{r}_{kl}^2\) | |
| Locus \(k\) | \(a_ka_k\) | \(s_{ik}=\frac{q_k-p_k}{\sqrt{2p_kq_k}}\) | \(q_k^2\) | \(r_{kl}\bar{R}_{kl}\) | \(r_{kl}\bar{R}_{kl}+\bar{r}_{kl}R_{kl}\) | \(\bar{r}_{kl}R_{kl}\) |
| \(A_kA_k\) | \(s_{ik}=\frac{2q_k}{\sqrt{2p_kq_k}}\) | \(p_k^2\) | \(\bar{R}_{kl}^2\) | \(2R_{kl}\bar{R}_{kl}\) | \(R_{kl}^2\) |
See Eq 23 in Am J Hum Genet, 2011, 88:294-305.
\[h^2_l=h^2_o\frac{K(1-K)}{z^2}\frac{K(1-K)}{P(1-P)}\] in which \(K\) is the prevalence of the disease (binary), and \(P\) is the proportion of cases in the sample.
See a simple example below
K=0.01
P=0.5
ho2=0.5
z=dnorm(qnorm(K))
hl2=ho2*K*(1-K)*K*(1-K)/z^2/(P*(1-P))
print(paste("Observed scale", ho2))
## [1] "Observed scale 0.5"
print(paste("Liability scale", hl2))
## [1] "Liability scale 0.275953649031586"
Algorithm. It is a simple linear regression but against eigenvecot.
1.1 Sample \(N_c\) individuals from the whole population
1.2 Generate eigenvectors from the \(pc\)
1.3 \(\beta_k=\frac{cov(pc,x_k)}{var(x_k)}\)
2.1 \(\tilde{pc}=\sum_{k=1}^M\hat{\beta}_kx_k\).
However, there is two technical issues here about snp strand or the switch of the minor allele. See summary [here] (https://github.com/gc5k/Notes/blob/master/StG/TechNotes/PalindromeLoci.pdf)
\[ \begin{align} L_B&=tr(\mathbf{KK}^T)=\frac{1}{B}\sum_{b=1}^Bz_b^T\mathbf{KK}^Tz_b\\ &=\frac{1}{B}\frac{1}{M^2}\sum_{b=1}^Bz_b^T\mathbf{XX}^T\mathbf{XX}^Tz_b\\ &=\frac{1}{B}\frac{1}{M^2}\sum_{b=1}^B||\mathbf{XX}^Tz_b||_2^2\\ \end{align} \]
The sampling variance is
\[ \begin{aligned} var(L_B)&=\mathbb{E}\{[L_{B}-tr(\mathbf{KK}^T)]^2\}\\ &=var(\frac{1}{B}\sum_{b=1}^Bz_b^T\mathbf{KK}^Tz_b)\\ &=\frac{1}{B^2}var(\sum_{b=1}^Bz_b^T\mathbf{KK}^Tz_b) \\ &\text{For quadric forms that } var(z^T\mathbf{A}z)=2tr(\Sigma_z\mathbf{A}\Sigma_z\mathbf{A}))+4\mu_z\mathbf{A}\Sigma\mathbf{A}\mu_z\\ &=\frac{1}{B^2}\sum_{b=1}^B2tr(\Sigma_z\mathbf{KK}^T\Sigma_z\mathbf{KK}^T)\\ &\text{because }z\sim~MVN(0, I_N)\\ &=\frac{2}{B}tr(\mathbf{KK}^T\mathbf{KK}^T)\\ &=\frac{1}{B}\sum_{l_1}\sum_{l_2}\mathcal{K}_{l_1,l_2} \end{aligned} \]
in which \(\mathcal{K}=\mathbf{KK}^T\).
An example
n=300 #sample size
m=500 #marker
SM=50 #simulation
BS=25 #randomization factor
LK=0
fq=runif(m, 0.1, 0.5)
x=matrix(0, n, m)
for(i in 1:m) {
x[,i]=rbinom(n, 2, fq[i])
}
sx=apply(x, 2, scale)
k=sx%*%t(sx)/m
ksq=k^2
k2=k%*%t(k)
k4=k2%*%t(k2)
k4Alt=k2^2
#evaluate LB
LK=array(0, SM)
for(i in 1:SM) {
Lb=array(0, BS)
for(j in 1:BS) {
z=matrix(rnorm(n), n, 1)
x1=t(sx)%*%z
x2=sx%*%x1
Lb[j]=(t(x2)%*%x2)[1,1]
}
LK[i]=sum(Lb)/(BS*m^2)
}
print(paste("Obs mean", mean(LK), "vs expected mean", n^2/m+n))
## [1] "Obs mean 474.194724827433 vs expected mean 480"
print(paste("obs var", var(LK), " vs expected var", 2*sum(diag(k4))/BS))
## [1] "obs var 162.536901365627 vs expected var 164.233602459417"
print(paste("incorrected var in Wu's paper", (n^2/m+n)/BS))
## [1] "incorrected var in Wu's paper 19.2"
print(paste("sum(diag(K4))=", sum(diag(k4)), "sum[(kkT)^2]", sum(k4Alt)))
## [1] "sum(diag(K4))= 2052.92003074272 sum[(kkT)^2] 2052.92003074272"
Sum_D_k2_sq=sum(diag(k2))^2
ZX=n^2*m^4+(2*n^3+2*n^2+8*n)*m^3+(n^4+2*n^3+21*n^2+20*n)*m^2+(8*n^3+20*n^2+20*n)*m
print(paste(sum(diag(k2)^2)*n, ZX/m^4, sum(diag(k2))^2))
## [1] "229957.108221216 230990.126448 227459.251387183"
EU=n*m^2+(n^2+n)*m
Sum_D_k2=sum(diag(k2))
print(paste(EU/m^2, Sum_D_k2))
## [1] "480.6 476.926882642594"
About \(\mathbf{K}\) and its statistics.
\[ \mathbf{K}=\frac{1}{M} \begin{bmatrix} k_{\color{blue}{1},\color{red}{1}}=\sum_{m=1}^M\frac{(\color{blue}{x}_{\color{blue}{1},m}-2p_m)(\color{red}{x}_{\color{red}{1},m}-2p_k)}{2p_mq_m} & k_{\color{blue}{1},\color{red}{2}}=\sum_{m=1}^M\frac{(\color{blue}{x}_{\color{blue}{1},m}-2p_m)(\color{red}{x}_{\color{red}{2},m}-2p_m)}{2p_mq_m} & \cdots & k_{\color{blue}{1},\color{red}{n}}=\sum_{m=1}^M\frac{(\color{blue}{x}_{\color{blue}{1},m}-2p_m)(\color{red}{x}_{\color{red}{n},m}-2p_m)}{2p_mq_m}\\ k_{\color{blue}{2},\color{red}{1}}=\sum_{m=1}^M\frac{(\color{blue}{x}_{\color{blue}{2},m}-2p_m)(\color{red}{x}_{\color{red}{1},m}-2p_m)}{2p_mq_m} & k_{\color{blue}{2},\color{red}{2}}=\sum_{m=1}^M\frac{(\color{blue}{x}_{\color{blue}{2},m}-2p_m)(\color{red}{x}_{\color{red}{2},m}-2p_m)}{2p_mq_m} & \cdots & k_{\color{blue}{2},\color{red}{n}}=\sum_{m=1}^M\frac{(\color{blue}{x}_{\color{blue}{2},m}-2p_m)(\color{red}{x}_{\color{red}{n},m}-2p_m)}{2p_mq_m}\\ \cdots & \cdots & \cdots & \cdots\\ k_{\color{blue}{n},\color{red}{1}}=\sum_{m=1}^M\frac{(\color{blue}{x}_{\color{blue}{n},m}-2p_m)(x_{\color{red}{1},m}-2p_m)}{2p_mq_m} &k_{\color{blue}{n},\color{red}{2}}=\sum_{m=1}^M\frac{(\color{blue}{x}_{\color{blue}{n},m}-2p_m)(x_{\color{red}{2},m}-2p_m)}{2p_mq_m} & \cdots &k_{\color{blue}{n},\color{red}{n}}=\sum_{m=1}^M\frac{(\color{blue}{x}_{\color{blue}{n},m}-2p_m)(x_{\color{red}{n},m}-2p_m)}{2p_mq_m} \end{bmatrix} \]
in which \[ \mathbf{KK}^T=\frac{1}{M^2}\mathcal{K}= \begin{bmatrix} \boxed{\mathcal{K}_{\color{blue}{1},\color{red}{1}}=\sum_{\color{green}{l}=1}^nk_{\color{blue}{1}, \color{green}{l}}k_{\color{green}{l},\color{red}{1}}} &\mathcal{K}_{\color{blue}{1},\color{red}{2}}=\sum_{\color{green}{l}=1}^nk_{\color{blue}{1},\color{green}{l}}k_{\color{blue}{l},\color{red}{2}} &\cdots &\mathcal{K}_{\color{blue}{1},\color{red}{n}}=\sum_{\color{green}{l}=1}^nk_{\color{blue}{1},\color{green}{l}}k_{\color{green}{l},\color{red}{n}}\\ \mathcal{K}_{\color{blue}{2},\color{red}{1}}=\sum_{\color{green}{l}=1}^nk_{\color{blue}{2}, \color{green}{l}}k_{\color{green}{l},\color{red}{1}} &\boxed{\mathcal{K}_{\color{blue}{2},\color{red}{2}}=\sum_{\color{green}{l}=1}^nk_{\color{blue}{2},\color{green}{l}}k_{\color{green}{l},\color{red}{2}}} &\cdots &\mathcal{K}_{\color{blue}{2},\color{red}{n}}=\sum_{\color{green}{l}=1}^nk_{\color{blue}{2},\color{green}{l}}k_{\color{green}{l},\color{red}{n}}\\ \vdots & \vdots & \vdots & \vdots\\ \mathcal{K}_{\color{blue}{n},\color{red}{1}}=\sum_{\color{green}{l}=1}^nk_{\color{blue}{1}, \color{green}{l}}k_{\color{green}{l},\color{red}{1}} &\mathcal{K}_{\color{blue}{n},\color{red}{2}}=\sum_{\color{green}{l}=1}^nk_{\color{blue}{n},\color{green}{l}}k_{\color{green}{l},\color{red}{2}} &\cdots &\boxed{\mathcal{K}_{\color{blue}{n},\color{red}{n}}=\sum_{\color{green}{l}=1}^nk_{\color{blue}{n},\color{green}{l}}k_{\color{green}{l},\color{red}{n}}}\\ \end{bmatrix} \]
Furthermore \[ \begin{aligned} \mathcal{K}_{\color{blue}{\mathcal{L}_1},\color{red}{\mathcal{L}_2}} &=\mathbf{K}_{[\color{blue}{\mathcal{L}_1},]}\mathbf{K}^T_{[,\color{red}{\mathcal{L}_2}]}\\ &=\sum_{\color{green}{l}=1}^nk_{\color{blue}{\mathcal{L}_1},\color{green}{l}}k_{\color{green}{l},\color{red}{\mathcal{L}_2}}\\ &=\frac{1}{M^2} \begin{bmatrix} k_{\color{blue}{\mathcal{L}_1},\color{green}{1}}=\sum_{m_1=1}^M\frac{(\color{blue}{x}_{\color{blue}{\mathcal{L}_1},m_1}-2p_{m_1})(\color{green}{x}_{\color{green}{1},m_1}-2p_{m_1})}{2p_{m_1}q_{m_1}}\\ k_{\color{blue}{\mathcal{L}_1},\color{green}{2}}=\sum_{m_1=1}^M\frac{(\color{blue}{x}_{\color{blue}{\mathcal{L}_1},m_1}-2p_{m_1})(\color{green}{x}_{\color{green}{2},m_1}-2p_{m_1})}{2p_{m_1}q_{m_1}}\\ \vdots\\ k_{\color{blue}{\mathcal{L}_1},\color{green}{n}}=\sum_{m_1=1}^M\frac{(\color{blue}{x}_{\color{blue}{\mathcal{L}_1},m_1}-2p_{m_1})(\color{green}{x}_{\color{green}{n},m_1}-2p_{m_1})}{2p_{m_1}q_{m_1}}\\ \end{bmatrix}^T \begin{bmatrix} k_{\color{green}{1},\color{red}{\mathcal{L}_2}}=\sum_{m_2=1}^M\frac{(\color{green}{x}_{\color{green}{1},m_2}-2p_{m_2})(\color{red}{x}_{\color{red}{\mathcal{L}_2},m_2}-2p_{m_2})}{2p_{m_2}q_{m_2}}\\ k_{\color{green}{2},\color{red}{\mathcal{L}_2}}=\sum_{m_2=1}^M\frac{(\color{green}{x}_{\color{green}{2},m_2}-2p_{m_2})(\color{red}{x}_{\color{red}{\mathcal{L}_2},m_2}-2p_{m_2})}{2p_{m_2}q_{m_2}}\\ \vdots\\ k_{\color{green}{n},\color{red}{\mathcal{L}_2}}=\sum_{m_2=1}^M\frac{(\color{green}{x}_{\color{green}{n},m_2}-2p_{m_2})(\color{red}{x}_{\color{red}{\mathcal{L}_2},m_2}-2p_{m_2})}{2p_{m_2}q_{m_2}} \end{bmatrix}\\ &=\frac{1}{M^2}\sum_{\color{green}{l}}^n \left\{ [\sum_{m_1=1}^M\color{blue}{s}_{\color{blue}{\mathcal{L}_1},m_1}\color{green}{s}_{\color{green}{l},m_1}] [\sum_{m_2=1}^M\color{green}{s}_{\color{green}{l},m_2}\color{red}{s}_{\color{red}{\mathcal{L}_2},m_2}] \right\}\\ &=\frac{1}{M^2}\sum_{\color{green}{l}}^n \left\{ \sum_{m_1,m_2}^M \color{blue}{s_{\mathcal{L}_1,m_1}} \color{green}{s}_{l,m_1}\color{green}{s}_{l,m_2} \color{red}{s_{{\mathcal{L}_2},m_2}} \right\} \end{aligned} \]
And
\[ \begin{aligned} \mathcal{K}_{\mathcal{L}_1,\mathcal{L}_2}^2&=\frac{1}{M^4} \left\{ \sum_{l}^n \left\{ \sum_{m_1,m_2}^M \color{blue}{s_{\mathcal{L}_1,m_1}} \color{green}{s}_{l,m_1} \color{red}{s_{\mathcal{L}_2,m_2}} \color{green}{s}_{l,m_2}\right\}\right\}^2\\ &=\frac{1}{M^4} \sum_{l_1,l_2}^n \sum_{m_1,\color{brown}{m_1^{\prime}}}^M \sum_{m_2,\color{pink}{m_2^{\prime}}}^M \underline{\color{blue}{s_{\mathcal{L}_1,m_1}} \color{blue}{s_{\mathcal{L}_1,\color{brown}{m_1^{\prime}}}}} \times\boxed{\color{green}{s}_{l_1,m_1} \color{green}{s}_{l_1,\color{brown}{m_1^{\prime}}}} \times\underline{\color{red}{s_{\mathcal{L}_2,m_2}} \color{red}{s}_{\color{red}{\mathcal{L}_2},\color{pink}{m_2^{\prime}}}} \times\boxed{\color{green}{s}_{l_2,m_2}\color{green}{s}_{l_2,\color{pink}{m_2^{\prime}}}} \end{aligned} \]
The classic \(h^2=\beta HLH\beta\)
| classic | \(h^2=\beta HLH\beta\) | \(h^2_d=d H_dL_dH_dd\) |
| \(h^2_{SNP}\) | \(h^2_{SNP.A}=m\beta H\{\frac{ \sum_{k=1}^{m} v_k^Tv_k }{1^TP1}\} H \beta\) | \(h^2_{SNP.D}=md H_d\{\frac{ \sum_{k=1}^{m} v_{k,d}^T v_{k,d} }{1^TP_d1}\} H_d d\) |
| \(h^2_{SNP_w}\) | \(\tilde{h}^2_{SNP.A}=m\beta H\{\frac{ \sum_{k=1}^{m}w_k v_{k}^Tv_k }{w^TPw}\} H \beta\) | \(\tilde{h}^2_{SNP.D}=md H_d\{\frac{ \sum_{k=1}^{m} w_{k,d}v_{k.d}^Tv_{k.d} }{w_{d}^TP_dw_{d}}\} H_d d\) |
| \(h^2_{cor}\) | \(h^2_{SNP.A_2}=m\beta_1 H_1\{\frac{ \sum_{k=1}^{m} v_{k_1}^Tv_{k_2} }{1^TP1}\} H_2 \beta_2\) | \(h^2_{SNP.D_2}=md_1 H_{d_1}\{\frac{ \sum_{k=1}^{m} v_{k,d_1}^Tv_{k,d_2} }{1^TP_d1}\} H_{d_2} d_2\) |
| \(h^2_{cor_w}\) | \(\tilde{h}^2_{SNP.A_2}=m\beta_1 H_1\{\frac{ \sum_{k=1}^{m}w_k v_{k,1}^Tv_{k,2} }{w^TPw}\} H_2 \beta_2\) | \(\tilde{h}^2_{SNP.D_2}=md_1 H_{d_1}\{\frac{ \sum_{k=1}^{m} w_{k,d}v_{k,d_1}^Tv_{k,d_2} }{w_{d}^TP_dw_{d}}\} H_{d_2} d_2\) |
\[\Omega_w=\frac{\sum_{k=1}^M(x_{i,k}-2p_k)(x_{j,k}-2p_k)}{\sum_{k=1}^M2p_kq_k}\] Like \(M_e\), \(M_{e_w}\) now can be defined accordingly.
\[h^2_w=\frac{ (\frac{\sum_{k=1}^M2p_kq_k\chi^2_{1.k}}{\sum_{k=1}^M2p_kq_k}-1)n }{n^2/M_{e_w}}\]
An example for \(h^2_{SNP}\), and \(h^2_{SNP_w}\)
n=500 #sample size
m=1000 #marker
h2=0.3 #heritability
b=rnorm(m, 0, sqrt(h2/m)) #effect
SIMU=5
#simu g
fq=runif(m, 0.3, 0.5)
x=matrix(0, n, m)
for(i in 1:m) {
x[,i]=rbinom(n, 2, fq[i])
}
FQ=colMeans(x)/2
sA=apply(x, 2, scale)
K=sA%*%t(sA)/m
me=var(K[col(K)<row(K)])
sW=x-2*FQ
Kw=sW%*%t(sW)/sum(2*FQ*(1-FQ))
meW=var(Kw[col(Kw)<row(Kw)])
#simu y
H2=matrix(0, SIMU, 4)
H2w=matrix(0, SIMU, 2)
for(i in 1:SIMU) {
y=x%*%b
vy=var(y)
y=y+rnorm(n, 0, sqrt(vy/(h2)*(1-h2)))
y=scale(y)
yy=y%*%t(y)
h2Mod=lm(yy[col(yy)<row(yy)]~K[col(yy)<row(yy)])
H2[i,1]=summary(h2Mod)$coefficients[2,1]
ss=matrix(0, m, 5)
for(j in 1:m) {
mod=lm(y~sA[,j])
ss[j,1:4]=summary(mod)$coefficient[2,]
}
ss[,5]=ss[,3]^2
H2[i,2]=((mean(ss[,5])-1)*n)/(n*n*me)
h2wMod=lm(yy[col(yy)<row(yy)]~Kw[col(yy)<row(yy)])
H2[i,3]=summary(h2wMod)$coefficients[2,1]
chiW=ss[,5]*2*FQ*(1-FQ)
H2[i,4]=((mean(chiW)/mean(2*FQ*(1-FQ))-1)*n)/(n*n*meW)
}
barplot(t(H2), beside = T)
abline(h=c(h2))
legend("topleft", legend=c("h2", "ssh2", "h2w", "ssh2w"))
\[var(h^2)=var(\frac{A}{B})=\frac{1}{\mu_A^2}\sigma^2_A+\frac{\mu^2_A}{\mu_B^4}\sigma^2_B\] in which \(\mu_A=\frac{n^2}{\hat{m}_e}\hat{h}^2\), \(\sigma_A^2=\frac{n^2}{\hat{m}_e}\), and \(\mu_B=\frac{n^2}{\hat{m}_e}\), and \(\sigma^2_B=\frac{2\hat{m}_e}{n}\). After some algrebra, \(var(h^2)=2\frac{\hat{m}_e}{n^2}+2\frac{\hat{m}_e^3}{n^5}\hat{h^2}^2\), and after further approximation \(\sigma_{\hat{h}^2}\approx \frac{\sqrt{(2\hat{m}_e)}}{n}\)
n=500 #sample size
m=1000 #marker
h2=0.3 #heritability
b=rnorm(m, 0, sqrt(h2/m)) #effect
SIMU=10
#simu g
fq=runif(m, 0.3, 0.5)
x=matrix(0, n, m)
for(i in 1:m) {
x[,i]=rbinom(n, 2, fq[i])
}
FQ=colMeans(x)/2
sA=apply(x, 2, scale)
K=sA%*%t(sA)/m
me=var(K[col(K)<row(K)])
sW=x-2*FQ
Kw=sW%*%t(sW)/sum(2*FQ*(1-FQ))
meW=var(Kw[col(Kw)<row(Kw)])
#simu y
H2=matrix(0, SIMU, 4)
H2w=matrix(0, SIMU, 2)
for(i in 1:SIMU) {
y=x%*%b
vy=var(y)
y=y+rnorm(n, 0, sqrt(vy/(h2)*(1-h2)))
y=scale(y)
yy=y%*%t(y)
h2Mod=lm(yy[col(yy)<row(yy)]~K[col(yy)<row(yy)])
H2[i,1]=summary(h2Mod)$coefficients[2,1]
ss=matrix(0, m, 5)
for(j in 1:m) {
mod=lm(y~sA[,j])
ss[j,1:4]=summary(mod)$coefficient[2,]
}
ss[,5]=ss[,3]^2
H2[i,2]=((mean(ss[,5])-1)*n)/(n*n*me)
h2wMod=lm(yy[col(yy)<row(yy)]~Kw[col(yy)<row(yy)])
H2[i,3]=summary(h2wMod)$coefficients[2,1]
chiW=ss[,5]*2*FQ*(1-FQ)
H2[i,4]=((mean(chiW)/mean(2*FQ*(1-FQ))-1)*n)/(n*n*meW)
}
barplot(t(H2), beside = T)
abline(h=c(h2))
legend("topleft", legend=c("h2", "ssh2", "h2w", "ssh2w"))
ME=1/me
##delta
s=sqrt(2*ME/n^2+2*ME^3/n^5*h2^2)
print(paste("Obs sd for h2:", sd(H2[,2])))
## [1] "Obs sd for h2: 0.104546753516027"
print(paste("Exp sd for h2:", s))
## [1] "Exp sd for h2: 0.0899163419622116"
Under the first circumstance, both traits have same subjects, and \(N_1=N_2=N\).
The target function can be
\[\begin{align} Q&=argmin_{\{\Delta\}}|| \mathbf{yy}^T-( \begin{bmatrix} \sigma^2_{a_1}\mathbf{K} & \gamma_{g}\mathbf{K}\\ \gamma_{g}\mathbf{K} & \sigma^2_{a_2}\mathbf{K} \\ \end{bmatrix} + \begin{bmatrix} \sigma^2_{e_1}\mathbf{I}_N & \gamma_{e}\mathbf{I}_N\\ \gamma_{e}\mathbf{I}_N & \sigma^2_{e_2}\mathbf{I}_N\\ \end{bmatrix} )||^2_F\\ &=tr\{ [\mathbf{yy}^T-( \begin{bmatrix} \sigma^2_{a_1}\mathbf{K} & \gamma_{g}\mathbf{K}\\ \gamma_{g}\mathbf{K} & \sigma^2_{a_2}\mathbf{K} \\ \end{bmatrix} + \begin{bmatrix} \sigma^2_{e_1}\mathbf{I}_N & \gamma_{e}\mathbf{I}_N\\ \gamma_{e}\mathbf{I}_N & \sigma^2_{e_2}\mathbf{I}_N\\ \end{bmatrix} )]^2\} \end{align} \]
in which \(\Delta=(\sigma^2_{a_1},\sigma^2_{e_1},\sigma^2_{a_2},\sigma^2_{e_2},\gamma_g,\gamma_e)\).
Setting the gradient of \(Q\) to zero in respective to each component, \[\color{red}{\frac{\partial Q}{\partial\Delta}} \color{blue}{\frac{\partial \Delta}{\partial\sigma^2_.}}=tr\{ \color{red}{2[\mathbf{yy}^T-( \begin{bmatrix} \sigma^2_{a_1}\mathbf{K} & \gamma_{g}\mathbf{K}\\ \gamma_{g}\mathbf{K} & \sigma^2_{a_2}\mathbf{K} \\ \end{bmatrix} + \begin{bmatrix} \sigma^2_{e_1}\mathbf{I}_N & \gamma_{e}\mathbf{I}_N\\ \gamma_{e}\mathbf{I}_N & \sigma^2_{e_2}\mathbf{I}_N\\ \end{bmatrix} )]} \cdot \color{blue}{\frac{\partial{\Delta}}{\partial \sigma^2_.}} \} \]
For each element in \(\Delta\), its partial can be derived as below
\[\begin{bmatrix} \frac{\partial Q}{\partial \sigma^2_{a_1}}=\begin{bmatrix} \mathbf{K}&0\\ 0&0\\ \end{bmatrix}, \frac{\partial Q}{\partial \sigma^2_{a_2}}=\begin{bmatrix} 0&0\\ 0&\mathbf{K} \end{bmatrix}, \frac{\partial Q}{\partial \gamma_g}=\begin{bmatrix} 0&\mathbf{K}\\ \mathbf{K}&0 \end{bmatrix} \\ \frac{\partial Q}{\partial \sigma^2_{e_1}}=\begin{bmatrix} \mathbf{I}_N&0\\ 0&0 \end{bmatrix}, \frac{\partial Q}{\partial \sigma^2_{e_2}}=\begin{bmatrix} 0&0\\ 0&\mathbf{I}_N \end{bmatrix}, \frac{\partial Q}{\partial \gamma_e}=\begin{bmatrix} 0&\mathbf{I}_N\\ \mathbf{I}_N&0 \end{bmatrix} \end{bmatrix} \]
Then, the \(\frac{\partial Q}{\partial \Delta}\frac{\partial \Delta}{\partial \sigma^2}\) becomes
\[\left \{ \begin{align} \frac{\partial Q}{\partial \sigma^2_{a_1}}&=2\{tr(\mathbf{y_1y_1}^T\mathbf{K})-\sigma^2_{a_1}tr(\mathbf{KK})-\sigma^2_{e_1}tr(\mathbf{K})\}=0 \\ \frac{\partial Q}{\partial \sigma^2_{e_1}}&=2\{tr(\mathbf{y_1y_1}^T\mathbf{I}_N)-\sigma^2_{a_1}tr(\mathbf{KI}_N)-\sigma^2_{e_1}tr(\mathbf{I}_N)\}=0 \\ \frac{\partial Q}{\partial \sigma^2_{a_2}}&=2\{tr(\mathbf{y_2y_2}^T\mathbf{K})-\sigma^2_{a_2}tr(\mathbf{KK})-\sigma^2_{e_2}tr(\mathbf{K})\}=0 \\ \frac{\partial Q}{\partial \sigma^2_{e_2}}&=2\{tr(\mathbf{y_2y_2}^T\mathbf{I}_N)-\sigma^2_{a_2}tr(\mathbf{KI}_N)-\sigma^2_{e_2}tr(\mathbf{I}_N)\}=0 \\ \frac{\partial Q}{\partial \sigma^2_{r_g}}&=2\{tr(\mathbf{y_1y_2}^T\mathbf{K}+\mathbf{y_1y_2}^T\mathbf{K})-\gamma_{g}tr(\mathbf{KK}+\mathbf{KK})-\gamma_{e}tr(\mathbf{K}+\mathbf{K})\}=0 \\ \frac{\partial Q}{\partial \sigma^2_{r_e}}&=2\{tr(\mathbf{y_1y_2}^T\mathbf{I}_N+\mathbf{y_1y_2}^T\mathbf{I}_N)-\gamma_{g}tr(\mathbf{K}+\mathbf{K})-\gamma_{e}tr(\mathbf{I}_N+\mathbf{I}_N)\}=0 \\ \end{align} \right . \]
And, it turns out a normal equation as below
\[\begin{bmatrix} tr(\mathbf{KK}) & tr(\mathbf{K}) & 0 & 0 &0 &0 \\ tr(\mathbf{K}) & tr(\mathbf{I}_N) & 0 & 0 &0 &0 \\ 0& 0& tr(\mathbf{KK}) & tr(\mathbf{K}) & 0 & 0 \\ 0& 0&tr(\mathbf{K}) & tr(\mathbf{I}_N) & 0 & 0 \\ 0& 0 & 0 & 0& tr(\mathbf{KK}) & tr(\mathbf{K}) \\ 0& 0& 0 & 0 &tr(\mathbf{K}) & tr(\mathbf{I}_N) \\ \end{bmatrix} \left[ \begin{array}{c} \hat{\sigma}^2_{a_1} \\ \hat{\sigma}^2_{e_1} \\ \hat{\sigma}^2_{a_2} \\ \hat{\sigma}^2_{e_2} \\ \hat{\gamma}_g \\ \hat{\gamma}_e \\ \end{array} \right] = \left [ \begin{array}{c} tr[y^T_1\mathbf{K}y_1] \\ tr[y^T_1\mathbf{I}_Ny_1]\\ tr[y^T_2\mathbf{K}y_2] \\ tr[y^T_2\mathbf{I}_Ny_2]\\tr[y^T_1\mathbf{K}y_2]\\tr[y^T_1\mathbf{I}_Ny_2] \end{array} \right ] \]
Then we hava \[ \hat{\gamma}_{g}=\frac{y_1^T\mathbf{K}y_2-y_1^Ty_2}{tr[\mathbf{KK^T}]-N}=\frac{N(Nr_g\sqrt{h^2_1h^2_2}/M_e+\frac{N}{\sqrt{NN}} r_e)-\frac{N}{\sqrt{NN}}r_e}{\frac{N^2}{M_e}}=\frac{Nr_g\sqrt{h^2_1h^2_2}/M_e}{\frac{N}{M_e}} \]
\[\begin{align} \mathcal{Q}&=argmin_{\gamma_g, \gamma_e, \sigma^2_{g_1},\sigma^2_{e_1}, \sigma^2_{g_2}, \sigma^2_{e_2}}||yy^T-( \begin{bmatrix} \sigma^2_{g_1}\mathbf{K}&\sigma^2_{\gamma_g}\mathbf{K}\\ \sigma^2_{\gamma_g}\mathbf{K}&\sigma^2_{g_2}\mathbf{K}\\ \end{bmatrix} + \begin{bmatrix} \sigma^2_{e_1}\mathbf{I}_N&\gamma_{e}\mathbf{I}_N\\ \gamma_{e}\mathbf{I}_N&\sigma^2_{e_2}\mathbf{I}_N\\ \end{bmatrix} )||_F^2\\ &=tr\{yy^Tyy^T-2yy^T \begin{bmatrix} \sigma^2_{g_1}\mathbf{K}+\sigma^2_{e_1}\mathbf{I}_N&\sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}_N\\ \sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}_N&\sigma^2_{g_2}\mathbf{K}+\sigma^2_{e_2}\mathbf{I}_N\\ \end{bmatrix} + \begin{bmatrix} \sigma^2_{g_1}\mathbf{K}+\sigma^2_{e_1}\mathbf{I}_N&\sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}_N\\ \sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}_N&\sigma^2_{g_2}\mathbf{K}+\sigma^2_{e_2}\mathbf{I}_N\\ \end{bmatrix}^2 \}\\ \end{align}\]
in which \(y=[y_1, y_2]\) and \[yy^T= \begin{bmatrix} y_1y_1^T&y_1y_2^T\\ y_2y_1^T&y_2y_2^T \end{bmatrix} \]
\[ \begin{align} \frac{\partial \mathcal{Q}}{\sigma^2_{g_1}}&=tr\{-2yy^T \begin{bmatrix} \mathbf{K}&0\\ 0&0\\ \end{bmatrix} + 2\begin{bmatrix} \sigma^2_{g_1}\mathbf{K}+\sigma^2_{e_1}\mathbf{I}_N&\sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}_N\\ \sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}_N&\sigma^2_{g_2}\mathbf{K}+\sigma^2_{e_2}\mathbf{I}_N\\ \end{bmatrix} \begin{bmatrix} \mathbf{K}&0\\ 0&0\\ \end{bmatrix}\}\\ &=-2tr(y_1y_1^T\mathbf{K})+2\sigma^2_{g_1}tr(\mathbf{K}^2)+2\sigma^2_{e_1}tr(\mathbf{K}) \end{align} \]
\[ \begin{align} \frac{\partial \mathcal{Q}}{\sigma^2_{e_1}}&=tr\{-2yy^T \begin{bmatrix} 0&\mathbf{I}_N\\ 0&0\\ \end{bmatrix} + 2\begin{bmatrix} \sigma^2_{g_1}\mathbf{K}+\sigma^2_{e_1}\mathbf{I}_N&\sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}_N\\ \sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}_N&\sigma^2_{g_2}\mathbf{K}+\sigma^2_{e_2}\mathbf{I}_N\\ \end{bmatrix} \begin{bmatrix} 0&\mathbf{I}_N\\ 0&0\\ \end{bmatrix}\}\\ &=-2tr(y_1y_1^T\mathbf{I}_N)+2\sigma^2_{g_1}tr(\mathbf{K})+2\sigma^2_{e_1}tr(\mathbf{I}_N) \end{align} \]
Similarly for \(\frac{\partial \mathcal{Q}}{\sigma^2_{g_{2}}}=-2tr(y_2y_2^T\mathbf{I}_N)+2\sigma^2_{g_2}tr(\mathbf{KK^T})+2\sigma^2_{e_2}tr(\mathbf{K})\), \(\frac{\partial \mathcal{Q}}{\sigma^2_{g_{2}}}=-2tr(y_2y_2^T\mathbf{I}_N)+2\sigma^2_{g_2}tr(\mathbf{K})+2\sigma^2_{e_2}tr(\mathbf{I}_N)\),
\[ \begin{align} \frac{\partial \mathcal{Q}}{\sigma^2_{\gamma_g}}&=tr\{-2yy^T \begin{bmatrix} 0&\mathbf{K}\\ \mathbf{K}&0\\ \end{bmatrix} + 2\begin{bmatrix} \sigma^2_{g_1}\mathbf{K}+\sigma^2_{e_1}\mathbf{I}_{N}&\sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}\\ \sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}_{N}&\sigma^2_{g_2}\mathbf{K}+\sigma^2_{e_2}\mathbf{I}\\ \end{bmatrix} \begin{bmatrix} 0&\mathbf{K}\\ \mathbf{K}&0\\ \end{bmatrix}\}\\ &=-4tr(y_1y_2^T\mathbf{K})+4\sigma^2_{\gamma_g}tr(\mathbf{K}\mathbf{K}^T)+4\sigma^2_{\gamma_e}tr(\mathbf{K}) \end{align} \]
\[ \begin{align} \frac{\partial \mathcal{Q}}{\sigma^2_{\gamma_e}}&=tr\{-2yy^T \begin{bmatrix} 0&\mathbf{K}\\ \mathbf{K}&0\\ \end{bmatrix} + 2\begin{bmatrix} \sigma^2_{g_1}\mathbf{K}+\sigma^2_{e_1}\mathbf{I}&\sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}\\ \sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}&\sigma^2_{g_2}\mathbf{K}+\sigma^2_{e_2}\mathbf{I}\\ \end{bmatrix} \begin{bmatrix} 0&\mathbf{K}\\ \mathbf{K}&0\\ \end{bmatrix}\}\\ &=-4tr(y_1y_2^T\mathbf{I})+4\sigma^2_{\gamma_g}tr(\mathbf{K})+4\sigma^2_{\gamma_e}tr(\mathbf{I}) \end{align} \]
hT1=0.3
hT2=0.5
rg=0.5
re=0.3
n=1000 #sample size
m=500 #marker
frq=runif(m, 0.1, 0.5)
x=matrix(rbinom(m*n, 2, frq), n, m, byrow=T)
b1=rnorm(m)
b2=rg*b1+sqrt(1-rg^2)*rnorm(m)
e1=rnorm(n)
e2=re*e1+sqrt(1-re^2)*rnorm(n)
bv1=x%*%b1
vG1=var(bv1[,1])
y1=bv1+sqrt(vG1/hT1*(1-hT1))*e1
sy1=scale(y1)
bv2=x%*%b2
vG2=var(bv2[,1])
y2=bv2+sqrt(vG2/hT2*(1-hT2))*e2
sy2=scale(y2)
ss1=matrix(0, m, 4)
ss2=matrix(0, m, 4)
for(i in 1:m) {
mod1=lm(sy1~x[,i])
ss1[i,]=summary(mod1)$coefficients[2,]
mod2=lm(sy2~x[,i])
ss2[i,]=summary(mod2)$coefficients[2,]
}
rh2=(mean(ss1[,3]*ss2[,3])-mean(sy1*sy2))/(n/m)
hT1est=(mean(ss1[,3]^2)-mean(sy1*sy1))/(n/m)
hT2est=(mean(ss2[,3]^2)-mean(sy2*sy2))/(n/m)
rGEst=rh2/sqrt(hT1est*hT2est)
\[ \hat{\gamma}_g=\frac{y_1^T\mathbf{K}y_2-y_1^Ty_2}{tr(\mathbf{KK^T})-N_c}=\frac{\sqrt{N_1N_2}(\gamma_g\sqrt{h^2_1h^2_2}/M_e+\frac{N_c}{N_1N_2}r_e)-\frac{N_c}{\sqrt{N_1N_2}}r_e}{\frac{N_1N_2}{M_e}} \]
The correlation is \[\hat{r}_g=\frac{\hat{\gamma_g}}{\sqrt{\hat{\sigma}^2_{g_1}\hat{\sigma}^2_{g_2}}}\]
The sampling variance of \(\hat{r}_g\), according to Delta method (see eq 11), is
\[ var(\hat{r}_g)=\gamma^2_g (\frac{var(\sigma^2_{g_1})}{4(\sigma^2_{g_1})^2}+\frac{var(\sigma^2_{g_2})}{4(\sigma^2_{g_2})^2}+\frac{var(\gamma_g)}{(\gamma_g)^2}+\frac{2cov(\sigma^2_{g_1},\sigma^2_{g_2})}{4\sigma^2_{g_1}\sigma^2_{g_2}}-\frac{2cov(\sigma^2_{g_1},\gamma_g)}{2\sigma^2_{g_1}\gamma_g}-\frac{2cov(\sigma^2_{g_2},\gamma_g)}{2\sigma^2_{g_2}\gamma_g}) \]
S=10
SIMU=array(0, dim=c(S, 4))
for(s in 1:S) {
hT1=0.3
hT2=0.5
rg=0.5
re=0.3
nc=500
n1=500
n2=200
N1=nc+n1
N2=nc+n2
m=500 #marker
frq=runif(m, 0.1, 0.5)
x=matrix(rbinom(m*nc, 2, frq), nc, m, byrow=T)
x1=matrix(rbinom(m*n1, 2, frq), n1, m, byrow = T)
x2=matrix(rbinom(m*n2, 2, frq), n2, m, byrow = T)
X1=rbind(x, x1)
X2=rbind(x, x2)
b1=rnorm(m)
b2=rg*b1+sqrt(1-rg^2)*rnorm(m)
e0=rnorm(nc)
e1=rnorm(n1)
e2=rnorm(n2)
E1=c(e0, e1)
E2=c(re*e0+sqrt(1-re^2)*rnorm(nc), e2)
bv1=X1%*%b1
vG1=var(bv1[,1])
y1=bv1+sqrt(vG1/hT1*(1-hT1))*E1
sy1=scale(y1)
sy1c=sy1[1:nc]
bv2=X2%*%b2
vG2=var(bv2[,1])
y2=bv2+sqrt(vG2/hT2*(1-hT2))*E2
sy2=scale(y2)
sy2c=sy2[1:nc]
ss1=matrix(0, m, 4)
ss2=matrix(0, m, 4)
for(i in 1:m) {
mod1=lm(sy1~X1[,i])
ss1[i,]=summary(mod1)$coefficients[2,]
mod2=lm(sy2~X2[,i])
ss2[i,]=summary(mod2)$coefficients[2,]
}
Q=(mean(ss1[,3]*ss2[,3])-nc/sqrt(N1*N2)*mean(sy1c*sy2c)) / (sqrt(N1*N2)/m)
EQ=((sqrt(N1*N2)*rg*sqrt(hT1*hT2)/m+nc/sqrt(N1*N2)*re) - nc/sqrt(N1*N2)*re) / (sqrt(N1*N2)/m)
hT1est=(mean(ss1[,3]^2)-mean(sy1*sy1))/(N1/m)
hT2est=(mean(ss2[,3]^2)-mean(sy2*sy2))/(N2/m)
rG=Q/sqrt(hT1est*hT2est)
rGEst=EQ/sqrt(hT1est*hT2est)
SIMU[s, ]=c(hT1est, hT2est, rG, rGEst)
}
colMeans(SIMU)
## [1] 0.2950739 0.4926739 0.4578649 0.5139267
We have \(\mathbf{M}\), a \((C+1)\times(C+1)\) symmetric matrix. We have,
in which \(N\) is sample size, and \(T_i=\frac{N^2}{M_{e.i}}\). \(M_{e.i}\) the effective number of markers for the \(i^{th}\) chromosome.
Using Gaussian elimination, we can find the inverse of the \(\mathbf{M}\) as below
\[\begin{bmatrix} T_1+N&N&...&N\\ N&T_2+N&...&N\\ \vdots&\vdots&\cdots&\vdots\\ N&N&\cdots&N \end{bmatrix} \color{red}{\begin{bmatrix} 1&0&...&0\\ 0&1&...&0\\ \vdots&\vdots&\cdots&\vdots\\ 0&0&\cdots&1 \end{bmatrix}} \]
Step 1: minus \(row_{C+1}\) from \(row_i\), \(i\ne (C+1)\), and we have \[ \begin{bmatrix} T_1&0&...&0\\ 0&T_2&...&0\\ \vdots&\vdots&\cdots&\vdots\\ N&N&\cdots&N \end{bmatrix} \color{red}{\begin{bmatrix} 1&0&...&-1\\ 0&1&...&-1\\ \vdots&\vdots&\cdots&\vdots\\ 0&0&\cdots&1 \end{bmatrix}} \]
Step 2: scale the row \(1~C\) with \(1/T_i\), and the last row with \(1/N\) \[ \begin{bmatrix} 1&0&...&0\\ 0&1&...&0\\ \vdots&\vdots&\cdots&\vdots\\ 1&1&\cdots&1 \end{bmatrix} \color{red}{\begin{bmatrix} 1/T_1&0&...&-1/T_1\\ 0&1/T_2&...&-1/T_2\\ \vdots&\vdots&\cdots&\vdots\\ 0&0&\cdots&1/N \end{bmatrix}} \]
Step 3: minus the last row with row \(1~C\), and we finish the inversion and get \(\color{red}{\mathbf{M}^{-1}}\) \[ \begin{bmatrix} 1&0&...&0\\ 0&1&...&0\\ \vdots&\vdots&\cdots&\vdots\\ 0&0&\cdots&1 \end{bmatrix} \color{red}{\begin{bmatrix} 1/T_1&0&...&-1/T_1\\ 0&1/T_2&...&-1/T_2\\ \vdots&\vdots&\cdots&\vdots\\ -1/T_1&-1/T_2&\cdots&1/N+\sum_{i=1}^C\frac{1}{T_i} \end{bmatrix}} \]
Furthermore, \(\color{red}{\mathbf{M}^{-1}}\) can be written as
\[\color{red}{\begin{bmatrix} \frac{M_{e.1}}{N^2}&0&\cdots&0&-\frac{M_{e.1}}{N^2}\\ 0&\frac{M_{e.2}}{N^2}&\cdots&0&-\frac{M_{e.2}}{N^2}\\ \vdots&\vdots&\cdots&0&\vdots\\ 0&0&\cdots&\frac{M_{e.C}}{N^2}&-\frac{M_{e.C}}{N^2}\\ -\frac{M_{e.1}}{N^2}&-\frac{M_{e.2}}{N^2}&\cdots&-\frac{M_{e.C}}{N^2}&\frac{1}{N}(1+\frac{\sum_{i=1}^C{M_{e.i}}}{N}) \end{bmatrix}} \] It involves \(M_{e.i}\) and \(N\), and all elements are known, so there is no inversion needed. The original computational cost of \(M\) is \(\mathcal{O}[(C+1)^3]\) can be reduced to linear scale of \(\mathcal{O}(C)\).
n=100
Chr=20
Me=(1:Chr)*10
Mat=matrix(0, Chr+1, Chr+1)
D_mat=n^2/Me
diag(Mat)=c(D_mat,0)
Mat=Mat+n
Mat_I=solve(Mat)
Mat_IE=matrix(0, Chr+1, Chr+1)
Mat_IE[1:Chr, ncol(Mat_IE)]=-1/D_mat
diag(Mat_IE)=c(1/D_mat, 1/n+sum(Me)/n^2)
Mat_IE[nrow(Mat_IE), 1:(ncol(Mat_IE)-1)]=-1/D_mat
layout(matrix(1:3, 1, 3))
plot(main="diag",diag(Mat_IE), diag(Mat_I), xlab="Expected", ylab="Inversed")
abline(a=0, b=1)
plot(main="last colunm",Mat_IE[1:Chr, ncol(Mat_IE)], Mat_I[1:Chr, ncol(Mat_I)], xlab="Expected", ylab="Inversed")
abline(a=0, b=1)
plot(main="last row",Mat_IE[nrow(Mat_IE), ], Mat_I[nrow(Mat_I), ], xlab="Expected", ylab="Inversed")
abline(a=0, b=1)
sessionInfo()
## R version 3.6.2 (2019-12-12)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS 10.16
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] kableExtra_1.2.1 knitr_1.30 Rcpp_1.0.5
##
## loaded via a namespace (and not attached):
## [1] rstudioapi_0.11 xml2_1.3.2 magrittr_1.5 rvest_0.3.6
## [5] munsell_0.5.0 colorspace_1.4-1 viridisLite_0.3.0 R6_2.4.1
## [9] rlang_0.4.8 stringr_1.4.0 httr_1.4.2 highr_0.8
## [13] tools_3.6.2 webshot_0.5.2 xfun_0.18 htmltools_0.5.0
## [17] yaml_2.2.1 digest_0.6.26 lifecycle_0.2.0 glue_1.4.2
## [21] evaluate_0.14 rmarkdown_2.7 stringi_1.5.3 compiler_3.6.2
## [25] scales_1.1.1