TOC

print(Sys.time())
## [1] "2021-04-02 17:23:36 CST"

1 VC-MoM

Additive model

\(y|e,\mathbf{\beta}=\mathbf{X}\beta+e\)

\(e|\sigma^2_e~N(0,\sigma^2I_N)\)

\(\beta|\sigma^2_g~N(\frac{\sigma^2_g}{M},I_M)\)

\(y\) is centered.

\[ \begin{align} cov(y)&=\mathbb{E}(yy^T)-\mathbb{y}\mathbb{y}^T\\ &=\sigma^2_a\mathbf{\Omega}_a+\sigma^2_eI_N \end{align} \]

The target to be minimized is

\(Q=argmin_{(\sigma^2_a,\sigma^2_e)}|| yy^T-(\sigma^2_a\mathbf{\Omega}+\sigma^2_eI_N)||_F^2\)

Here the subscript \(_{F}\) is for Frobenius norm of a matrix \(\mathbf{A}\), \(||\mathbf{A}||_F=\sqrt{tr(\mathbf{AA}^T)}\).

\[ \begin{align} Q&=argmin_{(\sigma^2_a,\sigma^2_e)}|| yy^T-(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_eI_N)||_F^2\\ &=tr\{[yy^T-(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_eI_N)][yy^T-(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_eI_N)]^T\} \end{align} \]

An ordinary least squares (OLS) problem of \(Q\) is as below

\[ \left\{ \begin{array}{lll} \frac{\partial{Q}}{\partial{\sigma^2_a}} & =tr\{\frac{yy^Tyy^T-2(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_eI_N)yy^T+(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_eI_N)^2}{\partial{\sigma^2_a}}\}&=tr\{\sigma^2_a\mathbf{\Omega}_a\mathbf{\Omega}_a^T+\sigma^2_e\mathbf{\Omega}_a-yy^T\mathbf{\Omega}_a\}=0\\ \frac{\partial{Q}}{\partial{\sigma^2_e}} & =tr\{\frac{yy^Tyy^T-2(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_eI_N)yy^T+(\sigma^2_g\mathbf{\Omega}_a+\sigma^2_eI_N)^2}{\partial{\sigma^2_e}}\}&=tr\{\sigma^2_a\mathbf{\Omega}_a+\sigma^2_e\mathbf{I}_N-yy^T\mathbf{I}_N\}=0 \end{array} \right. \]

\[ \left\{ \begin{array}{lll} tr\{\sigma^2_a\mathbf{\Omega}_a\mathbf{\Omega}_a^T+\sigma^2_e\mathbf{\Omega}_a\}=tr\{yy^T\mathbf{\Omega}_a\}=tr\{y^T\mathbf{\Omega}_ay\}\\ tr\{\sigma^2_a\mathbf{\Omega}_a+\sigma^2_e\mathbf{I}_N\}=tr\{yy^T\mathbf{I}_N\}=tr\{y^T\mathbf{I}_Ny\} \end{array} \right. \] The last step uses cycling property of \(trace\).

The above result can be orgainzed in to normal equation below

\[ \begin{bmatrix} tr[\mathbf{\Omega}_a\mathbf{\Omega}_a^T] & tr[\mathbf{\Omega}_a] \\ tr[\mathbf{\Omega}_a] & N \end{bmatrix} \left[ \begin{array}{c} \hat{\sigma}^2_a \\ \hat{\sigma}^2_e \end{array} \right] = \left[ \begin{array}{c} tr[y^T\mathbf{\Omega}_ay] \\ tr[y^TI_Ny] \end{array} \right] \]

It is easy to find that

\[\color{green}{\boxed{\hat{\sigma}^2_a=\frac{y^T(\mathbf{\Omega}_a-I_N)y}{tr[\mathbf{\Omega}_a\mathbf{\Omega}_a^T]-N}}}\]

Notes:, the \(y^T(\mathbf{\Omega}-I_N)y\) is quadratic form wiki link.

\(var[y^T(\mathbf{\Omega}_a-I_N)y]=2tr[H(\mathbf{\Omega}_a-I_N)H(\mathbf{\Omega}_a-I_N)]\)

in which \(H\) can be replaced by the expected correlation matrix for \(y\), here is \(I_N\) for unrelated individuals. So,

\(var[y^T(\mathbf{\Omega}_a-I_N)y]=2tr[(\mathbf{\Omega}_a-I_N)(\mathbf{\Omega}_a-I_N)]=2tr[\mathbf{\Omega}^2_a-2\mathbf{\Omega}_aI_N-I_N^2]=2[tr(\mathbf{\Omega}_a^2)-N]=2N^2M_e\). In addition, \(E[y^T(\mathbf{\Omega}_a-I_N)y]=[tr(\mathbf{\Omega}_a^2)-N]\sigma^2_a\).

sourceCpp("~/git/Notes/R/RLib/Shotgun.cpp")
M=10000
N=200
SIMU=50
frq=runif(M, 0.2, 0.8)
Dp=c(runif(M/2, -0.95, -0.5), runif(M/2, 0.5, 0.95))
Dp=Dp[1:(M-1)]

Amat=matrix(0, SIMU, 2)
gMat=GenerateGenoDprimeRcpp(frq, N=N, Dp = Dp)

sG=scale(gMat)
K=sG%*%t(sG)/(M-1)
Me=1/var(K[row(K)<col(K)])
I=diag(1, N, N)

KI=K-I

for(i in 1:SIMU) {
  y=scale(rnorm(N))
  yy=y%*%t(y)
  yy_KI=yy%*%KI
  
  Amat[i,1]=t(y)%*%KI%*%y
  ss=I%*%KI%*%I%*%KI
  #  ss=KI%*%KI
  Amat[i,2]=2*sum(diag(ss))
}
print(paste("Obs mean:", mean(Amat[,1]), " vs expected mean 0"))
## [1] "Obs mean: -0.496725938666893  vs expected mean 0"
print(paste("Obs var:", var(Amat[,1]), "vs expected var:", mean(Amat[,2]), "vs expected var (eq2):", N^2/Me*2))
## [1] "Obs var: 16.4799793466223 vs expected var: 15.7523433069965 vs expected var (eq2): 13.74480721206"

See a simulation example (additive)

n=500 #sample size
m=1000 #marker
h2=0.3 #heritability
b=rnorm(m, 0, sqrt(h2/m)) #effect
SIMU=5

#simu g
fq=runif(m, 0.1, 0.5)
x=matrix(0, n, m)
for(i in 1:m) {
  x[,i]=rbinom(n, 2, fq[i])
}
sx=apply(x, 2, scale)

K=sx%*%t(sx)/m
me=var(K[col(K)<row(K)])

#simu y
H2=matrix(0, SIMU, 2)
for(i in 1:SIMU) {
  y=sx%*%b+rnorm(n, 0, sqrt(1-h2))

  yy=y%*%t(y)
  h2Mod=lm(yy[col(yy)<row(yy)]~K[col(yy)<row(yy)])
  H2[i,1]=summary(h2Mod)$coefficients[2,1]
  ss=matrix(0, m, 5)
  for(j in 1:m) {
    mod=lm(y~x[,j])
    ss[j,1:4]=summary(mod)$coefficient[2,]
  }
  ss[,5]=ss[,3]^2
  H2[i,2]=((mean(ss[,5])-1)*n)/(n*n*me)
}
barplot(t(H2), beside = T)
abline(h=h2)

Dominance VC

\(y|e,\mathbf{\beta}=\mathbf{X}_a\beta_a+\mathbf{X}_d\beta_d+e\)

\(e|\sigma^2_e~N(0,\sigma^2I_N)\)

\(\beta_a|\sigma^2_a~N(\frac{\sigma^2_a}{M},I_M)\)

\(\beta_d|\sigma^2_d~N(\frac{\sigma^2_d}{M},I_M)\)

\(y\) is centered.

\[ \begin{align} cov(y)&=\mathbb{E}(yy^T)-\mathbb{y}\mathbb{y}^T\\ &=\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N \end{align} \]

When include dominace variance component, the target to be minimized is

\(Q=argmin_{(\sigma^2_g,\sigma^2_d,\sigma^2_e)}|| yy^T-(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)||_F^2\)

To minimize the target function below

\[ \begin{align} Q&=argmin_{(\sigma^2_a,\sigma^2_d,\sigma^2_e)}|| yy^T-(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)||_F^2\\ &=tr\{[yy^T-(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)][yy^T-(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)]^T\} \end{align} \] After taking deriation in repect to each component, we have \[ \left\{ \begin{array}{lll} \frac{\partial{Q}}{\partial{\sigma^2_a}} & =tr\{\frac{yy^Tyy^T-2(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)yy^T+(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)^2}{\partial{\sigma^2_a}}\}&=tr\{\sigma^2_a\mathbf{\Omega}_a\mathbf{\Omega}_a^T+\sigma^2_d\mathbf{\Omega}_a\mathbf{\Omega}_d^T+\sigma^2_e\mathbf{\Omega}_a-yy^T\mathbf{\Omega}_a\}=0\\ \frac{\partial{Q}}{\partial{\sigma^2_d}} & =tr\{\frac{yy^Tyy^T-2(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)yy^T+(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)^2}{\partial{\sigma^2_d}}\}&=tr\{\sigma^2_d\mathbf{\Omega}_d\mathbf{\Omega}_a^T+\sigma^2_d\mathbf{\Omega}_d\mathbf{\Omega}_d^T+\sigma^2_e\mathbf{\Omega}_a-yy^T\mathbf{\Omega}_d\}=0\\ \frac{\partial{Q}}{\partial{\sigma^2_e}} & =tr\{\frac{yy^Tyy^T-2(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)yy^T+(\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_eI_N)^2}{\partial{\sigma^2_e}}\}&=tr\{\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_e\mathbf{I}_N-yy^T\mathbf{I}_N\}=0 \end{array} \right. \] then \[ \left\{ \begin{array}{lll} tr\{\sigma^2_a\mathbf{\Omega}_a\mathbf{\Omega}_a^T+\sigma^2_a\mathbf{\Omega}_a\mathbf{\Omega}_d^T+\sigma^2_e\mathbf{\Omega}_a\}=tr\{yy^T\mathbf{\Omega}_a\}=tr\{y^T\mathbf{\Omega}_ay\}\\ tr\{\sigma^2_d\mathbf{\Omega}_d\mathbf{\Omega}_a^T+\sigma^2_d\mathbf{\Omega}_d\mathbf{\Omega}_d^T+\sigma^2_e\mathbf{\Omega}_d\}=tr\{yy^T\mathbf{\Omega}_d\}=tr\{y^T\mathbf{\Omega}_dy\}\\ tr\{\sigma^2_a\mathbf{\Omega}_a+\sigma^2_d\mathbf{\Omega}_d+\sigma^2_e\mathbf{I}_N\}=tr\{yy^T\mathbf{I}_N\}=tr\{y^T\mathbf{I}_Ny\} \end{array} \right. \]

We need to solve the equation below \[ \begin{bmatrix} tr[\mathbf{\Omega}_a\mathbf{\Omega}_a^T] & tr[\mathbf{\Omega}_a\mathbf{\Omega}_d^T]& tr[\mathbf{\Omega}_a] \\ tr[\mathbf{\Omega}_d\mathbf{\Omega}_a^T] & tr[\mathbf{\Omega}_d\mathbf{\Omega}_d^T] & tr[\mathbf{\Omega}_d] \\ tr[\mathbf{\Omega}_a] & tr[\mathbf{\Omega}_d] & N \end{bmatrix} \left[ \begin{array}{c} \hat{\sigma}^2_a \\ \hat{\sigma}^2_d \\ \hat{\sigma}^2_e \end{array} \right] = \left[ \begin{array}{c} tr[y^T\mathbf{\Omega}_ay] \\ tr[y^T\mathbf{\Omega}_dy]\\tr[y^TI_Ny] \end{array} \right] \] Of note, \(E[tr(\mathbf{\Omega})_a]=E[tr(\mathbf{\Omega})_d]=N\).

\[ \left[ \begin{array}{c} \hat{\sigma}^2_a \\ \hat{\sigma}^2_d \\ \hat{\sigma}^2_e \\ \end{array} \right ]=\Lambda^{-1} \left[ \begin{array}{c} y^T\mathbf{\Omega}_ay \\ y^T\mathbf{\Omega}_dy\\ y^TI_Ny \end{array} \right] \]

In particular, \[ \begin{align} \hat{\sigma}^2_a&=\frac{(ei-fh)y^T\mathbf{\Omega}_ay-(bi-ch)y^T\mathbf{\Omega}_dy+(bf-ce)y^TI_Ny}{det(\mathbf{\Lambda})}\\ &=\frac{1}{det(\mathbf{\Lambda})}\{[Ntr(\mathbf \Omega^2_d)-N^2]y^T\mathbf{\Omega}_ay-[Ntr(\mathbf \Omega^2_d)-N^2]y^TI_Ny\}\\ &=\frac{1}{det(\mathbf{\Lambda})}\{[Ntr(\mathbf \Omega^2_d)-N^2]y^T(\mathbf{\Omega}_a-I_N)y\} \end{align} \],

in which \[ \begin{align} det(\mathbf{\Lambda})&=abd+bfg+cdh-ceg-fha-ibd\\ &=\color{red}{tr(\mathbf{\Omega}^2_a}\color{blue}{\mathbf{\Omega}^2_d})N+N^3+N^3-N^2tr(\mathbf{\Omega}^2_a)-N^2tr(\mathbf{\Omega}^2_d)-N^3\\ &=N[\color{red}{tr(\mathbf{\Omega^2}_a} \color{blue}{\mathbf{\Omega^2}_d)}-tr(\mathbf{\Omega^2}_a)N-tr(\mathbf{\Omega}^2_d)N+N^2] \end{align} \].

Under orthogonal coding for additive and dominance scheme, \(\color{red}{tr(\mathbf{\Omega}^2_a}\color{blue}{\mathbf{\Omega}^2_d)}=\color{red}{tr(\mathbf{\Omega}^2_a)}\color{blue}{tr{(\mathbf{\Omega}^2_d)}}\), and \(det(\mathbf {\Lambda})=N[\color{red}{tr(\mathbf{\Omega}^2_a)}-N][\color{blue}{tr(\mathbf{\Omega}^2_d)}-N]\)

Then, \(\hat{\sigma}^2_a=\frac{y^T[\mathbf{\Omega}_a-I_N]y}{tr(\mathbf{\Omega}^2_a)-N)}\).

Similarly for, \(\hat{\sigma}^2_d=\frac{y^T[\mathbf{\Omega}_d-I_N]y}{tr(\mathbf{\Omega}^2_d)-N)}\).

See a dominace example

n=500 #sample size
m=1000 #marker
h2=0.3 #heritability
h2d=0.3
b=rnorm(m, 0, sqrt(h2/m)) #effect
d=rnorm(m, 0, sqrt(h2d/m))
SIMU=5

#simu g
fq=runif(m, 0.3, 0.5)
x=matrix(0, n, m)
for(i in 1:m) {
  x[,i]=rbinom(n, 2, fq[i])
}
FQ=colMeans(x)/2
sA=apply(x, 2, scale)

K=sA%*%t(sA)/m
me=var(K[col(K)<row(K)])


##dominance
xd=matrix(0, n, m)

for(i in 1:m) {
  cd=c(0, 2*FQ[i], 4*FQ[i]-2)
  xd[,i]=cd[x[,i]+1]
}
dCt=matrix(rep(2*FQ^2, n), n, m, byrow = T)
dV=matrix(rep(sqrt(4*FQ^2*(1-FQ)^2), n), n, m, byrow = T)
sD=(xd-dCt)/dV
Kd=sD%*%t(sD)/m
med=var(Kd[col(K)<row(K)])

Dm=ifelse(x==1, 1, 0)
#simu y
H2=matrix(0, SIMU, 4)
for(i in 1:SIMU) {
  y=x%*%b+Dm%*%d
  vy=var(y)
  y=y+rnorm(n, 0, sqrt(vy/(h2+h2d)*(1-h2-h2d)))
  y=scale(y)
  yy=y%*%t(y)
  h2Mod=lm(yy[col(yy)<row(yy)]~K[col(yy)<row(yy)]+Kd[col(yy)<row(yy)])
  H2[i,1]=summary(h2Mod)$coefficients[2,1]
  H2[i,2]=summary(h2Mod)$coefficients[3,1]
  ss=matrix(0, m, 5)
  ssd=matrix(0, m, 5)
  for(j in 1:m) {
    mod=lm(y~sA[,j]+sD[,j])
    ss[j,1:4]=summary(mod)$coefficient[2,]
    ssd[j,1:4]=summary(mod)$coefficient[3,]
  }
  ss[,5]=ss[,3]^2
  H2[i,3]=((mean(ss[,5])-1)*n)/(n*n*me)
  ssd[,5]=ssd[,3]^2
  H2[i,4]=((mean(ssd[,5])-1)*n)/(n*n*med)
}
barplot(t(H2), beside = T)
abline(h=c(h2, h2d))

UKB case

A realized example please refer to \(\color{red} {/public/home/xuhm/gc5k/UKB\_cohort/ss}\), \(\color{green} {/public/home/xuhm/gc5k/UKB\_cohort/ss\_adj}\)

library("knitr")
library("kableExtra")

ch2=read.table("cohortH2.txt", as.is = T, header = T)
knitr::kable(ch2, caption = "cohort h2") %>%
kable_styling("striped", full_width = T) %>%
row_spec(row=16:16, color="white", background="red")
cohort h2
Cohort N Me chisq h2 chisq.1 h2.adjust_sex.
Barts 7699 51936.00 1.116 0.794 1.2230 1.5222
Birmingham 19529 55594.92 1.100 0.287 1.2102 0.6050
Bristol 39172 56774.84 1.218 0.320 1.4353 0.6380
Bury 25404 55092.41 1.154 0.338 1.3046 0.6686
Cardiff 16226 52069.39 1.111 0.361 1.2325 0.7549
Croydon 18475 57405.47 1.107 0.337 1.2260 0.7099
Edinburgh 15441 53041.79 1.111 0.386 1.2156 0.7490
Glasgow 16299 51940.17 1.156 0.502 1.2900 0.9351
Hounslow 18420 56773.33 1.133 0.414 1.2382 0.7427
Leeds 39651 56474.83 1.242 0.349 1.4597 0.6623
Liverpool 29604 54490.45 1.183 0.341 1.3665 0.6826
Manchester 11502 53621.53 1.099 0.468 1.1650 0.7782
Middlesborough 19819 53409.79 1.133 0.362 1.2228 0.6074
Newcastle 34663 54675.71 1.179 0.286 1.3564 0.5686
Nottingham 30760 56132.08 1.173 0.319 1.3407 0.6287
Oxford 12260 55143.14 1.072 0.328 1.1345 0.6112
Reading 25886 58027.47 1.109 0.247 1.2400 0.5440
Sheffield 27745 55244.01 1.145 0.293 1.3013 0.6069
Stoke 18211 51302.58 1.114 0.326 1.2534 0.7220

Oxford

/public/home/xuhm/gc5k/UKB_cohort/ss_chr

library("knitr")
library("kableExtra")
ch2=read.table("oxChr.txt", as.is = T, header = T)
knitr::kable(ch2, caption = "Oxford h2") %>%
kable_styling("striped", full_width = T) %>%
row_spec(row=6:6, bold=T, color = "white", background = "red")
Oxford h2
Chr M Me chisq h2
1 32913 7204.5646 1.0755 0.04480
2 33622 6403.3527 1.0321 0.01700
3 28040 5839.5143 1.0861 0.04140
4 26836 5632.8855 1.0545 0.02530
5 25243 5368.6945 1.0443 0.01960
6 30405 857.1236 1.3061 0.02170
7 23260 5020.4250 1.0346 0.01433
8 21530 3909.6782 1.0736 0.02370
9 18799 4277.3328 1.0470 0.01660
10 21313 4328.9556 1.0324 0.01160
11 20746 3269.6384 1.0349 0.00940
12 20194 4213.4118 1.0714 0.02480
13 14769 3508.2313 1.0363 0.01050
14 13741 3106.6733 1.0716 0.01830
15 13502 2872.9585 1.0998 0.02360
16 14848 3146.7089 1.0180 0.00470
17 14267 1796.0199 1.0360 0.00530
18 12880 3243.4674 1.0560 0.01500
19 12394 2346.5845 1.0938 0.01820
20 11181 2683.5395 1.0824 0.01820
21 6532 2683.5395 1.0076 0.00170
22 7087 1597.1183 1.0161 0.00210

2 GRM statistics \(\mathbf{\Omega}_a\)

\(\mathbf{\Omega}=\mathbf{s^Ts}\), in which \(\mathbf{s}\) as defined in the last page. In particular, for a pair of individuals \(i\) and \(j\), their relatedness is \(\Omega_{ij}=\frac{1}{M}\sum_{k=1}^Ms_{i,k}s_{j,k}=\frac{1}{M}\sum_{k=1}^Ms_{i,k}s_{j,k}\)

\(\mathbf{\Omega}\) is a Hermitian matrix (symetric), and the diagonal element is \(\Omega_{ii}=\frac{1}{M}\sum_{k=1}^Ms_{i,k}^2\).

It exist simple relationship that \(M_e=var(\Omega_{off-diag})=\frac{M+\sum_{l_1 \ne l_2}r_{l_1l_2}^2}{M^2}=\), and \(E(V(M_e))=\frac{2M_e}{n^2}\)

sourceCpp("~/git/Notes/R/RLib/Shotgun.cpp")
M=1000
N=100
frq=runif(M, 0.2, 0.8)
Dp=c(runif(M/2, -0.95, -0.5), runif(M/2, 0.5, 0.95))
Dp=Dp[1:(M-1)]

Amat=matrix(0, 100, 2)

MeMat=matrix(0, 100, 1)
for(i in 1:100) {
gMat=GenerateGenoDprimeRcpp(frq, N=N, Dp = Dp)

sG=scale(gMat)
K=sG%*%t(sG)/(M-1)
MeMat[i,1]=1/var(K[row(K)>col(K)])
}
print(paste("Me:", mean(MeMat)))
## [1] "Me: 590.274088578811"
print(paste("V(Me):", sd(MeMat), "E[V(Me)]:", (2/N*mean(MeMat))))
## [1] "V(Me): 12.0302061579735 E[V(Me)]: 11.8054817715762"

Diagonal elements

\(E(s_{ii}^2)=diag(\Omega_{ii})=1\), interpreting that an individual is identical to himself under random mating (no inbred, otherwise \(1+F\)).

\[\begin{align} E(\Omega_{ii}^2) &= \frac{1}{M^2}[\color{red} {\sum_{k=1}^M s_{ik}^4}+\color{green}{\sum_{k_1}\sum_{k_1 \ne k_2} s_{i,k_1}^2s_{i,k_2}^2}] \\ &=\frac{1}{M^2}\{\color{red} {\sum_{k=1}^M \frac{1}{2p_kq_k}}+\color{green}{\sum_{k_1}\sum_{k_1 \ne k_2} [(1+\rho^2_{k_1,k_2})+\frac{\rho_{k_1,k_2}(p_{k_1}-q_{k_1})(p_{k_2}-q_{k_2})}{2\sqrt{p_{k_1}q_{k_1}p_{k_2}q_{k_2}}}]}\} \end{align}\]

Under large sample and many loci, \(E(\frac{\rho_{k_1,k_2}(p_{k_1}-q_{k_1})(p_{k_2}-q_{k_2})}{2\sqrt{p_{k_1}q_{k_1}p_{k_2}q_{k_2}}})=0\)

So, we have

\[\begin{align} E(\Omega_{ii}^2)&=\frac{1}{M^2}[M\frac{1}{E(\sigma_k^2)}+M(M-1)(1+E(\rho^2_{k_1k_2}))]\\ &=\frac{1}{E(\sigma_k^2)}\frac{1}{M}+\frac{M-1}{M}[1+\color{red}{E(\rho^2_{k_1k_2})}] \end{align}\]

Off-diagonal elements

\(\Omega_{ij}=\frac{1}{M}\sum_{k=1}^Ms_{ik}s_{jk}\)

and \(\Omega_{ij}^2=\frac{1}{M^2}[\sum_{k=1}^Ms^2_{ik}s^2_{jk}+\sum_{k_1=1}^M\sum_{k_2=1}^Ms_{ik_1}s_{ik_2}s_{jk_1}s_{jk_2}]\)

in which \(E(s_{ik}^2s_{jk}^2)=E(s_{ik}^2)E(s_{jk}^2)\), if the individuals are not related.

\(E(s_{ik_1}s_{ik_2}s_{jk_1}s_{jk_2})=E(s_{ik_1}s_{ik_2})E(s_{jk_1}s_{jk_2})=E(\rho_{i,k_1k_2})E(\rho_{j,k_1k_2})=E^2(\rho_{k_1k_2})\).

So

\[E(\mathbf{\Omega}_{ij}^2)=\frac{1}{M}+\frac{M-1}{M}\color{green}{E^2(\rho_{k_1k_2})}\]

\(E(s^2_{ii,k}),E(s^4_{ii,k}), var(s_{ii}^2)\)

Some useful identities for \(s\).

\[E(s^2_{ii,k})=1\]

\[E(\color{red} {s^4_{ii,k}})=\frac{1}{2p_kq_k}\] The \(\color{red} {s^4_{ik} = \frac{1}{2p_kq_k}}\), in which \(2p_kq_k\) is the expected value of the variance of the \(k^{th}\) locus under random mating. \(\color{red} {s^4_{ik}}\) can be derived as below,

\(a_ka_k\) \(A_ka_k\) \(A_kA_k\)
\(q_k^2\) \(2p_kq_k\) \(p_k^2\)
\(\Omega_{ii}=s^2_{ii}\) \(\frac{4p_k^2}{2p_kq_k}\) \(\frac{(1-2p_k)^2}{2p_kq_k}\) \(\frac{4q_k^2}{2p_kq_k}\)
\(\Omega_{ii}^2=s^4_{ii}\) \(\frac{16p_k^4}{4p_k^2q_k^2}\) \(\frac{(q_k-p_k)^4}{4p_k^2q_k^2}\) \(\frac{16p_k^4}{4p_k^2q_k^2}\)

\[var(s_{ii}^2)=E(s_{ii}^4)-E^2(s_{ii}^2)=\frac{1}{2p_kq_k}-1\]

library(Rcpp)
sourceCpp("~/git/Notes/R/RLib/Shotgun.cpp")
source("~/git/Notes/R/RLib/shotgun.R")
M=1000
N=500
frq=runif(M, 0.1, 0.5)
Dp=sample(c(runif(M/2, 0, .5), runif(M/2, -0.5, 0)), M)
Dp=Dp[-1]

G=GenerateGenoDprimeRcpp(frq, Dp, N)
s=apply(G, 2, scale)
#Kcpp=CorMatrixRcpp(s)
FQ=colMeans(G)/2
s2=s^2
s4=s^4
layout(matrix(1:3, 1, 3))
hist(main="Simulation test for s^2", colMeans(s2), xlab="E(s^2)")
plot(main="Simulation test for s^4", 1/(2*FQ*(1-FQ)), colMeans(s4), pch=16, cex=0.5, xlab="1/(2pq)", ylab="Simulated s^4")
abline(a=0, b=1, col="red", lwd=2, lty=2)

vs2=apply(s2, 2, var)
plot(main="Siulation test for var(s^2)",1/(2*FQ*(1-FQ))-1, vs2, pch=16, cex=0.5, xlab="1/(2pq)-1", ylab="Simulated var(s^2)")
abline(a=0, b=1, col="red", lwd=2, lty=2)

#K
K=1/M * s %*% t(s)

3 \(E(s_{ii,k_1}^2s_{ii,k_2}^2), cov(s_{ii,k_1}^2,s_{ii,k_2}^2)\)

\[\begin{align} E(\color{green}{s_{ii,k_1}^2s_{ii,k_2}^2})&= \color{purple}{q^2_{k_1}\frac{4p_{k_1}^2}{2p_{k_1}q_{k_1}} [\frac{4p_{k_2}^2r^2_{k_1k_2}+(q_{k_2}-p_{k_2})^2 2r_{k_1k_2}\bar{r}_{k_1k_2} + 4q_{k_2}^2\bar{r}_{k_1k_2}^2}{2p_{k_2}q_{k_2}}]}\\ &+2p_{k_1k_2}\frac{(q_{k_1}-p_{k_2})^2}{2p_{k_1k_2}}[\frac{4p_{k_2}^2r_{k_1k_2}\bar{R}_{k_1k_2}+(q_{k_2}-p_{k_2})^2 (\bar{r}_{k_1k_2}\bar{R}_{k_1k_2}+r_{k_1k_2}R_{k_1k_2}) + 4q_{k_2}^2\bar{r}_{k_1k_2}R_{k_1k_2}}{2p_{k_2}q_{k_2}}]\\ &+\color{purple}{p^2_{k_1}\frac{4q_{k_1}^2}{2p_{k_1}q_{k_1}}[\frac{4p_{k_2}^2\bar{R}^2_{k_1k_2}+(q_{k_2}-p_{k_2})^2 2R_{k_1k_2}\bar{R}_{k_1k_2} + 4q_{k_2}^2R_{k_1k_2}^2}{2p_{k_2}q_{k_2}}]}\\ &=\color{purple}{2p_{k_1}q_{k_1}[\frac{4p^2_{k_2}(r^2_{k_1k_2}+\bar{R}_{k_1k_2})+(q_{k_2}-p_{k_2})^2(2r_{k_1k_2}\bar{r}_{k_1k_2}+2R_{k_1k_2}\bar{R}_{k_1k_2})+4q^2_{k_2}(\bar{r}^2_{k_1k_2}+R^2_{k_1k_2})}{2p_{k_2}{k_2}}]}\\ &+(1-4p_{k_1}q_{k_1})\frac{4p^2_{k_2}r_{k_1k_2}\bar{R}_{k_1k_2}+(q_{k_2}-p_{k_2})^2(\bar{r}_{k_1}{k_2}\bar{R}_{k_1}{k_2}+r_{k_1}{k_2}R_{k_1}{k_2})}{2p_{k_2}q_{k_2}})\\ &=1+\frac{D_{k_1k_2}[(p_{k_1}-q_{k_1})(p_{k_2}-q_{k_2})]}{2p_{k_1}q_{k_1}p_{k_2}q_{k_2}}-\frac{D^2_{k_1k_2}}{p_{k_1}q_{k_1}p_{k_2}q_{k_2}}\\ &=1+\rho^2_{k_1k_2}+\frac{\rho_{k_1k_2}(p_{k_1}-q_{k_1})(p_{k_2}-q_{k_2})}{2\sqrt{p_{k_1}q_{k_1}p_{k_2}q_{k_2}}} \end{align}\]

And consequently, we can derive \[\begin{align} cov(s_{ii,k_1}^2,s_{ii,k_2}^2)&=E(s_{i,k_1}^2s_{i,k_2}^2)-E^2(s_{i,k_1})E^2(s_{i,k_2})\\ &=1+\rho^2_{k_1k_2}+\frac{\rho_{k_1k_2}(p_{k_1}-q_{k_1})(p_{k_2}-q_{k_2})}{2\sqrt{p_{k_1}q_{k_1}p_{k_2}q_{k_2}}}-1\\ &=\rho^2_{k_1k_2}+\rho_{k_1k_2}\frac{(p_{k_1}-q_{k_1})(p_{k_2}-q_{k_2})}{2\sqrt{p_{k_1}q_{k_1}p_{k_2}q_{k_2}}} \end{align}\]

When \(k_1 = k_2\), it is obviously that \(\rho_{k_1,k_2}=1, p_{k_1}=p_{k_2}\), and \[cov(s_{ii,k_1}^2,s_{ii,k_2}^2)=1+1\times\frac{(p_{k}-q_{k})^2}{2p_{k}q_{k}}=1+\frac{1-4p_{k}q_{k}}{2p_{k}q_{k}}=\frac{1}{2p_{k}q_{k}}-1=var(s_{ii,k}^2)\]

4 \(M_{E_a}\)

UKB cohorts test, “/public/home/xuhm/gc5k/UKB_cohort/cohort/grm/sub”

ukMe=read.table("meMarker.txt", as.is = T)
me=ukMe[,-1]
rownames(me)=ukMe[,1]
layout(matrix(1:3, 3, 1))

me1=me[, 1:10]
MeS1=matrix(c(apply(me1, 1, mean), apply(me1, 1, sd)), nrow(me1), 2, byrow = F)
barplot(t(as.matrix(me1)), beside = T)

me2=me[,11:20]
MeS2=matrix(c(apply(me2, 1, mean), apply(me2, 1, sd)), nrow(me2), 2, byrow = F)
barplot(t(as.matrix(me2)), beside = T)

me3=me[,21:30]
MeS3=matrix(c(apply(me3, 1, mean), apply(me3, 1, sd)), nrow(me3), 2, byrow = F)
barplot(t(as.matrix(me3)), beside = T)

MeMean=matrix(0, nrow=nrow(me), 3)
MeMean[,1]=apply(me1, 1, mean)
MeMean[,2]=apply(me2, 1, mean)
MeMean[,3]=apply(me3, 1, mean)

MeSd=matrix(0, nrow=nrow(me), 3)
MeSd[,1]=apply(me1, 1, sd)
MeSd[,2]=apply(me2, 1, sd)
MeSd[,3]=apply(me3, 1, sd)

MeSd=matrix(0, nrow=nrow(me), 3)
MeSd[,1]=apply(me1, 1, sd)
MeSd[,2]=apply(me2, 1, sd)
MeSd[,3]=apply(me3, 1, sd)

layout(matrix(1,1,1))
barP=barplot(MeMean, beside = T, ylim=c(0, 60000))
segments(barP, MeMean-MeSd*2, barP, MeMean+MeSd*2)

GRM \(\mathbf{\Omega}_d\)

Coding for genotypes
Genotype \(aa\) \(Aa\) \(AA\)
Additive \(x_A\) 0 1 2
Dominance \(x_d\) 0 \(2p\) \(4p-2\)
Frequency \(q^2\) \(2pq\) \(p^2\)

We have \(E(x_A)=2p\), \(E(x_d)=2p^2\), and \(E(x_ax_d)=4p^2q+8p^3-4p^2=4p^3\), and consequently, \(cov(x_a, x_d)=E(x_ax_d)-E(x_a)E(x_d)=0\).

For a pair of individuals, their relatedness in terms of dominance effect is \[ \Omega_{d_{ij}}=\frac{1}{M}\sum_{k=1}^M\frac{(x_{d_{i,k}}-2p_k^2)(x_{d_{j,k}}-2p_k^2)}{4p_k^2(1-p_k)^2} \] The effective number of markers is \(E(m_{e_d})=\frac{M^2}{\sum_{l_1=1}^M\sum_{l_2=1}^M\rho_{l_1l_2}^4}\).

frq=0.25
N=1000
g=rbinom(N, 2, frq)
bv=g*1+1*ifelse(g==1, 1, 0)
vg=var(bv)
h2=0.3
y=bv+rnorm(N, sqrt(var(bv)/h2*(1-h2)))

eF=mean(g)/2

#model 1: naive model
aCode1=g
dT1=c(0,1,0)
dCode1=dT1[(g+1)]
md1=lm(y~aCode1+dCode1)

#model 2: othorgonal model
aCode2=(g-2*eF)/(2*eF*(1-eF))
dT2=c(0, 2*eF, 4*eF-2)
dCode2=(dT2[(g+1)]-2*eF^2)/(4*eF^2*(1-eF)^2)
md2=lm(y~aCode2+dCode2)

#model 3: scale will not change t-test value (compare t-test values of dominance in model 2 and model 3)
aCode2=(g-2*eF)/(2*eF*(1-eF))
dT2=c(0, 2*eF, 4*eF-2)
dCode3=(dT2[(g+1)]-2*eF^2)
md3=lm(y~aCode2+dCode3)

aT=c(summary(md1)$coefficients[2,3], summary(md3)$coefficients[2,3], summary(md3)$coefficients[2,3])
dT=c(summary(md1)$coefficients[3,3], summary(md3)$coefficients[3,3], summary(md3)$coefficients[3,3])

LSE (Phenome)

For a linear model \(\mathbf{y}=\mathbf{X\beta}+e\),

We have the target function \[Q=e^Te=argmin_{\mathbf{\beta}}(\mathbf{y}-\mathbf{X\beta})^T(\mathbf{y}-\mathbf{X\beta})\] It can be minized with OLS as below \[\begin{align} \frac{\partial Q}{\partial \beta}&=\frac{(\mathbf{y}-\mathbf{X\beta})^T(\mathbf{y}-\mathbf{X\beta})}{\partial \beta}\\ &=\frac{\partial [y^Ty-\color{red}{\mathbf{\beta}^T\mathbf{X}^Ty}-\color{green}{y^T\mathbf{X\beta}}+\color{blue}{\mathbf{\beta}^T\mathbf{X}^T\mathbf{X}\beta}]}{\partial \mathbf{\beta}}\\ &=0-\color{red}{\mathbf{Xy}^T}-\color{green}{\mathbf{X}\mathbf{y}^T}+\color{blue}{[\mathbf{X}^T\mathbf{X}+(\mathbf{X}^T\mathbf{X})^T]\mathbf{\beta}}, (side\ notes, \mathbf{X}^T\mathbf{X}=(\mathbf{X}^T\mathbf{X})^T)\\ &=-2\mathbf{Xy}^T-2\mathbf{X^TX}\beta \end{align} \] Then, we have

\[\hat{\mathbf{\beta}}=(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y}\]

Notes for matrix calculus

1: more topics about matrix calculus please refer to “Matrix Calculus Wiki”.

2: Or read a gentle introduction “Matrix Calculus You Need for Deep Learning”.

GLSE

If it is general linear model, the LSE is to minimize \(Q=argmin_{\mathbf{\beta}}{(\mathbf{y}-\mathbf{X\beta})^TV^{-1}(\mathbf{y}-\mathbf{X\beta})}\), and its corresponding \(\hat{\mathbf{\beta}}=(\mathbf{X}^T\mathbf{V}^{-1}\mathbf{X})^{-1}(\mathbf{X}^T\mathbf{V}^{-1}\mathbf{y})\).

The sampling variance of \(\hat{\mathbf{\beta}}\) is

\[ \begin{align} SSE&=\color{red}{e^T}\color{green}{e}\\ &=\color{red}{(\mathbf{y}-\mathbf{X\hat{\beta}})^T}\color{green}{(\mathbf{y}-\mathbf{X\hat{\beta}})}\\ &=\color{red}{[\mathbf{y}-\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^{T}\mathbf{y}]^T}\color{green}{[\mathbf{y}-\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y}]}\\ &=\color{red}{\mathbf{y}^T}\color{green}{\mathbf{y}}-\color{red}{\mathbf{y}^T}\color{green}{\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y}}-\color{red}{\mathbf{y}{\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T}}\color{green}{\mathbf{y}}+\color{red}{\mathbf{y}^T\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^{T}}\color{green}{\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^{T}\mathbf{y}}\\ &=\mathbf{y}^T\mathbf{y}-\mathbf{y}^T\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y}\\ &=\mathbf{y}^T(I-\mathbf{P}_{\mathbf{X}})\mathbf{y} \end{align} \] Notes: \(\mathbf{P}_{\mathbf{X}}=\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\) is projection matrix wiki, that \(\mathbf{P}^n=\mathbf{P}\).

\[\hat{\sigma}_{\beta}^2=\frac{SSE}{n-m}(\mathbf{X}^T\mathbf{X})^{-1}=\hat{\sigma}^2_e(\mathbf{X}^T\mathbf{X})^{-1}\]

Variation for LSE

SLR and MLR

There is connection of the regression coefficients between simple linear regresssion and multiple linear regrression.

\[ \begin{align} \hat{\mathbf{\beta}}&=(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y}\\ &=(\Lambda_{\sigma_X}\Sigma\Lambda_{\sigma_X})^{-1}\rho_{Xy}\Lambda_{\sigma_X}\sigma_y\\ &=\Lambda_{\sigma_X}^{-1}\Sigma^{-1}\Lambda_{\sigma_X}^{-1}\rho_{Xy}\Lambda_{\sigma_X}\sigma_y\\ &=\Lambda_{\sigma_X}^{-1}\Sigma^{-1}\rho_{Xy}\sigma_y\\ &=\Sigma^{-1}\hat{b}_{Xy}\\ \end{align} \]

in which \(\hat{b}_{[i]}=\frac{cov(y, x_i)}{var(x_i)}=\rho_{y,x_i}\frac{\sigma_y}{\sigma_{x_i}}\), estimated from the simple linear regression from \(y=a+b_ix_i+e\).

And similarly, \(\hat{\sigma}^2_{\beta}=(\frac{y^Ty-\hat{\mathbf{\beta}}^T\hat{b}}{n-m})(\mathbf{\Lambda_{\sigma_X}\Sigma\Lambda_{\sigma_X}})^{-1}\)

Real example

‘/public/home/xuhm/gc5k/UKB_cohort/ss_adj’

library("knitr")
library("kableExtra")

ch2=read.table("CorUK.txt", as.is = T, header = T)
knitr::kable(ch2, caption = "cohort h2") %>%
kable_styling("striped", full_width = T) %>%
row_spec(row=16:16, color="white", background="red")
cohort h2
Cohort Gender ht cor
Barts 1 1 -0.6918
Birmingham 1 1 -0.7139
Bristol 1 1 -0.7183
Bury 1 1 -0.7131
Cardiff 1 1 -0.7244
Croydon 1 1 -0.7133
Edinburgh 1 1 -0.7114
Glasgow 1 1 -0.7160
Hounslow 1 1 -0.7170
Leeds 1 1 -0.7136
Liverpool 1 1 -0.7170
Manchester 1 1 -0.6995
Middlesborough 1 1 -0.7199
Newcastle 1 1 -0.7173
Nottingham 1 1 -0.7177
Oxford 1 1 -0.7153
Reading 1 1 -0.7252
Sheffield 1 1 -0.7180
Stoke 1 1 -0.7164

It is adjusted variance.

Quadratic form

Let \(x\) be a vector for \(n\) random variables, and \(A\) be a \(n\times n\) symmetric matrix, the scalar quantity \(xAx\) is known as quadratic form. If \[x \sim MVN(\mu,\sum)\], it can be shown that

\[E[xAx]=tr[A\sum]+\mu^TA\mu\]

It can be proved as below \(x^TAx = tr[x^TAx]\), and by the cyclic property of the trace operator \(E[tr(x^TAx)]=E[tr[Axx^T]]\). Since trace operator is linear combination of matrices, it therefor follows from the linearity of the expectation operator that \[E[tr(Axx^T)]=tr[AE(xx^T)]=tr[A(\sum+\mu\mu^T)]=tr(A\sum)+tr(\sum\mu\mu^T)=tr(A\sum)+tr(\mu^T\sum\mu)=tr(A\sum)+\mu^TA\mu\].

Variance

\[var[\mu^TA\mu]=2tr[A\sum A\sum]+4\mu^TA\sum A\mu\]

Covariance

\[cov[\mu^TA_1, \mu A_2]=2tr[A_1\sum A_2\sum]+4\mu^TA_1\sum A_2\mu\] for both variance and covariance, \(A\) is required to be symmetric.

When \(A\) is not symmetric

When \(A\) is asymmetric, \[x^T\tilde{A}x=x^T(A+A^T)x/2\] the expression of var and cov are the same, provided \(A\) is replaced by \(\tilde{A}=(A+A^T)/2\).

Resources for quadratic forms wiki link and Lecture notes.

Isserlis theorem

It may be related to high-order moments for multivariate normal distribution, see Isserlis Theorem, published in Biometrika, 12:134-139.

\(X_i\) is a vector that follows standard Gaussian distribution \(N(0,1)\). The product below can be expanded as

Example 1 (4-order moments)

\[ \begin{align} E(X_1X_2X_3X_4)&=E(X_1X_2)E(X_3X_4)+E(X_1X_3)E(X_2X_4)+E(X_1X_4)E(X_2X_3) \end{align} \] When \(X_i\), \(i=1\), it is \(E(X_1^4)=3E(X_1^2)=3\).

Example 2 (6-order moments)

\[ \begin{align} E(X_1X_2X_3X_4X_5X_5)&=E(X_1X_2)E(X_3X_4X_5X_6)+E(X_1X_3)E(X_2X_4X_5X_6)\\ &+E(X_1X_4)E(X_2X_3X_5X_6)+E(X_1X_5)E(X_2X_3X_4X_6)\\ &+E(X_1X_6)E(X_2X_3X_4X_5), (5\ terms)\\ &=E(X_1X_2)[E(X_3X_4)E(X_5X_6)\\ &+E(X_3X_5)E(X_4X_6)+E(X_3X_6)E(X_4X_5)]\\ &+...,(E(X_3X_4)E(X_5X_6)\ can\ be\ expanded\ as\ Example\ 1)\\ &=15 \end{align} \]

However, Isserlis Theorem works for normal distribution only, but it is not “normal” for a standardized locus.

Cumulant

Another possible method is “Cumulant”, see its Wiki. Or for “joint Cumulant” if covariance exists beween two vectors, see Joint Cumulant.

Inversion

\(2 \times 2\)

\[\mathbf{A}^{-1}= \begin{bmatrix} a & b \\ c & d \end{bmatrix} ^{-1} = \frac{1}{ad-bc}\begin{bmatrix} d & -b \\ -c & a \end{bmatrix}\]

\(3 \times 3\)

\[\mathbf{A}^{-1}= \begin{bmatrix} a & b &c \\ d & e & f \\g&h&i \end{bmatrix} ^{-1} = \frac{1}{det(\mathbf{A})}\begin{bmatrix} A & D &G \\B&E&H \\C & F &I \end{bmatrix}\] in which (\(A\) is scalar not a matrix as above) \(A=(ei-fh)\), \(B=-(di-fg)\), \(C=(dh-eg)\), \(D=-(bi-ch)\), \(E=(ai-cg)\), \(F=-(ah-bg)\), \(G=(bf-ce)\), \(H=-(af-cd)\), \(I=(ae-bd)\).

Conditional probability

Haplotype for a pair of biallelic loci


\(a_l\) \(A_l\)
\(a_k\) \(r_{kl}=q_l+\frac{D_{kl}}{q_k}\) \(\bar{r}_{kl}=q_l+\frac{D_{kl}}{q_k}\) \(q_k\)
\(A_k\) \(\bar{R}_{kl}=q_l+\frac{D_{kl}}{p_{kl}}\) \(R_{kl}=p_l+\frac{D_{kl}}{p_{kl}}\) \(p_k\)
\(q_l\) \(p_l\)

\(p_k\) and \(p_l\) are minor allelic frequencies of the two loci, respectively. It exists the transformation as \(D_{kl}=\rho_{kl}\sqrt{p_kq_kp_lq_l}\). Of note, \(D_{kl}\) is restrained as min(\(p_kq_l\), \(q_kp_l\)), and consequently \(\rho_{kl}=\sqrt{\frac{p_kq_l}{p_lq_k}}<1\).

Frequencies for a pair of loci for the \(i^{th}\) individual


Locus \(l\)
Geno \(a_la_l\) \(a_lA_l\) \(A_lA_l\)
Standardized geno \(s_{il}=\frac{-2p_l}{\sqrt{2p_lq_l}}\) \(s_{il}=\frac{q_l-p_l}{\sqrt{2p_lq_l}}\) \(s_{il}=\frac{2q_l}{\sqrt{2p_lq_l}}\)
Frequency \(q_l^2\) \(2p_lq_l\) \(p_l^2\)
\(a_ka_k\) \(s_{ik}=\frac{-2p_k}{\sqrt{2p_kq_k}}\) \(q_k^2\) \(r_{kl}^2\) \(2r_{kl}\bar{r}_{kl}\) \(\bar{r}_{kl}^2\)
Locus \(k\) \(a_ka_k\) \(s_{ik}=\frac{q_k-p_k}{\sqrt{2p_kq_k}}\) \(q_k^2\) \(r_{kl}\bar{R}_{kl}\) \(r_{kl}\bar{R}_{kl}+\bar{r}_{kl}R_{kl}\) \(\bar{r}_{kl}R_{kl}\)
\(A_kA_k\) \(s_{ik}=\frac{2q_k}{\sqrt{2p_kq_k}}\) \(p_k^2\) \(\bar{R}_{kl}^2\) \(2R_{kl}\bar{R}_{kl}\) \(R_{kl}^2\)

Liability scale

See Eq 23 in Am J Hum Genet, 2011, 88:294-305.

\[h^2_l=h^2_o\frac{K(1-K)}{z^2}\frac{K(1-K)}{P(1-P)}\] in which \(K\) is the prevalence of the disease (binary), and \(P\) is the proportion of cases in the sample.

See a simple example below

K=0.01
P=0.5
ho2=0.5
z=dnorm(qnorm(K))

hl2=ho2*K*(1-K)*K*(1-K)/z^2/(P*(1-P))
print(paste("Observed scale", ho2))
## [1] "Observed scale 0.5"
print(paste("Liability scale", hl2))
## [1] "Liability scale 0.275953649031586"

Projected PC

Algorithm. It is a simple linear regression but against eigenvecot.

1.1 Sample \(N_c\) individuals from the whole population

1.2 Generate eigenvectors from the \(pc\)

1.3 \(\beta_k=\frac{cov(pc,x_k)}{var(x_k)}\)

2.1 \(\tilde{pc}=\sum_{k=1}^M\hat{\beta}_kx_k\).

However, there is two technical issues here about snp strand or the switch of the minor allele. See summary [here] (https://github.com/gc5k/Notes/blob/master/StG/TechNotes/PalindromeLoci.pdf)

A note on \(L_B\)

\[ \begin{align} L_B&=tr(\mathbf{KK}^T)=\frac{1}{B}\sum_{b=1}^Bz_b^T\mathbf{KK}^Tz_b\\ &=\frac{1}{B}\frac{1}{M^2}\sum_{b=1}^Bz_b^T\mathbf{XX}^T\mathbf{XX}^Tz_b\\ &=\frac{1}{B}\frac{1}{M^2}\sum_{b=1}^B||\mathbf{XX}^Tz_b||_2^2\\ \end{align} \]

The sampling variance is

\[ \begin{aligned} var(L_B)&=\mathbb{E}\{[L_{B}-tr(\mathbf{KK}^T)]^2\}\\ &=var(\frac{1}{B}\sum_{b=1}^Bz_b^T\mathbf{KK}^Tz_b)\\ &=\frac{1}{B^2}var(\sum_{b=1}^Bz_b^T\mathbf{KK}^Tz_b) \\ &\text{For quadric forms that } var(z^T\mathbf{A}z)=2tr(\Sigma_z\mathbf{A}\Sigma_z\mathbf{A}))+4\mu_z\mathbf{A}\Sigma\mathbf{A}\mu_z\\ &=\frac{1}{B^2}\sum_{b=1}^B2tr(\Sigma_z\mathbf{KK}^T\Sigma_z\mathbf{KK}^T)\\ &\text{because }z\sim~MVN(0, I_N)\\ &=\frac{2}{B}tr(\mathbf{KK}^T\mathbf{KK}^T)\\ &=\frac{1}{B}\sum_{l_1}\sum_{l_2}\mathcal{K}_{l_1,l_2} \end{aligned} \]

in which \(\mathcal{K}=\mathbf{KK}^T\).

An example

n=300 #sample size
m=500 #marker
SM=50 #simulation
BS=25 #randomization factor

LK=0

fq=runif(m, 0.1, 0.5)
x=matrix(0, n, m)
for(i in 1:m) {
  x[,i]=rbinom(n, 2, fq[i])
}
sx=apply(x, 2, scale)
k=sx%*%t(sx)/m
ksq=k^2
k2=k%*%t(k)
k4=k2%*%t(k2)
k4Alt=k2^2
#evaluate LB

LK=array(0, SM)
for(i in 1:SM) {
  Lb=array(0, BS)
  for(j in 1:BS) {
    z=matrix(rnorm(n), n, 1)
    x1=t(sx)%*%z
    x2=sx%*%x1
    Lb[j]=(t(x2)%*%x2)[1,1]
  }
  LK[i]=sum(Lb)/(BS*m^2)
}
print(paste("Obs mean", mean(LK), "vs expected mean", n^2/m+n)) 
## [1] "Obs mean 474.194724827433 vs expected mean 480"
print(paste("obs var", var(LK), " vs expected var", 2*sum(diag(k4))/BS))
## [1] "obs var 162.536901365627  vs expected var 164.233602459417"
print(paste("incorrected var in Wu's paper", (n^2/m+n)/BS))
## [1] "incorrected var in Wu's paper 19.2"
print(paste("sum(diag(K4))=", sum(diag(k4)), "sum[(kkT)^2]", sum(k4Alt)))
## [1] "sum(diag(K4))= 2052.92003074272 sum[(kkT)^2] 2052.92003074272"
Sum_D_k2_sq=sum(diag(k2))^2

ZX=n^2*m^4+(2*n^3+2*n^2+8*n)*m^3+(n^4+2*n^3+21*n^2+20*n)*m^2+(8*n^3+20*n^2+20*n)*m

print(paste(sum(diag(k2)^2)*n, ZX/m^4, sum(diag(k2))^2))
## [1] "229957.108221216 230990.126448 227459.251387183"
EU=n*m^2+(n^2+n)*m
Sum_D_k2=sum(diag(k2))
print(paste(EU/m^2, Sum_D_k2))
## [1] "480.6 476.926882642594"

About \(\mathbf{KK}^T\)

About \(\mathbf{K}\) and its statistics.

\[ \mathbf{K}=\frac{1}{M} \begin{bmatrix} k_{\color{blue}{1},\color{red}{1}}=\sum_{m=1}^M\frac{(\color{blue}{x}_{\color{blue}{1},m}-2p_m)(\color{red}{x}_{\color{red}{1},m}-2p_k)}{2p_mq_m} & k_{\color{blue}{1},\color{red}{2}}=\sum_{m=1}^M\frac{(\color{blue}{x}_{\color{blue}{1},m}-2p_m)(\color{red}{x}_{\color{red}{2},m}-2p_m)}{2p_mq_m} & \cdots & k_{\color{blue}{1},\color{red}{n}}=\sum_{m=1}^M\frac{(\color{blue}{x}_{\color{blue}{1},m}-2p_m)(\color{red}{x}_{\color{red}{n},m}-2p_m)}{2p_mq_m}\\ k_{\color{blue}{2},\color{red}{1}}=\sum_{m=1}^M\frac{(\color{blue}{x}_{\color{blue}{2},m}-2p_m)(\color{red}{x}_{\color{red}{1},m}-2p_m)}{2p_mq_m} & k_{\color{blue}{2},\color{red}{2}}=\sum_{m=1}^M\frac{(\color{blue}{x}_{\color{blue}{2},m}-2p_m)(\color{red}{x}_{\color{red}{2},m}-2p_m)}{2p_mq_m} & \cdots & k_{\color{blue}{2},\color{red}{n}}=\sum_{m=1}^M\frac{(\color{blue}{x}_{\color{blue}{2},m}-2p_m)(\color{red}{x}_{\color{red}{n},m}-2p_m)}{2p_mq_m}\\ \cdots & \cdots & \cdots & \cdots\\ k_{\color{blue}{n},\color{red}{1}}=\sum_{m=1}^M\frac{(\color{blue}{x}_{\color{blue}{n},m}-2p_m)(x_{\color{red}{1},m}-2p_m)}{2p_mq_m} &k_{\color{blue}{n},\color{red}{2}}=\sum_{m=1}^M\frac{(\color{blue}{x}_{\color{blue}{n},m}-2p_m)(x_{\color{red}{2},m}-2p_m)}{2p_mq_m} & \cdots &k_{\color{blue}{n},\color{red}{n}}=\sum_{m=1}^M\frac{(\color{blue}{x}_{\color{blue}{n},m}-2p_m)(x_{\color{red}{n},m}-2p_m)}{2p_mq_m} \end{bmatrix} \]

in which \[ \mathbf{KK}^T=\frac{1}{M^2}\mathcal{K}= \begin{bmatrix} \boxed{\mathcal{K}_{\color{blue}{1},\color{red}{1}}=\sum_{\color{green}{l}=1}^nk_{\color{blue}{1}, \color{green}{l}}k_{\color{green}{l},\color{red}{1}}} &\mathcal{K}_{\color{blue}{1},\color{red}{2}}=\sum_{\color{green}{l}=1}^nk_{\color{blue}{1},\color{green}{l}}k_{\color{blue}{l},\color{red}{2}} &\cdots &\mathcal{K}_{\color{blue}{1},\color{red}{n}}=\sum_{\color{green}{l}=1}^nk_{\color{blue}{1},\color{green}{l}}k_{\color{green}{l},\color{red}{n}}\\ \mathcal{K}_{\color{blue}{2},\color{red}{1}}=\sum_{\color{green}{l}=1}^nk_{\color{blue}{2}, \color{green}{l}}k_{\color{green}{l},\color{red}{1}} &\boxed{\mathcal{K}_{\color{blue}{2},\color{red}{2}}=\sum_{\color{green}{l}=1}^nk_{\color{blue}{2},\color{green}{l}}k_{\color{green}{l},\color{red}{2}}} &\cdots &\mathcal{K}_{\color{blue}{2},\color{red}{n}}=\sum_{\color{green}{l}=1}^nk_{\color{blue}{2},\color{green}{l}}k_{\color{green}{l},\color{red}{n}}\\ \vdots & \vdots & \vdots & \vdots\\ \mathcal{K}_{\color{blue}{n},\color{red}{1}}=\sum_{\color{green}{l}=1}^nk_{\color{blue}{1}, \color{green}{l}}k_{\color{green}{l},\color{red}{1}} &\mathcal{K}_{\color{blue}{n},\color{red}{2}}=\sum_{\color{green}{l}=1}^nk_{\color{blue}{n},\color{green}{l}}k_{\color{green}{l},\color{red}{2}} &\cdots &\boxed{\mathcal{K}_{\color{blue}{n},\color{red}{n}}=\sum_{\color{green}{l}=1}^nk_{\color{blue}{n},\color{green}{l}}k_{\color{green}{l},\color{red}{n}}}\\ \end{bmatrix} \]

Furthermore \[ \begin{aligned} \mathcal{K}_{\color{blue}{\mathcal{L}_1},\color{red}{\mathcal{L}_2}} &=\mathbf{K}_{[\color{blue}{\mathcal{L}_1},]}\mathbf{K}^T_{[,\color{red}{\mathcal{L}_2}]}\\ &=\sum_{\color{green}{l}=1}^nk_{\color{blue}{\mathcal{L}_1},\color{green}{l}}k_{\color{green}{l},\color{red}{\mathcal{L}_2}}\\ &=\frac{1}{M^2} \begin{bmatrix} k_{\color{blue}{\mathcal{L}_1},\color{green}{1}}=\sum_{m_1=1}^M\frac{(\color{blue}{x}_{\color{blue}{\mathcal{L}_1},m_1}-2p_{m_1})(\color{green}{x}_{\color{green}{1},m_1}-2p_{m_1})}{2p_{m_1}q_{m_1}}\\ k_{\color{blue}{\mathcal{L}_1},\color{green}{2}}=\sum_{m_1=1}^M\frac{(\color{blue}{x}_{\color{blue}{\mathcal{L}_1},m_1}-2p_{m_1})(\color{green}{x}_{\color{green}{2},m_1}-2p_{m_1})}{2p_{m_1}q_{m_1}}\\ \vdots\\ k_{\color{blue}{\mathcal{L}_1},\color{green}{n}}=\sum_{m_1=1}^M\frac{(\color{blue}{x}_{\color{blue}{\mathcal{L}_1},m_1}-2p_{m_1})(\color{green}{x}_{\color{green}{n},m_1}-2p_{m_1})}{2p_{m_1}q_{m_1}}\\ \end{bmatrix}^T \begin{bmatrix} k_{\color{green}{1},\color{red}{\mathcal{L}_2}}=\sum_{m_2=1}^M\frac{(\color{green}{x}_{\color{green}{1},m_2}-2p_{m_2})(\color{red}{x}_{\color{red}{\mathcal{L}_2},m_2}-2p_{m_2})}{2p_{m_2}q_{m_2}}\\ k_{\color{green}{2},\color{red}{\mathcal{L}_2}}=\sum_{m_2=1}^M\frac{(\color{green}{x}_{\color{green}{2},m_2}-2p_{m_2})(\color{red}{x}_{\color{red}{\mathcal{L}_2},m_2}-2p_{m_2})}{2p_{m_2}q_{m_2}}\\ \vdots\\ k_{\color{green}{n},\color{red}{\mathcal{L}_2}}=\sum_{m_2=1}^M\frac{(\color{green}{x}_{\color{green}{n},m_2}-2p_{m_2})(\color{red}{x}_{\color{red}{\mathcal{L}_2},m_2}-2p_{m_2})}{2p_{m_2}q_{m_2}} \end{bmatrix}\\ &=\frac{1}{M^2}\sum_{\color{green}{l}}^n \left\{ [\sum_{m_1=1}^M\color{blue}{s}_{\color{blue}{\mathcal{L}_1},m_1}\color{green}{s}_{\color{green}{l},m_1}] [\sum_{m_2=1}^M\color{green}{s}_{\color{green}{l},m_2}\color{red}{s}_{\color{red}{\mathcal{L}_2},m_2}] \right\}\\ &=\frac{1}{M^2}\sum_{\color{green}{l}}^n \left\{ \sum_{m_1,m_2}^M \color{blue}{s_{\mathcal{L}_1,m_1}} \color{green}{s}_{l,m_1}\color{green}{s}_{l,m_2} \color{red}{s_{{\mathcal{L}_2},m_2}} \right\} \end{aligned} \]

And

\[ \begin{aligned} \mathcal{K}_{\mathcal{L}_1,\mathcal{L}_2}^2&=\frac{1}{M^4} \left\{ \sum_{l}^n \left\{ \sum_{m_1,m_2}^M \color{blue}{s_{\mathcal{L}_1,m_1}} \color{green}{s}_{l,m_1} \color{red}{s_{\mathcal{L}_2,m_2}} \color{green}{s}_{l,m_2}\right\}\right\}^2\\ &=\frac{1}{M^4} \sum_{l_1,l_2}^n \sum_{m_1,\color{brown}{m_1^{\prime}}}^M \sum_{m_2,\color{pink}{m_2^{\prime}}}^M \underline{\color{blue}{s_{\mathcal{L}_1,m_1}} \color{blue}{s_{\mathcal{L}_1,\color{brown}{m_1^{\prime}}}}} \times\boxed{\color{green}{s}_{l_1,m_1} \color{green}{s}_{l_1,\color{brown}{m_1^{\prime}}}} \times\underline{\color{red}{s_{\mathcal{L}_2,m_2}} \color{red}{s}_{\color{red}{\mathcal{L}_2},\color{pink}{m_2^{\prime}}}} \times\boxed{\color{green}{s}_{l_2,m_2}\color{green}{s}_{l_2,\color{pink}{m_2^{\prime}}}} \end{aligned} \]

\(\Omega_w\)

TopTable

The classic \(h^2=\beta HLH\beta\)

In population-based design, the relatedness is measured in IBS
classic \(h^2=\beta HLH\beta\) \(h^2_d=d H_dL_dH_dd\)
\(h^2_{SNP}\) \(h^2_{SNP.A}=m\beta H\{\frac{ \sum_{k=1}^{m} v_k^Tv_k }{1^TP1}\} H \beta\) \(h^2_{SNP.D}=md H_d\{\frac{ \sum_{k=1}^{m} v_{k,d}^T v_{k,d} }{1^TP_d1}\} H_d d\)
\(h^2_{SNP_w}\) \(\tilde{h}^2_{SNP.A}=m\beta H\{\frac{ \sum_{k=1}^{m}w_k v_{k}^Tv_k }{w^TPw}\} H \beta\) \(\tilde{h}^2_{SNP.D}=md H_d\{\frac{ \sum_{k=1}^{m} w_{k,d}v_{k.d}^Tv_{k.d} }{w_{d}^TP_dw_{d}}\} H_d d\)
\(h^2_{cor}\) \(h^2_{SNP.A_2}=m\beta_1 H_1\{\frac{ \sum_{k=1}^{m} v_{k_1}^Tv_{k_2} }{1^TP1}\} H_2 \beta_2\) \(h^2_{SNP.D_2}=md_1 H_{d_1}\{\frac{ \sum_{k=1}^{m} v_{k,d_1}^Tv_{k,d_2} }{1^TP_d1}\} H_{d_2} d_2\)
\(h^2_{cor_w}\) \(\tilde{h}^2_{SNP.A_2}=m\beta_1 H_1\{\frac{ \sum_{k=1}^{m}w_k v_{k,1}^Tv_{k,2} }{w^TPw}\} H_2 \beta_2\) \(\tilde{h}^2_{SNP.D_2}=md_1 H_{d_1}\{\frac{ \sum_{k=1}^{m} w_{k,d}v_{k,d_1}^Tv_{k,d_2} }{w_{d}^TP_dw_{d}}\} H_{d_2} d_2\)

\[\Omega_w=\frac{\sum_{k=1}^M(x_{i,k}-2p_k)(x_{j,k}-2p_k)}{\sum_{k=1}^M2p_kq_k}\] Like \(M_e\), \(M_{e_w}\) now can be defined accordingly.

\[h^2_w=\frac{ (\frac{\sum_{k=1}^M2p_kq_k\chi^2_{1.k}}{\sum_{k=1}^M2p_kq_k}-1)n }{n^2/M_{e_w}}\]

An example for \(h^2_{SNP}\), and \(h^2_{SNP_w}\)

n=500 #sample size
m=1000 #marker
h2=0.3 #heritability

b=rnorm(m, 0, sqrt(h2/m)) #effect
SIMU=5

#simu g
fq=runif(m, 0.3, 0.5)
x=matrix(0, n, m)
for(i in 1:m) {
  x[,i]=rbinom(n, 2, fq[i])
}
FQ=colMeans(x)/2
sA=apply(x, 2, scale)
K=sA%*%t(sA)/m
me=var(K[col(K)<row(K)])

sW=x-2*FQ
Kw=sW%*%t(sW)/sum(2*FQ*(1-FQ))
meW=var(Kw[col(Kw)<row(Kw)])

#simu y
H2=matrix(0, SIMU, 4)
H2w=matrix(0, SIMU, 2)
for(i in 1:SIMU) {
  y=x%*%b
  vy=var(y)
  y=y+rnorm(n, 0, sqrt(vy/(h2)*(1-h2)))
  y=scale(y)
  yy=y%*%t(y)
  h2Mod=lm(yy[col(yy)<row(yy)]~K[col(yy)<row(yy)])
  H2[i,1]=summary(h2Mod)$coefficients[2,1]
  ss=matrix(0, m, 5)
  for(j in 1:m) {
    mod=lm(y~sA[,j])
    ss[j,1:4]=summary(mod)$coefficient[2,]
  }
  ss[,5]=ss[,3]^2
  H2[i,2]=((mean(ss[,5])-1)*n)/(n*n*me)

  h2wMod=lm(yy[col(yy)<row(yy)]~Kw[col(yy)<row(yy)])
  H2[i,3]=summary(h2wMod)$coefficients[2,1]
  chiW=ss[,5]*2*FQ*(1-FQ)
  H2[i,4]=((mean(chiW)/mean(2*FQ*(1-FQ))-1)*n)/(n*n*meW)
}
barplot(t(H2), beside = T)
abline(h=c(h2))
legend("topleft", legend=c("h2", "ssh2", "h2w", "ssh2w"))

Delta method

\[var(h^2)=var(\frac{A}{B})=\frac{1}{\mu_A^2}\sigma^2_A+\frac{\mu^2_A}{\mu_B^4}\sigma^2_B\] in which \(\mu_A=\frac{n^2}{\hat{m}_e}\hat{h}^2\), \(\sigma_A^2=\frac{n^2}{\hat{m}_e}\), and \(\mu_B=\frac{n^2}{\hat{m}_e}\), and \(\sigma^2_B=\frac{2\hat{m}_e}{n}\). After some algrebra, \(var(h^2)=2\frac{\hat{m}_e}{n^2}+2\frac{\hat{m}_e^3}{n^5}\hat{h^2}^2\), and after further approximation \(\sigma_{\hat{h}^2}\approx \frac{\sqrt{(2\hat{m}_e)}}{n}\)

n=500 #sample size
m=1000 #marker
h2=0.3 #heritability

b=rnorm(m, 0, sqrt(h2/m)) #effect
SIMU=10

#simu g
fq=runif(m, 0.3, 0.5)
x=matrix(0, n, m)
for(i in 1:m) {
  x[,i]=rbinom(n, 2, fq[i])
}
FQ=colMeans(x)/2
sA=apply(x, 2, scale)
K=sA%*%t(sA)/m
me=var(K[col(K)<row(K)])

sW=x-2*FQ
Kw=sW%*%t(sW)/sum(2*FQ*(1-FQ))
meW=var(Kw[col(Kw)<row(Kw)])

#simu y
H2=matrix(0, SIMU, 4)
H2w=matrix(0, SIMU, 2)
for(i in 1:SIMU) {
  y=x%*%b
  vy=var(y)
  y=y+rnorm(n, 0, sqrt(vy/(h2)*(1-h2)))
  y=scale(y)
  yy=y%*%t(y)
  h2Mod=lm(yy[col(yy)<row(yy)]~K[col(yy)<row(yy)])
  H2[i,1]=summary(h2Mod)$coefficients[2,1]
  ss=matrix(0, m, 5)
  for(j in 1:m) {
    mod=lm(y~sA[,j])
    ss[j,1:4]=summary(mod)$coefficient[2,]
  }
  ss[,5]=ss[,3]^2
  H2[i,2]=((mean(ss[,5])-1)*n)/(n*n*me)
  
  h2wMod=lm(yy[col(yy)<row(yy)]~Kw[col(yy)<row(yy)])
  H2[i,3]=summary(h2wMod)$coefficients[2,1]
  chiW=ss[,5]*2*FQ*(1-FQ)
  H2[i,4]=((mean(chiW)/mean(2*FQ*(1-FQ))-1)*n)/(n*n*meW)
}
barplot(t(H2), beside = T)
abline(h=c(h2))
legend("topleft", legend=c("h2", "ssh2", "h2w", "ssh2w"))

ME=1/me
##delta
s=sqrt(2*ME/n^2+2*ME^3/n^5*h2^2)
print(paste("Obs sd for h2:", sd(H2[,2])))
## [1] "Obs sd for h2: 0.104546753516027"
print(paste("Exp sd for h2:", s))
## [1] "Exp sd for h2: 0.0899163419622116"

A pair of traits

\(N_1=N_2=N\)

Under the first circumstance, both traits have same subjects, and \(N_1=N_2=N\).

The target function can be

\[\begin{align} Q&=argmin_{\{\Delta\}}|| \mathbf{yy}^T-( \begin{bmatrix} \sigma^2_{a_1}\mathbf{K} & \gamma_{g}\mathbf{K}\\ \gamma_{g}\mathbf{K} & \sigma^2_{a_2}\mathbf{K} \\ \end{bmatrix} + \begin{bmatrix} \sigma^2_{e_1}\mathbf{I}_N & \gamma_{e}\mathbf{I}_N\\ \gamma_{e}\mathbf{I}_N & \sigma^2_{e_2}\mathbf{I}_N\\ \end{bmatrix} )||^2_F\\ &=tr\{ [\mathbf{yy}^T-( \begin{bmatrix} \sigma^2_{a_1}\mathbf{K} & \gamma_{g}\mathbf{K}\\ \gamma_{g}\mathbf{K} & \sigma^2_{a_2}\mathbf{K} \\ \end{bmatrix} + \begin{bmatrix} \sigma^2_{e_1}\mathbf{I}_N & \gamma_{e}\mathbf{I}_N\\ \gamma_{e}\mathbf{I}_N & \sigma^2_{e_2}\mathbf{I}_N\\ \end{bmatrix} )]^2\} \end{align} \]

in which \(\Delta=(\sigma^2_{a_1},\sigma^2_{e_1},\sigma^2_{a_2},\sigma^2_{e_2},\gamma_g,\gamma_e)\).

Setting the gradient of \(Q\) to zero in respective to each component, \[\color{red}{\frac{\partial Q}{\partial\Delta}} \color{blue}{\frac{\partial \Delta}{\partial\sigma^2_.}}=tr\{ \color{red}{2[\mathbf{yy}^T-( \begin{bmatrix} \sigma^2_{a_1}\mathbf{K} & \gamma_{g}\mathbf{K}\\ \gamma_{g}\mathbf{K} & \sigma^2_{a_2}\mathbf{K} \\ \end{bmatrix} + \begin{bmatrix} \sigma^2_{e_1}\mathbf{I}_N & \gamma_{e}\mathbf{I}_N\\ \gamma_{e}\mathbf{I}_N & \sigma^2_{e_2}\mathbf{I}_N\\ \end{bmatrix} )]} \cdot \color{blue}{\frac{\partial{\Delta}}{\partial \sigma^2_.}} \} \]

For each element in \(\Delta\), its partial can be derived as below

\[\begin{bmatrix} \frac{\partial Q}{\partial \sigma^2_{a_1}}=\begin{bmatrix} \mathbf{K}&0\\ 0&0\\ \end{bmatrix}, \frac{\partial Q}{\partial \sigma^2_{a_2}}=\begin{bmatrix} 0&0\\ 0&\mathbf{K} \end{bmatrix}, \frac{\partial Q}{\partial \gamma_g}=\begin{bmatrix} 0&\mathbf{K}\\ \mathbf{K}&0 \end{bmatrix} \\ \frac{\partial Q}{\partial \sigma^2_{e_1}}=\begin{bmatrix} \mathbf{I}_N&0\\ 0&0 \end{bmatrix}, \frac{\partial Q}{\partial \sigma^2_{e_2}}=\begin{bmatrix} 0&0\\ 0&\mathbf{I}_N \end{bmatrix}, \frac{\partial Q}{\partial \gamma_e}=\begin{bmatrix} 0&\mathbf{I}_N\\ \mathbf{I}_N&0 \end{bmatrix} \end{bmatrix} \]

Then, the \(\frac{\partial Q}{\partial \Delta}\frac{\partial \Delta}{\partial \sigma^2}\) becomes

\[\left \{ \begin{align} \frac{\partial Q}{\partial \sigma^2_{a_1}}&=2\{tr(\mathbf{y_1y_1}^T\mathbf{K})-\sigma^2_{a_1}tr(\mathbf{KK})-\sigma^2_{e_1}tr(\mathbf{K})\}=0 \\ \frac{\partial Q}{\partial \sigma^2_{e_1}}&=2\{tr(\mathbf{y_1y_1}^T\mathbf{I}_N)-\sigma^2_{a_1}tr(\mathbf{KI}_N)-\sigma^2_{e_1}tr(\mathbf{I}_N)\}=0 \\ \frac{\partial Q}{\partial \sigma^2_{a_2}}&=2\{tr(\mathbf{y_2y_2}^T\mathbf{K})-\sigma^2_{a_2}tr(\mathbf{KK})-\sigma^2_{e_2}tr(\mathbf{K})\}=0 \\ \frac{\partial Q}{\partial \sigma^2_{e_2}}&=2\{tr(\mathbf{y_2y_2}^T\mathbf{I}_N)-\sigma^2_{a_2}tr(\mathbf{KI}_N)-\sigma^2_{e_2}tr(\mathbf{I}_N)\}=0 \\ \frac{\partial Q}{\partial \sigma^2_{r_g}}&=2\{tr(\mathbf{y_1y_2}^T\mathbf{K}+\mathbf{y_1y_2}^T\mathbf{K})-\gamma_{g}tr(\mathbf{KK}+\mathbf{KK})-\gamma_{e}tr(\mathbf{K}+\mathbf{K})\}=0 \\ \frac{\partial Q}{\partial \sigma^2_{r_e}}&=2\{tr(\mathbf{y_1y_2}^T\mathbf{I}_N+\mathbf{y_1y_2}^T\mathbf{I}_N)-\gamma_{g}tr(\mathbf{K}+\mathbf{K})-\gamma_{e}tr(\mathbf{I}_N+\mathbf{I}_N)\}=0 \\ \end{align} \right . \]

And, it turns out a normal equation as below

\[\begin{bmatrix} tr(\mathbf{KK}) & tr(\mathbf{K}) & 0 & 0 &0 &0 \\ tr(\mathbf{K}) & tr(\mathbf{I}_N) & 0 & 0 &0 &0 \\ 0& 0& tr(\mathbf{KK}) & tr(\mathbf{K}) & 0 & 0 \\ 0& 0&tr(\mathbf{K}) & tr(\mathbf{I}_N) & 0 & 0 \\ 0& 0 & 0 & 0& tr(\mathbf{KK}) & tr(\mathbf{K}) \\ 0& 0& 0 & 0 &tr(\mathbf{K}) & tr(\mathbf{I}_N) \\ \end{bmatrix} \left[ \begin{array}{c} \hat{\sigma}^2_{a_1} \\ \hat{\sigma}^2_{e_1} \\ \hat{\sigma}^2_{a_2} \\ \hat{\sigma}^2_{e_2} \\ \hat{\gamma}_g \\ \hat{\gamma}_e \\ \end{array} \right] = \left [ \begin{array}{c} tr[y^T_1\mathbf{K}y_1] \\ tr[y^T_1\mathbf{I}_Ny_1]\\ tr[y^T_2\mathbf{K}y_2] \\ tr[y^T_2\mathbf{I}_Ny_2]\\tr[y^T_1\mathbf{K}y_2]\\tr[y^T_1\mathbf{I}_Ny_2] \end{array} \right ] \]

Then we hava \[ \hat{\gamma}_{g}=\frac{y_1^T\mathbf{K}y_2-y_1^Ty_2}{tr[\mathbf{KK^T}]-N}=\frac{N(Nr_g\sqrt{h^2_1h^2_2}/M_e+\frac{N}{\sqrt{NN}} r_e)-\frac{N}{\sqrt{NN}}r_e}{\frac{N^2}{M_e}}=\frac{Nr_g\sqrt{h^2_1h^2_2}/M_e}{\frac{N}{M_e}} \]

\[\begin{align} \mathcal{Q}&=argmin_{\gamma_g, \gamma_e, \sigma^2_{g_1},\sigma^2_{e_1}, \sigma^2_{g_2}, \sigma^2_{e_2}}||yy^T-( \begin{bmatrix} \sigma^2_{g_1}\mathbf{K}&\sigma^2_{\gamma_g}\mathbf{K}\\ \sigma^2_{\gamma_g}\mathbf{K}&\sigma^2_{g_2}\mathbf{K}\\ \end{bmatrix} + \begin{bmatrix} \sigma^2_{e_1}\mathbf{I}_N&\gamma_{e}\mathbf{I}_N\\ \gamma_{e}\mathbf{I}_N&\sigma^2_{e_2}\mathbf{I}_N\\ \end{bmatrix} )||_F^2\\ &=tr\{yy^Tyy^T-2yy^T \begin{bmatrix} \sigma^2_{g_1}\mathbf{K}+\sigma^2_{e_1}\mathbf{I}_N&\sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}_N\\ \sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}_N&\sigma^2_{g_2}\mathbf{K}+\sigma^2_{e_2}\mathbf{I}_N\\ \end{bmatrix} + \begin{bmatrix} \sigma^2_{g_1}\mathbf{K}+\sigma^2_{e_1}\mathbf{I}_N&\sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}_N\\ \sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}_N&\sigma^2_{g_2}\mathbf{K}+\sigma^2_{e_2}\mathbf{I}_N\\ \end{bmatrix}^2 \}\\ \end{align}\]

in which \(y=[y_1, y_2]\) and \[yy^T= \begin{bmatrix} y_1y_1^T&y_1y_2^T\\ y_2y_1^T&y_2y_2^T \end{bmatrix} \]

\[ \begin{align} \frac{\partial \mathcal{Q}}{\sigma^2_{g_1}}&=tr\{-2yy^T \begin{bmatrix} \mathbf{K}&0\\ 0&0\\ \end{bmatrix} + 2\begin{bmatrix} \sigma^2_{g_1}\mathbf{K}+\sigma^2_{e_1}\mathbf{I}_N&\sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}_N\\ \sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}_N&\sigma^2_{g_2}\mathbf{K}+\sigma^2_{e_2}\mathbf{I}_N\\ \end{bmatrix} \begin{bmatrix} \mathbf{K}&0\\ 0&0\\ \end{bmatrix}\}\\ &=-2tr(y_1y_1^T\mathbf{K})+2\sigma^2_{g_1}tr(\mathbf{K}^2)+2\sigma^2_{e_1}tr(\mathbf{K}) \end{align} \]

\[ \begin{align} \frac{\partial \mathcal{Q}}{\sigma^2_{e_1}}&=tr\{-2yy^T \begin{bmatrix} 0&\mathbf{I}_N\\ 0&0\\ \end{bmatrix} + 2\begin{bmatrix} \sigma^2_{g_1}\mathbf{K}+\sigma^2_{e_1}\mathbf{I}_N&\sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}_N\\ \sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}_N&\sigma^2_{g_2}\mathbf{K}+\sigma^2_{e_2}\mathbf{I}_N\\ \end{bmatrix} \begin{bmatrix} 0&\mathbf{I}_N\\ 0&0\\ \end{bmatrix}\}\\ &=-2tr(y_1y_1^T\mathbf{I}_N)+2\sigma^2_{g_1}tr(\mathbf{K})+2\sigma^2_{e_1}tr(\mathbf{I}_N) \end{align} \]

Similarly for \(\frac{\partial \mathcal{Q}}{\sigma^2_{g_{2}}}=-2tr(y_2y_2^T\mathbf{I}_N)+2\sigma^2_{g_2}tr(\mathbf{KK^T})+2\sigma^2_{e_2}tr(\mathbf{K})\), \(\frac{\partial \mathcal{Q}}{\sigma^2_{g_{2}}}=-2tr(y_2y_2^T\mathbf{I}_N)+2\sigma^2_{g_2}tr(\mathbf{K})+2\sigma^2_{e_2}tr(\mathbf{I}_N)\),

\[ \begin{align} \frac{\partial \mathcal{Q}}{\sigma^2_{\gamma_g}}&=tr\{-2yy^T \begin{bmatrix} 0&\mathbf{K}\\ \mathbf{K}&0\\ \end{bmatrix} + 2\begin{bmatrix} \sigma^2_{g_1}\mathbf{K}+\sigma^2_{e_1}\mathbf{I}_{N}&\sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}\\ \sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}_{N}&\sigma^2_{g_2}\mathbf{K}+\sigma^2_{e_2}\mathbf{I}\\ \end{bmatrix} \begin{bmatrix} 0&\mathbf{K}\\ \mathbf{K}&0\\ \end{bmatrix}\}\\ &=-4tr(y_1y_2^T\mathbf{K})+4\sigma^2_{\gamma_g}tr(\mathbf{K}\mathbf{K}^T)+4\sigma^2_{\gamma_e}tr(\mathbf{K}) \end{align} \]

\[ \begin{align} \frac{\partial \mathcal{Q}}{\sigma^2_{\gamma_e}}&=tr\{-2yy^T \begin{bmatrix} 0&\mathbf{K}\\ \mathbf{K}&0\\ \end{bmatrix} + 2\begin{bmatrix} \sigma^2_{g_1}\mathbf{K}+\sigma^2_{e_1}\mathbf{I}&\sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}\\ \sigma^2_{\gamma_g}\mathbf{K}+\gamma_{e}\mathbf{I}&\sigma^2_{g_2}\mathbf{K}+\sigma^2_{e_2}\mathbf{I}\\ \end{bmatrix} \begin{bmatrix} 0&\mathbf{K}\\ \mathbf{K}&0\\ \end{bmatrix}\}\\ &=-4tr(y_1y_2^T\mathbf{I})+4\sigma^2_{\gamma_g}tr(\mathbf{K})+4\sigma^2_{\gamma_e}tr(\mathbf{I}) \end{align} \]

hT1=0.3
hT2=0.5
rg=0.5
re=0.3
n=1000 #sample size
m=500 #marker
frq=runif(m, 0.1, 0.5)

x=matrix(rbinom(m*n, 2, frq), n, m, byrow=T)

b1=rnorm(m)
b2=rg*b1+sqrt(1-rg^2)*rnorm(m)

e1=rnorm(n)
e2=re*e1+sqrt(1-re^2)*rnorm(n)

bv1=x%*%b1
vG1=var(bv1[,1])
y1=bv1+sqrt(vG1/hT1*(1-hT1))*e1
sy1=scale(y1)

bv2=x%*%b2
vG2=var(bv2[,1])
y2=bv2+sqrt(vG2/hT2*(1-hT2))*e2
sy2=scale(y2)

ss1=matrix(0, m, 4)
ss2=matrix(0, m, 4)
for(i in 1:m) {
  mod1=lm(sy1~x[,i])
  ss1[i,]=summary(mod1)$coefficients[2,]
  
  mod2=lm(sy2~x[,i])
  ss2[i,]=summary(mod2)$coefficients[2,]
}

rh2=(mean(ss1[,3]*ss2[,3])-mean(sy1*sy2))/(n/m)
hT1est=(mean(ss1[,3]^2)-mean(sy1*sy1))/(n/m)
hT2est=(mean(ss2[,3]^2)-mean(sy2*sy2))/(n/m)

rGEst=rh2/sqrt(hT1est*hT2est)

\[ \hat{\gamma}_g=\frac{y_1^T\mathbf{K}y_2-y_1^Ty_2}{tr(\mathbf{KK^T})-N_c}=\frac{\sqrt{N_1N_2}(\gamma_g\sqrt{h^2_1h^2_2}/M_e+\frac{N_c}{N_1N_2}r_e)-\frac{N_c}{\sqrt{N_1N_2}}r_e}{\frac{N_1N_2}{M_e}} \]

The correlation is \[\hat{r}_g=\frac{\hat{\gamma_g}}{\sqrt{\hat{\sigma}^2_{g_1}\hat{\sigma}^2_{g_2}}}\]

The sampling variance of \(\hat{r}_g\), according to Delta method (see eq 11), is

\[ var(\hat{r}_g)=\gamma^2_g (\frac{var(\sigma^2_{g_1})}{4(\sigma^2_{g_1})^2}+\frac{var(\sigma^2_{g_2})}{4(\sigma^2_{g_2})^2}+\frac{var(\gamma_g)}{(\gamma_g)^2}+\frac{2cov(\sigma^2_{g_1},\sigma^2_{g_2})}{4\sigma^2_{g_1}\sigma^2_{g_2}}-\frac{2cov(\sigma^2_{g_1},\gamma_g)}{2\sigma^2_{g_1}\gamma_g}-\frac{2cov(\sigma^2_{g_2},\gamma_g)}{2\sigma^2_{g_2}\gamma_g}) \]

S=10

SIMU=array(0, dim=c(S, 4))
for(s in 1:S) {

hT1=0.3
hT2=0.5
rg=0.5
re=0.3
nc=500
n1=500
n2=200

N1=nc+n1
N2=nc+n2

m=500 #marker
frq=runif(m, 0.1, 0.5)

x=matrix(rbinom(m*nc, 2, frq), nc, m, byrow=T)
x1=matrix(rbinom(m*n1, 2, frq), n1, m, byrow = T)
x2=matrix(rbinom(m*n2, 2, frq), n2, m, byrow = T)

X1=rbind(x, x1)
X2=rbind(x, x2)

b1=rnorm(m)
b2=rg*b1+sqrt(1-rg^2)*rnorm(m)

e0=rnorm(nc)
e1=rnorm(n1)
e2=rnorm(n2)
E1=c(e0, e1)
E2=c(re*e0+sqrt(1-re^2)*rnorm(nc), e2)

bv1=X1%*%b1
vG1=var(bv1[,1])
y1=bv1+sqrt(vG1/hT1*(1-hT1))*E1
sy1=scale(y1)
sy1c=sy1[1:nc]

bv2=X2%*%b2
vG2=var(bv2[,1])
y2=bv2+sqrt(vG2/hT2*(1-hT2))*E2
sy2=scale(y2)
sy2c=sy2[1:nc]

ss1=matrix(0, m, 4)
ss2=matrix(0, m, 4)
for(i in 1:m) {
  mod1=lm(sy1~X1[,i])
  ss1[i,]=summary(mod1)$coefficients[2,]
  
  mod2=lm(sy2~X2[,i])
  ss2[i,]=summary(mod2)$coefficients[2,]
}

Q=(mean(ss1[,3]*ss2[,3])-nc/sqrt(N1*N2)*mean(sy1c*sy2c)) / (sqrt(N1*N2)/m)

EQ=((sqrt(N1*N2)*rg*sqrt(hT1*hT2)/m+nc/sqrt(N1*N2)*re) - nc/sqrt(N1*N2)*re) / (sqrt(N1*N2)/m)
hT1est=(mean(ss1[,3]^2)-mean(sy1*sy1))/(N1/m)
hT2est=(mean(ss2[,3]^2)-mean(sy2*sy2))/(N2/m)
rG=Q/sqrt(hT1est*hT2est)
rGEst=EQ/sqrt(hT1est*hT2est)

SIMU[s, ]=c(hT1est, hT2est, rG, rGEst)
}
colMeans(SIMU)
## [1] 0.2950739 0.4926739 0.4578649 0.5139267

A useful trick on joint matrix inversion

We have \(\mathbf{M}\), a \((C+1)\times(C+1)\) symmetric matrix. We have,

  • We have its diagonal elements \(\mathbf{M}_{ii}=[T_1+N, T_2+N, ..., T_C+N, N]\),
  • We have its off-diagonal elements \(\mathbf{M}_{ij}=N\), \(i\ne j\)

in which \(N\) is sample size, and \(T_i=\frac{N^2}{M_{e.i}}\). \(M_{e.i}\) the effective number of markers for the \(i^{th}\) chromosome.

Using Gaussian elimination, we can find the inverse of the \(\mathbf{M}\) as below

\[\begin{bmatrix} T_1+N&N&...&N\\ N&T_2+N&...&N\\ \vdots&\vdots&\cdots&\vdots\\ N&N&\cdots&N \end{bmatrix} \color{red}{\begin{bmatrix} 1&0&...&0\\ 0&1&...&0\\ \vdots&\vdots&\cdots&\vdots\\ 0&0&\cdots&1 \end{bmatrix}} \]

Step 1: minus \(row_{C+1}\) from \(row_i\), \(i\ne (C+1)\), and we have \[ \begin{bmatrix} T_1&0&...&0\\ 0&T_2&...&0\\ \vdots&\vdots&\cdots&\vdots\\ N&N&\cdots&N \end{bmatrix} \color{red}{\begin{bmatrix} 1&0&...&-1\\ 0&1&...&-1\\ \vdots&\vdots&\cdots&\vdots\\ 0&0&\cdots&1 \end{bmatrix}} \]

Step 2: scale the row \(1~C\) with \(1/T_i\), and the last row with \(1/N\) \[ \begin{bmatrix} 1&0&...&0\\ 0&1&...&0\\ \vdots&\vdots&\cdots&\vdots\\ 1&1&\cdots&1 \end{bmatrix} \color{red}{\begin{bmatrix} 1/T_1&0&...&-1/T_1\\ 0&1/T_2&...&-1/T_2\\ \vdots&\vdots&\cdots&\vdots\\ 0&0&\cdots&1/N \end{bmatrix}} \]

Step 3: minus the last row with row \(1~C\), and we finish the inversion and get \(\color{red}{\mathbf{M}^{-1}}\) \[ \begin{bmatrix} 1&0&...&0\\ 0&1&...&0\\ \vdots&\vdots&\cdots&\vdots\\ 0&0&\cdots&1 \end{bmatrix} \color{red}{\begin{bmatrix} 1/T_1&0&...&-1/T_1\\ 0&1/T_2&...&-1/T_2\\ \vdots&\vdots&\cdots&\vdots\\ -1/T_1&-1/T_2&\cdots&1/N+\sum_{i=1}^C\frac{1}{T_i} \end{bmatrix}} \]

Furthermore, \(\color{red}{\mathbf{M}^{-1}}\) can be written as

\[\color{red}{\begin{bmatrix} \frac{M_{e.1}}{N^2}&0&\cdots&0&-\frac{M_{e.1}}{N^2}\\ 0&\frac{M_{e.2}}{N^2}&\cdots&0&-\frac{M_{e.2}}{N^2}\\ \vdots&\vdots&\cdots&0&\vdots\\ 0&0&\cdots&\frac{M_{e.C}}{N^2}&-\frac{M_{e.C}}{N^2}\\ -\frac{M_{e.1}}{N^2}&-\frac{M_{e.2}}{N^2}&\cdots&-\frac{M_{e.C}}{N^2}&\frac{1}{N}(1+\frac{\sum_{i=1}^C{M_{e.i}}}{N}) \end{bmatrix}} \] It involves \(M_{e.i}\) and \(N\), and all elements are known, so there is no inversion needed. The original computational cost of \(M\) is \(\mathcal{O}[(C+1)^3]\) can be reduced to linear scale of \(\mathcal{O}(C)\).

n=100
Chr=20
Me=(1:Chr)*10
Mat=matrix(0, Chr+1, Chr+1)
D_mat=n^2/Me
diag(Mat)=c(D_mat,0)
Mat=Mat+n

Mat_I=solve(Mat)

Mat_IE=matrix(0, Chr+1, Chr+1)
Mat_IE[1:Chr, ncol(Mat_IE)]=-1/D_mat
diag(Mat_IE)=c(1/D_mat, 1/n+sum(Me)/n^2)
Mat_IE[nrow(Mat_IE), 1:(ncol(Mat_IE)-1)]=-1/D_mat

layout(matrix(1:3, 1, 3))
plot(main="diag",diag(Mat_IE), diag(Mat_I), xlab="Expected", ylab="Inversed")
abline(a=0, b=1)
plot(main="last colunm",Mat_IE[1:Chr, ncol(Mat_IE)], Mat_I[1:Chr, ncol(Mat_I)], xlab="Expected", ylab="Inversed")
abline(a=0, b=1)

plot(main="last row",Mat_IE[nrow(Mat_IE), ], Mat_I[nrow(Mat_I), ], xlab="Expected", ylab="Inversed")
abline(a=0, b=1)

R sess

sessionInfo()
## R version 3.6.2 (2019-12-12)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS  10.16
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] kableExtra_1.2.1 knitr_1.30       Rcpp_1.0.5      
## 
## loaded via a namespace (and not attached):
##  [1] rstudioapi_0.11   xml2_1.3.2        magrittr_1.5      rvest_0.3.6      
##  [5] munsell_0.5.0     colorspace_1.4-1  viridisLite_0.3.0 R6_2.4.1         
##  [9] rlang_0.4.8       stringr_1.4.0     httr_1.4.2        highr_0.8        
## [13] tools_3.6.2       webshot_0.5.2     xfun_0.18         htmltools_0.5.0  
## [17] yaml_2.2.1        digest_0.6.26     lifecycle_0.2.0   glue_1.4.2       
## [21] evaluate_0.14     rmarkdown_2.7     stringi_1.5.3     compiler_3.6.2   
## [25] scales_1.1.1