- Teaching philosophy
- Background
- Teaching experiences
- Intro to mixed models
April 12, 2022
In a world in transition, students and teachers both need to teach themselves one essential skill– learning how to learn. - Carl Sagan, Demon-haunted world
B.Sc (Tribhuvan University) : Agriculture
Plant Breeding and Genetics
M.S (University of Georgia) : Horticulture
Comparative Genomics, Molecular Biology, Apomixis
Ph.D (Clemson University) : Plant and Environmental Sciences
Genomics-Assisted Breeding in Sorghum
Research Scientist (Clemson University): Advanced Plant Technology Program
Genomics-Assisted Selection, High Throughput Phenotying
University of Georgia
Clemson University
Molecular Biology (GEN 3040): Fall 2015
Population and Quantitative Genetics (GEN 4110): Spring 2016, 2017
Molecular Genetics and Gene Regulation (GEN 4210): Fall 2016
Syllabus - outline
I. Plant breeding and genetics of population
II. Means of genotypes and breeding populations
Syllabus - outline
III. Variation in breeding populations
IV. Selection in breeding populations
Approach

Part 1: Concepts
Part 2: Applications
Francis Galton - 1886: Regression and heritability
Ronald Fisher - 1918: Infinitesimal model (P = G + E)
Sewall Wright - 1922: Genetic relationship
Charles Henderson - 1968: BLUP using relationship
Which famous scientist was Francis Galton related to?
“… the ordinary genealogical course of a race consists in a constant outgrowth from its centre, a constant dying away at its margins, and a tendency of the scanty remains of all exceptional stock to revert to that mediocrity, whence the majority of their ancestors originally sprang.” - Galton, Typical Laws of Heredity
When did Mendel publish his laws of inheritance?
When did Mendel publish his laws of inheritance?
1866
When was Mendelism rediscovered?
When was Mendelism rediscovered?
1900
Which of these was not the one to rediscover Mendelism?
Hugo De Vries
William Bateson
Carl Correns
Erich von Tschermak
Which of these was not the one to rediscover Mendelism?
Hugo De Vries
William Bateson
Carl Correns
Erich von Tschermak
Vigorous debates (sometimes on personal level) ensued
Biometricians
Weldon, Pearson
Mendelians
Bateson, De Vries
Vigorous debates (sometimes on personal level) ensued
Biometricians
Weldon, Pearson
Mendelians
Bateson, De Vries
G. Udny Yule (1902): New Phytologist 1 (9):226-27.
“Mendel’s Laws, so far from being in any way inconsistent with the Law of Ancestral Heredity, lead directly to a special case of that law.”
Simple linear regression
\(y = \beta_0 + \beta_1x + e\)
Multiple linear regression
\(y = \beta_0 + \beta_1x_1 + \beta_2x_2 + ... + \beta_px_p + e\)
Combination of fixed and random terms
\(y=Xb+Zu+e\)
Fixed effect
Fixed effect
Random effects
\(y=Xb+Zu+e\)
Assuming: \(u \sim N(0,K\sigma^{2}_{u})\) and \(e \sim N(0,I\sigma^{2}_{e})\)
\(y=Xb+Zu+e\)
Assuming: \(u \sim N(0,K\sigma^{2}_{u})\) and \(e \sim N(0,I\sigma^{2}_{e})\)
Summary:
Henderson’s equation
\[ \left[\begin{array}{rr} X'X & Z'X \\ X'Z & Z'Z+\lambda K^{-1} \end{array}\right] \left[\begin{array}{r} b \\ u \end{array}\right] =\left[\begin{array}{r} X'y \\ Z'y \end{array}\right] \] \(\lambda=\sigma^{2}_{e}/\sigma^{2}_{u}\) (Regularization parameters) (1)
The mixed model can also be notated as follows
\[y=Wg+e\]
Solved as
\[[W'W+\Sigma] g = [W'y]\]
Where
\(W=[X,Z]\)
\(g=[b,u]\)
\(\Sigma = \left[\begin{array}{rr} 0 & 0 \\ 0 & \lambda K^{-1} \end{array}\right]\)
1 Genetic values
1 Genetic values
2 Breeding values
1 Genetic values
2 Breeding values
3 Genomic Breeding values
Data:
## Env Gen Phe ## 1 E1 G1 47 ## 2 E1 G2 51 ## 3 E1 G3 46 ## 4 E1 G4 58 ## 5 E2 G1 52 ## 6 E2 G2 46 ## 7 E2 G3 52 ## 8 E2 G4 54 ## 9 E3 G1 53 ## 10 E3 G2 48 ## 11 E3 G3 58 ## 12 E3 G4 52
Model: \(Phenotype = Environment_{(F)} + Genotype_{(R)}\)
\[y=Wg+e\] Design matrix \(W=[X,Z]\):
## EnvE1 EnvE2 EnvE3 GenG1 GenG2 GenG3 GenG4 ## 1 1 0 0 1 0 0 0 ## 2 1 0 0 0 1 0 0 ## 3 1 0 0 0 0 1 0 ## 4 1 0 0 0 0 0 1 ## 5 0 1 0 1 0 0 0 ## 6 0 1 0 0 1 0 0 ## 7 0 1 0 0 0 1 0 ## 8 0 1 0 0 0 0 1 ## 9 0 0 1 1 0 0 0 ## 10 0 0 1 0 1 0 0 ## 11 0 0 1 0 0 1 0 ## 12 0 0 1 0 0 0 1
Mixed-model: \[[W'W+\Sigma] g = [W'y]\] \(W'W\):
WW = crossprod(W) print(WW)
## EnvE1 EnvE2 EnvE3 GenG1 GenG2 GenG3 GenG4 ## EnvE1 4 0 0 1 1 1 1 ## EnvE2 0 4 0 1 1 1 1 ## EnvE3 0 0 4 1 1 1 1 ## GenG1 1 1 1 3 0 0 0 ## GenG2 1 1 1 0 3 0 0 ## GenG3 1 1 1 0 0 3 0 ## GenG4 1 1 1 0 0 0 3
Mixed-model: \[[W'W+\Sigma] g = [W'y]\]
Left-hand side (\(W'W+\Sigma\)):
## EnvE1 EnvE2 EnvE3 GenG1 GenG2 GenG3 GenG4 ## EnvE1 4 0 0 1.00 1.00 1.00 1.00 ## EnvE2 0 4 0 1.00 1.00 1.00 1.00 ## EnvE3 0 0 4 1.00 1.00 1.00 1.00 ## GenG1 1 1 1 3.17 0.00 0.00 0.00 ## GenG2 1 1 1 0.00 3.17 0.00 0.00 ## GenG3 1 1 1 0.00 0.00 3.17 0.00 ## GenG4 1 1 1 0.00 0.00 0.00 3.17
Assuming independent individuals: \(K=I\)
Regularization: \(\lambda=\sigma^{2}_{e}/\sigma^{2}_{u}=1.64/9.56=0.17\)
Mixed-model: \[[W'W+\Sigma] g = [W'y]\]
Right-hand side (\(W'y\)):
## [,1] ## EnvE1 202 ## EnvE2 204 ## EnvE3 211 ## GenG1 152 ## GenG2 145 ## GenG3 156 ## GenG4 164
Mixed-model: \[[W'W+\Sigma] g = [W'y]\]
We can find coefficients through least-square solution
\(g=(LHS)^{-1}(RHS)=(W'W+\Sigma)^{-1}W'y\)
## [,1] ## EnvE1 50.50 ## EnvE2 51.00 ## EnvE3 52.75 ## GenG1 -0.71 ## GenG2 -2.92 ## GenG3 0.55 ## GenG4 3.08
\(BLUE=\frac{w'y}{w'w}=\frac{sum}{n}\) = simple average
\(BLUP=\frac{w'y}{w'w+\lambda}=\frac{sum}{n+\lambda}\) = biased average = \(BLUE\times h^2\)
Note:
If we know the relationship among individuals:
## GenG1 GenG2 GenG3 GenG4 ## GenG1 1.00 0.64 0.23 0.48 ## GenG2 0.64 1.00 0.33 0.67 ## GenG3 0.23 0.33 1.00 0.31 ## GenG4 0.48 0.67 0.31 1.00
Then we estimate \(\lambda K^{-1}\)
## GenG1 GenG2 GenG3 GenG4 ## GenG1 0.15 -0.09 0.00 -0.01 ## GenG2 -0.09 0.22 -0.02 -0.10 ## GenG3 0.00 -0.02 0.10 -0.02 ## GenG4 -0.01 -0.10 -0.02 0.17
Regularization: \(\lambda=\sigma^{2}_{e}/\sigma^{2}_{u}=1.64/17.70=0.09\)
And the left-hand side becomes
## EnvE1 EnvE2 EnvE3 GenG1 GenG2 GenG3 GenG4 ## EnvE1 4 0 0 1.00 1.00 1.00 1.00 ## EnvE2 0 4 0 1.00 1.00 1.00 1.00 ## EnvE3 0 0 4 1.00 1.00 1.00 1.00 ## GenG1 1 1 1 3.15 -0.09 0.00 -0.01 ## GenG2 1 1 1 -0.09 3.22 -0.02 -0.10 ## GenG3 1 1 1 0.00 -0.02 3.10 -0.02 ## GenG4 1 1 1 -0.01 -0.10 -0.02 3.17
We can find coefficients through least-square solution
\(g=(LHS)^{-1}(RHS)=(W'W+\Sigma)^{-1}W'y\)
## [,1] ## EnvE1 51.05 ## EnvE2 51.55 ## EnvE3 53.30 ## GenG1 -1.32 ## GenG2 -3.34 ## GenG3 0.03 ## GenG4 2.45
Genetic coefficients shrink more: Var(A) < Var(G)
Expectation-Maximization REML (1977)
\(\sigma^{2}_{u} = \frac{u'K^{-1}u}{q-\lambda tr(K^{-1}C^{22})}\) and \(\sigma^{2}_{e} = \frac{e'y}{n-p}\)
Bayesian Gibbs Sampling (1993)
\(\sigma^{2}_{u} = \frac{u'K^{-1}u+S_u\nu_u}{\chi^2(q+\nu_u)}\) and \(\sigma^{2}_{e} = \frac{e'e+S_e\nu_e}{\chi^2(n+\nu_e)}\)
Kernel methods:
Ridge methods:
Kernel
\(y=Xb+Zu+e\), \(u\sim N(0,K\sigma^2_u)\)
Ridge
\(y=Xb+Ma+e\), \(a\sim N(0,I\sigma^2_a)\)
Where
Kernel model
\[ \left[\begin{array}{rr} X'X & Z'X \\ X'Z & Z'Z+K^{-1}(\sigma^2_e/\sigma^2_u) \end{array}\right] \left[\begin{array}{r} b \\ u \end{array}\right] =\left[\begin{array}{r} X'y \\ Z'y \end{array}\right] \]
Ridge model
\[ \left[\begin{array}{rr} X'X & M'X \\ X'M & M'M+I^{-1}(\sigma^2_e/\sigma^2_a) \end{array}\right] \left[\begin{array}{r} b \\ a \end{array}\right] =\left[\begin{array}{r} X'y \\ M'y \end{array}\right] \]
Both models capture same genetic signal (de los Campos 2015)
K = tcrossprod(M)/ncol(M) GBLUP = reml(y=y,K=K); Kernel_GEBV = GBLUP$EBV RRBLUP = reml(y=y,Z=M); Ridge_GEBV = M%*%RRBLUP$EBV plot(Kernel_GEBV,Ridge_GEBV, main='Comparing results')