Estimation of variance components and heritability

In this demonstration you will estimate variance components using a dataset consisting of phenotypes on individual plants (GID) belonging to 30 half-sib families (HSfam). The data are from two locations

library(lme4)

## Loading required package: Matrix

library(rrBLUP)
library(pedigreemm)
library(arm)

## Loading required package: MASS

## 
## arm (Version 1.13-1, built: 2022-8-25)

## Working directory is /Users/jrut/Library/Mobile Documents/com~apple~CloudDocs/Documents/Teaching/CPSC554/2024/Class 7

Load the data objects

setwd("~/Library/Mobile Documents/com~apple~CloudDocs/Documents/Teaching/CPSC554")
load('HS example data.RData')

Look at the phenotypic data

The GID column contains indivudal unique IDs. The HSfam indicates to which half-sib family each individual belongs. The l

head(pheno)

##     GID HSfam loc      pheno           bv
## 51   81     1   1 -0.9553381 -0.463061920
## 80  110     1   1 -1.0753473 -1.379104345
## 87  117     1   1  3.1308638  1.377296145
## 88  118     1   1  0.7319875 -1.186054227
## 162 192     1   1 -1.0200256 -0.005779062
## 174 204     1   1 -0.1315346 -1.224348681

str(pheno)

## 'data.frame':    500 obs. of  5 variables:
##  $ GID  : int  81 110 117 118 192 204 206 232 337 358 ...
##  $ HSfam: num  1 1 1 1 1 1 1 1 1 1 ...
##  $ loc  : num  1 1 1 1 1 1 1 1 2 2 ...
##  $ pheno: num  -0.955 -1.075 3.131 0.732 -1.02 ...
##  $ bv   : num  -0.46306 -1.3791 1.3773 -1.18605 -0.00578 ...

Change some variables to factors

The loc, HSfam, and GID variables were recognized as integer or numeric, but they are actually factors. They need to be changed to factors using as.factor()

pheno$loc<- as.factor(pheno$loc)
pheno$HSfam<- as.factor(pheno$HSfam)
pheno$GID<- as.factor(pheno$GID)

Fit a half-sib family model

In this model, pheno is the response variable, location is a fixed factor, and HSfam is a random factor. This allows us to estimate how much of the total phenotypic variance can be attributed to half-sib family, \(\sigma^2_{f}\). This is also the same as the genetic covariance variance between individuals within a half-sib family

mod<- lmer(pheno~loc+(1|HSfam), data=pheno)

Get the variance components

vc<- data.frame(VarCorr(mod))
Varfam<- vc[1,'vcov']
Varresid<- vc[2,'vcov']

Calculate the additive genetic variance in the base population.

Recall that \(\sigma^2_{f}=r_{xy}\times\sigma^2_{a}\) where \(\sigma^2_{f}\) is the variance due to half-sib family, \(\sigma^2_{a}\) is the additive genetic variance in the base population, and \(r_{xy}\) is the coefficient of relationship between inividuals within a family.

Varabase<- 4*Varfam

Estimate heritabilitites

First we estimate the what is often referred to as the ‘family-mean heritability.’ Better known as the reliability. This tells us how well whole family phenotypic means or BLUPs correspond with the true family-mean breeding values

nreps<- mean(table(pheno$HSfam))
H2fam<- Varfam/(Varfam+ Varresid/nreps)
H2fam

## [1] 0.6434489

Next we estimate the actual narrow-sense heritability. This tells us the correspondence of phenotypes and genotypes observed on a single non-inbred individual in a single environment.

Varabase/(Varabase+Varresid)

## [1] 0.3022194

Get BLUPs of the Half-sib families

BLUPs are Best Linear Unbiased Predictors, they are random effect coefficients.These values are an estimate of true average breeding value for each family. In our example, this also tells us the 1/2 the breeding value of the female parents (the mother of each half-sib family). Multiply by 2 to get the breeding value estimate of the female parent of the half-sib family

Parentebv1<- ranef(mod)$HSfam[,1] *2

Next we can calculate the reliability of these BLUPs using the prediction error variance (PEV) and the half-sib family variance

sep<- se.ranef(mod)$HSfam[,1]
pev<- sep^2
rel<- 1-pev/Varfam
mean(rel)

## [1] 0.6382942

Fit a pedigree BLUP model

Now we will use complete pedigree information to estimate the same parameters from the half-sib family data. These estimates will be more accurate because we will account for the fact that some half-sib families are more related to eachother than others.

Estimate pedigree relationships

We first need to generate the pedigree relationship matrix to use it in the model

p2<- editPed(ped$p2, ped$p1, ped$GID) #order the pedigree
p3<- pedigree(p2$sire, p2$dam, p2$label) #create pedigree object
A<- as.matrix(getA(p3)) #create pedigree relationship matrix

## 'as(<dtTMatrix>, "dtCMatrix")' is deprecated.
## Use 'as(., "CsparseMatrix")' instead.
## See help("Deprecated") and help("Matrix-deprecated").

Generate fixed and random effect design matrices

First the fixed effects design matrix

X<- model.matrix(pheno~1+loc, pheno)

Next the random effects design matrix

pheno$GID<- factor(pheno$GID, levels= colnames(A))
Z<- model.matrix(pheno~0+GID, pheno) 
colnames(Z)<- gsub("GID", "", colnames(Z)) #remove the prefix from the columns

Ensure that columns of Z match with rows of A

A<- A[colnames(Z),]

Fit the model

The vector of phenotypes, design matrices, and relationship matrices are used as input to the mixed model solver, mixed.solve()

mod2<- mixed.solve(y= pheno$pheno, Z, K=A, X)

Look at the results

First look at the outputs from the model

str(mod2)

## List of 5
##  $ Vu  : num 1.67
##  $ Ve  : num 2.65
##  $ beta: num [1:2(1d)] -1.15 -0.226
##   ..- attr(*, "dimnames")=List of 1
##   .. ..$ : chr [1:2] "(Intercept)" "loc2"
##  $ u   : num [1:557(1d)] 0.389 -0.533 -0.264 -0.92 -0.171 ...
##   ..- attr(*, "dimnames")=List of 1
##   .. ..$ : chr [1:557] "-1" "-10" "-11" "-12" ...
##  $ LL  : num -1058

Next look at the distribution if the BLUPs. Unlike in the previous model, there is a BLUP for each individual.

hist(mod2$u)

Get the variance components

The estimate of the additive genetic variance in the base population is in Vu, and the estimate of the residual error variance is in Ve

VarabaseMX<- mod2$Vu
VarresidMX<- mod2$Ve

Next we can estimate the heritability in the base population

VarabaseMX/c(VarabaseMX+VarresidMX)

## [1] 0.3868276

Note that the value is similar to what we estimated earlier.

Get BLUPs of the female parents

We can also get the BLUPs of the female parents (the mothers) of each of the half-sib families. These are GIDs 1-30.

Parentebv2<- mod2$u[as.character(1:30)]

Compare these breeding value estimates with the ones estimated previously.

plot(Parentebv1, Parentebv2)