We will use the mixed model equations (Equation 4 from lecture 12 notes) to predict breeding values for days-to-heading measured on 557 wheat lines in two environments, there is no replication within environment. Then we will calculate the reliabilities and the selection accuracies.
load("Class16data.RData")
Lets assume the the error and genetic variance in the base population is 115 and 13 respectively
Ve<- 115
Va<- 13
Calculate alpha, the shrinkage parameter
alpha<- Ve/Va
alpha
## [1] 8.846154
Create the design matrix X using the model.matrix function
X<- model.matrix(DTHD~environment, data=pheno)
head(X)
## (Intercept) environmentoptimal
## 1 1 0
## 2 1 0
## 3 1 0
## 4 1 0
## 5 1 0
## 6 1 0
Create the design matrix Z using the model.matrix function
Z<- model.matrix(DTHD~genotype_identifier-1, data=pheno)
Z[1:3, 1:3]
## genotype_identifier6569128 genotype_identifier6688880
## 1 1 0
## 2 0 1
## 3 0 0
## genotype_identifier6688916
## 1 0
## 2 0
## 3 1
The model.matrix function has appended ‘genotype_identifier’ to the column names of Z. So that the column names of Z will match the column names of A, we will remove ‘genotype_identifier’ from the column names of Z using the gsub() function.
colnames(Z)<- gsub("genotype_identifier", "", colnames(Z))
Make sure A rows and columns are in the same order as Z columns
A<- A[colnames(Z),colnames(Z)]
A[1:3, 1:3]
## 6569128 6688880 6688916
## 6569128 1.9930 0.3972 0.3736
## 6688880 0.3972 1.9892 1.0466
## 6688916 0.3736 1.0466 1.9618
Calculate the inverse of A
Ainv<- solve(A)
Ainv[1:3, 1:3]
## 6569128 6688880 6688916
## 6569128 1.2624028157 0.0009917869 -0.002926766
## 6688880 0.0009917869 0.9279932611 -0.149256306
## 6688916 -0.0029267663 -0.1492563058 1.385856866
Create the vector y
y<-as.vector(pheno$DTHD)
Create the matrix on the left hand side
C11<- t(X)%*%X
C12<- t(X)%*%Z
C21<- t(Z)%*%X
C22<- t(Z)%*%Z + Ainv*alpha
upper_LHS<- cbind(C11, C12)
lower_LHS<- cbind(C21, C22)
LHS= rbind(upper_LHS, lower_LHS)
Create the matrix on the right hand side
RHS<- rbind(t(X)%*%y,t(Z)%*%y)
Calculate the inverse of the left hand side
invLHS<- solve(LHS)
Calculate the solutions for the fixed and random effects
b_u<- invLHS %*%RHS
Calculate PEV matrix
PEVmat<- invLHS*Ve
Get the diagonals of the PEV mat
dPEV<- diag(PEVmat)
Subset the diagonals of the PEV mat to get the PEVs of the EBVs only. (The first two PEVs correspond to the fixed effect estimates in this example)
pev_ebv<-dPEV[-c(1:2)]
Calculate the reliabilities (remember to use the Va among individuals, not Va in the base population)
Vf<-diag(A)*Va
rel<- 1-pev_ebv/Vf
Make a histogram of the reliabilities
hist(rel)
Calculate the median of the reliabilities
mean(rel)
## [1] 0.4801748
Calculate the median selection accuracy
mean(sqrt(rel))
## [1] 0.6897284
Compare this to the accuracy of phenotypic selection among the lines and compare with the selection accuracy for the EBVs
Vf<- Va*median(diag(A))
sqrt(Vf/c(Vf+Ve/2))
## [1] 0.5567591