In this demonstration we will use data on three traits to estimate an economic selection index and the accuracy of the index. The data are on a set of individual genotypes that are not replicated. Assumptions of trait heritabilities and genetic correlations are provided. The three traits are grain yield (gryld), days to harvest (dth), and a vegetitaive index called gndvi. The economic values of grain yield and days to harvest are provided. Gndvi has no economic value.
First load the data into memory
load("IndexData1.RData")
Look at the data
head(data)
## gid dth gryld gndvi
## 1 4905617 52 2.308036 0.4609720
## 2 6176013 52 2.872769 0.4192635
## 3 6569128 52 1.573661 0.4730780
## 4 6688880 51 3.062500 0.4748669
## 5 6688916 53 2.174107 0.4581281
## 6 6688933 49 2.544643 0.4597804
Check that the data is numeric
str(data)
## 'data.frame': 30 obs. of 4 variables:
## $ gid : Factor w/ 1097 levels "3827768","4905617",..: 2 5 6 10 11 12 13 14 15 16 ...
## $ dth : num 52 52 52 51 53 49 50 53 53 54 ...
## $ gryld: num 2.31 2.87 1.57 3.06 2.17 ...
## $ gndvi: num 0.461 0.419 0.473 0.475 0.458 ...
Check the dimensions
dim(data)
## [1] 30 4
Now we need to calculate the phenotypic covariance matrix ‘P’, this can be done using the cov() function or we can calculate it the long way. To calculate P the long way, first we need to subtract the mean phenotype value from each trait, we can do this using the scale() function.
X<- scale(data[,c('dth','gryld','gndvi')], center=TRUE, scale=FALSE)
Next for each pair of traits we can calculate the product of the deviations from the mean for trait 1 with that of trait 2, and then take the average like this:
P<- t(X) %*% X *1/(nrow(X)-1)
P looks like this:
## dth gryld gndvi
## dth 1.443678161 -0.439183041 -0.0098963633
## gryld -0.439183041 0.293670688 0.0014278642
## gndvi -0.009896363 0.001427864 0.0003314357
Notice that P is symmetric and that the diagonals contain the variances, and the off-diagonals contain the covariances
Now we will estimate the matrix G which is a matrix that contains the genetic covariances between the traits in the index and the traits in the aggregate genotype. To do this we will assume the following heritabilities and genetic correlations
Genetic correlation between dth and gryld
r_dth.gryld<- -0.6
Genetic correlation between dth and ndvi
r_dth.gndvi<- -0.3
Genetic correlation between gryld and gndvi
r_gryld.gndvi<- 0.3
Heritability of dth
h2_dth<- 0.4
Heritability of gndvi
h2_gndvi<- 0.6
Heritability of gryld
h2_gryld<- 0.2
Now we make an empty matrix, G, where we will fill in the values
G<- matrix(NA, 3, 2, dimnames=list(c('dth','gryld','gndvi'),c('dth', 'gryld')))
G looks like this
## dth gryld
## dth NA NA
## gryld NA NA
## gndvi NA NA
Fill in matrix G using the phenotypic variances, the heritabilities, and the genetic correlations
Calculate the genetic variance for each trait
va_dth<- h2_dth*P['dth','dth'] #genetic variance of days to heading
va_gndvi<- h2_gndvi*P['gndvi','gndvi'] #genetic variance of GNDVI
va_gryld<- h2_gryld*P['gryld','gryld'] #genetic variance of grain yield
Calculate the genetic covariances for each trait
cv_dth.gryld<- r_dth.gryld*sqrt(va_dth)*sqrt(va_gryld) #days to heading and yield
cv_dth.gndvi<- r_dth.gndvi*sqrt(va_dth)*sqrt(va_gndvi) #days to heading and GNDVI
cv_gryld.gndvi<- r_gryld.gndvi*sqrt(va_gryld)*sqrt(va_gndvi) #grain yield and GNDVI
Fill in G matrix variances
G['dth', 'dth']<-va_dth
G['gryld', 'gryld']<-va_gryld
Fill in G matrix covariances
G['dth','gryld']<- cv_dth.gryld
G['gryld', 'dth']<- cv_dth.gryld
G['gndvi', 'dth']<- cv_dth.gndvi
G['gndvi','gryld']<- cv_gryld.gndvi
Look at G
G
## dth gryld
## dth 0.577471264 -0.110499862
## gryld -0.110499862 0.058734138
## gndvi -0.003214858 0.001025278
v<- as.matrix(c(-5, 60))
b<- solve(P)%*%G %*% v
b
## [,1]
## dth -3.391515
## gryld 8.338143
## gndvi 96.916516
I<-as.matrix(data[,c('dth','gryld','gndvi')]) %*%b
Plot histogram of net merit
hist(I)
Correlation between I and yield
cor(I, data$gryld)
## [,1]
## [1,] 0.8757228
cor(I, data$dth)
## [,1]
## [1,] -0.9221152
Correlation between I and days to heading
cor(I, data$dth)
## [,1]
## [1,] -0.9221152
First make the matrix C, genetic covariance matrix of traits in the aggregate genotype
C<- G[c('dth','gryld'),c('dth','gryld')]
numer<- t(b) %*% G %*% v
denom<- t(v) %*% C %*% v
r<- sqrt(numer/denom)
r
## [,1]
## [1,] 0.50254
Genetic correlation between gryld and gndvi
r_gryld.gndvi<- 0.6
Heritability of dth
h2_dth<- 0.4
Heritability of gndvi
h2_gndvi<- 0.6
Heritability of gryld
h2_gryld<- 0.2
Now we make an empty matrix, G, where we will fill in the values
G<- matrix(NA, 3, 2, dimnames=list(c('dth','gryld','gndvi'),c('dth', 'gryld')))
G looks like this
## dth gryld
## dth NA NA
## gryld NA NA
## gndvi NA NA
Calculate the genetic variance for each trait
va_dth<- h2_dth*P['dth','dth'] #genetic variance of days to heading
va_gndvi<- h2_gndvi*P['gndvi','gndvi'] #genetic variance of GNDVI
va_gryld<- h2_gryld*P['gryld','gryld'] #genetic variance of grain yield
Calculate the genetic covariances for each trait
cv_dth.gryld<- r_dth.gryld*sqrt(va_dth)*sqrt(va_gryld) #days to heading and yield
cv_dth.gndvi<- r_dth.gndvi*sqrt(va_dth)*sqrt(va_gndvi) #days to heading and GNDVI
cv_gryld.gndvi<- r_gryld.gndvi*sqrt(va_gryld)*sqrt(va_gndvi) #grain yield and GNDVI
Fill in G matrix variances and covariances
Look at G
G
## dth gryld
## dth 0.577471264 -0.110499862
## gryld -0.110499862 0.058734138
## gndvi -0.003214858 0.002050557
Solve for the selection index weights (vector b)
v<- as.matrix(c(-5, 60))
b<- solve(P)%*%G %*% v
Calculate the accuracy of the index. First make the matrix C, genetic covariance matrix of traits in the aggregate genotype
C<- G[c('dth','gryld'),c('dth','gryld')]
numer<- t(b) %*% G %*% v
denom<- t(v) %*% C %*% v
r<- sqrt(numer/denom)
r
## [,1]
## [1,] 0.5878723