In this demonstration we will use data on three traits to estimate an economic selection index and the accuracy of the index. The data are on a set of individual genotypes that are not replicated. Assumptions of trait heritabilities and genetic correlations are provided. The three traits are grain yield (gryld), days to harvest (dth), and a vegetitaive index called gndvi. The economic values of grain yield and days to harvest are provided. Gndvi has no economic value.

First load the data into memory

load("IndexData1.RData")

Look at the data

head(data)
##       gid dth    gryld     gndvi
## 1 4905617  52 2.308036 0.4609720
## 2 6176013  52 2.872769 0.4192635
## 3 6569128  52 1.573661 0.4730780
## 4 6688880  51 3.062500 0.4748669
## 5 6688916  53 2.174107 0.4581281
## 6 6688933  49 2.544643 0.4597804

Check that the data is numeric

str(data)
## 'data.frame':    30 obs. of  4 variables:
##  $ gid  : Factor w/ 1097 levels "3827768","4905617",..: 2 5 6 10 11 12 13 14 15 16 ...
##  $ dth  : num  52 52 52 51 53 49 50 53 53 54 ...
##  $ gryld: num  2.31 2.87 1.57 3.06 2.17 ...
##  $ gndvi: num  0.461 0.419 0.473 0.475 0.458 ...

Check the dimensions

dim(data)
## [1] 30  4

Calculate the phenotypic covariance matrix

Now we need to calculate the phenotypic covariance matrix ‘P’, this can be done using the cov() function or we can calculate it the long way. To calculate P the long way, first we need to subtract the mean phenotype value from each trait, we can do this using the scale() function.

X<- scale(data[,c('dth','gryld','gndvi')], center=TRUE, scale=FALSE)

Next for each pair of traits we can calculate the product of the deviations from the mean for trait 1 with that of trait 2, and then take the average like this:

P<- t(X) %*% X *1/(nrow(X)-1)

P looks like this:

##                dth        gryld         gndvi
## dth    1.443678161 -0.439183041 -0.0098963633
## gryld -0.439183041  0.293670688  0.0014278642
## gndvi -0.009896363  0.001427864  0.0003314357

Notice that P is symmetric and that the diagonals contain the variances, and the off-diagonals contain the covariances

Calculate the genetic covariance matrix

Now we will estimate the matrix G which is a matrix that contains the genetic covariances between the traits in the index and the traits in the aggregate genotype. To do this we will assume the following heritabilities and genetic correlations

Genetic correlation between dth and gryld

r_dth.gryld<- -0.6 

Genetic correlation between dth and ndvi

r_dth.gndvi<- -0.3

Genetic correlation between gryld and gndvi

r_gryld.gndvi<- 0.3

Heritability of dth

h2_dth<- 0.4

Heritability of gndvi

h2_gndvi<- 0.6

Heritability of gryld

h2_gryld<- 0.2

Now we make an empty matrix, G, where we will fill in the values

G<- matrix(NA, 3, 2, dimnames=list(c('dth','gryld','gndvi'),c('dth', 'gryld')))

G looks like this

##       dth gryld
## dth    NA    NA
## gryld  NA    NA
## gndvi  NA    NA

Fill in matrix G using the phenotypic variances, the heritabilities, and the genetic correlations

Calculate the genetic variance for each trait

va_dth<- h2_dth*P['dth','dth'] #genetic variance of days to heading
va_gndvi<- h2_gndvi*P['gndvi','gndvi'] #genetic variance of GNDVI
va_gryld<- h2_gryld*P['gryld','gryld'] #genetic variance of grain yield

Calculate the genetic covariances for each trait

cv_dth.gryld<- r_dth.gryld*sqrt(va_dth)*sqrt(va_gryld) #days to heading and yield
cv_dth.gndvi<- r_dth.gndvi*sqrt(va_dth)*sqrt(va_gndvi) #days to heading and GNDVI
cv_gryld.gndvi<- r_gryld.gndvi*sqrt(va_gryld)*sqrt(va_gndvi) #grain yield and GNDVI

Fill in G matrix variances

G['dth', 'dth']<-va_dth
G['gryld', 'gryld']<-va_gryld

Fill in G matrix covariances

G['dth','gryld']<- cv_dth.gryld
G['gryld', 'dth']<- cv_dth.gryld
G['gndvi', 'dth']<- cv_dth.gndvi
G['gndvi','gryld']<- cv_gryld.gndvi

Look at G

G
##                dth        gryld
## dth    0.577471264 -0.110499862
## gryld -0.110499862  0.058734138
## gndvi -0.003214858  0.001025278

Solve for the selection index weights (vector b)

v<- as.matrix(c(-5, 60))
b<- solve(P)%*%G %*% v
b
##            [,1]
## dth   -3.391515
## gryld  8.338143
## gndvi 96.916516

Calculate index values

I<-as.matrix(data[,c('dth','gryld','gndvi')]) %*%b

Plot histogram of net merit

hist(I)

Check correlations between the index and the economic traits

Correlation between I and yield

cor(I, data$gryld)
##           [,1]
## [1,] 0.8757228
cor(I, data$dth)
##            [,1]
## [1,] -0.9221152

Correlation between I and days to heading

cor(I, data$dth)
##            [,1]
## [1,] -0.9221152

Calculate the accuracy of the index.

First make the matrix C, genetic covariance matrix of traits in the aggregate genotype

C<- G[c('dth','gryld'),c('dth','gryld')]
numer<- t(b) %*% G %*% v
denom<- t(v) %*% C %*% v
r<- sqrt(numer/denom)
r
##         [,1]
## [1,] 0.50254

Now recalculate with genetic correlation between gndvi and gryld equal to 0.6

Genetic correlation between gryld and gndvi

r_gryld.gndvi<- 0.6

Heritability of dth

h2_dth<- 0.4

Heritability of gndvi

h2_gndvi<- 0.6

Heritability of gryld

h2_gryld<- 0.2

Now we make an empty matrix, G, where we will fill in the values

G<- matrix(NA, 3, 2, dimnames=list(c('dth','gryld','gndvi'),c('dth', 'gryld')))

G looks like this

##       dth gryld
## dth    NA    NA
## gryld  NA    NA
## gndvi  NA    NA

Calculate the genetic variance for each trait

va_dth<- h2_dth*P['dth','dth'] #genetic variance of days to heading
va_gndvi<- h2_gndvi*P['gndvi','gndvi'] #genetic variance of GNDVI
va_gryld<- h2_gryld*P['gryld','gryld'] #genetic variance of grain yield

Calculate the genetic covariances for each trait

cv_dth.gryld<- r_dth.gryld*sqrt(va_dth)*sqrt(va_gryld) #days to heading and yield
cv_dth.gndvi<- r_dth.gndvi*sqrt(va_dth)*sqrt(va_gndvi) #days to heading and GNDVI
cv_gryld.gndvi<- r_gryld.gndvi*sqrt(va_gryld)*sqrt(va_gndvi) #grain yield and GNDVI

Fill in G matrix variances and covariances

Look at G

G
##                dth        gryld
## dth    0.577471264 -0.110499862
## gryld -0.110499862  0.058734138
## gndvi -0.003214858  0.002050557

Solve for the selection index weights (vector b)

v<- as.matrix(c(-5, 60))
b<- solve(P)%*%G %*% v

Calculate the accuracy of the index. First make the matrix C, genetic covariance matrix of traits in the aggregate genotype

C<- G[c('dth','gryld'),c('dth','gryld')]
numer<- t(b) %*% G %*% v
denom<- t(v) %*% C %*% v
r<- sqrt(numer/denom)
r
##           [,1]
## [1,] 0.5878723