Introduction to the A Matrix

The numerator relationship matrix (A) has been used in animal breeding to help us ‘borrow information’ from relatives. This means that if we know the parents, each parents should provide a random sample of one out of every two alleles. One allele would be randomly sampled from the two from each parent to create a new offspring. Statistics prior to animal breeding was not very concerned with predicting random effects. These were somewhat seen as nuisance parameters. Random effects at the beginning would have been expected to come from the same distribution have the same variance (\(\boldsymbol{I}\sigma^{2}_{x}\)).


Setting up the pedigree

We will use the following example of a pedigree taken from Mrode’s chapter 2 (Linear Models book).

# create original pedigree
ped <- matrix(cbind(c(3:6), c(1,1,4,5), c(2,0,3,2)), ncol=3)

# change row/col names
rownames(ped) <- 3:6
colnames(ped) <- c("Animal", "Sire", "Dam")

# print ped
print(ped)
##   Animal Sire Dam
## 3      3    1   2
## 4      4    1   0
## 5      5    4   3
## 6      6    5   2

What is wrong with this pedigree?

It contains parents (2nd and 3rd column), that are not in the first column! This is a problem and is solved by taking the parents and stacking them on top of the pedigree because the pedigree needs to be sorted from oldest (top) to youngest (bottom).

# create pedigree for Chapter 2. 
ped <- matrix(cbind(c(1:6), c(0,0,1,1,4,5), c(0,0,2,0,3,2)), ncol=3)

# change row/column names
rownames(ped) <- 1:6
colnames(ped) <- c("Animal", "Sire", "Dam")

# print matrix
print(ped)
##   Animal Sire Dam
## 1      1    0   0
## 2      2    0   0
## 3      3    1   2
## 4      4    1   0
## 5      5    4   3
## 6      6    5   2

Keep in mind that sorting and renumbering a pedigree is a job itself. This is not trival. In fact, some software still has you provide year of birth or another numeric variable to sort on. Methods exist for sorting, but is beyond the scope of this tutorial.



Calculation of A

The calculation of A is very simple. For an ordered pedigree, the diagonal elemnts are simply 1 + half the A relationship between it’s sire and dam (off-diagonal). The off-diagonals are calculated from the relationship (previous off-diagonals) between the individual and the parents of the other individual.

Diagonals of A

\[a_{diag}=1+0.5(a_{sire,dam})\]

For example, the diagonal element for animal 6 would be:

\[a_{6,6}=1+0.5(a_{5,2})\]

(or element \(a_{2,5}\), symmetric) because the parents of 6 are animals 5 and 2.

Off-Diagonals of A

Off-diagonals are the numerator relationship between two individuals. The equation looks like:

\[a_{ind_1,ind_2}=0.5(a_{ind_1,sire_2}+a_{ind_1,dam_2})\]

where \(sire_2\) and \(dam_2\) refer to the sire and dam of individual 2 (\(ind_2\)).

For the numerator relationship between animals 1 and 6 (remember the pedigree is sorted from oldest to youngest) the equation is:

\[a_{1,6}=0.5(a_{1,5}+a_{1,2})\]

because animal 5 is the sire of 6 and animal 2 is the dam of 6. So the relationship between 1 and 6 is the average between the relationship of the individual to the parents of the other individual.

Creating A

Note: None of these functions are ‘production ready’ as they would be extremely slow in practice (especially with R). Quite a bit of time in animal breeding has been spent on how to speed up these calculations, but is beyond the scope of this tutorial.

Let’s look at Gota’s createA function in R (modified by myself).

#------------------------------------------------------------------------------#
# Gota's createA function (slightly modified by myself)
#------------------------------------------------------------------------------#
  
# Given the pedigree, return additive relationship matrix 'A'.
  
# Arguments
#       1) ped
#           ped needs to be in the format
#               column 1 = animal
#           column 2 = sire
#               column 3 = dam

####### NOTE: Unknown parents should be coded as ZERO (NOT 'NA')

# Literature: Henderson, C. R. 1976. 
# Simple Method for Computing the Inverse of a Numerator Relationship 
# Matrix Used in Prediction of Breeding Values. Biometrics 32:69-83.

# Author: Gota Morota <morota at wisc dot edu>
# Create: 16-Apr-2009
# Last-Modified: 1-Apr-2010
# License: GPLv3 or later

    `createA` <-
      
        function(ped){
          
            if (nargs() > 1 ) {
                stop("Only the pedigree is required (Animal, Sire, Dam)")
            }
            
            # This is changed from Gota's function
          # Extract the sire and dam vectors
            s = ped[, 2]
            d = ped[, 3]
            
            # Stop if they are different lengths
            if (length(s) != length(d)){
                stop("size of the sire vector and dam vector are different!")
            }
            
            # set number of animals and empty vector
            n <- length(s)
            N <- n + 1
            A <- matrix(0, ncol=N, nrow=N)
            
            # set sires and dams
            s <- (s == 0)*(N) + s
            d <- (d == 0)*N + d
            
            start_time <- Sys.time()
            # Begin for loop
            for(i in 1:n){
                
              # equation for diagonals
                A[i,i] <- 1 + A[s[i], d[i]]/2
                
                for(j in (i+1):n){    # only do half of the matrix (symmetric)
                    if (j > n) break
                    A[i,j] <- ( A[i, s[j]] + A[i, d[j]] ) / 2  # half relationship to parents
                    A[j,i] <- A[i,j]    # symmetric matrix, so copy to other off-diag
                }           
            }
            
        # print the time it took to complete
    cat("\t", sprintf("%-30s:%f", "Time it took (sec)", as.numeric(Sys.time() - start_time)), "\n")
            
    # return the A matrix
        return(A[1:n, 1:n])
    
        }

If you are new to R, this can look quite scary. I urge you to carefully go through the script, pulling out peice at a time, until you understand what is going on.

# create A
A <- Matrix(createA(ped))
##   Time it took (sec)            :0.000051
rownames(A) <- 1:6
colnames(A) <- 1:6

# convert ot sparce matrix for printing (not really sparce in this example)
A <- as(A, "sparseMatrix")

# print A
print(round(A, 4))
## 6 x 6 sparse Matrix of class "dsCMatrix"
##      1     2      3      4      5      6
## 1 1.00 .     0.5000 0.5000 0.5000 0.2500
## 2 .    1.000 0.5000 .      0.2500 0.6250
## 3 0.50 0.500 1.0000 0.2500 0.6250 0.5625
## 4 0.50 .     0.2500 1.0000 0.6250 0.3125
## 5 0.50 0.250 0.6250 0.6250 1.1250 0.6875
## 6 0.25 0.625 0.5625 0.3125 0.6875 1.1250

This matrix is not the relationship, but a (co)variance matrix meaning that on the diagonal are variances, and the off-diagonals represent the covariances (between breeding values). To obtain the relationships one would divide the off-diagonal elements by the square roots of the product of the cooresponding diagonals. This is taking a covariance matrix and reducing it to a correlation matrix. Thus, we call A the numerator relationship matrix.

# convert A to actual relationships
A_Rel <- cov2cor(A)

# print matrix
print(round(A_Rel, 4))
## 6 x 6 sparse Matrix of class "dsCMatrix"
##        1      2      3      4      5      6
## 1 1.0000 .      0.5000 0.5000 0.4714 0.2357
## 2 .      1.0000 0.5000 .      0.2357 0.5893
## 3 0.5000 0.5000 1.0000 0.2500 0.5893 0.5303
## 4 0.5000 .      0.2500 1.0000 0.5893 0.2946
## 5 0.4714 0.2357 0.5893 0.5893 1.0000 0.6111
## 6 0.2357 0.5893 0.5303 0.2946 0.6111 1.0000

Notice how element (1,6) is obtained by taking

\[0.2357 = \frac{0.25}{\sqrt(1.00*1.1250)}\]

Notice that animals 1 and 2 are not related, thus their breeding values should be independent (no information is shared between them).

The inverse of A looks like:

# A inverse
Ainv <- solve(A)

# print
print(Ainv)
## 6 x 6 sparse Matrix of class "dgCMatrix"
##                                                                   
## [1,]  1.8333333  0.5000000 -1.0 -0.6666667  5.551115e-17  .       
## [2,]  0.5000000  2.0333333 -1.0  .          5.333333e-01 -1.066667
## [3,] -1.0000000 -1.0000000  2.5  0.5000000 -1.000000e+00  .       
## [4,] -0.6666667  .          0.5  1.8333333 -1.000000e+00  .       
## [5,]  .          0.5333333 -1.0 -1.0000000  2.533333e+00 -1.066667
## [6,]  .         -1.0666667  .    .         -1.066667e+00  2.133333



Quick Methods to calculate A

You can decompse A using a matrix L (Henderson, 1976) so that A equals LL’.

# Choleski factorization to real symmetric positive-definite square matrix
L <- t(chol(A))

# print L
print(L)
## 6 x 6 sparse Matrix of class "dtCMatrix"
##      1     2         3         4         5         6
## 1 1.00 .     .         .         .         .        
## 2 .    1.000 .         .         .         .        
## 3 0.50 0.500 0.7071068 .         .         .        
## 4 0.50 .     0.0000000 0.8660254 .         .        
## 5 0.50 0.250 0.3535534 0.4330127 0.7071068 .        
## 6 0.25 0.625 0.1767767 0.2165064 0.3535534 0.6846532

You can recreate A by multiplying L by it’s transpose.

# recreate A
L %*% t(L)
## 6 x 6 sparse Matrix of class "dgCMatrix"
##      1     2      3      4      5      6
## 1 1.00 .     0.5000 0.5000 0.5000 0.2500
## 2 .    1.000 0.5000 .      0.2500 0.6250
## 3 0.50 0.500 1.0000 0.2500 0.6250 0.5625
## 4 0.50 .     0.2500 1.0000 0.6250 0.3125
## 5 0.50 0.250 0.6250 0.6250 1.1250 0.6875
## 6 0.25 0.625 0.5625 0.3125 0.6875 1.1250

We can further define A by separating it with TDT’.

\[\boldsymbol{A} = \boldsymbol{T}\boldsymbol{D}\boldsymbol{T}'\]

Now we need to create T. This only tracks parentage and ignores inbreeding.

    createT <- function(ped){

            if (nargs() > 1 ) {
                stop("Only the pedigree is required (Animal, Sire, Dam)")
            }
            
            # This is changed from Gota's function
            s = ped[, 2]
            d = ped[, 3]
            
            # check for NA
            
            # Stop if they are different lengths
            if (length(s) != length(d)){
                stop("size of the sire vector and dam vector are different!")
            }
            
            # set number of animals and empty vector
            n <- length(s)
            N <- n + 1
            
            # initialize T matrix
            T <- matrix(0, ncol=N, nrow=N)
            
            # set sires and dams
            s <- (s == 0)*(N) + s
            d <- (d == 0)*N + d
            
            # set start time
            start_time <- Sys.time()
            
            # Begin for loop
            for(i in 1:n){
                
              # diagonals are 1 (ignore inbreeding for this matrix)
                T[i,i] <- 1
                
                # loop to fill T
                for(j in (i+1):n){
                  
                  # 
                    if (j > n) break    # only do half of the matrix (symmetric)
                  
                    #A[i,j] <- ( A[i, s[j]] + A[i, d[j]] ) / 2  # half relationship of parents
                    T[i,j] <- 0
                    T[j,i] <- 0.5 * (T[s[j], i] + T[d[j], i])
                    
                }           
            }
            
    cat("\t", sprintf("%-30s:%f", "Time it took (sec)", as.numeric(Sys.time() - start_time)), "\n")
            
            return(T[1:n, 1:n])

    }

Now create T.

# create T
T <- createT(ped)
##   Time it took (sec)            :0.000080
# print T
print(T)
##      [,1]  [,2] [,3] [,4] [,5] [,6]
## [1,] 1.00 0.000 0.00 0.00  0.0    0
## [2,] 0.00 1.000 0.00 0.00  0.0    0
## [3,] 0.50 0.500 1.00 0.00  0.0    0
## [4,] 0.50 0.000 0.00 1.00  0.0    0
## [5,] 0.50 0.250 0.50 0.50  1.0    0
## [6,] 0.25 0.625 0.25 0.25  0.5    1

Now create D is not easy to create, because you first have to calculate the inbreeding, which basically means you have to create A first, but we want an easy way to calculate A, so it doesn’t make much sense for us.

# calculate inbreeding (F) from the A matrix
inbreeding <- diag(A) - 1

# print inbreeding
print(inbreeding)
## [1] 0.000 0.000 0.000 0.000 0.125 0.125
# function
createD <- function(ped, inbreeding){
 
  # subset sire and dam
  s <- ped[, 2]
  d <- ped[, 3]
  
  # start time
  start_time <- Sys.time()
  
  # add vec
  diagsD <- Matrix(-1, ncol=1, nrow=length(s))
  
  # add a 0 to the front of inbreeding
  
  # fill D
  for (i in 1:length(s)){
    
    if (s[i]!=0 & d[i]!=0){
      diagsD[i, 1] <- 0.5 - 0.25*(inbreeding[s[i]] + inbreeding[d[i]])
    } else if (s[i]!=0 & d[i]==0){
      diagsD[i, 1] <- 0.75 - 0.25*(inbreeding[s[i]])
    } else if (s[i]==0 & d[i]!=0){
      diagsD[i, 1] <- 0.75 - 0.25*(inbreeding[d[i]])
    } else {
      diagsD[i,1] <- 1
    }
  }
  
  # create D
  D <- diag(diagsD[,1])
            
  cat("\t", sprintf("%-30s:%f", "Time it took (sec)", as.numeric(Sys.time() - start_time)), "\n")
    
  # return
  return(D)
  
}
# create D
D <- Matrix(createD(ped, inbreeding))
##   Time it took (sec)            :0.017661
# print D
print(D)
##      [,1] [,2] [,3] [,4] [,5] [,6]   
## [1,]    1    .    .    .    .       .
## [2,]    .    1    .    .    .       .
## [3,]    .    .  0.5    .    .       .
## [4,]    .    .    . 0.75    .       .
## [5,]    .    .    .    .  0.5       .
## [6,]    .    .    .    .    . 0.46875

The D matrix represents the variance of the Mendelian sampling term. You can see that in general, as animals begin to become inbred, their Mendelian sampling variance should decrease.

Now we can try to recreate A using D and T.

# create A based on A = TDT'
newA <- T %*% D %*% t(T)

# print A
print(newA)
## 6 x 6 Matrix of class "dgeMatrix"
##      [,1]  [,2]   [,3]   [,4]   [,5]   [,6]
## [1,] 1.00 0.000 0.5000 0.5000 0.5000 0.2500
## [2,] 0.00 1.000 0.5000 0.0000 0.2500 0.6250
## [3,] 0.50 0.500 1.0000 0.2500 0.6250 0.5625
## [4,] 0.50 0.000 0.2500 1.0000 0.6250 0.3125
## [5,] 0.50 0.250 0.6250 0.6250 1.1250 0.6875
## [6,] 0.25 0.625 0.5625 0.3125 0.6875 1.1250



Creating the Inverse of A

One can create the A inverse using:

\[\boldsymbol{A} = (\boldsymbol{T^{-1}})'\boldsymbol{D^{-1}}\boldsymbol{T^{-1}}\]

\(\boldsymbol{D^{-1}}\) is easy to calculate because D is a diagonal matrix, so simply take the inverse of the diagonal elements (1/\(d_{ii}\)).

# calculate D
Dinv <- solve(D)

# print D inverse
print(Dinv)
##      [,1] [,2] [,3] [,4]     [,5] [,6]    
## [1,]    1    .    .        .    .        .
## [2,]    .    1    .        .    .        .
## [3,]    .    .    2        .    .        .
## [4,]    .    .    . 1.333333    .        .
## [5,]    .    .    .        .    2        .
## [6,]    .    .    .        .    . 2.133333

The \(\boldsymbol{T^{-1}}\) matrix is extremely easy to calculate using this function.

# Create T inverse function
    createTinv <- function(ped) {
      
      # start time
      start_time <- Sys.time()
        
      # create identity matrix
        Tinv <- diag(nrow(ped))
        
        # fill T inverse based on parental information
        for (i in 1:nrow(Tinv)) {
            for (j in 1:ncol(Tinv)) {
                if (i < j | i == j) next
                else 
                    if (j == ped[i, 2] | j == ped[i, 3]) Tinv[i, j] = -0.5
            }
        }
            
  cat("\t", sprintf("%-30s:%f", "Time it took (sec)", as.numeric(Sys.time() - start_time)), "\n")
    
        # return matrix
        return(Tinv)

    }

Now calculate T inverse.

# create T inverse
Tinv <- createTinv(ped)
##   Time it took (sec)            :0.000531
# print
print(Tinv)
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]  1.0  0.0  0.0  0.0  0.0    0
## [2,]  0.0  1.0  0.0  0.0  0.0    0
## [3,] -0.5 -0.5  1.0  0.0  0.0    0
## [4,] -0.5  0.0  0.0  1.0  0.0    0
## [5,]  0.0  0.0 -0.5 -0.5  1.0    0
## [6,]  0.0 -0.5  0.0  0.0 -0.5    1

You can see each row on the lower triangular part represents the parents of animal i. Parents of animal 3 are animals 1 and 2, thus you have a -0.5 in columns 1 and 2 for row 3.

Creating the Inverse of A without inbreeding

You can create the the \(\boldsymbol{A^{-1}}\) ignoring inbreeding. The \(\boldsymbol{T^{-1}}\) matrix will be the same, but you will have either 2, 4/3, or 1 on the diagonal of \(\boldsymbol{D^{-1}}\) if they have both, one, or two parents known, respectively.

createDnoF <- function(ped){
 
  # extract sire and dam
  s <- ped[, 2]
  d <- ped[, 3]
  
  # start time
  start_time <- Sys.time()
  
  # initiate D inverse
  Dinv <- Matrix(0, ncol=length(s), nrow=length(s))
  
  # set D inverse
  for (i in 1:length(s)){
    
    if (s[i]!=0 & d[i]!=0){
      Dinv[i, i] <- 2
    } else if (s[i]!=0 & d[i]==0){
      Dinv[i, i] <- (4/3)
    } else if (s[i]==0 & d[i]!=0){
      Dinv[i, i] <- (4/3)
    } else {
      Dinv[i, i] <- 1
    }
    
  }
            
  cat("\t", sprintf("%-30s:%f", "Time it took (sec)", as.numeric(Sys.time() - start_time)), "\n")
    
  # return D inverse
  return(Dinv)
  
}

# create D inverse with no inbreeding
DinvNoF <- createDnoF(ped)
##   Time it took (sec)            :0.032515
# print 
print(DinvNoF)
## 6 x 6 sparse Matrix of class "dsCMatrix"
##                        
## [1,] 1 . . .        . .
## [2,] . 1 . .        . .
## [3,] . . 2 .        . .
## [4,] . . . 1.333333 . .
## [5,] . . . .        2 .
## [6,] . . . .        . 2

We can again create \(\boldsymbol{A^{-1}}\) inverse, but there will be no inbreeding accounted for, making it a little differnt from the true \(\boldsymbol{A^{-1}}\).

# create A inverse with no inbreeding
Ainv2 <- t(Tinv) %*% DinvNoF %*% Tinv

# print A inverse
Ainv2
## 6 x 6 Matrix of class "dgeMatrix"
##            [,1] [,2] [,3]       [,4] [,5] [,6]
## [1,]  1.8333333  0.5 -1.0 -0.6666667  0.0    0
## [2,]  0.5000000  2.0 -1.0  0.0000000  0.5   -1
## [3,] -1.0000000 -1.0  2.5  0.5000000 -1.0    0
## [4,] -0.6666667  0.0  0.5  1.8333333 -1.0    0
## [5,]  0.0000000  0.5 -1.0 -1.0000000  2.5   -1
## [6,]  0.0000000 -1.0  0.0  0.0000000 -1.0    2

Lets look at the different between the inverse of A with and without inbreeding.

# difference between A inverse with or without inbreeding
print(round(Ainv - Ainv2, 4))
## 6 x 6 sparse Matrix of class "dgCMatrix"
##                                   
## [1,] .  .      . 0  0.0000  .     
## [2,] .  0.0333 . .  0.0333 -0.0667
## [3,] .  .      . .  .       .     
## [4,] .  .      . .  .       .     
## [5,] .  0.0333 . .  0.0333 -0.0667
## [6,] . -0.0667 . . -0.0667  0.1333


Creating the Inverse of A without inbreeding (Henderson’s)

We can apply an algorithm from Henderson 1976 to calculate the \(\boldsymbol{A^{-1}}\) directly.

createAinvHenderson <- function(ped){
  
  # set sire and dam
  s <- ped[, 2]
  d <- ped[, 3]
  
  # set total number in pedigree
  n <- length(s)
  
  # set n + 1 to create Matrix to grab 0's
  N <- n + 1 
  
  # calculate D inverse to start
  Dinv <- vector(mode="numeric", n)
  
  # create new vector of pedigree
  ped <- cbind(ped, 0)
  ped[, 4][ped[,2]==0 & ped[,3]==0] <- 0
  ped[, 4][ped[,2]!=0 & ped[,3]==0] <- 1
  ped[, 4][ped[,2]==0 & ped[,3]!=0] <- 2
  ped[, 4][ped[,2]!=0 & ped[,3]!=0] <- 3
  
  # number of parents
  parents <- ped[,4]
  
  # fill Dinverse
  Dinv[parents==0] <- 1
  Dinv[parents==1 | parents==2] <- (4/3)
  Dinv[parents==3] <- 2
  
  # set A inverse
  Ainv <- Matrix(0, ncol=n, nrow=n)
    
  # set start time
    start_time <- Sys.time()
    
  # begin loop
  for (i in 1:n){
    
    if (parents[i]==3){
      
      # add to diagonal
      Ainv[i, i]    <- Ainv[i, i]    + Dinv[i]
      
      # set factor to add
      factor <- (-1*(Dinv[i]/2))
      Ainv[s[i], i] <- Ainv[s[i], i] + factor
      Ainv[i, s[i]] <- Ainv[s[i], i]
      Ainv[d[i], i] <- Ainv[d[i], i] + factor
      Ainv[i, d[i]] <- Ainv[d[i], i]
      
      # set factor to add
      factor <- Dinv[i]/4
      Ainv[s[i], s[i]] <- Ainv[s[i], s[i]] + factor
      Ainv[d[i], d[i]] <- Ainv[d[i], d[i]] + factor
      Ainv[s[i], d[i]] <- Ainv[s[i], d[i]] + factor
      Ainv[d[i], s[i]] <- Ainv[s[i], d[i]]
      
    # check if only dam known
    } else if (parents[i]==2){
      
      # set factor
      Ainv[i, i] <- Ainv[i, i] + Dinv[i]
      
      # set factor
      factor <- (-1*(Dinv[i]/2))
      Ainv[d[i], i] <- Ainv[d[i], i] + factor
      Ainv[i, d[i]] <- Ainv[d[i], i]
      
      # set factor to add
      factor <- Dinv[i]/4
      Ainv[d[i], d[i]] <- Ainv[d[i], d[i]] + factor
      
    # check if only sire known
    } else if (parents[i]==1){
      
      # set factor
      Ainv[i, i] <- Ainv[i, i] + Dinv[i]
      
      # set factor
      factor <- (-1*(Dinv[i]/2))
      Ainv[s[i], i] <- Ainv[s[i], i] + factor
      Ainv[i, s[i]] <- Ainv[s[i], i]
      
      # set factor to add
      factor <- Dinv[i]/4
      Ainv[s[i], s[i]] <- Ainv[s[i], s[i]] + factor
      
    # check if neither is known
    } else if (parents[i]==0){
      
      # just add to diagonal
      Ainv[i, i] <- Ainv[i, i] + Dinv[i]
      
    }
    
    
  }
  
    # print time it took
  cat("\t", sprintf("%-30s:%f", "Time it took (sec)", 
    as.numeric(Sys.time() - start_time)), "\n")
            
  # return A inverse
  return(Ainv)
  
}

Use the function to create \(\boldsymbol{A^{-1}}\) without accounting for inbreeding. Notice it will be slightly different from the real \(\boldsymbol{A^{-1}}\).

# create inverse
AinvHend <- createAinvHenderson(ped)
##   Time it took (sec)            :0.178566
# print
print(AinvHend)
## 6 x 6 sparse Matrix of class "dgCMatrix"
##                                             
## [1,]  1.8333333  0.5 -1.0 -0.6666667  .    .
## [2,]  0.5000000  2.0 -1.0  .          0.5 -1
## [3,] -1.0000000 -1.0  2.5  0.5000000 -1.0  .
## [4,] -0.6666667  .    0.5  1.8333333 -1.0  .
## [5,]  .          0.5 -1.0 -1.0000000  2.5 -1
## [6,]  .         -1.0  .    .         -1.0  2


Using Dick Quaas’s Method of L (accounting for inbreeding)

Dick Quaas came up with a method of calculating the L matrix to account for inbreeding.

\[\boldsymbol{A} = \boldsymbol{TDT'}\] \[\boldsymbol{L} = \boldsymbol{T}\sqrt{\boldsymbol{D}}\] \[\boldsymbol{A} = \boldsymbol{LL'}\]

createLQuass <- function(ped){
 
  # get sire and dam
  s <- ped[, 2]
  d <- ped[, 3]
  
  # number of total animals in the pedigree
  n <- length(s)
  
  # this sets n + 1 - to set missing parents to that value, outside the main matrix
  N <- n+1
  
  # start time
  start_time <- Sys.time()
  
  # initiallize vector to store a_ii values!
  diagsA <- vector(mode="numeric", N)
  
  # sum of squares down row for each animal
  sumSquaresL <- vector(mode="numeric", N)
  
  # create L
  L <- Matrix(0, ncol=n+1, nrow=n+1)
  
  # row and col names
  rownames(L) <- 1:N
  colnames(L) <- 1:N
  
  # set sire 
  s[s==0] <- N
  d[d==0] <- N
  
  # double loop
  for (j in 1:n){        # column - 1st to fill down column
    for (i in 1:n){      # row - 2nd because we need to fill down the column,
                         #        not across row
      # print
      #cat("\n\n *** Round *** ", i, " ", j, "\n\n")
      
      #========================================#
      # Skip upper off-diagonals
      #========================================#
      
      # skip if column is greater than row number (upper-diags)
      if (j > i) { next }
      
      #========================================#
      # Fill diagonals
      #========================================#
      
      # fill diagonals of L first
      if (i == j) {                   # test if it's a diagonal element
        
        #cat("\nDIAGONAL\n")
        
        # test if both parents are not missing!
        if (s[i]!=N & d[i]!=N) {
          
          # set diagonal element of L
          L[i, j] <- sqrt((1 - 0.25*(sumSquaresL[s[i]] + sumSquaresL[d[i]])))
          
           # ** old way **
           # sum(L[s[i], 1:i]^2) + sum(L[d[i], 1:i]^2)  # see page 29 in Mrode
           # we don't need this because the parents if known, should
           # be above in the pedigree, therefore have a SS elements
           # across the L matrix!!
          
        # test if sire is not missing and the dam is
        } else if (s[i]!=N & d[i]==N) {
          
          # set diagonal element of L
          L[i, j] <- sqrt((1 - 0.25*(sumSquaresL[s[i]])))
          
        # test if dam is not missing and the sire is
        } else if (s[i]==N & d[i]!=N) {
          
          # set diagonal element of L
          L[i, j] <- sqrt((1 - 0.25*(sumSquaresL[d[i]])))
          
        # test if both are missing
        } else if (s[i]==N & d[i]==N){
          
          # set diagonal element of L
          L[i, j] <- 1
          
        } else {
          warning("\n\n\nSOMETHING BAD HAPPENED!!!\n\n\n")
        }
      
      # after we get done filling the diagonal of L (i.e. i = j) then
      # we can fill in the sum of squares for that animal!
      sumSquaresL[i] <- sum(L[i, 1:i]^2)
      
      } # end of the diagonal portion
      
      #========================================#
      # Fill off-diagonals
      #========================================#
      
      # fill off-diagonals of L next
      if (j < i) {
        
        #cat("\nOFF-DIAGONAL\n")
        
        # test if both parents are not missing!
        if (s[i]!=N | d[i]!=N) {
          
          # fill lower off-diags
          L[i, j] <- 0.5 * (L[s[i], j] + L[d[i], j])
          
        # test if sire is not missing and the dam is
        #} else if (s[i]!=N & d[i]==N) {
          
          # fill lower off-diags
          #L[i, j] <- 0.5 * (0 + L[s[i], j])
          
        # test if dam is not missing and the sire is
        #} else if (s[i]==N & d[i]!=N) {
          
          # fill lower off-diags
          #L[i, j] <- 0.5 * (L[d[i], j] + 0)
          
        # test if both are missing
        } else if (s[i]==N & d[i]==N) {
          
          # set to 0 if both parents missing
          L[i, j] <- 0
          
        } else {
          warning("\n\n\nSOMETHING BAD HAPPENED!!!\n\n\n")
        }
      
      }
      
      #cat("\n\n *** Done with Round ***", i, " ", j, "\n\n")
      
    }
  }
  
  # subset L
  L <- L[1:n, 1:n]
  
  # subset inbreeding
  inbreeding <- round((sumSquaresL - 1)[1:n], 5)
  
    # print time it took
  cat("\t", sprintf("%-30s:%f", "Time it took (sec)", 
    as.numeric(Sys.time() - start_time)), "\n")
            
  # return list
  return(list(L=L, inbreeding=inbreeding))
  
}

Now use the function to extract L.

# create the L matrix with Quass
outputQuass <- createLQuass(ped)
##   Time it took (sec)            :0.031293
# extract L matrix
L <- outputQuass$L

# print L (notice that I return a list of both the L matrix and inbreeding coefs
print(L)
## 6 x 6 sparse Matrix of class "dgCMatrix"
##      1     2         3         4         5         6
## 1 1.00 .     .         .         .         .        
## 2 .    1.000 .         .         .         .        
## 3 0.50 0.500 0.7071068 .         .         .        
## 4 0.50 .     .         0.8660254 .         .        
## 5 0.50 0.250 0.3535534 0.4330127 0.7071068 .        
## 6 0.25 0.625 0.1767767 0.2165064 0.3535534 0.6846532

Now print the inbreeding coefficents from L (diagonals of A - 1).

# extract inbreeding from function
inbreeding <- outputQuass$inbreeding

# print inbreeding coefs
print(inbreeding)
## [1] 0.000 0.000 0.000 0.000 0.125 0.125

Now we can recreate A using using LL’.

# create A with L
AwithL <- L %*% t(L)

# print A
print(AwithL)
## 6 x 6 sparse Matrix of class "dgCMatrix"
##      1     2      3      4      5      6
## 1 1.00 .     0.5000 0.5000 0.5000 0.2500
## 2 .    1.000 0.5000 .      0.2500 0.6250
## 3 0.50 0.500 1.0000 0.2500 0.6250 0.5625
## 4 0.50 .     0.2500 1.0000 0.6250 0.3125
## 5 0.50 0.250 0.6250 0.6250 1.1250 0.6875
## 6 0.25 0.625 0.5625 0.3125 0.6875 1.1250

Recreating A with this multiplication is not a good idea. What we really wanted was a way to calculate the inbreeding efficientyly without create A explicitly. Since the L matrix gives us a way to calculate the diagonals of A and therefore inbreeding coefficients (1 - diag), we can create \(\boldsymbol{A^{-1}}\).

# now we need a way to calculate D from inbreeding
Dinv <- solve(createD(ped, inbreeding))
##   Time it took (sec)            :0.001742

Now we can use the quick way to calculate \(\boldsymbol{A^{-1}}\) from above using \(\boldsymbol{T^{-1}}\) and \(\boldsymbol{D^{-1}}\).

# create A inverse
Ainv = round(t(Tinv) %*% Dinv %*% Tinv, 4)
Ainv = Matrix(Ainv)
Ainv = as(Ainv, "sparseMatrix")

# print A inverse
print(Ainv)
## 6 x 6 sparse Matrix of class "dsCMatrix"
##                                                  
## [1,]  1.8333  0.5000 -1.0 -0.6667  .       .     
## [2,]  0.5000  2.0333 -1.0  .       0.5333 -1.0667
## [3,] -1.0000 -1.0000  2.5  0.5000 -1.0000  .     
## [4,] -0.6667  .       0.5  1.8333 -1.0000  .     
## [5,]  .       0.5333 -1.0 -1.0000  2.5333 -1.0667
## [6,]  .      -1.0667  .    .      -1.0667  2.1333

 


A work by Austin Putz

putz.austin@gmail.com