This demonstration will show you 1) how to use a pedigree to get estimates of relationship between individuals and inbreeding coefficients, and 2) how alleles are transmitted in a pedigreed population to better understand the concept of identity-by-descent and how this leads to increasing levels of homozygosity (inbreeding) over time.
First you will learn how to compute pedigree based estimates of relationship using the package pedigreemm
library(pedigreemm)
## Loading required package: lme4
## Loading required package: Matrix
This pedigree consists of half-siblings G and H that have been self pollinated for one generation
ped<- data.frame(id=c('F', 'G', 'H', 'I', 'J'),
p1=c(NA, 'F', 'F', 'G', 'H'),
p2= c(NA, NA, NA, 'G', 'H'))
ped
## id p1 p2
## 1 F <NA> <NA>
## 2 G F <NA>
## 3 H F <NA>
## 4 I G G
## 5 J H H
The pedigree must be put in order starting with the base population and moving forward. The editPed() function does this.
ped2<- editPed(ped$p1, ped$p2, ped$id)
ped2
## label sire dam gene
## 1 F <NA> <NA> 0
## 2 G F <NA> 1
## 3 H F <NA> 1
## 4 I G G 2
## 5 J H H 2
This function encodes the pedigree in what is called ‘sparse matrix format’ which is a trick to make computations with the matrix faster
ped3<- pedigree(ped2$sire, ped2$dam, ped2$label)
This is 2x the kinship matrix
A<- getA(ped3)
A<- as.matrix(A) #converts the A matrix to a regular matrix
A
## F G H I J
## F 1.0 0.50 0.50 0.50 0.50
## G 0.5 1.00 0.25 1.00 0.25
## H 0.5 0.25 1.00 0.25 1.00
## I 0.5 1.00 0.25 1.50 0.25
## J 0.5 0.25 1.00 0.25 1.50
Now you will learn about allele transmission in pedigreed populations to understand better the concept of identity-by-descent and inbreeding. First we will simulate a small pedigreed population that mates randomly for several generations. We also simulate genotypic data at one locus. We will keep track of the pedigree and genotypic data on all individuals generated each generation. Then we will observe the results and discuss. If time allows you will estimate the rate of inbreeding using the pedigree based estimates of the inbreeding coefficient.
library(pedigreemm)
library(doBy)
This time we record ID numbers, parents, and the generation number in a data frame. We will also record genotypes for a single locus. Note that in the base population each allele is unique because all individuals are assumed unrelated
nbase<- 1000
base<- data.frame(id=c(1:nbase), p1=NA, p2=NA, allele1=NA, allele2=NA, gen=0)
base$allele1<- paste('A', c(1:nbase), "-1", sep="")
base$allele2<- paste('A', c(1:nbase), "-2", sep="")
Before doing anything else we create a data.frame where info on all individuals will be stored in this simulation
allpop<- base
We will sample ‘N’ individuals for random mating each generation for ‘ngen’ generations
N<- 10
ngen<- 75
We sample N individuals from the base as object pop
pop<- base[sample(base$id, size=N),]
pop
## id p1 p2 allele1 allele2 gen
## 896 896 NA NA A896-1 A896-2 0
## 483 483 NA NA A483-1 A483-2 0
## 38 38 NA NA A38-1 A38-2 0
## 623 623 NA NA A623-1 A623-2 0
## 265 265 NA NA A265-1 A265-2 0
## 962 962 NA NA A962-1 A962-2 0
## 71 71 NA NA A71-1 A71-2 0
## 57 57 NA NA A57-1 A57-2 0
## 612 612 NA NA A612-1 A612-2 0
## 568 568 NA NA A568-1 A568-2 0
We are using nested for loops, the inner loop select two random parents and random mates them. In the random mating process, one allele per parent is selected at random and given to a progeny. The progeny gets its own row in the information table indicating its parentage and its one locus phenotype. This is repeated N times to form the next generation of the population. The outer loop repeats the random mating process ngen times to form many generations
#loop for each generation
for(j in 1:ngen){
##create the empty data table to record information on new individuals created
popnew<- data.frame(id=c(nrow(allpop)+1):c(nrow(allpop)+N), p1=NA, p2=NA, allele1=NA, allele2=NA, gen=j)
##loop to create each new individual through random mating
for(i in 1:nrow(popnew)){
##randomly select two parents and one allele per parent
ixp1<- sample(1:N,1) #row index for parent 1
ixp2<- sample(1:N,1) #row index for parent 2
rowp1<- pop[ixp1,] #data on parent 1
rowp2<- pop[ixp2,] #data in parent 2
alsp1<- rowp1[,c('allele1', 'allele2')] #alleles p1
alsp2<- rowp2[,c('allele1', 'allele2')] #alleles p2
al1samp<- sample(alsp1,1) #randomly select one allele from parent 1
al2samp<- sample(alsp2,1) #randomly select one allele from parent 2
##record information
popnew[i,c('p1', 'p2')]<- pop[c(ixp1, ixp2),'id'] #parent information
popnew[i,c('allele1','allele2')]<- c(al1samp,al2samp) #allele information
}#new individuals 'popnew' created
##add new individuals to the information data.frame
allpop<- rbind(allpop, popnew)
##replace the population
pop<- popnew
}#ngen generations completed
View some of the data generated for generation 3
head(allpop[allpop$gen==3,])
## id p1 p2 allele1 allele2 gen
## 1021 1021 1013 1011 A612-1 A483-2 3
## 1022 1022 1014 1018 A38-2 A962-2 3
## 1023 1023 1014 1011 A38-2 A483-2 3
## 1024 1024 1019 1017 A483-2 A896-2 3
## 1025 1025 1016 1014 A896-2 A38-2 3
## 1026 1026 1019 1011 A623-1 A265-1 3
Using pedigreemm, we compute the inbreeding coefficient I per individual. Once computed, we then add I to our data.frame of pedigrees and genotypes
ped<- pedigree(allpop$p1, allpop$p2, allpop$id)
I<- inbreeding(ped)
allpop<- data.frame(allpop, I)
mnI<- summaryBy(I~gen, allpop)
head(mnI)
## gen I.mean
## 1 0 0.0000000
## 2 1 0.1500000
## 3 2 0.0125000
## 4 3 0.1343750
## 5 4 0.1687500
## 6 5 0.2224609
Use View(allpop) to look at the data, what do you notice about the alleles and the inbreeding coefficients. How is allele identity-by-descent and inbreeding related?
Use the mean I per generation that we just computed.
Estimate the actual inbreeding rate based using \(\Delta F= (F_{t+1} – F_{t})/(1 – F_{t})\). Compute the the expected inbreeding rate based on the population size and compare the two values.