Random Genetic Drift in R

Simulating Random Genetic Drift

We’ve seen in class that small inbred populations will become increasingly homogeneous over time. We have also modeled births as marbles selected randomly from urns with marbles of different colors representing different alleles. In this lab, we will simulate how a population of alleles changes over time. For the sake of simplicity, we are going to make two important assumptions:

Every allele is equally likely to reproduce itself. In other words, there are no selective advantages (for now!).
The generations are non-overlapping (imagine a new generation of plants sprouting each summer).

In R code, our population of alleles will be a vector. Here’s a population of 10 diploid organisms with 10 A alleles and 10 a alleles:

alleles <- c(rep("A",10), rep("a",10))
table(alleles)

## alleles
##  a  A 
## 10 10

To create the next generation, we’ll simply sample 20 alleles with replacement from the last generation:

alleles <- sample(alleles, 20, replace=TRUE)
table(alleles)

## alleles
##  a  A 
## 12  8

Now, let’s simulate 100 generations and, for each generation, count the number of A alleles and plot it. Note that in the code below, the second line creates and empty plot and the final line (within the for loop) adds a point to the plot. You may want to run this entire section of code several times. What happens to the population of alleles in every simulation?

alleles <- c(rep("A",10), rep("a",10))

plot(1,0, type="n", xlim=c(1,100), ylim=c(0,20), xlab="generation", ylab="number of A alleles")
for(i in 1:100){
alleles <- sample(alleles, 20, replace=TRUE)
points(i, length(alleles[alleles=="A"]), pch=19, col="red")
}

Time to Fixation

How long does it take for fixation to occur – for one of the two alleles to go extinct? And how does this time depend on the number of alleles and the proportiosn of alleles?

Let’s start answering this by writing code that counts generations until fixation occurs. If instead of re-running a section of code a fixed number of times we want to keep running it until some condition (in this case, fixation) is met, we need to use a while loop instead of a for loop. In the code below, we keep simulating generations so long as the most common allele is still smaller in number than the total population of alleles.

alleles <- c(rep("A",10), rep("a",10))

n <- sum(table(alleles))
generations <- 0

while(max(table(alleles))<n){
alleles <- sample(alleles, 20, replace=TRUE)
generations <- generations + 1
}

generations

Of course, the time to fixation varies between simulations. To get a sense of the possibilities, we can simulate the fate of our population of alleles large number of times, plot the distribution of time until fixation and summarize this distribution by finding the mean fixation time and standard deviation in fixation time. The code below creates a function that does just that.

Notice that our function, sim.fixation, takes two inputs. The first input is the population of alleles (a vector) and the second is the number of simulations, which defaults to 100. Within the function, we first count the number of alleles, and create a vector, generations, which will hold our fixation times. Next, we use a for loop that to simulate the the fortunes of our population a fixed number of times. Within the for loop, we have the same while loop we used above which creates new generations until one allele dominates. Finally, this function creates a histogram of fixation times and returns the number of generations in each simulated history, the mean number of generations and the standard deviation in the number of generations.

sim.fixation <- function(alleles, num.sims=100){
n <- length(alleles)
generations <- rep(0, num.sims)

for(i in 1:num.sims){
temp.alleles <- alleles
while(max(table(temp.alleles))<n){
temp.alleles <- sample(temp.alleles, n, replace=TRUE)
generations[i] <- generations[i] + 1
}
}
hist(generations, main="Generations until Fixation")
return(g = list(generations=generations, mean.gen = mean(generations), sd.gen = sd(generations)))

}

Now, let’s finally see this function in action! Please complete the appropriate row in the Fixation Times chart each time you run the sim.fixation function. After you’ve run the code below, try creating your own populations of alleles and finding the fixation times.

alleles <- c(rep("A",10), rep("a",10))
sim.fixation(alleles, 200)

alleles <- c(rep("A",20), rep("a",20))
sim.fixation(alleles, 200)

alleles <- c(rep("A",40), rep("a",40))
sim.fixation(alleles, 200)

alleles <- c(rep("A",5), rep("a",15))
sim.fixation(alleles, 200)

alleles <- c(rep("A",2), rep("a",18))
sim.fixation(alleles, 200)

alleles <- c(rep("A",2), rep("a",18), rep("B", 20))
sim.fixation(alleles, 200)

Losing Rare Alleles

In small inbreeding populations, rare alleles may be lost well before one allele completely dominates. Let’s rewrite our function from above so that it keeps track of the number of unique alleles. This time, our while loop will be designed to keep spouting new generation until one allele has been lost.

sim.lose.rare.allele <- function(alleles, num.sims=100){

n <- length(alleles)
num.unique.alleles <- length(unique(alleles))
generations <- rep(0, num.sims)

for(i in 1:num.sims){
temp.alleles <- alleles
while(length(unique(temp.alleles))==num.unique.alleles){
temp.alleles <- sample(temp.alleles, n, replace=TRUE)
generations[i] <- generations[i] + 1
}
}
hist(generations, main="Generations until an Allele is Lost")
return(g = list(generations=generations, mean.gen = mean(generations), sd.gen = sd(generations)))

}

Please try each of the following simulations and complete the Losing Rare Alleles chart. Then try several simulations of your choosing before answering the follow-up questions.

alleles <- c(rep("A",200), rep("a",200), rep("B", 2))
sim.lose.rare.allele(alleles, 200)

alleles <- c(rep("A",4000), rep("a",4000), rep("B", 2))
sim.lose.rare.allele(alleles, 200)

alleles <- c(rep("A",200), rep("a",200), rep("B", 2), rep("C", 2))
sim.lose.rare.allele(alleles, 200)

alleles <- c(rep("A",200), rep("a",200), rep("B", 4))
sim.lose.rare.allele(alleles, 200)

Random Genetic Drift in R

Population Genetics

1/17/2017

Simulating Random Genetic Drift

Time to Fixation

Losing Rare Alleles