To a population geneticist such as yourself, migration is about the movement of genes between populations. We can imagine subpopulations in adjacent ponds, islands or wooded areas that are partially but not entirely isolated from each other. Using “island”" or “stepping stone” model to simulation the movement of genes between these populations.
(a) Island (b) Stepping Stone
In the Island Model genes are equally likely to migrate between any pair of islands while in the Stepping Stone model, genes can only migrate to one of the neighboring stones.
Let \(m\) be the proportion of new migrants in an island population in any generation. In this case \((1-m)\) is the proportion of “natives”. If the allelic frequency of some gene is \(q_0\) in the native population and \(q_m\) in the migrant population, then the allelic frequency after one generation becomes:
\[q_1 = (1-m) \cdot q_0 + m \cdot q_m\]
Let’s imagine that an alleles for red eyes has frequencies of 0.8, 0.3, 0.2 and 0.7 in four subpopulations A, B, C and D and that there’s a mutation rate of 0.1.
q0 <- c(0.8, 0.3, 0.2, 0.7)
m <- 0.1
Note that we can select the allelic frequency of population B as:
q0[2]
the allelic frequencies of all of subpopulation as:
q0[-2]
and the average allelic frequency in other populations as:
mean(q0[-2])
Therefore, using our formula (\(q_1 = (1-m) \cdot q_0 + m \cdot q_m\)) we can calculate the allelic frequency in population B in the next generation, \(q_{2A}\) as:
q0[2]*(1-m) + mean(q0[-2])*m
More generally, we can calcuate the allelic frequency for all subpopulations in the next generation as:
q1 <- vector()
for (i in 1:length(q0)) {q1[i] <- q0[i]*(1-m) + mean(q0[-i])*m}
q1
If instead of just looking one generation ahead, we want to look 50 generations head we can add a second for loop:
q <- q0
for (t in 1:20){
qlast <- q
for (i in 1:length(q)) {q[i] <- qlast[i]*(1-m) + mean(qlast[-i])*m}
print(q)
}
Now, let’s use these ideas to create a tidy function that allows us to tweak the allelic frequecies and the mutation rate. This function will also store the allelic frequencies for each generation in a data.frame, q.history so that we can graph our results. Other than, saving values to q.history this code performs the same function as the code above.
sim.island <- function(q, migration.rate=0.05, num.gens=100){
num.islands <- length(q)
q.history <- data.frame(generation=numeric(), island=numeric(), q=numeric())
q.history[1:num.islands, ] <- cbind(rep(0, num.islands), 1:num.islands, q)
for (n in 1:num.gens){
q.last <- q
for(i in 1:num.islands){
q[i] <- (1-migration.rate)*q.last[i] + migration.rate*(mean(q.last[-i]))
}
q.history[n*num.islands+(1:num.islands), ] <- cbind(rep(n, num.islands), 1:num.islands, q)
}
q.history$island <- as.factor(q.history$island)
return(q.history)
}
Let’s try out our function using the islands described above:
q0 <- c(0.8, 0.3, 0.2, 0.7)
results <- sim.island(q= q0, migration.rate = 0.1, num.gens = 50)
View(results)
library(ggplot2)
ggplot(results, aes(generation, q, group=island))+geom_line(aes(color=island))
How would you describe the results on the Island Model simulation?
How could you calculate the equilbrium allelic frequencies for the islands?
What determines how quickly the populations approach the equilibrium value?
The code for the Stepping Stone model looks much like the code for the except that the migrants for each island have the allelic frequencies of only the neighboring islands as opposed to all islands.
Going back to our examples with four subpopulations with allelic frequencies of 0.8, 0.3, 0.2, and 0.7, we can look one generation into the future with the code below. Note that it’s a little messier than the code for the island migrations since we have to account for the fact that the edge islands have fewer alleles both coming and going and account for them separately.
q0 <- c(0.8, 0.3, 0.2, 0.7)
m <- 0.1
q1 <- vector()
q1[1] <- (1-0.5*m)*q0[1] + 0.5*m*q0[2]
for(i in 2:3){
q1[i] <- (1-m)*q0[i] + 0.5*m*(q0[i-1]+q0[i+1])
}
q1[4] <- (1-0.5*m)*q0[4] + 0.5*m*q0[3]
q1
We could simulate 20 generations the same way that we did before. Note that we only need to change the code for how we calculate “q[i]” from “qlast”.
q <- q0
for (t in 1:20){
q.last <- q
q[1] <- (1-0.5*m)*q.last[1] + 0.5*m*q.last[2]
for(i in 2:4){
q[i] <- (1-m)*q.last[i] + 0.5*m*(q.last[i-1]+q.last[i+1])
}
q[4] <- (1-0.5*m)*q.last[4] + 0.5*m*q.last[3]
print(q)
}
sim.stepping.stone <- function(q, migration.rate=0.05, num.gens=100){
num.islands <- length(q)
q.history <- data.frame(generation=numeric(), island=numeric(), q=numeric())
q.history[1:num.islands, ] <- cbind(rep(0, num.islands), 1:num.islands, q)
for (n in 1:num.gens){
q.last <- q
q[1] <- (1-0.5*migration.rate)*q.last[1] + 0.5*migration.rate*q.last[2]
for(i in 2:(num.islands-1)){
q[i] <- (1-migration.rate)*q.last[i] + 0.5*migration.rate*(q.last[i-1]+q.last[i+1])
}
q[num.islands] <- (1-0.5*migration.rate)*q.last[num.islands] + 0.5*migration.rate*q.last[num.islands-1]
q.history[n*num.islands+(1:num.islands), ] <- cbind(rep(n, num.islands), 1:num.islands, q)
}
q.history$island <- as.factor(q.history$island)
return(q.history)
}
results <- sim.stepping.stone(c(1,0,0,0))
View(results)
ggplot(results, aes(generation, q, group=island))+geom_line(aes(color=island))
Is is possible to create a stepping stone simulation where the lines cross? In other words, is it possible to create a scenario where island B started with a lower allelic frequency than island C but ended with a higher freqency? Was this possible in the Island Simulation?
Challenge: Create a simulation where lines cross twice. In other words, you should create a scenario where one of two islands of interest started with a higher allelic frequency, then for some number of generations had a lower allelic frequency, before later still higher allelic frequency once again. What started conditions makes this possible?
Using the Island Model equation, \(q_1 = (1-m) \cdot q_0 + m \cdot q_m\), derive an equation for the difference between the allelic frequency on the island in generation t and the allelic frequency of migrants, that is \(q_t - q_m\), in terms of \(m\), \(t\), \(q_0\) and \(q_m\). You can assume that \(q_m\) is constant across generations. (Hint: First write an equation for \(q_2\) in terms of \(m\), \(q_1\) and \(q_m\) and then use your equation for \(q_1\) to substitue for \(q_1\).)
An island population of butterflies has an allelic frequency of 0.75 and the allelic frequency of the surrounding mainland population is 0.25 for the same allele. How many generation will it take for the island population to reach a frequency of 0.50 if the migration rate is 0.05?
Four neighboring populations of equal size have allelic frequencies of 0.2, 0.5, 0.8 and 0.9 and migrate according to the island model with m=0.05. What are the expected allelic frequencies after 5 geneations?