Chapter 3: Genetic Drift and Effective Population Size



This Chapter is really about changes in allele frequency that occur when (always) population size is finite Popuation size is a fundamental consideration when examining changes in allele frequency through time: There is a huge variation in population size across species.

Genetic drift results from Sampling Error, due to a finite sample of individuals, gametes and alleles. Drift not only impacts allele frequencies but also genotype frequencies.

We can use a Wright-Fisher model to simulate drift, with some assumptions about biology:

  1. Generations are discrete.
  2. Number of males and females is equal.
  3. Population Size remains constant over-time.
  4. All individuals produce equal number of gametes.

BASIC Genetic Drift simulator (comes with absolutely NO WARRANTY)

Under a Wright-Fisher Model there are a few assumptions of genetic drift, these are:

  1. The direction of change in allele frequency is random.
  2. The magnitude of change is inversely proportional to population size.
  3. If there is no other process acting on allele frequncy then fixation/loss is the equilibrium state.
  4. Genetic drift changes allele frequencies and therefore genotype frequencies.
  5. The probability of fixation of an allele through drift is its frequency.

Three models discussed in tthis chapter:

  1. The binomial distribution: magnitude of genetic drift depends on the allele frequencies.
  2. Markov Chain: The rate of change of allele frequencies.
  3. Diffusion estimation: genetic drift modelled as diffusion of particles.

Modelling Genetic Drift: The Binomial distribution

Genetic drift is a stochastic process, individual outcomes are dictated by chance, but a large number of outcomes can be modelled as a probability distribution.

The binomial formula describes the probability distribution of \(i\) successes in \(N\) trails, where the probability of success is \(p\). This is quite like sampling alleles from a population. Consider a population of A and a alleles. For any given trial (or sample) the probability of sampling an A allele is the frequency of the A allele. But what if we took 10 samples, what is the probability of getting 6, or 7 A alleles? This can be described by the binomial probability distribution and is useful for describing the magnitidue af allele frequency changes expected under a Wright-Fisher model of drift.

\[P_{(i=A)}=\binom{2N}{i}p^{i}q^{2N-i}\] where \[\binom{2N}{i} = \frac{(2N)!}{i!(2N-i)!}\] and \(p^{i}\) is the probability of drawing i alleles if the frequency is \(p\), and \(q^{2N-i}\) is the probability of the remainder of the drawn alleles being B (i.e. not A) given the frequency \(q\).

\(\binom{2N}{i}\) is pronounced “\(2N\) draw \(i\)” and represents the probabiltity of the number of successes, \(i\), given \(2N\) trials.

So what is the probability of drawing 7 A alleles from in 10 trials, given that \(p\) = 0.5?

\(\binom{2N}{i}\) enumerates all the possible ways to draw 7 A from ten trials, and so we need this number first.

\[\binom{10}{7}=\frac{(10)!}{7!(10-7)!} = \frac{3.6288\times 10^{6}}{3.024\times 10^{4}} = 120\]

\(2N\) actually refers to the number of draws you are making, so if you make 10 draws it is written as \(\binom{10}{2}\) for example. The notation here is \(2N\) because in a Wright-Fisher model of drift you are usually drawing two alleles from each individual (static pop size), thus you make twice as many draws as there are individuals, hence \(2N\).

Adding this into the equation we get:

\[P_{(i=A)}=120 \times p^{i}q^{2N-i} = 120 \times 0.5^{7}0.5^{13}= 0.1171875 \]

double check the logic is working by matching up the numbers to the example in the book \(\binom{4}{2}\):

\[P_{(i=A)}=6 \times p^{i}q^{2N-i} = 6 \times 0.5^{2}0.5^{2}= 0.375 \]

Yes, it works!!

Now plot the probabiltiy denstiy for samples with \(N\) of 10 draws, and number of success ranging from \(0 .. N\), with \(p = 0.5\).

#set up the draw sizes
draws<-1000
p<-0.5
range<-1:draws
probs<-NULL
for ( i in range ) {
  #also try it with rbinom
  prob<-dbinom(x=i, size=draws, prob=p)
  probs<-c(probs,prob)
}#for
plot(1:length(probs),probs, type = "n", xlab="number of successes", ylab="probability", main=paste("prbability density of success at p = 0.5, trials = ", draws))
lines(1:length(probs),probs, lwd=3, col = "red")

The above calculations give the chance of observing x number of success in N trials. Importantly, we can see that the odds of extreme changes in allele frequency (i.e. an unusually large number of success of failures is reduced with increased number of trails, i.e. icreased popluation size). The below plot contains fewer trials than the one above, and you can see the chance of getting a “wild” number of draws increases.

But how variable should allele frequency changes be under genetic drift. The varience of a Bernoulli or binomial random variable is:

  • \[\sigma^{2}=pq\]

The maxium variability will occur when \(p=q=\frac{1}{2}\)

The Standard deviation is \(\sigma=\sqrt{pq}\).

and the Standard error is:

  • \[SE=\sqrt{\frac{pq}{2N}}\]

Thus the variability of allele changes decrease as a population approaches fixation, this is because \(pq\) approaches zero (one of the allele frequencies is almost zero, so x times a number near zero is small, thus the variace approaches 0 too).

Also the SE in allele frequency changes also reduces in population where alleles approach fixation/loss. For example, see the plot below:

Markov chain models

Instead of looking at one population, like we have done before, it can be useful to look and numerous identical populations, to find a little bit about how drift works on average.

For example for a given allele we can calculate the transition probability of one state to another (i.e. from one frequency to another) using the binomial formula:

  • \[P_{(i \rightarrow j)}=\binom{2N}{j}p^{j}q^{2N-j}\]

where \(i\) is the initial number of number of the allele and \(j\) is the new number of alleles, after the proposed transition.

By assumin an infinite number of popultaions one can start of with founding populations and then, generation by generation, modify the average frequncy of each allele across all populations. For example, imagine a infinate number of populations, each of one individual with A and a alleles (therefore and Aa heterozygote), each allele frequencies of \(p=q=\frac{1}{2}\). After sampling to form another generation, with a single diploid idividual there are only three outcomes:

  • number of A alleles = 0 (lost).
  • number of A alleles = 1 (Heterozygote).
  • or number of A alleles = 2 (Fixed).

You can use the binomial equation to figure out the probability of each transition (0,1,2)

For example the liklihood of drawing 2 A alleles is:

\[P_{(1 \rightarrow 2)}=\binom{2}{2}p^{2}q^{0} = 0.25\]

given that each population will now have two A alleles at probability 0.25 so you can modify the frequency of populations in state i with the transition probability of moving to state \(j\) such that if there are 100 (\(i\)) population with one A allele, and one a allele, 25 will now be fixed for A.

This means that the frequency of alleles in a given generation is only effected by the allele frequency in the current generation, this is known as the Markov property.

Using fancy matrix algebra there is you can culculate that the rate at which genetic variation is lost from a collection of populations is:

  • \[1-\frac{1}{2N}\]

This means that genetic drift reduces genetic variaiton and a amount equal the inverse of twice the population size! Try to graph this according to population size.

This figure shows exactly how population size effects the rate at which genetic diversity is lost, a key insight from the Hamilton chapter.

The Diffusion Approximation of Genetic Drift

Markov chain models treat allelic state transitions and time as discrete processes, but these can be model as continuous variables using some very fancy math (only hinted qt in the book, thankfully).

Imagine diffusion of a dye in water:

Here ink particles can move in three dimensions, and move at rate that depends on the diffusion coefficient but it is simpler, and more applicable to allele frequencies to work in only one dimension (representing allele frequency).

In this case, each particle is an individual population affected by drift. Assume that:

  1. Particles (populations) move at a constant velocity
  2. Each particle moves \(\delta\) distance in set unit time.
  3. Half the particles move in one direction half the other, so \(p=q=0.5\).

So the, next movement of particals is:

\[\bar{x}= p(\delta) - q(\delta) = 0\]

In an ensample of replicate population we expect an equal number of them to move towards fixation and loss. The mean change in allele frequnecy will be \(0\) (remember the red line on the genetic drift simulations).

Although the mean is \(0\) the variance is not…

If a group of particals all start at position zero (\(\bar{x}^2_{i(t=1)} = 0\)) then the variance in particle position increases by \(\delta^2\) every time interval, or \(t\delta^2\) in \(t\) time intervals.

\[\frac{\delta^2}{2}\] is the diffusion coeffieint of real molecules abut has an analog in population genetics:

\[D=\frac{p-p^2}{4N}\]

The diffusion coefficient is determined by both \(N\) (population size) and \(p\) allele frequnecy (greatest when \(p=0.5\)).

Researchers have used diffusion theory to estimate the mean time to fixation for alleles at various frequencies.

fixation and time

Effective Population Size

  • consensus populations size \(\longrightarrow\) the total head count of the population.

  • The effective population size \(N_{e}\longrightarrow\) is approxiamtely the number of individuals that contribute gametes to the next generation. This is the the size of a Wright-Fisher population thata would produce the same magnitude of genetic drift.

An example of why \(N_{e}\) might be lower that the consensus pop size is demonstrated by fluctuating population size over time. For example, imagine a population of 100, then 10, then 100 indoividuals over three generations (a genetic bottleneck). The arithmetic mean is 70, but a better refelction of the effect of drift on this population would be to use the harmonic mean which is 25.

We expect this pop to behave like an ideal Wright-Fisher population with census size 25.

  • Founder Effect is the establishment of a popualtion by a few individuals, the population is therefor subject to the excess drift/sampling error of small populations.

Also mating behaviours also impact \(N_{e}\), when males compete for females for example, and typically few males contribute gametes to the next generation, thus \(N_{e}\) is lower than the consensus head count. Assuming all other conditions of a Wright-Fisher population are met then The Breeding Sex Ratio can alter \(N_{e}\) as such:

  • \[N_{e}=4\frac{N_{m}N_{f}}{N_{m}+N_{f}}\]

where \(N_{m}\) and \(N_{f}\) are the number of breeding males and females respectively.

For example in Elephant seals 21 males successfully mated (although 71 were observed, but didn’t mate) with 550 females. A highly skewed breeding ratio. The impact on \(N_{e}\) is really quite profound:

\[N_{e}=4\frac{(21)(550)}{21+550} = 80.910683\]

This essentially means that a consensus population of 625 individuals has an \(N_{e}\) equivalent of an ideal Wright-Fisher population of 40 males and 40 females when breeding ration is 1:1….

Variation in family size can also reduce Ne

Variation in the number of offspring produced by individuals..

\[Ne = \frac{4N_{t-1}}{var(k) + \bar{k}^2-k}\]

Where \(N_{t-1}\) is the size of the original population and \(k\) is the number of gametes resulting in progeny (family size). When Variance in family size is equal to the mean there is no family size bottleneck.

The Wright-Fisher model assumes that family size follows a Poisson distribution.

Paralelisms between Genetic Drift and Inbreeding

Mating in small pops can be thought of as a form of inbreeding. For example, in a infinite populaion you will not breed with your relatives, but as pop size decreases then you have a higher change of randonly picking your relative and the amount of inbreeding will increase. Genetic Drift and inbreeding have some commonalities, for example:

  1. Both have a greater impact in smaller populations and..
  2. Both lead to an increase in Homozygosity.

Inbreeding can be examined in terms of its effects on the Fixation Index \(F_{t}\).

  • \[F_{t} = \frac{1}{2N_{e}} + \bigg(1-\frac{1}{2N_{e}}\bigg)F_{t-1}\]

The first term is the chance of picking identical aleles by descent (autozygostity). The second term takes the probability of picking a non autozygous allele \(\big(1-\frac{1}{2N_{e}}\big)\), that is actually autozygous due to inbreeding in the previous generation \(F_{t-1}\).

After some re-arrangement the Heterozygosity of generation \(t\) is:

  • \[H_{t}=\bigg(1-\frac{1}{2N_{e}}\bigg)^{t}H_{0}\].

Where \(H_{0}\) is the initial heterozygosity and \(H_{t}\) is the heterozygosity in the new generation.

There is an important distiction between inbreeding and Genetic drift:

  • Drift causes decreases in \(H\) (heterozygosity) due to the fixation/loss of alleles.
  • Inbreeding causes decreases in \(H\) due to changes in genotype frequencies not via loss of alleles.


Estimating effective population size

Here we will consider three estimates of effective population size:

  • Inbreeding effective population size \(N^{i}_{e}\) which is the the size of an idealised Wright-Fisher population that exhibits the same probabiltiy of two alleles being identical by decent (IBD) as the actual REAL population.

  • Varience effective population size \(N^{i}_{v}\), which is the size of an idealised Wright-Fisher population that has the same sampling vareince in allele frequency as the REAL population.

For Inbreeding effective population size..

In an ideal population the probability of sampling an two alleles the are IBD is equal to \(\frac{1}{2N^{i}_{e}}\), so…

\[ P(IBD)=\frac{1}{2N^{i}_{e}}\]

After re-arranging for \(N^{i}_{e}\) we get..

\[N^{i}_{e}=\frac{1}{2p(IBD)}\]

THe second equation shows us the probability of IBD is dependent on the population size (remember IBD is a measure of inbreeding and is not the same as identity by state). Thus the lielyhood of draeing two alleles which are IBD is greater in small populations. Thus the \(N_{e}\) has been defined by the expected rate of inbreeding, it is the Inbreeding effective population size.

For Varience effective population size

This estimate makes use of the variation in allele frequencies in multiple replicate populations. For example the varience in allele frequency changes in a group of replicate populations can be defined as:

\[Varience(\Delta{p})=\frac{p_{t-1}q_{t-1}}{2N^{i}_{v}}\]

Where \(p\) and \(q\) are allele frequencies at a diallelic locus. The equation can be re-arranged to give \(N^{i}_{v}\).

\[N^{i}_{v}=\frac{pq}{2\times variance(\Delta{p})}\]

This is similar to the above, where we have defined \(N^{i}_{v}\) based on the expected varience in allele frequency changes for a given population.

These effective population sizes estimates are NOT always equivalent as they measure different aspects of the population.

For Breeding Effective Population Size

The above to parameter estimates assume descrete population entities, but what if a population is spread continously over a large geographical area and dispersal of individuals and/or gametes is limited (in a population of plants for example). In this case population size can be defined as the average dispersal and mating patterns. This approach is called the Breeding Effective Population Size.

In effect individuals in this population are essentially isolated by distance and dispersal can follow a normal distribution. For example

This can alse be visualized in 2-Dimensions..

library(mvtnorm)
## Warning: package 'mvtnorm' was built under R version 3.1.3
library(graphics)
## Choose the matrix dimensions
yticks <- xticks <- seq(-3, 3, length=100)
side <- diff(yticks[1:2])  # side length of squares
sigma <- diag(2)               # standard devs. for f2
mu <- c(0,0)                # means

## Using pnorm
f <- Vectorize(function(x, y, side, mu1, mu2, s1, s2)
    diff(pnorm(x+c(-1,1)*side/2, mu1, s1)) * diff(pnorm(y+c(-1,1)*side/2, mu2, s2)),
    vec=c("x", "y"))
## Using pmvnorm
f2 <- Vectorize(function(x, y, side, mu, sigma)
    pmvnorm(lower=c(x,y)-side/2, upper=c(x,y)+side/2, mean=mu, sigma=sigma),
                vec=c("x", "y"))

## get prob. of squares, mu are means, s are standards devs.
mat <- outer(xticks, yticks, f, side=side, mu1=0, mu2=0, s1=1,s2=1)
mat2 <- outer(xticks, yticks, f2, side=side, mu=mu, sigma=sigma)

## test equality
all(abs(mat2-mat) < 1e-11)  # TRUE
## [1] TRUE
all.equal(mat2, mat)        # TRUE
## [1] TRUE
## See how it looks
library(lattice)
## Warning: package 'lattice' was built under R version 3.1.3
persp(mat, col="lightblue", theta=35, phi=35, shade=0.1,xlab="x spatial dimension",ylab="y spatial dimension",zlab="probability of gamete dispersal",cex.lab=0.8)

In this case \(N^{b}_{e}\) can be defined as:

  • \[N^{b}_{e}=\pi\big(2\sqrt{dispersal variance}\big)^{2}d\]

Which can then be simplified to:

  • \[N^{b}_{e}=4\pi\big(dispersal\ SD\big)d\]

where \(d\) is the density of individuals. The area described by the above equation is called the genetic heighbourhood. Multiplying the genetic neighbourhood by the density of indivudals gives you the breeding effective population size.

Important terms

  • Breeding effective population size \(N^{b}_{e}\) = the number of individuals found in a genetic neighbourhood, which is defined as the vareince in gamete dispersal.

  • Deme The largest areas or collection of individuals where mating is random.

  • Genetic neighbourhood An areas or sub-unit of a population within whxih mating is random.

  • Isolation by distance The decrease in the probability of gamete union due to physical distance.

Effective population size of different genomes

The \(N_{e}\) of plastid and mitochondrial genomes is \(1/4\) of that of the nuclear genome. This is because of two independent reasons:

  1. There is only a haploid copy per organelle, and organelles are inherited singularly, give or take.

  2. They are only inherited through one line, usually the maternal line.

Each of there reduces \(N_{e}\) by \(1/2\).

Gene genealogies and the coalescent model

Some important terms:

  1. Coalescence or Coalescent event: The point in time where a pair of lienages of genealogies trace back to a single common lineage.

  2. Genealogy The record of ancestor-descendnet relationships for a family or locus.

  3. Waiting time The mean expected waiting time back in the past until a single coalescent even in a sample of lineages.

Under an idealised Wright-Fisher population the probability that two lineages coalescence is:

\[ \frac{1}{2N}\]

and the probability that they do not is:

\[ 1-\frac{1}{2N}\]

When the coalescent event occurs the number of lineages is reduced by one.

The coalescence displays the Markov property in that the outcome in the next generation depends only on the situation in the current generation. This means that the basic probabilities of coalescence can be caluculated over an arbitrary number of time-points (generations).

This sueful equation demostrates the probability of coalscence in \(t\) generations is:

\[\bigg(1-\frac{1}{2N}\bigg)^{t-1} \frac{1}{2N}\]

for example in a population of N=10 the chance that two radom alleles will coalesce in four generations is the liklighood they DO NOT coalesce in the first three generation (going backwars of course) \(\bigg(1-\frac{1}{2N}\bigg)^{3}\) but DO coalesce in the final fourth generation \(\frac{1}{2N}\).

The cumulative probability of two random alleles coalescing in t generations is estimated by the exponential function:

\[P(T_{C}\leq t) = 1-e^{-\frac{1}{2N}t}\]

Importantly, the average time for \(k\) lineages to coalesce in a population of \(2N\) indivduals is:

\[\frac{2N}{\frac{k(k-1)}{2}}\]

It is important to note that this is the time for any two of these \(k\) lineages to coalesce.

The basic natire of the coalesence is the same for all poplutations no matter what the size. Population size just scales the time along which coalescence occurs.

Height of the coalescent tree

The time from the present to the point in the past where all \(k\) lineages have coalescend and found their MRCA. This is hiehgt is simplay the sum of the waiting times for all coalescent events in the tree.

\[E(H_{k}) = 2\bigg(1-\frac{1}{k}\bigg)\].

Coalescent genalogies abd population bottle necks

Just like the effects of Genetic drift coalescent probabilities in populations can be calculated using the harmonic mean of \(N_{e}\) is the population fluctates in size.

For example the harmonic mean of a population with equal time at size 100-10-100 is ~25 and thge probability of two alleles NOT coalescing over 3 generations:

\[P_{NC}\approx e^{-\frac{3}{2(25)}}\].

Thus the chance of coalescing over the three generations is 1-P_{NC}

Coalescent genealogies in growing and shrinking populations

Working backwards in time, if a population is shrinking, then the probability of a coalescent is increasing as \(N\) is shrinking and the probability of a coalescent event is \(\frac{1}{2N}\).

The result is that expanding population tends to be the reverse of stable population in that the longest perionds between coalescent events occur towards the tips of the tree where \(N\) is larger (thus prob of coalescent is lower).

Take a look at these picutres to get a flavor of what I mean:

Coalescent tree and popluation growth 1 Coalescent tree and popluation growth 2