Response to selection in teosinte assimilation

Libraries

Only one for now

library(lattice)

Data

Download data from Dolores’s growouts. 360ppm and 215ppm branch lengths (3 years data lumped together).

system("wget -O b360.txt https://gist.github.com/rossibarra/5964105/raw/71adcff6492367aa7608ca561df3c4a6f4c2716f/bl360")
bl360 <- scan("b360.txt")
system("wget -O b215.txt https://gist.github.com/rossibarra/5964126/raw/980f82bc928e45529c12ccb0dc74075e74a6eea4/bl215")
bl215 <- scan("b215.txt")

Now we estimate pehnotypic variance \(\sigma^2P \approx 0.03\) and mean \(\mu \approx 0.42\) for branch length at 215ppm. Also look at data.

Vp = var(bl215)
Vp
## [1] 0.03434
bmu = mean(bl215)
bmu
## [1] 0.3707
hist(bl215, xlab = "branch length 215ppm", main = "")

plot of chunk unnamed-chunk-3

And repeatfor 360ppm, checking for normality here by comparing to curve.

Vp = var(bl360)
Vp
## [1] 0.0298
bmu = mean(bl360)
bmu
## [1] 0.4165
hist(bl360, xlab = "branch length 360ppm", main = "",freq=F)
curve(dnorm(x,mean=0.4165313,sd=0.1726368),from=0,to=1,add=T)

plot of chunk unnamed-chunk-4

Functions

This gets mean of shortest branch length selected plants of sample size x

get_s <- function(x,data) {
    mean(data) - mean(data[1:x])
} 

This gives response to selection

resp <- function(h2, s) {
    h2 * s
}

Following p. 210 in Falconer & Mackay to estimate sampling variance in response to selection. \(\sigma^2_R=\frac{\sigma^2_Pth^2}{Ne}+\frac{1}{M}\)

Vr <- function(Ne, h2, Vp, t, M) {
    Vp * t * h2/Ne + 1/M
}

Analysis

Params and sample data

We will fix some number of plants in a chamber (num_plants below). We have no idea what \(h^2\) or \(s\) are, so try lots. But \(s=\frac{s_f+s_m}{2}\) and because we phenotype after flowering, there is no selection on males (\(s_m=0\)) and \(s=\frac{s_f}{2}\). To get \(s\) we decide how many shortest branch plants we choose. We’ll try \(n=1-20\) shortest plants from our population of plants and estimate \(s\) as if we were sampling from a normal. Note that this \(n\) coresponds roughly to \(N_e\) so we’ll call it that!

num_plants = 48  # Ne in our theoretical experimental pop
Ne = c(1:20) # number of adults kept for next generation.
h2 = 1:10/10  # 5-100% h^2 in 5%

Get our simulated data by averaging over resamples from observed (ugly, yes, but it’ll do)

#sorted_bl360 = bl360[order(bl360)]  # sort observed data <- now using simulated data instead of observed
#really ugly way to get a "normal" set of plants of size n=24 drawn from the data (or from the normal, now commented out)
#for(i in 1:1000){ bob[[i]]=rnorm(24,mean=0.4165313,sd=0.1726368)}
bob=list()
for(i in 1:1000){ bob[[i]]=sample(bl360,num_plants,replace=TRUE)}
bob=lapply(1:1000,function(x) bob[[x]][order(bob[[x]])])
plants=(rep(0,num_plants));
for(i in 1:1000){ plants=plants+bob[[i]] }
plants=plants/1000

Now we can calculate our selection strength \(s\)

s = sapply(Ne, function(x) get_s(x,plants))/2  # divide by two because of s=sf/2 above

Results

\(R=h^2s\) is easy enough. This shows that the fewer plants you select (smaller \(N_e\)) the bigger your response to selection is – smaller branch lengths in next generation. This makes sense – we expect a bigger shift if we choose progeny only from the 2 plants with the shortest branch lengths than if we choose from the bottom 50% say.

R = sapply(h2, function(x) resp(h2 = x, s = s))
rownames(R) = Ne
colnames(R) = h2
wireframe(R, scales = list(arrows = F), xlab = expression(N[e]), ylab = expression(h^2))

plot of chunk unnamed-chunk-11

Calculate sampling variance in Response to selection

Assume \(t=1\) generations. Total plants measured \(M\) is our variable num_plants. Have to use range of \(h^2\) as above, and now substitute in \(N_e\) from above too. This gives us \(\sigma^2_R\).

est_vr = sapply(h2, function(x) Vr(h2 = x, Ne = Ne, t = 1, M=num_plants, Vp=var(plants)))
rownames(est_vr) = Ne
colnames(est_vr) = h2
wireframe(est_vr, scales = list(arrows = F), screen = list(z = 230, x = -70, 
    y = 0), zlab = expression(sigma[R]^2), ylab = expression(h^2), xlab = expression(N[e]))

plot of chunk unnamed-chunk-12

This graphs shows that as we select harder (smaller \(N_e\)) our variance goes up because of genetic drift.

Confidence Intervals

With \(h^2\) and \(s\) (and our bad estimate of \(N_e\) we can get \(R\) and \(\sigma^2_R\) which let’s us calculate a confidence interval.

We find the lower bound of the confidence interval, which we will call \(R_{min}\), assuming the confidence interval is \(R \pm 2∗\sigma_R\). We want \(R_{min}>0\) in order to be able to have confidence that our selection program will yield nonzero results.

sd_R = est_vr^0.5
Rmin = R - 2 * sd_R
rownames(Rmin) = Ne
colnames(Rmin) = h2
wireframe(Rmin, zlab = list("Rmin", rot = 90), scales = list(arrows = F), xlab = expression(N[e]), 
    ylab = expression(h^2))

plot of chunk unnamed-chunk-13

This graph shows that for even our confidence interval on the response to selection after one generation includes 0. Which means we can’t rule out genetic drift (let alone show statistical difference to a different selection experiment under different conditions). We get a peak in the middle as selecting harder gives a stronger response, but also leads to more drift, so there’s an otpimum level.

Multiple generations

Here we let the experiment run for \(t=5\) generations. We have to make some assumptions. We don’t know how strong selection (difference between final and initial mean) we can really do. Let’s assume \(3x\) original difference, but \(N_e\) stays same. This is equivalent of saying we get 3 times as much response over 5 generations as we see in the original 1 generation.

R = sapply(h2, function(x) resp(h2 = x, s = 3 * s))  #re-estimate response with 3X s 
est_vr = sapply(h2, function(x) Vr(h2 = x, Ne = Ne, M=num_plants, Vp=var(plants), t = 5))  #var. response now over 5 gens
rownames(est_vr) = Ne
colnames(est_vr) = h2
sd_R = est_vr^0.5
Rmin = R - 2 * sd_R
rownames(Rmin) = Ne
colnames(Rmin) = h2
wireframe(Rmin, zlab = list("Rmin_5gens", rot = 90), scales = list(arrows = F), 
    xlab = expression(N[e]), ylab = expression(h^2))

plot of chunk unnamed-chunk-14

Not so great outlook. \(R_{min}<0\) unless \(h^2\) is really high. This means that unless we have confidence that \(h^2\) is high (which it almost never is for single-plant traits), there is no gaurantee we will see any response to selection.

Finally, if we claim we can make 5 times the single generation progress in 5 generations, we see:

R = sapply(h2, function(x) resp(h2 = x, s = 5 * s))  #re-estimate response with 3X s 
est_vr = sapply(h2, function(x) Vr(h2 = x, Ne = Ne, M=num_plants, Vp=var(plants), t = 5))  #var. response now over 5 gens
rownames(est_vr) = Ne
colnames(est_vr) = h2
sd_R = est_vr^0.5
Rmin = R - 2 * sd_R
rownames(Rmin) = Ne
colnames(Rmin) = h2
wireframe(Rmin, zlab = list("Rmin_5gens", rot = 90), scales = list(arrows = F), 
    xlab = expression(N[e]), ylab = expression(h^2))

plot of chunk unnamed-chunk-15

This looks much more positive, but makes some very generous assumptions about how well we can select while maintaining \(N_e\) and still only works for \(h^2>0.4\).