1000 random values of \(\theta\).
thetas<-runif(1000,1,100)
write.table(thetas,file="~/src/msdir/file.tbs",col.names=F,row.names=F,quote=F)
Use these 1,000 diversity values to simulate a constant size neutral population. Then calculate summary statistics using msstats
system("~/src/msdir/ms 10 1000 -t tbs < ~/src/msdir/file.tbs | ~/src/msstats/src/msstats | cut -f 4,5,8 > ~/src/msdir/sumstats.txt")
Read in data, plot relationship between Tajima’s D and diversity. Do a simple linear regression.
sumstats<-read.table("~/src/msdir/sumstats.txt",header=T)
diversity<-scan("~/src/msdir/file.tbs")
data<-cbind(sumstats,diversity)
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.2.4
ggplot(data,aes(y=tajd,x=diversity))+geom_point()+geom_smooth(method="loess")+coord_cartesian(ylim = c(-2, 2))
summary(lm(data=data,tajd~diversity))
##
## Call:
## lm(formula = tajd ~ diversity, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.94321 -0.59504 -0.03147 0.57605 2.69632
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.1024097 0.0572462 -1.789 0.0739 .
## diversity 0.0001055 0.0009706 0.109 0.9134
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.8712 on 998 degrees of freedom
## Multiple R-squared: 1.184e-05, Adjusted R-squared: -0.0009901
## F-statistic: 0.01182 on 1 and 998 DF, p-value: 0.9134