Abstract: This is a analysis for insertion number of transgenic arabidopsis which you can use to infer T-DNA insertion numbers based on 3:1 segregation ratio (Kan+ for instance) of T1 seeds.
Statistic Model: number of Select marker free lines(x) are of binomial distribution of total seeds tested(n) and p is equal to 0.25 when there is only one insertion. Our null hypothesis is insertion number(t) is equal or larger to 2 and alternative hypothesis is t is 1. H0 : t >= 2 H1 : t = 1 X ~ binom(n, 1/4^t) set: alpha = 0.05
When n=55, t=2, the distribution looks like
## Warning: package 'ggplot2' was built under R version 3.1.3
If there are 2 independent insertions, 99% chance we are expecting less than or equal to 8 select marker free plants. So out of 55 lines, if we observe more than 8 select marker free plants, H0 is rejected and we assume this line is single insertion.
If you are looking for single insertion lines the cutoff number of select marker free plants would be as followed
## n cutoff
## 1 30 5
## 2 35 6
## 3 40 7
## 4 45 7
## 5 50 8
## 6 55 8
## 7 60 9
## 8 65 9
## 9 70 10
## 10 75 10
## 11 80 11
## 12 85 11
## 13 90 12
## 14 95 12
## 15 100 12
## 16 105 13
## 17 110 13
## 18 115 14
## 19 120 14
in the table,“n” is the total number of plants you screened, and “cutoff” means if you observe more than this number (exclusive), you assume it is single insertion.