I would calculate mean number of seeds that germinated for the wild type, and assume normality. I would then calculate the mean number of seeds that the GMO plant, and calculate the p value of the mean number of seeds of the GMO plant, to see if it is actually statistically different or likely to just be different by chance.
d1 <- read.csv("http://faraway.neu.edu/biostats/lab3_dataset1.csv")
mean_table = apply(d1, 2, mean)
se_table = apply(d1, 2, sd)/sqrt(apply(d1,2,length))
confidence_interval = se_table * 1.96
plot.heights <- as.matrix(cbind(as.numeric(mean_table[2:3]),
as.numeric(mean_table[4:5])))
plot.heights
## [,1] [,2]
## [1,] 80.9 19.1
## [2,] 25.7 74.3
bp <- barplot(plot.heights,
beside = T,
names = c("GMO germ","Wild germ", "GMO fail", "Wild fail"),
ylim = c(0,max(as.numeric(plot.heights) + 2)),
ylab = "Number of seeds")
arrows(y0 = plot.heights-confidence_interval[2:5], y1 = plot.heights+confidence_interval[2:5], x0 = bp, x1 = bp, angle = 90, code = 3, length = 0.1, col = "dark red")
My extremely hard work in the lab was worth it. GMO plans both germinated at a higher percentage and at a larger volume.
H0: The mean number of seeds that germinated is not statistically different in GMO plants than wild plants. HA: The mean number of seeds that germinated is statistically different in GMO plants than wild plants.
sum_table = colSums(d1)[2:5]
fisher_table <- as.matrix(cbind(as.numeric(sum_table[1:2]),
as.numeric(sum_table[3:4])))
fisher.test(fisher_table, alternative="greater")
##
## Fisher's Exact Test for Count Data
##
## data: fisher_table
## p-value < 2.2e-16
## alternative hypothesis: true odds ratio is greater than 1
## 95 percent confidence interval:
## 10.18571 Inf
## sample estimates:
## odds ratio
## 12.22468
There is overwhelming evidence that the GMO plants helped seed germination. With a p-value of 2.2e^-16, it is extremely likely that this was not due to chance.
d2 <- read.csv("http://faraway.neu.edu/biostats/lab3_dataset2.csv")
head(d2)
## countries gmo.disease gmo.nodisease nogmo.disease nogmo.nodisease
## 1 India 45 40 15 31
## 2 Vietnam 59 42 27 23
## 3 Brazil 58 44 30 31
## 4 South Africa 52 44 21 29
## 5 Cambodia 39 51 22 25
## 6 Ivory Coast 53 50 23 24
H0: There is no association between GMO and disease influence HA: There is association between GMO and disease influence
d2_sum_table = colSums(d2[, -1])
d2_fisher_table <- as.matrix(cbind(as.numeric(d2_sum_table[1:2]),
as.numeric(d2_sum_table[3:4])))
fisher.test(d2_fisher_table, alternative="greater")
##
## Fisher's Exact Test for Count Data
##
## data: d2_fisher_table
## p-value = 0.02954
## alternative hypothesis: true odds ratio is greater than 1
## 95 percent confidence interval:
## 1.027284 Inf
## sample estimates:
## odds ratio
## 1.240706
There is evidence that GMO use is associated with disease incidence. With a p value of 0.02, using an alpha of 0.05, we can reject H0 and say that there is a greater disease incidence with GMO use.
pvals <- numeric(NROW(d2))
for (i in 1:NROW(d2)) {
fisher_matrix = cbind(t(d2[i, 2:3]),
(t(d2[i, 4:5])))
pvals[i] <- fisher.test(fisher_matrix)$p.value
}
pvals
## [1] 0.0288584 0.7271239 0.4170731 0.2219837 0.7204465 0.8606783 0.3042649
## [8] 0.7393854 0.7224751 1.0000000
In all countries except 1, there is an association between GMO use and disease incidence.