d1=read.csv("http://faraway.neu.edu/biostats/lab5_dataset1.csv")
d1
## treatment mortality
## 1 control 1.4528777
## 2 control 3.2662526
## 3 control 1.1786520
## 4 control 13.4003497
## 5 control 3.7791827
## 6 control 1.1966567
## 7 control 4.4257026
## 8 control 5.6878067
## 9 control 4.8345176
## 10 control 2.0029310
## 11 control 12.3268667
## 12 control 4.0142207
## 13 control 1.4604716
## 14 control 0.2967991
## 15 control 8.3723191
## 16 control 2.5988431
## 17 control 2.6746265
## 18 control 6.9854975
## 19 control 6.1794001
## 20 control 4.9229174
## 21 control 6.8139867
## 22 control 5.9425379
## 23 control 2.9287186
## 24 control 0.3718177
## 25 control 5.0522099
## 26 control 2.5699110
## 27 control 2.3261266
## 28 control 0.6245322
## 29 control 1.6851422
## 30 control 4.1286132
## 31 t1 14.2774240
## 32 t1 3.3108742
## 33 t1 5.4068767
## 34 t1 3.4770873
## 35 t1 0.9258347
## 36 t1 2.4229976
## 37 t1 2.4736877
## 38 t1 3.4579869
## 39 t1 11.0234561
## 40 t1 7.8709262
## 41 t1 3.1126561
## 42 t1 2.8480607
## 43 t1 7.3666523
## 44 t1 6.4023378
## 45 t1 1.8427229
## 46 t1 1.8085128
## 47 t1 5.2834641
## 48 t1 7.9132053
## 49 t1 3.2793781
## 50 t1 8.8561110
## 51 t1 5.4635889
## 52 t1 1.9896796
## 53 t1 5.1609449
## 54 t1 1.1860600
## 55 t1 15.3793193
## 56 t1 26.5864024
## 57 t1 2.5415612
## 58 t1 1.2915788
## 59 t1 6.4864775
## 60 t1 3.2057478
## 61 t2 81.5827433
## 62 t2 7.1047246
## 63 t2 14.7278368
## 64 t2 7.5988898
## 65 t2 3.5139009
## 66 t2 8.9244286
## 67 t2 1.2153613
## 68 t2 31.9942072
## 69 t2 8.6128333
## 70 t2 64.8846884
## 71 t2 11.8877627
## 72 t2 3.6329812
## 73 t2 13.6089322
## 74 t2 2.9034578
## 75 t2 2.1093221
## 76 t2 9.8892295
## 77 t2 4.7431816
## 78 t2 7.3972281
## 79 t2 7.9593021
## 80 t2 4.0979181
## 81 t2 4.1842659
## 82 t2 6.4547829
## 83 t2 24.0007960
## 84 t2 1.6103205
## 85 t2 13.3824773
## 86 t2 10.3083101
## 87 t2 21.3937717
## 88 t2 5.4510927
## 89 t2 10.6975935
## 90 t2 9.6513595
## 91 t3 1.5800871
## 92 t3 9.0963006
## 93 t3 8.6746295
## 94 t3 5.4751170
## 95 t3 13.2876290
## 96 t3 4.7516239
## 97 t3 0.7583637
## 98 t3 1.5322459
## 99 t3 0.7988256
## 100 t3 1.6931647
## 101 t3 1.4617485
## 102 t3 2.8352096
## 103 t3 1.0931663
## 104 t3 3.1836514
## 105 t3 1.4125765
## 106 t3 15.9154012
## 107 t3 5.5661715
## 108 t3 6.7542655
## 109 t3 3.9915729
## 110 t3 14.6168662
## 111 t3 1.4394535
## 112 t3 1.7131868
## 113 t3 11.3848354
## 114 t3 1.4180797
## 115 t3 2.2091753
## 116 t3 1.8352708
## 117 t3 1.9738918
## 118 t3 2.0562557
## 119 t3 4.4557185
## 120 t3 2.2765691
## 121 t4 7.3451669
## 122 t4 46.6670723
## 123 t4 9.8298197
## 124 t4 10.1801879
## 125 t4 11.0210740
## 126 t4 24.8452430
## 127 t4 11.3184665
## 128 t4 11.7325359
## 129 t4 6.1616187
## 130 t4 8.8086107
## 131 t4 12.9378929
## 132 t4 6.7605586
## 133 t4 20.7282228
## 134 t4 2.6687386
## 135 t4 16.5528429
## 136 t4 2.6209849
## 137 t4 9.0162082
## 138 t4 7.1830214
## 139 t4 6.3465110
## 140 t4 11.5086994
## 141 t4 1.7961412
## 142 t4 39.5111659
## 143 t4 2.3048776
## 144 t4 7.6635061
## 145 t4 3.9911519
## 146 t4 5.7498916
## 147 t4 98.2157457
## 148 t4 12.3962700
## 149 t4 3.3659137
## 150 t4 2.3617302
H0: The biological control agent has no effect in increasing the mortality of the pest. HA: The biological control agent has a significant effect in increasing the mortality of the pest.
par(mfrow = c(1, 2))
hist(d1$mortality[d1$treatment == "t1"])
hist(d1$mortality[d1$treatment == "t2"])
The distribution is right skewed for all 4 treatments.
par(mfrow = c(1, 2))
hist(log10(d1$mortality[d1$treatment == "t1"]))
hist(log10(d1$mortality[d1$treatment == "t2"]))
I would use the transformed data, as this significantly helped normality.
mod.aov = aov(mortality~treatment, data=d1)
summary(mod.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## treatment 4 2959 739.8 5.069 0.000749 ***
## Residuals 145 21162 145.9
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
There is a difference in the mean mortality of the pest across the different groups. With the F value at 5.069 and the significance code at ***, which means that the p about 0, that means that we can reject the null hypothesis.
come back to this – i actually don’t know
mod.tukey = TukeyHSD(mod.aov)
print(mod.tukey)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = mortality ~ treatment, data = d1)
##
## $treatment
## diff lwr upr p adj
## t1-control 1.6383709 -6.9782430 10.2549847 0.9846614
## t2-control 9.4007737 0.7841599 18.0173876 0.0249811
## t3-control 0.3913522 -8.2252617 9.0079661 0.9999435
## t4-control 9.9363128 1.3196989 18.5529266 0.0150140
## t2-t1 7.7624029 -0.8542110 16.3790168 0.0988879
## t3-t1 -1.2470187 -9.8636325 7.3695952 0.9945629
## t4-t1 8.2979419 -0.3186720 16.9145558 0.0650864
## t3-t2 -9.0094215 -17.6260354 -0.3928077 0.0355977
## t4-t2 0.5355390 -8.0810749 9.1521529 0.9998032
## t4-t3 9.5449606 0.9283467 18.1615744 0.0218409
plot(mod.tukey)
There are 3 mean pairs that have significant difference, t4 and t3, t3
and t2, t4 and the control, and t2 and the control. This is because t3
and the control are very close, and t4 and t2 are different from
both.
#install.packages("multcompView")
require(multcompView)
## Loading required package: multcompView
labels=multcompLetters(mod.tukey$treatment[, "p adj"])$Letters
labels
## t1 t2 t3 t4 control
## "ab" "a" "b" "a" "b"
# Compute means
means=aggregate(mortality ~ treatment, mean, data=d1)
means
## treatment mortality
## 1 control 4.116683
## 2 t1 5.755054
## 3 t2 13.517457
## 4 t3 4.508035
## 5 t4 14.052996
# Compute standard errors
se = function(x) {
sd(x)/sqrt(length(x))
}
stderrs=aggregate(mortality ~ treatment, FUN=se, data=d1)
#cis
ci.upper = means$mortality + 1.96 * stderrs$mortality
ci.lower = means$mortality - 1.96 * stderrs$mortality
# Create barplot figure
bp=barplot(means$mortality, names=means$treatment, ylab="Mean mortality of pest",ylim=c(0, max(ci.upper + 1.9)))
# Add standard error bars
arrows(y0=ci.lower, y1=ci.upper, x0=bp, x1=bp, angle=90, code=3, legnth=0.1)
## Warning in arrows(y0 = ci.lower, y1 = ci.upper, x0 = bp, x1 = bp, angle = 90, :
## "legnth" is not a graphical parameter
box()
# Add text labels
text(x=bp, y=ci.upper, c(labels), pos=3)
t2 and t4 seem equally as strong at increasing the mean mortality rate of pests, they are very close, but it seems like t4 has a slight edge, with slightly higher variance. t2 is close behind, followed far back from t1 and t3, which both have little effect, although it is slightly noticible. The control is expectedly the worst.
# Download the data
d2=read.csv("http://faraway.neu.edu/biostats/lab5_dataset2.csv")