Task 1

1.1

d1=read.csv("http://faraway.neu.edu/biostats/lab5_dataset1.csv")
d1
##     treatment  mortality
## 1     control  1.4528777
## 2     control  3.2662526
## 3     control  1.1786520
## 4     control 13.4003497
## 5     control  3.7791827
## 6     control  1.1966567
## 7     control  4.4257026
## 8     control  5.6878067
## 9     control  4.8345176
## 10    control  2.0029310
## 11    control 12.3268667
## 12    control  4.0142207
## 13    control  1.4604716
## 14    control  0.2967991
## 15    control  8.3723191
## 16    control  2.5988431
## 17    control  2.6746265
## 18    control  6.9854975
## 19    control  6.1794001
## 20    control  4.9229174
## 21    control  6.8139867
## 22    control  5.9425379
## 23    control  2.9287186
## 24    control  0.3718177
## 25    control  5.0522099
## 26    control  2.5699110
## 27    control  2.3261266
## 28    control  0.6245322
## 29    control  1.6851422
## 30    control  4.1286132
## 31         t1 14.2774240
## 32         t1  3.3108742
## 33         t1  5.4068767
## 34         t1  3.4770873
## 35         t1  0.9258347
## 36         t1  2.4229976
## 37         t1  2.4736877
## 38         t1  3.4579869
## 39         t1 11.0234561
## 40         t1  7.8709262
## 41         t1  3.1126561
## 42         t1  2.8480607
## 43         t1  7.3666523
## 44         t1  6.4023378
## 45         t1  1.8427229
## 46         t1  1.8085128
## 47         t1  5.2834641
## 48         t1  7.9132053
## 49         t1  3.2793781
## 50         t1  8.8561110
## 51         t1  5.4635889
## 52         t1  1.9896796
## 53         t1  5.1609449
## 54         t1  1.1860600
## 55         t1 15.3793193
## 56         t1 26.5864024
## 57         t1  2.5415612
## 58         t1  1.2915788
## 59         t1  6.4864775
## 60         t1  3.2057478
## 61         t2 81.5827433
## 62         t2  7.1047246
## 63         t2 14.7278368
## 64         t2  7.5988898
## 65         t2  3.5139009
## 66         t2  8.9244286
## 67         t2  1.2153613
## 68         t2 31.9942072
## 69         t2  8.6128333
## 70         t2 64.8846884
## 71         t2 11.8877627
## 72         t2  3.6329812
## 73         t2 13.6089322
## 74         t2  2.9034578
## 75         t2  2.1093221
## 76         t2  9.8892295
## 77         t2  4.7431816
## 78         t2  7.3972281
## 79         t2  7.9593021
## 80         t2  4.0979181
## 81         t2  4.1842659
## 82         t2  6.4547829
## 83         t2 24.0007960
## 84         t2  1.6103205
## 85         t2 13.3824773
## 86         t2 10.3083101
## 87         t2 21.3937717
## 88         t2  5.4510927
## 89         t2 10.6975935
## 90         t2  9.6513595
## 91         t3  1.5800871
## 92         t3  9.0963006
## 93         t3  8.6746295
## 94         t3  5.4751170
## 95         t3 13.2876290
## 96         t3  4.7516239
## 97         t3  0.7583637
## 98         t3  1.5322459
## 99         t3  0.7988256
## 100        t3  1.6931647
## 101        t3  1.4617485
## 102        t3  2.8352096
## 103        t3  1.0931663
## 104        t3  3.1836514
## 105        t3  1.4125765
## 106        t3 15.9154012
## 107        t3  5.5661715
## 108        t3  6.7542655
## 109        t3  3.9915729
## 110        t3 14.6168662
## 111        t3  1.4394535
## 112        t3  1.7131868
## 113        t3 11.3848354
## 114        t3  1.4180797
## 115        t3  2.2091753
## 116        t3  1.8352708
## 117        t3  1.9738918
## 118        t3  2.0562557
## 119        t3  4.4557185
## 120        t3  2.2765691
## 121        t4  7.3451669
## 122        t4 46.6670723
## 123        t4  9.8298197
## 124        t4 10.1801879
## 125        t4 11.0210740
## 126        t4 24.8452430
## 127        t4 11.3184665
## 128        t4 11.7325359
## 129        t4  6.1616187
## 130        t4  8.8086107
## 131        t4 12.9378929
## 132        t4  6.7605586
## 133        t4 20.7282228
## 134        t4  2.6687386
## 135        t4 16.5528429
## 136        t4  2.6209849
## 137        t4  9.0162082
## 138        t4  7.1830214
## 139        t4  6.3465110
## 140        t4 11.5086994
## 141        t4  1.7961412
## 142        t4 39.5111659
## 143        t4  2.3048776
## 144        t4  7.6635061
## 145        t4  3.9911519
## 146        t4  5.7498916
## 147        t4 98.2157457
## 148        t4 12.3962700
## 149        t4  3.3659137
## 150        t4  2.3617302

H0: The biological control agent has no effect in increasing the mortality of the pest. HA: The biological control agent has a significant effect in increasing the mortality of the pest.

1.2

par(mfrow = c(1, 2))
hist(d1$mortality[d1$treatment == "t1"])
hist(d1$mortality[d1$treatment == "t2"])

The distribution is right skewed for all 4 treatments.

1.3

par(mfrow = c(1, 2))
hist(log10(d1$mortality[d1$treatment == "t1"]))
hist(log10(d1$mortality[d1$treatment == "t2"]))

I would use the transformed data, as this significantly helped normality.

1.4

mod.aov = aov(mortality~treatment, data=d1)
summary(mod.aov)
##              Df Sum Sq Mean Sq F value   Pr(>F)    
## treatment     4   2959   739.8   5.069 0.000749 ***
## Residuals   145  21162   145.9                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

1.5

There is a difference in the mean mortality of the pest across the different groups. With the F value at 5.069 and the significance code at ***, which means that the p about 0, that means that we can reject the null hypothesis.

1.6

come back to this – i actually don’t know

1.7

mod.tukey = TukeyHSD(mod.aov)
print(mod.tukey)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = mortality ~ treatment, data = d1)
## 
## $treatment
##                  diff         lwr        upr     p adj
## t1-control  1.6383709  -6.9782430 10.2549847 0.9846614
## t2-control  9.4007737   0.7841599 18.0173876 0.0249811
## t3-control  0.3913522  -8.2252617  9.0079661 0.9999435
## t4-control  9.9363128   1.3196989 18.5529266 0.0150140
## t2-t1       7.7624029  -0.8542110 16.3790168 0.0988879
## t3-t1      -1.2470187  -9.8636325  7.3695952 0.9945629
## t4-t1       8.2979419  -0.3186720 16.9145558 0.0650864
## t3-t2      -9.0094215 -17.6260354 -0.3928077 0.0355977
## t4-t2       0.5355390  -8.0810749  9.1521529 0.9998032
## t4-t3       9.5449606   0.9283467 18.1615744 0.0218409

1.8

plot(mod.tukey)

There are 3 mean pairs that have significant difference, t4 and t3, t3 and t2, t4 and the control, and t2 and the control. This is because t3 and the control are very close, and t4 and t2 are different from both.

1.9

#install.packages("multcompView")
require(multcompView)
## Loading required package: multcompView

1.10

labels=multcompLetters(mod.tukey$treatment[, "p adj"])$Letters
labels
##      t1      t2      t3      t4 control 
##    "ab"     "a"     "b"     "a"     "b"

1.11

# Compute means
means=aggregate(mortality ~ treatment, mean, data=d1)
means
##   treatment mortality
## 1   control  4.116683
## 2        t1  5.755054
## 3        t2 13.517457
## 4        t3  4.508035
## 5        t4 14.052996
# Compute standard errors
se = function(x) {
  sd(x)/sqrt(length(x))
}
stderrs=aggregate(mortality ~ treatment, FUN=se, data=d1)

#cis
ci.upper = means$mortality + 1.96 * stderrs$mortality
ci.lower = means$mortality - 1.96 * stderrs$mortality

# Create barplot figure
bp=barplot(means$mortality, names=means$treatment, ylab="Mean mortality of pest",ylim=c(0, max(ci.upper + 1.9)))

# Add standard error bars
arrows(y0=ci.lower, y1=ci.upper, x0=bp, x1=bp, angle=90, code=3, legnth=0.1)
## Warning in arrows(y0 = ci.lower, y1 = ci.upper, x0 = bp, x1 = bp, angle = 90, :
## "legnth" is not a graphical parameter
box()
# Add text labels
text(x=bp, y=ci.upper, c(labels), pos=3)

1.12

t2 and t4 seem equally as strong at increasing the mean mortality rate of pests, they are very close, but it seems like t4 has a slight edge, with slightly higher variance. t2 is close behind, followed far back from t1 and t3, which both have little effect, although it is slightly noticible. The control is expectedly the worst.

Task 2

2.1

# Download the data
d2=read.csv("http://faraway.neu.edu/biostats/lab5_dataset2.csv")