Task 1

d1 <- read.csv("http://faraway.neu.edu/biostats/lab4_dataset1.csv")
d1
##      treatment age      score
## 1      control  15  4.3567227
## 2      control  23  3.8493535
## 3      control  23  6.9803791
## 4      control  19  1.0894147
## 5      control  11 23.9647335
## 6      control  20  1.3979789
## 7      control  21  4.6481514
## 8      control  20  2.4207560
## 9      control  16  3.4874855
## 10     control  24 59.5708610
## 11     control  19  7.5187178
## 12     control  15  2.0415299
## 13     control  16  1.4324617
## 14     control  16 11.5905151
## 15     control  19  7.2531813
## 16     control  14  5.3759302
## 17     control  21  2.9172397
## 18     control  20  1.6695259
## 19     control  21  2.5213834
## 20     control  21 20.0861155
## 21     control  18  3.9698698
## 22     control   9  1.8507170
## 23     control  20 47.9083889
## 24     control  22 11.3033640
## 25     control  15  5.8203064
## 26     control  15 21.2952288
## 27     control  16 17.9290563
## 28     control  15  3.9779116
## 29     control  19 67.0945262
## 30     control  17  5.7257467
## 31     control  12  1.7780288
## 32     control  22  6.3955370
## 33     control  16  9.0933042
## 34     control  23 74.2901520
## 35     control  16  8.2136908
## 36     control  25 11.6697358
## 37     control  20  6.8404058
## 38     control  16  5.2909571
## 39     control  14  7.1368676
## 40     control  15 16.2426355
## 41     control  11 58.8649012
## 42     control  21 20.6433336
## 43     control  17 24.7273124
## 44     control  21  2.1569099
## 45     control  19 19.7646615
## 46     control  15  9.2066385
## 47     control  14  1.7036108
## 48     control  24 12.4413144
## 49     control  26  6.3043848
## 50     control  16 31.9632662
## 51     control  13  3.4346602
## 52     control  12  4.8056305
## 53     control  28  2.9267439
## 54     control  17  6.1897583
## 55     control  20 11.0453749
## 56     control  18  3.5546330
## 57     control  14 16.9517855
## 58     control  11  2.2076249
## 59     control  24  2.5909266
## 60     control  18 31.2230844
## 61  vaccinated  27  0.9842774
## 62  vaccinated  20  4.1040517
## 63  vaccinated  14  1.8569288
## 64  vaccinated  17  4.0935061
## 65  vaccinated  20 14.7150869
## 66  vaccinated  22 13.2843737
## 67  vaccinated  18  1.9524641
## 68  vaccinated  17  0.2765854
## 69  vaccinated  14 33.0381049
## 70  vaccinated  15  5.2966056
## 71  vaccinated  23  4.6707859
## 72  vaccinated  21  2.6821011
## 73  vaccinated  17  4.5272216
## 74  vaccinated  16  2.3062531
## 75  vaccinated  16  4.1399953
## 76  vaccinated  22  1.8216693
## 77  vaccinated  16  0.6905908
## 78  vaccinated  23  7.2997366
## 79  vaccinated  20 12.4254281
## 80  vaccinated  23  1.9962281
## 81  vaccinated  14  0.7762429
## 82  vaccinated  22  5.1667368
## 83  vaccinated  20  2.5994266
## 84  vaccinated  24  0.4803605
## 85  vaccinated  20  2.7240830
## 86  vaccinated  12  1.4472999
## 87  vaccinated  18  1.9329192
## 88  vaccinated  22  0.8550696
## 89  vaccinated  21 16.4963956
## 90  vaccinated  14  1.9520263
## 91  vaccinated  14  0.5457941
## 92  vaccinated  13  3.3108119
## 93  vaccinated  25  3.5366348
## 94  vaccinated  21  1.0142742
## 95  vaccinated  21  0.1512350
## 96  vaccinated  19  1.4326391
## 97  vaccinated  25  4.8090888
## 98  vaccinated  15  2.5606899
## 99  vaccinated  18  2.4640868
## 100 vaccinated  18  4.7627285
## 101 vaccinated  10  0.8298929
## 102 vaccinated  16  8.1398931
## 103 vaccinated  16  2.7037940
## 104 vaccinated  16  5.5141122
## 105 vaccinated  20  7.6454273
## 106 vaccinated  17  3.3989971
## 107 vaccinated  15  1.1289550
## 108 vaccinated  13  8.6968819
## 109 vaccinated  17  0.3678188
## 110 vaccinated  17  1.5765032
## 111 vaccinated  21  2.1050291
## 112 vaccinated  17  2.3022317
## 113 vaccinated  17  7.5418228
## 114 vaccinated  15  3.1149774
## 115 vaccinated  24  4.0843704
## 116 vaccinated  15  2.5353842
## 117 vaccinated  20  2.1219504
## 118 vaccinated  11  5.4496468
## 119 vaccinated  13  8.5525404
## 120 vaccinated  16  0.2458346

1.1

HO: There is no difference in the mean age between the control and treatment groups. HA: There is a significant difference in the mean age between the control and treatment groups.

1,2

hist(d1$age)

1.3

The distribution seems very close to normal, with a few exceptions

par(mfrow=c(1,2))
hist(d1$age)
hist(log10(d1$age))

The data looks worse with the log. Transofrmation is not necessary in this case.

1.5

se = function(x) {
  sd(x)/sqrt(length(x))
}
mean.age = aggregate(age~treatment, data=d1, FUN=mean)
mean.age
##    treatment      age
## 1    control 17.96667
## 2 vaccinated 18.05000
mean(mean.age$age)
## [1] 18.00833
stderr.age = aggregate(age~treatment, data=d1, FUN=se)

ci.upper = mean.age$age + 1.96 * stderr.age$age
ci.lower = mean.age$age - 1.96 * stderr.age$age

1.6

H0: The vaccine has no effect on HPV rates. HA: The vaccine has a significant effect on HPV rates.

1.7

hist(d1$score)

This is very clearly not normal.

1.8

qqnorm(d1$score)
qqline(d1$score, col = "red")

This is clearly not following the qq plot. The distribution is not normal.

1.9

qqnorm(log10(d1$score))
qqline(log10(d1$score), col = "red")

While it still isn’t completely normal, it is significantly better.

1.10

control = log10(subset(d1, treatment == "control", select = "score"))
vaccinated = log10(subset(d1, treatment == "vaccinated", select = "score"))
t.test(control, vaccinated)
## 
##  Welch Two Sample t-test
## 
## data:  control and vaccinated
## t = 5.1839, df = 118, p-value = 9.099e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.2725156 0.6094143
## sample estimates:
## mean of x mean of y 
## 0.8543715 0.4134065

There is a significant difference in mean HPV scores. The p value of the t test is 9.099e-07, which means at a n alpha of 0.5, the null hypothesis has been rejected.

1.11

mean.score = aggregate(score~treatment, data=d1, FUN=mean)
mean.score
##    treatment    score
## 1    control 13.01176
## 2 vaccinated  4.32061
mean(mean.score$score)
## [1] 8.666184
stderr.score = aggregate(score~treatment, data=d1, FUN=se)

ci.upper = mean.score$score + 1.96 * stderr.score$score
ci.lower = mean.score$score - 1.96 * stderr.score$score

bp = barplot(mean.score$score, names=mean.score$treatment, ylim=c(0, max(ci.upper + 1.9)), ylab="HPV Score")
arrows(y0=ci.lower, y1=ci.upper, x0=bp, x1=bp, angle=90, code=3, legnth=0.1)
## Warning in arrows(y0 = ci.lower, y1 = ci.upper, x0 = bp, x1 = bp, angle = 90, :
## "legnth" is not a graphical parameter
text(x=bp, y=ci.upper, c("a", "b"), pos=3)

#### 1.12

I would appove this vaccine for HPV

1.13

avg = mean.score$score[mean.score$treatment == "control"] / mean.score$score[mean.score$treatment == "vaccinated"]
avg
## [1] 3.011556

The mean HPV score in the control group is over 3 times higher than the mean HPV score in the vaccinated group.

Task 2

2.1

# Download the data
d3 <- read.csv("http://faraway.neu.edu/biostats/lab4_dataset3.csv")
head(d3)
##   year    autism  vaccine
## 1 1950 0.8747092 5.079621
## 2 1951 1.0775450 4.979636
## 3 1952 0.9145069 5.272306
## 4 1953 1.4415051 5.080250
## 5 1954 1.2291669 5.694768
## 6 1955 1.0399880 5.906284

2.2

par(mfrow=c(1,2))
hist(d3$autism)
hist(d3$vaccine)

People might be convinced because they follow similar distributions. This doesn’t mean necessarily however, that with a higher vaccination rate, there is a higher autism rate.

2.3

There would be a treatment and placebo control group. Both will be brought into an exam room and injected, and told they are getting the vaccine. I will pick random participants with replacement. Both will have to be injected with some substance, because it could be the injection itself which causes autism. Both groups will come back in after a few months or years in order to check for autism, where the exact same room, examiner, and procedure will be used.

2.4

# Download the data
d4 <- read.csv("http://faraway.neu.edu/biostats/lab4_dataset4.csv")
d4
##     treatment     autism
## 1     control  3.5735024
## 2     control  8.0336850
## 3     control  2.8990161
## 4     control 32.9595418
## 5     control  9.2952895
## 6     control  2.9433006
## 7     control 10.8854720
## 8     control 13.9897470
## 9     control 11.8909945
## 10    control  4.9264153
## 11    control 30.3191998
## 12    control  9.8733898
## 13    control  3.5921806
## 14    control  0.7300079
## 15    control 20.5925821
## 16    control  6.3921226
## 17    control  6.5785196
## 18    control 17.1815513
## 19    control 15.1988717
## 20    control 12.1084229
## 21    control 16.7597030
## 22    control 14.6162847
## 23    control  7.2034853
## 24    control  0.9145239
## 25    control 12.4264311
## 26    control  6.3209610
## 27    control  5.7213483
## 28    control  1.5361014
## 29    control  4.1447810
## 30    control 10.1547498
## 31    control 26.0151628
## 32    control  6.0328062
## 33    control  9.8519717
## 34    control  6.3356661
## 35    control  1.6869808
## 36    control  4.4149894
## 37    control  4.5073529
## 38    control  6.3008630
## 39    control 20.0860465
## 40    control 14.3417627
## 41    control  5.6716291
## 42    control  5.1895050
## 43    control 13.4229157
## 44    control 11.6658200
## 45    control  3.3576600
## 46    control  3.2953252
## 47    control  9.6270993
## 48    control 14.4188002
## 49    control  5.9754164
## 50    control 16.1368863
## 51    control  9.9553080
## 52    control  3.6254326
## 53    control  9.4038548
## 54    control  2.1611423
## 55    control 28.0229468
## 56    control 48.4435837
## 57    control  4.6310264
## 58    control  2.3534101
## 59    control 11.8191326
## 60    control  5.8412534
## 61    vaccine 90.1628753
## 62    vaccine  7.8519350
## 63    vaccine 16.2767769
## 64    vaccine  8.3980720
## 65    vaccine  3.8834611
## 66    vaccine  9.8630189
## 67    vaccine  1.3431819
## 68    vaccine 35.3590673
## 69    vaccine  9.5186529
## 70    vaccine 71.7086707
## 71    vaccine 13.1380096
## 72    vaccine  4.0150651
## 73    vaccine 15.0401960
## 74    vaccine  3.2088171
## 75    vaccine  2.3311614
## 76    vaccine 10.9292888
## 77    vaccine  5.2420263
## 78    vaccine  8.1752014
## 79    vaccine  8.7963892
## 80    vaccine  4.5288999
## 81    vaccine  4.6243289
## 82    vaccine  7.1336383
## 83    vaccine 26.5249818
## 84    vaccine  1.7796793
## 85    vaccine 14.7899247
## 86    vaccine 11.3924445
## 87    vaccine 23.6437743
## 88    vaccine  6.0243891
## 89    vaccine 11.8226692
## 90    vaccine 10.6664019
## 91    vaccine  4.7468440
## 92    vaccine 27.3267973
## 93    vaccine 26.0600272
## 94    vaccine 16.4481605
## 95    vaccine 39.9182437
## 96    vaccine 14.2746670
## 97    vaccine  2.2782504
## 98    vaccine  4.6031212
## 99    vaccine  2.3998048
## 100   vaccine  5.0865478
## 101   vaccine  4.3913352
## 102   vaccine  8.5174404
## 103   vaccine  3.2840531
## 104   vaccine  9.5642173
## 105   vaccine  4.2436144
## 106   vaccine 47.8125075
## 107   vaccine 16.7217033
## 108   vaccine 20.2909349
## 109   vaccine 11.9913476
## 110   vaccine 43.9114928
## 111   vaccine  4.3243574
## 112   vaccine  5.1466976
## 113   vaccine 34.2019356
## 114   vaccine  4.2601469
## 115   vaccine  6.6367292
## 116   vaccine  5.5134583
## 117   vaccine  5.9298987
## 118   vaccine  6.1773334
## 119   vaccine 13.3857182
## 120   vaccine  6.8391915

2.5

hist(log10(d4$autism))

d4control = log10(subset(d4, treatment == "control", select = "autism"))
d4vaccinated = log10(subset(d4, treatment == "vaccine", select = "autism"))
t.test(d4control, d4vaccinated)
## 
##  Welch Two Sample t-test
## 
## data:  d4control and d4vaccinated
## t = -1.2634, df = 117.48, p-value = 0.209
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.22768101  0.05032468
## sample estimates:
## mean of x mean of y 
## 0.8718967 0.9605749

There is no significant difference in autism scores between treatment groups. The p value of the t test is 0.209, which means at a n alpha of 0.5, the null hypothesis has failed to have been rejected.

2.6

mean.autism = aggregate(autism~treatment, data=d4, FUN=mean)
mean.autism
##   treatment   autism
## 1   control 10.30590
## 2   vaccine 14.24099
mean(mean.autism$autism)
## [1] 12.27345
stderr.autism = aggregate(autism~treatment, data=d4, FUN=se)

ci.upper = mean.autism$autism + 1.96 * stderr.autism$autism
ci.lower = mean.autism$autism - 1.96 * stderr.autism$autism

bp = barplot(mean.autism$autism, names=mean.autism$treatment, ylim=c(0, max(ci.upper + 1.9)), ylab="Autism Score")
arrows(y0=ci.lower, y1=ci.upper, x0=bp, x1=bp, angle=90, code=3, legnth=0.1)
## Warning in arrows(y0 = ci.lower, y1 = ci.upper, x0 = bp, x1 = bp, angle = 90, :
## "legnth" is not a graphical parameter
text(x=bp, y=ci.upper, c("a", "b"), pos=3)

2.7

I would say that there is no significant relationship between the incidence of autism and vaccination in the study. The numbers are close enough together between the groups that the vaccine has no significant effect on causeing autism on people.