1.8 Smoking habits of UK residents
age, amtWeekends (after recoding), amtWeekdays (after recoding) are all numeric variables; amtWeekends and amtWeekdays need to be cleaned to extract the numeric data.
age is reported as discrete, while amtWeekends and amtWeekdays are discrete (I assume one cannot report that one half of a cigarette is smoked in this study)
1.10 Cheaters, scope of inference.
1.28 Reading the paper.
1.36 Exercise and mental health.
1.48 Stats scores.
It seems as if Q1 and Q3 in the table given in the text are not accurate.
score <- c(57, 66, 69, 71, 72, 73, 74, 77, 78, 78, 79, 79, 81, 81, 82, 83, 83, 88, 89, 94)
score<-as.data.frame(score)
library(ggplot2)
library(ggthemes)
summary(score)
## score
## Min. :57.00
## 1st Qu.:72.75
## Median :78.50
## Mean :77.70
## 3rd Qu.:82.25
## Max. :94.00
ggplot(score, aes(y=score, fill=""))+
geom_boxplot()+
stat_summary(geom="text",fun.y=quantile,
aes(x=.45,label=sprintf("%1.1f",..y..)))+
theme_bw()+
ggtitle("Final Exam Scores",subtitle="20 Introductory Statistics Students")+
xlab("")+
ylab("score")+
theme(legend.position="none")+
theme(axis.text.y = element_blank())+
coord_flip()
1.50 Mix-and-match.
1.56 Distributions and appropriate statistics, Part II.
1.70 Heart transplants.
rm(list=ls())
library(openintro)
library(plyr)
data(heartTr)
hrtcon<-prop.table(table(heartTr$survived,heartTr$transplant),2)
hrtcon
##
## control treatment
## alive 0.1176471 0.3478261
## dead 0.8823529 0.6521739
65.2% of those in the treatment group died while 88.2% of those in the control group died.
i. The claim being tested is that an experimental heart transplant program increases lifespan.
ii. We write alive on [alive, 28] cards representing patients who were alive at the end of the study, and dead on [dead, 75] cards representing patients who were not. Then, we shuffle these cards and split them into two groups: one group of size [treatment, 69] representing treatment, and another group of size [control, 34] representing control. We calculate the difference between the proportion of dead cards in the treatment and control groups (treatment - control) and record this value. We repeat this 100 times to build a distribution centered at zero. Lastly, we calculate the fraction of simulations where the simulated differences in proportions are at least .882 - .652 = 0.23. If this fraction is low, we conclude that it is unlikely to have observed such an outcome by chance and that the null hypothesis should be rejected in favor of the alternative.
iii. From the simulation results we can reject the null hypothesis and conclude that the transplant is effective.