1:8
a) Each row represents a British person (participant) who was surveyed.
b) There were 1691 participants
c) sex: categorical and not ordinal
age: numerical and discrete
marital: categorical and not ordinal
grossIncome: categorical and ordinal
smoke: categorical and not ordinal
amtWeekends: numerical and discrete
amtWeekdays: numerical and discrete
1:10
a) Population: children, sample: 160 children observed.
b) This is an experiment, which can provide evidence of a causal connection where an observational study would
not. It is not, however, randomized so it would not be an ideal study to show a causal relationship in the population at large.
1:28
a) We can not conclude that smoking and dementia have a causal relationship because the study was observational in nature. We can, however, establish a naturally occurring association between the variables.
b) The statement is not correct. We can not conclude that sleep loss causes bullying because the study was observational in nature. It even seems likely that both may be caused by a set of underlying factors
(confounding variables) that are correlated with both. This is partly the kind of question that could
be answered by a well designed experiment. We can, however, establish a naturally occurring association between the variables.
1:36
a) This is a study of mental health.
b) The treatment group was instructed to exercise twice a week. The control group was instructed not to exercise.
c) No, stratified sampling was used instead of blocking. The goal was to get representative proportions.
d) No. The patients knew what treatment they were being asked to apply (exercise).
They probably didn’t know what effect was being studied.
e) This study could be used to establish a causal relationship, because it used an experiment. It assigned
treatment or control randomly, so it would be generalizable to the population at large.
f) I would be concerned how exercise is defined to participants and if the definition of mental health could
reasonably be affected during the time frame of the study.
1:48
## c.57..66..69..71..72..73..74..77..78..78..79..79..81..81..82..
## Min. :57.00
## 1st Qu.:72.75
## Median :78.50
## Mean :77.70
## 3rd Qu.:82.25
## Max. :94.00
1:50 a) This goes with 2. It’s normal, symmetric, unimodal and leptokurtic. As the mean is 60 and the kurtosis is
high, it could not be a standard normal.
b) This goes with 3. It’s multimodal and symmetric and could show sampling from a uniform distribution (from 0 to 100).
c) This goes with 1. It is positively skewed and unimodal and could be from a gamma distribution.
1:56 a) This is right skewed. Its domain is only positive values and it has outliers to the right.
The median would be less affected by the outliers, as will the IQR.
b) This appears to be symmetric. There are few outliers. It looks to be close to a uniform distribution,
so mean and median would be very close, but mean should be used. The IQR will also not differ much
from the standard deviation, but standard deviation should be used.
c) This is right skewed. A mean could be influenced too much by outlying data, so the median and IQR should be used.
d) This has right skew. Therefore, the IQR and median will be more meaningful.
1:70 a) No. Survival is not independent of transplant status. It appears the rate of survivors
among the recipients is more than twice survivors among non-recipients. b) The box plots suggest that transplant recipients survive many more days on average than non-recipients. A
quarter of the recipients survived longer than 650 days while 3/4 of the non-recipients were dead by around a month. c) About 63% of the treatment group died, but around 85% of the control group died. d) i)The claim is that receiving a heart transplant will help an official heart transplant candidate live longer. ii)We write alive on 52 cards representing patients who were alive at the end of the study, and dead on 148 cards
representing patients who were not. Then, we shuffle these cards and split them into two groups: one group of
size 100 representing treatment, and another group of size 100 representing control. We calculate the difference between
the proportion of dead cards in the treatment and control groups (treatment control) and record this value. We
repeat this 100 times to build a distribution centered at 0. Lastly, we calculate the fraction of simulations
where the simulated differences in proportions are at least as great as our observed 22 percent difference. If this fraction is low, we conclude that it is unlikely to have observed such an outcome by chance and that the null hypothesis should be rejected in favor of the alternative. iii) The simulation results suggest the transplant program was effective. There is a 5% chance of seeing a 23% or greater difference between
treatment and control, which we observed.