sleep <- read.csv("sleep.csv")
sleep
##    improve    group
## 1     25.2   Unrest
## 2     14.5   Unrest
## 3     -7.0   Unrest
## 4     12.6   Unrest
## 5     34.5   Unrest
## 6     45.6   Unrest
## 7     11.6   Unrest
## 8     18.6   Unrest
## 9     12.1   Unrest
## 10    30.5   Unrest
## 11   -10.7 Deprived
## 12     4.5 Deprived
## 13     2.2 Deprived
## 14    21.3 Deprived
## 15   -14.7 Deprived
## 16   -10.7 Deprived
## 17     9.6 Deprived
## 18     2.4 Deprived
## 19    21.8 Deprived
## 20     7.2 Deprived
## 21    10.0 Deprived

The improve column tells you how much each subject has improved on a cognitive task. The group column tells you what group the subject is in. Subjects were randomly assigned to two groups. In the unrest group, subjects were allowed unrestricted sleep throughout. In the “Deprived” group, subjects were deprived sleep then allowed to try to catch up.

We are looking for evidence that subjects in the “Depived” group perform worse on cognitive tasks. So we are going to frame our hypotheses.

Typically we design the experiment before we collect data. But we have already collected data, so let’s visualize it.

library(ggplot2)
ggplot(data=sleep, mapping=aes(x=group, y=improve)) + geom_point()

What this plot shows are samples from two populations: the population of sleep deprived, and non-sleep deprived individuals. We can see that there is a lot of variability within each population, and a sample size of 10 or 11 might show different things, depending on what sample is drawn.

We have to get a p-value to be sure.

Now let’s frame the hypotheses.

Null hypothesis: mean of the “Unrest” group is equal to the mean of the “Deprived” group. (Differences of mean is zero).

Alternative hypothesis: The means are different, OR the mean of the Unrest group is greater than the mean of the Deprived group. These statements are usually written in terms of differences (difference greater than 0 or difference not equal to 0).

How would you choose between greater than (o less than) and not equal to. The conservative choice is to pick not equal to and that is what some recommend.

Decide on one sample or two sample: two.

Decide on type of test (t test or proportions test) t-test because we are testing means of a quantitative variable.