Content you should have understood before watching this video:
- Number 1, ‘Variables’
- Number 2, ‘Variation in data’
- Number 3, ‘Basic statistical metrics’
- Number 4, ‘Standard deviation and standard error’
- Number 5, ‘Populations, samples, hypotheses’
- Number 6, ‘Distributions’
- Number 8, ‘Quantiles and probabilities’
- Number 11, ‘Error types’
The simplest experiment
- Only one response variable, one predictor
- The predictor variable is binomial (e.g. treated, non-treated)
- For example, we can ask if the movie ‘Scream 2’ is scarier than the original ‘Scream’.
- We could measure heart rates (which indicate anxiety) during both films and compare them.
This situation can be analysed with a t-test:
scream1 <- c(180, 165, 122, 156, 170) #max heart rates scream1 scream2 <- c(190, 145, 100, 138, 166) #max heart rates scream2 t.test(scream1, scream2)
The types of t-tests
- Independent t-test (or simply t-test)
- Compares two means based on independent data.
- E.g. data from different groups of people
- Paired (or dependent) t-test
- Compares two means based on related data.
- E.g. data from the same people measured at different times.
- Data from ‘matched’ samples (e.g. before - after)
- One-tailed vs. two-tailed testing
Rationale for the t-test
- Two samples of data are collected and the sample means calculated. These means might differ by either a little or a lot.
- When assessing the difference between the samples, we look at the difference between the means, but also the spread
- As the observed difference gets larger and the spread smaller, we have more confident that the different isn’t purely by chance!
Rationale for the t-test
We need a metric (a test statistic!) that puts the difference between the samples into perspective with
- the difference between the samples that we would expect by chance, and
- the standard deviations of the two samples
This is called the t-statistic:
\[t = \frac{\text{observed difference - expected difference}}{\text{estimate of the standard deviations}}\]
In fact, the expected difference is mostly zero (this is the case in the following examples)
The t-statistic, the test statistic for a t-test
\[ t = \frac{\bar{X_1}-\bar{X_2}}{\sqrt{\frac{s^2_p}{n_1} + \frac{s^2_p}{n_2}}} \]
\[ s^2_p = \frac{(n_1 - 1)s^2_1 + (n_2 - 1)s^2_2}{n_1 + n_2 -2} \]
The t-distribution
Use the equivalent commands to rnorm(), pnorm(), and qnorm()
rt(100, df = 10); pt(q = 0, df = 10); qt(p = .025, df = 10)
Arachnophobia example
- Is arachnophobia (fear of spiders) specific to real spiders or is a picture enough?
- Participants
- 12 arachnophobic individuals
- Manipulation
- 6 participants were exposed to a real spider
- 6 were exposed to a picture of the same spider
- Response variable: anxiety (using an imaginary anxiety meter…)
- Our null hypothesis is: There is no difference in anxiety between seeing a real spider or a picture of a spider
realspider <- c(3, 5, 3, 7, 8, 5) spiderpicture <- c(5, 6, 3, 8, 7, 8)
Arachnophobia example
You can now organise your data in two ways, the so-called ‘wide’- or ‘long’ format:
d_wide <- data.frame(realspider, spiderpicture)
d_long <- data.frame(treat = c(0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1),
anxiety = c(realspider, spiderpicture))
head(d_wide, 3)
realspider spiderpicture
1 3 5
2 5 6
3 3 3
head(d_long, 3)
treat anxiety
1 0 3
2 0 5
3 0 3
The t-test in R
- To do a t-test we use the function
t.test() - Depending on the format of your data, you can use this function in two ways:
t.test(d_wide$realspider, d_wide$spiderpicture) t.test(d_long$anxiety ~ d_long$treat) #or: t.test(anxiety ~ treat, data = d_long)
What does the dollar sign do again…?
We recommend the long format, it is more versatile and easier to use when you have a lot of data!
The results however looks the same:
The t-test in R
t.test(d_long$anxiety ~ d_long$treat)
Welch Two Sample t-test
data: d_long$anxiety by d_long$treat
t = -0.86966, df = 9.9746, p-value = 0.4049
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
-3.562974 1.562974
sample estimates:
mean in group 0 mean in group 1
5.166667 6.166667
How do we interpret this output?
Comparing the t-value against a random t-distribution
Apart from a recall of the data, the group means and a confidence interval for the difference between groups, we obtain a t-value, the degrees of freedom, and a p-value
We now compare our obtained t-value against a random t-distribution: how rare is our t-value, which reflects the difference between the groups and their standard deviations?
pt(q = -0.87, df = 10) [1] 0.20235
This is our p-value! (in fact we need to multiply it by 2 to account for the two tails)
The t-test in R
qt(p = .025, df = 10) [1] -2.228139
Arachnophobia example: our conclusion
We cannot reject our null hypothesis
There is no difference in anxiety between seeing a real spider or a picture of a spider
Importantly, this does NOT mean that there is no difference between the two groups, as we could be committing a type II error.
We cannot indicate a probability for such an error! Therefore we say ‘we fail to reject the null hypothesis’, rather than ‘we accept the null hypothesis’
Note the unreasonably low sample size though!
How to report the results of a t-test
On average, participants did not experience greater anxiety from real spiders (6.1 \(\pm\) 0.79) than from pictures of spiders (5.1 \(\pm\) 0.83; t-test, p = 0.4).
or, if it were significant:
On average, participants did experience greater anxiety from real spiders (xy \(\pm\) xy) than from pictures of spiders (xy \(\pm\) x; t-test, p < xyz).
Paired t-test: example
- Does plant transpiration respond to treatment with carbon dioxide?
- Null hypothesis: plant transpiration is not affected by carbon dioxide
- Measure transpiration in 12 leaves before and after treatment with carbon dioxide
- The measurements are paired: the 12 leaves are the same before and after the treatment
In R:
t.test(d1$transpiration ~ d1$co2, paired = T) #set argument paired to TRUE
One- vs. two-tailed tests
- Depending on whether we expect differences between groups to occur in both directions or only in one direction, we use 1- or 2-tailed t-tests
- In the spider example, differences could occur in both directions (those shown a picture could be more afraid): 2-tailed
- If you test whether carrying backpacks makes people shorter (paired, before and after): 1-tailed (carrying backpacks can’t make you taller!
In R, choose alternative = 'two-sided', 'greater' or 'less' (by default the argument is set to ‘two-sided’:
t.test(d1$transpiration ~ d1$co2, paired = T, alternative = 'greater')
Assumptions of a t-test
- Both the independent t-test and the paired t-test are parametric tests based on the normal distribution. Therefore, they assume:
- The sampling distribution is normally distributed. In the paired t-test this means that the sampling distribution of the differences between scores should be normal, not the scores themselves (assumption of normality).
- This means that low sample sizes are generally problematic (30 or less per group) as difficult to assess
- Variances in the two samples are roughly equal (assumption of homogeneity of variance). This assumption however can be relaxed if accounted for
- Scores are independent (unless in a paired t-test) (assumption of independence)
In a nutshell
- The t-test is based on the t-distribution, which is similar to the normal distribution
- You can specify a t-test in R using the long or wide format
- Two types of t-tests: independent and a paired, know how they are used
- how to formulate a null hypothesis for a t-test
- how to run a t-test in R
- how to interpret the output of a t-test
- Remember how the argument ‘alternative’ is used (one- and two-tailed tests), and what it means
- You can commit a type I or type II error in a t-test
- Some important assumptions need to be met in a t-test, do not trust the results if they are not met!