T-tests in R are relatively simple to use and only require one or two vectors of numerical data (depending on if it is a one or two-sample t-test) in order for the analysis to run. Below is an example of the R t-test code:
# Dummy t-test code
t.test(x, y = NULL, alternative = c("two.sided", "less", "greater"), mu = 0, paired =
FALSE, var.equal = FALSE, conf.level = 0.95)
The parameters of the t-test are simple to use. x
is a numeric vector of data while y
is an additional numeric vector of data, although it is not required for the test. When running t.test
, the default value for y
is NULL
. If you only specify x
, it will run a one-sample t-test. Specifying y
will override the default value and run a two-sample t-test.
The other parameters of t.test
are also important:
alternative
parameter. You can override this by specifying “less” or “greater”, which will run the analysis as a one-tailed t-test with the criterion being located at the lower or upper regions of the distribution, respectively.conf.level
will set a new confidence level in the t-test.paired = TRUE
.mu
. For two-sample t-tests, this will set the true difference of the two means to the specified value.var.equal
allows you to specify whether two sets of data have equal variance or not. (Note that this is only for two-sample t-tests.) The default value is FALSE
, which assumes inequal variance between datasets and uses the Welch approximation for degrees of freedom. Specifying var.equal = TRUE
will estimate variance using pooled variance.Below is an example of a one-sample t-test using a dataset provided by R. The Sleep data show the effect of two soporific drugs on 10 patients, where the observed values are differences in hours of sleep compared to the control group (the “extra” column).
Here, we are going to run a t-test on all observed data. To do this, we will specify x
as the vector of values from the “extra” column and submit it to a one-sample t-test. We will do this by specifying x
as sleep$extra
, which pulls only the “extra” column from the “sleep” dataset. Our null hypothesis is that the average differences in sleep are equal to zero (which is the default mu
value) while our alternative hypothesis is that the average differences in sleep are greater than zero (which is specified using the alternative
parameter):
\(H_O: \mu_{change} = 0\)
\(H_A: \mu_{change} > 0\)
# One-sample t-test using Sleep data.
t.test(x = sleep$extra, alternative = "greater")
##
## One Sample t-test
##
## data: sleep$extra
## t = 3.413, df = 19, p-value = 0.001459
## alternative hypothesis: true mean is greater than 0
## 95 percent confidence interval:
## 0.7597797 Inf
## sample estimates:
## mean of x
## 1.54
Using the statistics provided above, we can reject the null hypothesis and say that the change in all sleep scores when two soporific drugs are introduced is significantly greater than zero, t(19) = 3.413, p = .001.
For a two-sample t-test, we are first going to create two separate data frames for Groups 1 and 2 (i.e., exposure to the two soporific drugs) so that we can compare their outcome values. To do this, we are just going to take subsets of the original Sleep dataset:
# Separate Groups 1 and 2
sleep1 <- subset(sleep, group == 1)
sleep2 <- subset(sleep, group == 2)
Which we will then submit to a two-sample t-test. For demonstration purposes, we will assume equal variances between the groups. Our hypothesis is that Drug 2 is more effective than Drug 1 at increasing hours of sleep:
\(H_O: \mu_{change2} = \mu_{change1}\)
\(H_A: \mu_{change2} > \mu_{change1}\)
Or:
\(H_O: \mu_{change2} - \mu_{change1} = 0\)
\(H_A: \mu_{change2} - \mu_{change1} > 0\)
# Run two-sample t-test
t.test(x = sleep2$extra, y = sleep1$extra, alternative = "greater", var.equal = TRUE)
##
## Two Sample t-test
##
## data: sleep2$extra and sleep1$extra
## t = 1.8608, df = 18, p-value = 0.03959
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## 0.1076222 Inf
## sample estimates:
## mean of x mean of y
## 2.33 0.75
Based on the results of this test, we can reject the null hypothesis and conclude that individuals who take Drug 2 have more increases in hours of sleep compared to control than do individuals who take Drug 1, t(18) = 1.8608, p = .04.
Although the Sleep dataset consists of 20 participants split into two separate treatment groups (i.e., Drug 1 and Drug 2), let’s assume that the study was within-subjects, or that 10 participants took both drugs on separate occasions.
\(H_O: \mu_{change2} - \mu_{change1} = 0\)
\(H_A: \mu_{change2} - \mu_{change1} > 0\)
In this case, the only required change to the code presented above is specifying the correct paired
command:
# Run two-sample t-test
t.test(x = sleep2$extra, y = sleep1$extra, alternative = "greater", paired = TRUE, var.equal = TRUE)
##
## Paired t-test
##
## data: sleep2$extra and sleep1$extra
## t = 4.0621, df = 9, p-value = 0.001416
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## 0.8669947 Inf
## sample estimates:
## mean of the differences
## 1.58
In this case, we can reject the null hypothesis that Drugs 1 and 2 result in the same changes in hours of sleep compared to control and conclude that participants who used Drug 2 gained more hours of sleep than when the same participants used Drug 1, t(9) = 4.062, p = .001.