t-Tests are a great way of identifying if two group means are statistically different. This can be done by comparing a sample to the population (one-sample) or comparing two different samples (two-sample). This tutorial will focus on the latter.
The military application of t-tests are especially common in Test and Evaluation (T&E). Acquisition programs are being tested against requirements, and t-tests serve as a way to determine if the program meets or exceeds the standard. Especially, in Operational T&E, when sample sizes are small due to cost, t t-tests are a valuable tool. Another example of the application within T&E is with upgrades to current programs. T-tests can be used to determine if the performance of the new system is at least good as the legacy system.
The functions utilized for this tutorial come from the built-in stats package. However, dplyr will need to be loaded for one example.
library(dplyr)
We can use built-in R functions to perform t-tests. There are multiple functions to perform t-tests in R, but for this tutorial we will look at two of them: t.test and pairwise.t.test.
We will leverage built-in R data from sleep and airquality. Snippets of each of these data sets are shown below.
head(sleep)
## extra group ID
## 1 0.7 1 1
## 2 -1.6 1 2
## 3 -0.2 1 3
## 4 -1.2 1 4
## 5 -0.1 1 5
## 6 3.4 1 6
head(airquality)
## Ozone Solar.R Wind Temp Month Day
## 1 41 190 7.4 67 5 1
## 2 36 118 8.0 72 5 2
## 3 12 149 12.6 74 5 3
## 4 18 313 11.5 62 5 4
## 5 NA NA 14.3 56 5 5
## 6 28 NA 14.9 66 5 6
As previously stated, two-sample t-tests are used when you want to compare two different samples. In order to do this in R, you can use t.test or pairwise.t.test. However, this section will focus on t.test.
The unpaired t-test is used when you want to compare two independent groups.
Suppose we want to test if the average increased sleep with two different drugs are equal, using the sleep dataset and t.test function.
t.test(extra~group, data = sleep)
##
## Welch Two Sample t-test
##
## data: extra by group
## t = -1.8608, df = 17.776, p-value = 0.07939
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.3654832 0.2054832
## sample estimates:
## mean in group 1 mean in group 2
## 0.75 2.33
As you can see, t.test outputs all of the information you need for a two-sample t-test. The p-value, alternative hypothesis (H1), confidence interval around the true difference in means, and the sample averages of each group, just to name a few. Notice that the p-value is > 0.05, implying that there does not exist enough evidence to reject the null; that is, the two groups are not statistically different.
Use paired t-tests when obervations from one group are paired with the other. This can be done easily in r, by simply adding paired = TRUE when calling t.test and/or pairwise.t.test.
t.test(extra~group, data = sleep, paired = TRUE)
##
## Paired t-test
##
## data: extra by group
## t = -4.0621, df = 9, p-value = 0.002833
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -2.4598858 -0.7001142
## sample estimates:
## mean of the differences
## -1.58
pairwise.t.test(sleep$extra, sleep$group, paired = TRUE)
##
## Pairwise comparisons using paired t tests
##
## data: sleep$extra and sleep$group
##
## 1
## 2 0.0028
##
## P value adjustment method: holm
This tutorial has demonstrated how to do unpaired and paired t-tests. There are other adjustments we can make to these functions and this section will explain a few of the available options. The following function adjustments work for both t.test and pairwise.t.test, but this section will only demonstrate the examples using t.test.
You can adjust the alternative hypothesis easily when calling the function. The default alternative is two-sided, which means your null hypothesis is \(H_0 = 0\) and the alternative is \(H_A \neq 0\). For one-sided tests,
set alternative = "less" if \(H_0 \ge 0\) and \(H_A < 0\), and
alternative = "greater" if \(H_0 \le 0\) and \(H_A > 0\).
Suppose we want to test if the difference in average increased sleep of the two groups is \(\le 0\), using the sleep dataset:
t.test(extra~group, data = sleep, alternative = "greater")
##
## Welch Two Sample t-test
##
## data: extra by group
## t = -1.8608, df = 17.776, p-value = 0.9603
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## -3.053381 Inf
## sample estimates:
## mean in group 1 mean in group 2
## 0.75 2.33
Notice that the p-value is > 0.05, implying that there does not exist enough evidence to reject the null; that is, the difference of the means is \(\le\) 0.
You can also adjust the true value of the mean difference being tested by setting mu = N, where N is the true value of the mean difference.
Suppose we want to test if the difference in average increased sleep of the two groups is \(\ge 1\), using the sleep dataset:
t.test(extra~group, data = sleep, alternative = "less", mu = 1)
##
## Welch Two Sample t-test
##
## data: extra by group
## t = -3.0385, df = 17.776, p-value = 0.003571
## alternative hypothesis: true difference in means is less than 1
## 95 percent confidence interval:
## -Inf -0.1066185
## sample estimates:
## mean in group 1 mean in group 2
## 0.75 2.33
Notice that the p-value is < 0.05, implying that there does exist enough evidence to reject the null; that is, the difference of the means is < 1.
The default for both t.test and pairwise.t.test is to assume unequal variances. However, if you would like to test with the assumption that the two variances are equal, you can add var.equal = TRUE when you call the function.
Suppose we want to test if the difference in average increased sleep of the two groups are equal, assuming the variances of the two groups are equal and using the sleep dataset:
t.test(extra~group, data = sleep, var.eq = TRUE)
##
## Two Sample t-test
##
## data: extra by group
## t = -1.8608, df = 18, p-value = 0.07919
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.363874 0.203874
## sample estimates:
## mean in group 1 mean in group 2
## 0.75 2.33
Note that the p-value is > 0.5, therefore there is not enough evidence to show that the means of the two groups are not equal.
The default confidence level of the interval is 0.95 (95%). You can adjust the confidence level that you are interested in by adding conf.level = $\alpha$, where 0 < \(\alpha\) < 1.
Suppose we want to test if the difference in average increased sleep of the two groups are equal at an 80% confidence level, using the sleep dataset:
t.test(extra~group, data = sleep, conf.level = .8)
##
## Welch Two Sample t-test
##
## data: extra by group
## t = -1.8608, df = 17.776, p-value = 0.07939
## alternative hypothesis: true difference in means is not equal to 0
## 80 percent confidence interval:
## -2.7101645 -0.4498355
## sample estimates:
## mean in group 1 mean in group 2
## 0.75 2.33
Based on the p-value, we again fail to reject the null hypothesis. That is, we are 80% confident that the means of the two groups are not statistically significantly different.
In the previous section, we looked solely at application of t.test. This section will focus on the advantages of using pairwise.t.test. Unlike t.test, pairwise.t.test only provides the p-value for the comparison. The t.test function can only test two samples at a time. However, pairwise.t.test can perform multiple pairwise comparisons and output a triangular matrix of the p-values between all groups.
In addition, pairwise.t.test will adjust the p-values using the “holm” method by default. You can set p.adjust = to any of the following options:
p.adjust.methods
## [1] "holm" "hochberg" "hommel" "bonferroni" "BH"
## [6] "BY" "fdr" "none"
Using any of the above methods, besides “none” will allow you to maintain the overall p.value across multiple comparisons. To show how pairwise.t.test works with multiple groups, we will use the airquality dataset to show this.
Suppose we want to test if there is a difference of mean Ozone between months.
attach(airquality)
Month <- factor(Month, labels = month.abb[5:9])
pairwise.t.test(Ozone, Month)
##
## Pairwise comparisons using t tests with pooled SD
##
## data: Ozone and Month
##
## May Jun Jul Aug
## Jun 1.00000 - - -
## Jul 0.00026 0.05113 - -
## Aug 0.00019 0.04987 1.00000 -
## Sep 1.00000 1.00000 0.00488 0.00388
##
## P value adjustment method: holm
As you can see, it outputs a lower triangular matrix of all of the pairwise comparisons. You can use this to determine which differences are statistially significantly different, and then look at their means to determine which factor is superior. For instance, the p-value for the difference between Ozone for May and July is very small, indicating that there is a statistically significant difference. We can determine the mean Ozone for each month, and compare them
MayAvg <- airquality %>%
filter(Month == 5) %>%
summarize(mean(Ozone,na.rm = TRUE))
JulyAvg <- airquality %>%
filter(Month == 7) %>%
summarize(mean(Ozone,na.rm = TRUE))
MayAvg
## mean(Ozone, na.rm = TRUE)
## 1 23.61538
JulyAvg
## mean(Ozone, na.rm = TRUE)
## 1 59.11538
In July, the mean ozone in parts per billion is more than twice that in May.
You can easily do two-sample t-tests with built-in R functions. Throughout this tutorial you have learned how to do unpaired, paired, and multiple pairwise t-tests. In addition, you have learned how to change the default settings in order to customize the t-tests to get the results you need.