This lecture corresponds with Section 7.3 Difference of Two Means of the textbook and provides a foundation for Analysis of Variance (ANOVA) which is a statistical technique for comparing many means.
The t-distribution can be used for inference when working with the standardized difference of two means if
An instructor decided to run two slight variations of the same exam. Prior to passing out the exams, she shuffled the exams together to ensure each student received a random version. Anticipating complaints from students who took different versions, she would like to evaluate whether the difference observed in the groups is so large that it provides convincing evidence that one version was more dificult (on average) than the other.
In this case, we have two unrelated (i.e. independent or unpaired) groups of samples. Therefore, it is possible to use an independent t-test to evaluate whether the means are different.
Typical research questions are:
We can define the corresponding null and alternative hypotheses as follows:
\[t = \frac{\bar{x}_A - \bar{x}_B}{SE_{\bar{x}_A - \bar{x}_B}}\] Question: What should we used to estimate the standard error (\(SE\)) of our statistic?
Answer: It depends on if we assume the variances of each group are equal or not.
Equal Variance (homoscedasticity)
The pooled variance (\(S_p^2\)) can be used to estimate the variance of both groups when we assume the variance of each group is the same. It can be calculated as follows:
\[S_p^2 = \frac{(n_A -1)s_A^2 + (n_B-1)s_B^2}{n_A + n_B -2}\]
The standard error for our statistic is
\[SE_{\bar{x}_A - \bar{x}_B} = \sqrt{\frac{S_p^2}{n_A}+ \frac{S_p^2}{n_B}}\]
with degrees of freedom \(df = n_A+n_B - 2\).
Unequal Variance (heteroscedasticity)
The Welch t-test can be used when we assume unequal variances between the two groups. The standard error is
\[SE_{\bar{x}_A - \bar{x}_B} = \sqrt{\frac{s_A^2}{n_A}+ \frac{s_B^2}{n_B}}\]
but the degrees of freedom are calculated using a complex formula (Welch-Satterthwaite Approximation). The book suggests using the smaller of \(n_A-1\) and \(n_B-1\). We will let the software calculate the approximate degrees of freedom.
The first step should always be to visualize the data if possible.
Scores = read.csv("C:\\Users\\jbloda\\Desktop\\STAT 3006\\Fall 2022\\Lessons\\Chapter 7\\7.3\\Scores.csv")
head(Scores)
## Class Grade
## 1 A 76
## 2 A 80
## 3 A 72
## 4 A 100
## 5 A 68
## 6 A 56
boxplot(Scores$Grade ~ Scores$Class)
Let’s create vectors of scores for each class.
Class_A = subset(Scores$Grade, Scores$Class=='A')
Class_B = subset(Scores$Grade, Scores$Class=='B')
t.test(Scores$Grade ~ Scores$Class, mu = 0, alternative = 'two.sided', var.equal = TRUE)
##
## Two Sample t-test
##
## data: Scores$Grade by Scores$Class
## t = -6.9811, df = 301, p-value = 1.875e-11
## alternative hypothesis: true difference in means between group A and group B is not equal to 0
## 95 percent confidence interval:
## -12.461581 -6.981006
## sample estimates:
## mean in group A mean in group B
## 77.97101 87.69231
t.test(Class_A, Class_B, mu = 0, alternative = 'two.sided', var.equal = TRUE)
##
## Two Sample t-test
##
## data: Class_A and Class_B
## t = -6.9811, df = 301, p-value = 1.875e-11
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -12.461581 -6.981006
## sample estimates:
## mean of x mean of y
## 77.97101 87.69231
To check the normality assumption, we can use visual techniques such as QQ-plots or statistical test. Visual techniques will suffice for this assumption.
I prefer using the normal QQ-plot from the ‘car’ package. You only need to install the package once.
install.packages("car")
The following will create QQ plots for each of the groups.
library(car)
## Loading required package: carData
qqPlot(Class_A, pch = 20)
## [1] 17 51
qqPlot(Class_B, pch = 20)
## [1] 199 227
For the assumption of Normality to be satisfy, the points should follow a straight line. Obviously the scores for class B do not follow a Normal distribution and thus a t-test is not appropriate. For the purposes of this lesson, we will ignore the violation to this assumption.
As seen in the boxplot, the Equal Variance assumption is not satisfied either. The standard deviations can be compared as well.
sd(Class_A)
## [1] 12.99091
sd(Class_B)
## [1] 9.17775
A statistical test can be calculated for equal variance. The \(F\) test for equal variances is
var.test(Class_A, Class_B)
##
## F test to compare two variances
##
## data: Class_A and Class_B
## F = 2.0036, num df = 68, denom df = 233, p-value = 0.0001427
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 1.392117 3.004096
## sample estimates:
## ratio of variances
## 2.003581
The null hypothesis of the test is that the variances are equal, and thus the ratio of the variances is equal to 1. The test rejects the null hypothesis so it would be unwise to use the two sample t-test which assumes equal variance.
t.test(Scores$Grade ~ Scores$Class, mu = 0, alternative = 'two.sided', var.equal = FALSE)
##
## Welch Two Sample t-test
##
## data: Scores$Grade by Scores$Class
## t = -5.8036, df = 88.926, p-value = 9.838e-08
## alternative hypothesis: true difference in means between group A and group B is not equal to 0
## 95 percent confidence interval:
## -13.049633 -6.392953
## sample estimates:
## mean in group A mean in group B
## 77.97101 87.69231
The p-value of the test is \(9.8\times10^-8\), which is less than our significance level of \(\alpha=0.05\). We can reject our null hypothesis that the average exam grades for each class are the same and conclude that the exam grades of the two classes are significantly different, with class B averaging a higher score than class A.
Class B scored on average 9.72 points higher on the exam than Class A. We are \(95\%\) confident that the mean score for Class B is between 6.39 and 13.05 points higher than Class A.