Comparing Population Means Using Two Sample Hypothesis Tests

2025-03-16

Introduction to Hypothesis Tests

Hypothesis testing is a statistical method to used to evaluate an assumption about a population. When discussing hypothesis tests, it is important to know the following vocabulary:

Null Hypothesis (\(H_0\)): The statement initially assumed to be true in a hypothesis test, assumes no relationship between the variable being studied and the data set
Alternate Hypothesis (\(H_A\)): A contradiction of the null hypothesis, assumes there is a relationship between the variable of interest and the data set
Significance Level (\(\alpha\)): The maximum allowable probability of coming to an incorrect conclusion, for example, \(\alpha\) = 0.05 means that the test has a 5% chance of being incorrect

Assumptions

In order to conduct a two sample hypothesis test, the following must be assumed:

Samples are independent of one another
The populations are normally distributed

The t-distribution

Two sample hypothesis tests use the t distribution, which is a plot of t-score versus probability of occurrence for a normal distribution.

t-score: indicates how far a data point is from the mean
p-score is the probability corresponding to a given t-score
critical t-score: the t-score of \(\alpha\), the point at which the probability that the null hypothesis is correct is too large and it can no longer be rejected

Evaluating Hypotheses using the t-distribution

In the t distribution the sum of the area outside of the critical t-scores is \(\alpha\). If the \(p < \alpha\) for a given test, then its corresponding t-score remains outside the critical values. If the \(p > \alpha\), then its t-score crosses the critical threshold.

Results of the Hypothesis Test

To conduct a hypothesis test, the data for each sample is first used to determine a t-score (this can be done with the qt() function in r). The t-score is then converted into a p-value (via the pt() function in r). The p-value is then evaluated according to the following guidelines:

\(p < \alpha\): \(H_0\) can be rejected because the probability that it is correct is insignificant according to the selected significance level
\(p > \alpha\): \(H_0\) cannot be rejected because the probability that it is correct is greater than the chosen significance level

A full hypothesis test can also be conducted using the t.test() function in r

Example: Supplement Delivery and Tooth Growth

Using the R data set “Tooth Growth”, which measures the length of a cell responsible for tooth growth in guinea pigs given vitamin c supplements in a varying doses via either orange juice or ascorbic acid, the following null and alternate hypotheses could be formulated:

\(H_0\): The mean cell length is the same for guinea pigs given supplements via orange juice and ascorbic acid
\(H_0\): The mean cell length is different for guinea pigs given supplements via orange juice versus ascorbic acid

Mathematical Expression: Supplement Delivery and Tooth Growth

These hypotheses can be mathematically formulated as shown below

\[H_0: \mu\ _j = \mu\ _a\] \[H_a: \mu\ _j \neq \mu\ _a\]

where \(\mu\ _j\) refers to the cell length of guinea pigs supplemented via orange juice and \(\mu\ _a\) refers to the cell length of guinea pigs supplemented via ascorbic acid.

Verifying Assumptions

To ensure a hypothesis test can be conducted, the data set must meet the required assumptions,

Independent Samples: Each entry represents a unique guinea pig, so samples are independent
Normality: The samples are shown to be approximately normal in the qq plots below

Evaluation of Hypotheses: Supplement Delivery and Tooth Growth

The following code can be used to conduct a hypothesis test to determine whether the average cell length is related to supplement delivery method (assuming \(\alpha = 0.05\))

data("ToothGrowth")
# separate the cell lengths for guinea pigs given orange juice 
# (listed as "OJ" in the data frame) and ascorbic acid 
# (listed as "VC" in the data frame)
length_j = ToothGrowth %>% filter(supp == "OJ") %>% select(len)
length_a = ToothGrowth %>% filter(supp == "VC") %>% select(len)
# run a t test
# not paired bc each test is on a different guinea pig
t.test(x = length_j, y = length_a, alternative = "two.sided", 
       mu = 0, conf.level = 0.95, paired = FALSE, var.equal = FALSE)

Evaluation of Hypotheses: Supplement Delivery and Tooth Growth

The results of this t-test (as shown below) indicate that \(p > \alpha\), therefore the null hypothesis cannot be rejected. This means that there is not statistically significant evidence that the mean cell length differs based on supplement delivery method

## 
##  Welch Two Sample t-test
## 
## data:  length_j and length_a
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1710156  7.5710156
## sample estimates:
## mean of x mean of y 
##  20.66333  16.96333

Further Visualizations: Supplement Delivery and Tooth Growth

Although the hypothesis test showed that there is not a statistically significant difference between the cell length of guinea pigs given supplements via juice versus ascorbic acid, the samples do have different means, as highlighted in the box plot below. Including a larger sample size or adjusting the confidence level could cause the null hypothesis to be rejected in future tests.