2025-02-09

Hypothesis Testing

Hypothesis testing, otherwise known as significance testing, is when we test a claim about a population parameter using sample evidence to confirm or reject the hypothesis.

Components

  • \(H_0\) (Null Hypothesis): The statement about the population parameter.
  • \(H_a\) (Alternative Hypothesis): The statement that directly contradicts the null hypothesis.

Hypothesis testing allows researchers to draw meaningful conclusions about a population.

Steps

1.) State the null and alternative hypotheses.

2.) Choose a significance level. (A.k.a. how close to true our sample statistic has to be to accept the “truth”)

3.) Calculate the test statistic.

4.) Determine the p-value or critical value.

5.) Draw a conclusion. (Reject or fail to reject the null hypothesis)

The test statistic for a one-sample t-test is calculated as:

\[ t = \frac{\bar{X} - \mu_0}{s / \sqrt{n}} \]

where:

  • \(\bar{X}\) is the sample mean,
  • \(\mu_0\) is the hypothesized population mean,
  • \(s\) is the sample standard deviation,
  • \(n\) is the sample size.

Applying Hypothesis Testing

To demonstrate how to perform a hypothesis test, I will be examining R Studio’s pre-downloaded data set, mtcars.

As an example, I will test if the average miles-per-gallon (MPG) differs from a hypothesized value.

Visualizing the Data

Code to Perform a T-test

result = t.test(mtcars$mpg, mu = 20)
result
## 
##  One Sample t-test
## 
## data:  mtcars$mpg
## t = 0.08506, df = 31, p-value = 0.9328
## alternative hypothesis: true mean is not equal to 20
## 95 percent confidence interval:
##  17.91768 22.26357
## sample estimates:
## mean of x 
##  20.09062
  • t.test() performs a one sample t-test
  • mu = 20 is the hypothesized mean value for the population.

Sample Mean Vs. Hypothesized Mean

Visualizing the Sample Distribution and Hypothesized Mean

Conclusion

Null Hypothesis: The sample mean is equal to the hypothesized population mean. \(H_0: \mu = 20\)

Alternative Hypothesis: The sample mean is not equal to the hypothesized population mean. \(H_a: \mu \neq 20\)

From the results of the t.test, we know that the t-statistic is 0.085 and the p-value is 0.9328. The confidence interval is 95 percent, so we know our significance level is 0.05. Since the p-value (0.9328) is greater than 0.05, we fail to reject the null hypothesis. There is insufficient evidence to conclude that the sample mean is significantly different from the hypothesized population mean of 20.