Hypothesis Testing

03/16/2025

Introduction

Hypothesis testing is a statistical method used to make inferences about a population parameter based on a sample. It involves formulating a null hypothesis \(H_0\) and an alternative hypothesis \(H_A\), then using a statistical test to determine whether to reject \(H_0\).

Hypothesis tests are used to assess whether a difference between two samples represents a real difference between the populations from which the samples were taken. A null hypothesis of ‘no difference’ is taken as a starting point, and we calculate the probability that both sets of data came from the same population.

Load Required Libraries

data(mtcars)

library(ggplot2)
library(plotly)

## 
## Attaching package: 'plotly'

## The following object is masked from 'package:ggplot2':
## 
##     last_plot

## The following object is masked from 'package:stats':
## 
##     filter

## The following object is masked from 'package:graphics':
## 
##     layout

Example: Testing the Mean MPG

We will test whether the mean miles per gallon (mpg) of cars in the mtcars dataset is significantly different from 20.

Hypotheses:

\(H_0: \mu = 20\) (The true mean mpg is 20)
\(H_A: \mu \neq 20\) (The true mean mpg is not 20)

Perform t-test

t_test = t.test(mtcars$mpg, mu=20)
t_test

## 
##  One Sample t-test
## 
## data:  mtcars$mpg
## t = 0.08506, df = 31, p-value = 0.9328
## alternative hypothesis: true mean is not equal to 20
## 95 percent confidence interval:
##  17.91768 22.26357
## sample estimates:
## mean of x 
##  20.09062

Histogram of MPG (ggplot)

Boxplot of MPG (ggplot)

3D Visualization (Plotly)

R Code for Hypothesis Test

t_test = t.test(mtcars$mpg, mu=20)
summary(t_test)

##             Length Class  Mode     
## statistic   1      -none- numeric  
## parameter   1      -none- numeric  
## p.value     1      -none- numeric  
## conf.int    2      -none- numeric  
## estimate    1      -none- numeric  
## null.value  1      -none- numeric  
## stderr      1      -none- numeric  
## alternative 1      -none- character
## method      1      -none- character
## data.name   1      -none- character

Conclusion

In this analysis, we performed a hypothesis test to determine whether the mean miles per gallon (mpg) in the mtcars dataset is significantly different from 20. We set up our null hypothesis \(H_0\) stating that the true mean mpg is 20 and our alternative hypothesis \(H_A\) stating that the mean mpg is not 20. Using a t-test, we computed the test statistic and p-value to assess the statistical significance of our findings.

The p-value obtained from the t-test is 0.9328, which is much greater than 0.05. Since the p-value is larger, we fail to reject the null hypothesis \(H_0\). This suggests that there is not enough statistical evidence to conclude that the mean mpg differs from 20.

This approach demonstrates how hypothesis testing is a crucial tool in statistical analysis, allowing us to make data-driven decisions and draw meaningful insights from sample data.