Big Picture

Goal: Deciding whether an observed pattern in data could be explained by random chance.

We will be answering a concrete question using a real data:

Do manual cars have higher MPG than automatic cars (on average)?

Hypotheses

We compare mean MPG between the two groups.

\[ H_0: \mu_{\text{manual}} - \mu_{\text{automatic}} = 0 \]

\[ H_A: \mu_{\text{manual}} - \mu_{\text{automatic}} > 0 \]

Significance level: \(\alpha = 0.05\)

Visual Comparison: MPG by transmission

Hypothesis test result

## 
##  Welch Two Sample t-test
## 
## data:  mpg by am
## t = -3.7671, df = 18.332, p-value = 0.9993
## alternative hypothesis: true difference in means between group Automatic and group Manual is greater than 0
## 95 percent confidence interval:
##  -10.57662       Inf
## sample estimates:
## mean in group Automatic    mean in group Manual 
##                17.14737                24.39231

What is a p-value?

A p-value is defined as:

\[ p = P(\text{data as extreme as observed} \mid H_0 \text{ is true}) \]

If \(p < \alpha\), the data will provide the evidence against the null hypothesis.

Relationship between MPG and weight

Plotly interactive scatter: MPG vs Weight

Conclusion

  • Hypothesis testing helps determine whether observed differences are likely due to chance.

  • The p-value provides the evidence against the null hypothesis.

  • Visualizations help support and interpret statistical results.