Ben Wilcox DAT301: HW3

2025-03-25

Introduction to P-Value

The p-value is one of the most important concepts for hypothesis testing
It is a representation of the probability of obtaining test results as extreme as the observed results, assuming the null hypothesis is true.

If we let \(H_0\) be the null hypothesis, then:

\[ p\text{-value} = P(\text{data} \mid H_0 \text{ is true}) \]

This will measure the strength of the evidence against \(H_0\).

Framework for Hypothesis Testing

The Null Hypothesis (\(H_0\)): The initial assumption.
The Alternative Hypothesis (\(H_a\)): The contesting statement.
We will then use the p-value to decide whether to reject \(H_0\).

What does P-Value look like?

Example of P-Value with mtcars dataset

3D Visualization of P-Value

## Warning: package 'plotly' was built under R version 4.4.3

R Code Example with mtcars dataset

# An example of a t-test:
t.test(mpg ~ am, data = mtcars)

## 
##  Welch Two Sample t-test
## 
## data:  mpg by am
## t = -3.7671, df = 18.332, p-value = 0.001374
## alternative hypothesis: true difference in means between group Automatic and group Manual is not equal to 0
## 95 percent confidence interval:
##  -11.280194  -3.209684
## sample estimates:
## mean in group Automatic    mean in group Manual 
##                17.14737                24.39231

Conclusion

Using the p-value we can quantify how surprising our data is from \(H_0\).
The p-value must be used within the context of the question or example to understand what it means.
It is also important to know that when a low p-value is found that does not prove the alternative.