HW 3

2025-09-17

Hypothesis Testing

Fundamental test in statistics, with two outcomes: - reject the null hypothesis - fail to reject the null hypothesis

Hypothesis testing is used to literally “test a hypothesis” someone may have about the validity of a statistic and seeing if that hypothesis may or may not be true.

Null and Alternative Hypothesis

They are opposites (only one can be true)
The Alternative Hypothesis \((H_1)\) is claim that is tested
Null Hypothesis \((H_0)\) is what is assumed (default)
Test are created to try and prove in favor of \(H_1\) disproving \(H_0\)

\[\text{(Null Hypothesis)} \hspace{5mm} H_0 : \mu = \mu_0 \\ \text{(Alternative Hypothesis)} \hspace{5mm} H_1 : \mu \neq \mu_0 \]

Type I vs Type II Errors

A Type I Error is can be more easily referred to as a false positive: this is when the Null Hypothesis \((H_0)\) is wrongly rejected
A Type II Error or a false negative: happens when the Null Hypothesis \((H_0)\) is incorrectly NOT rejected

\[\text{(Type I Error)} \hspace{5mm} \alpha = P(\text{Reject } H_0 | H_0 \text{ is true}) \\ \text{(Type II Error)} \hspace{5mm} \beta = P(\text{Fail to reject } H_0 | H_0 \text{ is false}) \]

R Code

Below is the code for the Normal Distribution graph on the next slide
As you can see the graph goes from -4 to 4 and the shaded areas are from -4 to -1.96 on the left side and from 1.96 to 4 on the right side
Also the shaded areas are done with darkgray and the distribution line is colored red

df <- data.frame(x = c(-4, 4))
# Normal Curve
ggplot(df, aes(x = x)) +
stat_function(fun = dnorm, args = list(mean = 0, sd = 1), color = "red") +
  
# Left Rejection
stat_function(fun = dnorm, args = list(mean = 0, sd = 1), xlim = c(-4, -1.96), 
geom = "area", fill = "darkgray", alpha = 0.5) +
  
#Right Rejection
stat_function(fun = dnorm, args = list(mean = 0, sd = 1), xlim = c(1.96, 4),  
geom = "area", fill = "darkgray", alpha = 0.5)

Normal Distribution

Below is a Normal Distribution Curve
Used to observe essentially, how accurate your hypothesis may or may not be
By examing the “critical regions”, we shade these areas to show that this is where the hypothesis is rejected and as you can see in this specific graph, the hypothesis is mostly accepted.

Plotly plot

In this interactive version you can hover over the points on the plot in order to visualize the p-value
The vertical line represents the test statistic. This is the hypothesized point that is either rejected or not rejected.

Sampling Distribution

Below is a histogram showing a distribution of ages
It shows the possible values of a test statistic if \(H_0\) is true
Centered at the null value we can see that most of the of the ages or our observed test statistic is in the center thus, we fail to reject the hypothesis

Conclusion

Hypothesis Testing is a critical part of statistics
Interestingly we never accept a hypothesis test, we just fail to reject or reject
By understanding these concepts we can test the likelyhood that a hypothesis is true or false.
Understanding these slides will ensure a base knowledge about different topics and graphs related to hypothesis testing