Hypothesis Testing

October 19, 2024

Introduction to Hypothesis Testing

What is hypothesis testing?
Steps in hypothesis testing
Importance in statistics

Key Concepts of Hypothesis Testing

Null Hypothesis (H0): No effect or no difference
Alternative Hypothesis (H1): Effect or difference exists
P-value: Probability of observing the test statistic or more extreme under H0
Alpha level (α): Significance threshold

Visualizing the Normal Distribution

x <- rnorm(1000, mean = 0, sd = 1)
ggplot(data.frame(x), aes(x)) +
  geom_histogram(aes(y = ..density..), bins = 30, fill = "skyblue", color = "black") +
  stat_function(fun = dnorm, args = list(mean = 0, sd = 1), color = "red") +
  ggtitle("Normal Distribution with Overlayed Density Curve") +
  theme_minimal()

Plotly Interactive Visualization

data <- data.frame(x = rnorm(100), y = rnorm(100))
p <- ggplot(data, aes(x=x, y=y)) +
  geom_point() +
  ggtitle("Interactive Scatter Plot")
ggplotly(p)

Mathematical Expressions in Hypothesis Testing

\[ H_0: \mu_1 = \mu_2 \quad \text{(Null Hypothesis)} \]

\[ H_1: \mu_1 \neq \mu_2 \quad \text{(Alternative Hypothesis)} \]

P-Value Interpretation

The p-value is calculated as: \[ P(\\text{test statistic} \\geq t \\mid H_0) = \\alpha \]
If the p-value is less than alpha, reject the null hypothesis.
If the p-value is greater than alpha, fail to reject the null hypothesis.

R Code for Hypothesis Testing

data1 <- rnorm(30, mean=5, sd=2)
data2 <- rnorm(30, mean=6, sd=2)
t.test(data1, data2)

## 
##  Welch Two Sample t-test
## 
## data:  data1 and data2
## t = -1.0611, df = 54.682, p-value = 0.2933
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.610265  0.495469
## sample estimates:
## mean of x mean of y 
##  5.681118  6.238516

Thank you!

Any questions?