2025-05-29

What is Hypothesis Testing

  • Hypothesis testing is viewing data and inferring if the data provided is sufficient to reject or accept a hypothesis.
  • Important Definitions
    • Research Hypothesis: A statement that introduces a research question and proposes an expected result.
    • Null Hypothesis: A statement of “no effect” or “no difference”.
    • Alternative Hypothesis: What we want to prove with our testing Represents the claim we want to provide evidence for when doing Hypothesis testing.

How do we go about Hypothesis Testing

  1. State your Research, Null and Alternative Hypothesis.

  2. Collect data to test the hypothesis.

  3. Perform tests on collected data to determine the final result.

  4. Find the p-value

  5. Decide whether you reject or fail to reject your Null Hypothesis.

One sample test R example

  • Lets see if the average height of a sample is different from the average population world wide.
set.seed(123)
sample_data <- rnorm(30, mean = 172, sd = 5)
t.test(sample_data, mu = 170)
## 
##  One Sample t-test
## 
## data:  sample_data
## t = 1.9703, df = 29, p-value = 0.05842
## alternative hypothesis: true mean is not equal to 170
## 95 percent confidence interval:
##  169.9329 173.5961
## sample estimates:
## mean of x 
##  171.7645

Viusalization of data with Plot

Type 1 and 2 Errors with GGPlot

Interactive 3d Visualization of Normal Distribution

R Code Example

# Generate sample data
set.seed(123)
sample_data <- rnorm(30, mean = 172, sd = 5)

# Create histogram
ggplot(data.frame(height = sample_data), aes(x = height)) +
  geom_histogram(binwidth = 2, fill = "#8C1D40", color = "black") +
  geom_vline(xintercept = 170, color = "blue", linetype = "dashed", size = 1) +
  labs(title = "Sample Distribution vs Population Mean",
       x = "Height (cm)", y = "Frequency") +
  theme_minimal()

Latex Equation

\[ z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}} \]

Latex Equation Definitions

\(\bar{x}\) = sample mean

\(\mu_0\) = hypothesized population mean

\(\sigma\) = population standard deviation

\(n\) = sample size