2024-11-15

}

}

Hypothesis Testing Defined

  • In statistics, hypothesis testing is a form of analysis in which you test your assumptions about a specific parameter.
  • A statistical analyst will validate these assumptions through collection and evaluating a data set.
  • The two forms of hypotheses used to put these theories to the test are null and alternative hypotheses.
  • Null hypothesis: The claim being made that the researcher is attempting to disprove.
  • Alternative hypothesis: The hypothesis the researcher is attempting to prove through data and research.

Step 1: Stating Your Hypothesis

  • State your prediction using both a null and alternative hypothesis.

    Example:
  • Null Hypothesis: On average, men are not taller than women.
  • Alternative Hypothesis: On average, men are taller than women.

Step 2: Collect Relevant Data

  • Test your hypothesis by collecting data related to your hypotheses, considering the scope and various control variables that could affect your data collection.

    Example:
  • Collect data on heights of men and women on a global scale, looking into research and census information.

Step 3: Do a Statistical Test

  • Doing a statistical test will check if the observed data is consistent with the proposed hypothesis.
  • Some statistical tests that are frequently used include a T-test, Z-test, Chi-square test, and Analysis of Variance.
  • A T-test is calculated as follows:
    \[ t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} \] Example:
  • Performing a t-test to check the heights of men and women would compare the means of the two different groups.
  • This results in an average height of 175.4cm for men and 161.7cm for women, as well as a p-value of 0.002 declaring the likelihood of this difference if the null hypothesis is true.

Step 4: Reject or Accept Null Hypothesis

  • Compare the p-value to the significance level (\(\alpha\))
  • If \(p \leq \alpha\), reject the null hypothesis.
  • If \(p > \alpha\), accept the null hypothesis.

Step 5: Summarize and Present Your Findings

  • Present your findings to your peers in a research paper or other form of presentation.
  • Here is a violin ggplot showing our findings regarding height in men VS women:

R Code of Violin Plot Findings

male_heights <- rnorm(100, mean = 175.4, sd = 7)
female_heights <- rnorm(100, mean = 161.74, sd = 6)
height_data <- data.frame(
  height = c(male_heights, female_heights),
  gender = factor(rep(c("Men", "Women"), each = 100))
)
ggplot(height_data, aes(x = gender, y = height, fill = gender)) + 
  geom_violin(trim = F, alpha = 0.6) + 
  stat_summary(fun = mean, geom = "point", shape = 23, size = 1,
               color = "black", fill = "white") +
  labs(title = "Findings of Average Height by Gender",
       x = "Gender", y = "Height in Centimeters") +
  scale_fill_manual(values = c("blue", "pink")) +
  theme_minimal()