Introduction to ANOVA

  • ANOVA is a statistical method used to compare means across three or more groups.
  • Helps determine if differences in group means are statistically significant.
  • Common applications: manufacturing, medicine, marketing.

Types of ANOVA 1. One-Way ANOVA: Compares means across one factor. 2. Two-Way ANOVA: Compares means across two factors. 3. Repeated Measures ANOVA: Measures the same subjects at different times.

Example: Comparing Production Lines

  • Objective: Compare defect rates across three production lines.
  • Groups: Line A, Line B, Line C
  • Hypotheses:
    • Null (H0): All production lines have the same average defect rate.
    • Alternative (H1): At least one production line has a different average defect rate.

Plotting Data with ggplot2

# Boxplot for defects
ggplot(data, aes(x=line, y=defects)) + 
  geom_boxplot(fill="blue") +
  labs(title="Defect Rates by Production Line", 
       x="Production Line", y="Defect Rate")

Calculating the F-statistic

The F-statistic measures the ratio of variance between groups to the variance within groups:

\[ F = \frac{\text{MS}_{\text{between}}}{\text{MS}_{\text{within}}} \]

Where: - \(\text{MS}_{\text{between}}\) is the mean square between the groups - \(\text{MS}_{\text{within}}\) is the mean square within the groups

Degrees of Freedom and P-value

  • Degrees of Freedom:
    • Between groups: \(df_{\text{between}} = k - 1\)
    • Within groups: \(df_{\text{within}} = N - k\)
  • P-value: If the p-value is less than 0.05, the null hypothesis is rejected, indicating a significant difference between group means.