2026-02-04

The Question

Statistics begins with curiosity:

Can vitamins actually make guinea pigs teeth grow longer?

We will use real data to answer this question instead of just guessing.

The ToothGrowth Dataset

This dataset measures tooth length of guinea pigs based on:

  • Tooth length (‘len’)
  • Supplement type (‘supp’)
    • Orange Juice (OJ)
    • Vitamin C (VC)
  • Daily Dose (‘dose’)
    • 0.5 mg
    • 1 mg
    • 2 mg

Dataset Summary

##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000
##   supp      len
## 1   OJ 20.66333
## 2   VC 16.96333
##   dose    len
## 1  0.5 10.605
## 2  1.0 19.735
## 3  2.0 26.100

The Hypothesis

We want to know: Does the type of supplement affect tooth growth?

Null Hypothesis: There is no difference in mean tooth length.

\[ H_0: \mu_{OJ} = \mu_{VC} \]

Alternative hypothesis: There is a difference in mean tooth length.

\[ H_a: \mu_{OJ} \neq \mu_{VC} \]

Tooth Length by Supplement (ggplot)

Tooth Length by Dose (ggplot)

R Code Example: Tooth Length by Dose

ggplot(ToothGrowth, aes(x = factor(dose), y = len, fill = supp)) +
  geom_boxplot(alpha = 0.8) +
  scale_fill_manual(
    name = "Supplement",
    values = c("OJ" = "#E69F00", "VC" = "#56B4E9")) +
  labs(
    title = "Tooth Length by Dose and Supplement",
    x = "Dose (mg/day)",
    y = "Tooth Length"
  ) +
  theme_minimal() +
  theme(legend.position = "right",
        plot.title = element_text(hjust = 0.5))

Understanding the p-value

The p-value helps answer:

“If there were really no difference between supplements, how likely is it to see the data we observed?”

\[ p = P(\text{observed data} \mid H_0 \text{ is true}) \]

Interpretation:

  • Small p-value → strong evidence against \(H_0\)
  • Large p-value → data is consistent with \(H_0\)

Scatter Plot (plotly)

Plot not showing up when knitted or in rpubs, but displays in RStudio

Final Takeaways

  • Look at the data first
    • Visualizations reveal patterns in the data before formal testing.
  • For ToothGrowth:
    • Orange Juice tends to produce slightly longer teeth than Vitamin C.
    • Higher doses generally lead to longer teeth.
  • Workflow: Data → Visuals → Hypothesis → Decision