Data (synthetic)

I don’t have a real otter dataset, so I’m using a synthetic dataset with plausible values:

  • Sea otters heavier (~30 kg on average)
  • River otters lighter (~9 kg on average)
  • Sample size: Sea otter = 120, River otter = 120
## $`River otter`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   3.585   7.673   8.787   9.009  10.460  15.525 
## 
## $`Sea otter`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   15.03   27.12   30.45   30.15   33.27   43.51

Hypotheses

  • Compare mean body mass (kg) between species
  • Sample sizes: Sea otter n = 120, River otter n = 120

\[ H_0: \mu_{Sea} = \mu_{River}, \qquad H_1: \mu_{Sea} \ne \mu_{River}. \]

Test statistic & p-value

Welch two-sample t-test.

\[ t = \frac{\bar{Y}_{Sea}-\bar{Y}_{River}}{\sqrt{\tfrac{s^2_{Sea}}{n_{Sea}} + \tfrac{s^2_{River}}{n_{River}} }}, \quad p = 2\,P\big(|T_{\nu}| \ge |t_{obs}|\big). \]

Boxplot (ggplot2)

Compare medians & spread.

Goal: show group difference visually.

Histogram (ggplot2)

Distribution overlap.

Shapes hint at effect size; not just significance.

Interactive 3D (plotly)

Mass vs length & whiskers.

Context for variability beyond group label.

Permutation p-value (result)

Randomize labels (two-sided).

Permutation test to validate result.
## Observed diff (kg): 21.14 
## Permutation p-value: 0

Permutation null (base R)

Observed diff marked.

Dashed lines mark the observed difference.

R Code Example

Core steps used in the slides.

# Welch two-sample t-test
sea <- otters$body_mass_kg[otters$species=="Sea otter"]
riv <- otters$body_mass_kg[otters$species=="River otter"]
t.test(sea, riv)

# Quick permutation (two-sided)
set.seed(7); B <- 1000
obs <- mean(sea) - mean(riv)
perm <- numeric(B); labels <- as.character(otters$species); vals <- otters$body_mass_kg
for(i in 1:B){
  lab <- sample(labels, length(labels))
  perm[i] <- mean(vals[lab=="Sea otter"]) - mean(vals[lab=="River otter"])
}
mean(abs(perm) >= abs(obs))

Takeaway

  • Sea otters are about 21.1 kg heavier on average
    (95% CI [20.1, 22.2])

  • Welch t-test: p = 1.34e-87 ~ 0;
    permutation p-hat = 0

  • Effect size: Cohen’s d = 5.29 (large)