Analysis of Variance
To test whether there is a significant difference in means across categorical independent variable(s) on a continuous dependent variable.
One-way ANOVA
Conditions
We want to test whether the mean number of bird collisions significantly differs across four orientation groups (E, N, S, W).
| East (E) |
14 |
| North (N) |
28 |
| South (S) |
35 |
| West (W) |
26 |
Hypotheses
𝐻0 : 𝜇1 = 𝜇2 = 𝜇3 = … = 𝜇𝑘
𝐻1 : At least one mean is different from others.
Steps
- Calculate the within-group sum of squares (𝑆𝑆𝑊)
- Calculate the between-group sum of squares (𝑆𝑆𝐵)
- SST = SSW + SSB
- Calculate degrees of freedom (𝑑𝑓)
- Calculate mean squares (𝑀𝑆) = SS/df
- Calculate F-ratio
- Compare F-ratio to critical F-value Suppose we have 𝑘 levels. The i-th level has 𝑛𝑖 observations.
https://study.com/skill/learn/how-to-calculate-the-total-sum-of-squares-within-and-between-ssw-and-ssb-explanation.html
Example with R
library(readxl)
# Load data
data <- read_excel("Bird collision XJTLU_20250601.xlsx", sheet = "Campus")
data_clean <- subset(data, !is.na(`Orientation`) & !is.na(`Bird collision?`))
# Prepare data
data_clean$Orientation <- as.factor(data_clean$Orientation)
data_clean$Collision <- as.numeric(data_clean$`Bird collision?`)
# Check group means
aggregate(Collision ~ Orientation, data = data_clean, mean)
Orientation Collision
1 E 0.8750000
2 N 0.9655172
3 S 1.0000000
4 W 0.9629630
# Fit one-way ANOVA model
model <- aov(Collision ~ Orientation, data = data_clean)
summary(model)
Df Sum Sq Mean Sq F value Pr(>F)
Orientation 3 0.172 0.05733 1.605 0.193
Residuals 103 3.678 0.03571
Interpretation: p<0.05, fail to reject H0
No statistically significant difference in mean bird collisions between the four orientations