Lecture 10 Doing an ANOVA "by hand"

Eamonn Mallon
09/11/2020

Calculate SSY \[ SSY = \Sigma(y-\bar{y})^2 \]

oneway <- read.csv("~/Dropbox/Teaching/old_teaching/zipped/oneway.csv")
sum((oneway$ozone-mean(oneway$ozone))^2)

[1] 44

Calculate SSE \[ SSE = \Sigma_{j=1}^k\Sigma(y-\bar{y_j})^2 \]

sum((oneway$ozone[oneway$garden=="A"]-mean(oneway$ozone[oneway$garden=="A"]))^2)

[1] 12

sum((oneway$ozone[oneway$garden=="B"]-mean(oneway$ozone[oneway$garden=="B"]))^2)

[1] 12

So SSA = 44 - 24 = 20 (SSY = SSE + SSA)

Source	Sum of squares	Degrees of freedom	Mean squares	F
Garden	20	1	20	15
Error	24	18	s² = 1.3333
Total	44	19

Degrees of freedom (n-p)
- Garden: 2 levels, 1 parameter, therefore 2-1
- Error: 20 samples, 2 parameters (look at the equation). 20-2
- Total: Add up the other two
Mean squares (Mean squared deviation - lecture 2) = SS/df
F = Mean squares (treatment) / Mean squares (error) = 20/1.333 [Think signal over noise]

The p prefix, as in pf(), is how you calculate a p-value from a probability distribution

1-pf(15,1,18)

[1] 0.001114539

So the probability of obtaining data as extreme as ours (or more extreme) if the two means were really the same is about 0.1%