In the last mini-lecture, we fitted models with two categorical explanatory variables \((X_1, X_2)\). Now we need to interpret the output.
2024-01-11
In the last mini-lecture, we fitted models with two categorical explanatory variables \((X_1, X_2)\). Now we need to interpret the output.
yield.m2 <- lm(yield ~ block + bean, # including blocks info
data = Beans,
contrasts = list(block=contr.sum, bean=contr.sum)
)
anova(yield.m2)
## Analysis of Variance Table ## ## Response: yield ## Df Sum Sq Mean Sq F value Pr(>F) ## block 3 52.90 17.632 4.6567 0.01713 * ## bean 5 444.44 88.887 23.4757 1.341e-06 *** ## Residuals 15 56.79 3.786 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Recall what cannot be said often enough:
An ANOVA table is great in that it tells you about the importance of the different EVs in your model.
But it
We need to interpret the coefficients!
with(summary(yield.m2), round(coefficients, 3))
## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 16.675 0.397 41.982 0.000 ## block1 0.042 0.688 0.061 0.953 ## block2 2.392 0.688 3.476 0.003 ## block3 -1.475 0.688 -2.144 0.049 ## bean1 5.075 0.888 5.714 0.000 ## bean2 5.700 0.888 6.418 0.000 ## bean3 -0.600 0.888 -0.676 0.510 ## bean4 -0.250 0.888 -0.281 0.782 ## bean5 -3.700 0.888 -4.166 0.001
| Term | Coef | In R output | Estimate | Std. Error | t value | Pr(>|t|) |
|---|---|---|---|---|---|---|
| Constant | \(\mu\) | (Intercept) | 16.675 | 0.397 | 41.982 | 0.000 |
| Block | \(\alpha_1\) | block1 | 0.042 | 0.688 | 0.061 | 0.953 |
| \(\alpha_2\) | block2 | 2.392 | 0.688 | 3.476 | 0.003 | |
| \(\alpha_3\) | block3 | -1.475 | 0.688 | -2.144 | 0.049 | |
| Bean | \(\beta_1\) | bean1 | 5.075 | 0.888 | 5.714 | 0.000 |
| \(\beta_2\) | bean2 | 5.700 | 0.888 | 6.418 | 0.000 | |
| \(\beta_3\) | bean3 | -0.600 | 0.888 | -0.676 | 0.510 | |
| \(\beta_4\) | bean4 | -0.250 | 0.888 | -0.281 | 0.782 | |
| \(\beta_5\) | bean5 | -3.700 | 0.888 | -4.166 | 0.001 |
The mean yield, or yield for a given bean variety in a given block is
\[E(y\mid\mathrm{block,\ bean}) = \mu + \alpha_\mathrm{block} + \beta_\mathrm{bean}\]
| Term | Coef | In R output | Estimate | Std. Error | t value | Pr(>|t|) |
|---|---|---|---|---|---|---|
| Constant | \(\mu\) | (Intercept) | 16.675 | 0.397 | 41.982 | 0.000 |
| Block | \(\alpha_1\) | block1 | 0.042 | 0.688 | 0.061 | 0.953 |
| \(\alpha_2\) | block2 | 2.392 | 0.688 | 3.476 | 0.003 | |
| \(\alpha_3\) | block3 | -1.475 | 0.688 | -2.144 | 0.049 | |
| Bean | \(\beta_1\) | bean1 | 5.075 | 0.888 | 5.714 | 0.000 |
| \(\beta_2\) | bean2 | 5.700 | 0.888 | 6.418 | 0.000 | |
| \(\beta_3\) | bean3 | -0.600 | 0.888 | -0.676 | 0.510 | |
| \(\beta_4\) | bean4 | -0.250 | 0.888 | -0.281 | 0.782 | |
| \(\beta_5\) | bean5 | -3.700 | 0.888 | -4.166 | 0.001 |
\[E(y\mid\mathrm{block,\ bean}) = \mu + \alpha_\mathrm{block} + \beta_\mathrm{bean}\]
Example: Mean yield from Bean Variety 3 (bean = 3) grown in Block 2 (block = 2):
\(E(y\mid\mathrm{block}=2\land \mathrm{bean}=3) = \mu + \alpha_2 + \beta_3\)
\(E(y\mid\mathrm{block}=2\land \mathrm{bean}=3) = 16.675 + 2.392 -0.6 = 18.467\)
And Bean Variety 6! Good point:
block \((\alpha_1, \alpha_2, \alpha_3)\).bean \((\beta_1, \ldots, \beta_5)\)block ‘consumed’ only 3 DoF, and bean only 5 DoF:If you have \(n\) groups, you only get \(n-1\) group means. — Why?
We also know that
\[\alpha_1 + \alpha_2 + \ldots +\alpha_n= \sum_{i=1}^n \alpha_i = 0\]
So, we
\[\alpha_n = -\alpha_1 - \alpha_2 - \ldots - \alpha_{n-1} = -\sum_{i=1}^{n-1}\alpha_i\]
\[y = \mu + \begin{bmatrix} \mathtt{block} & \mathrm{Deviation} \\ 1 & \alpha_1 \\ 2 & \alpha_2 \\ 3 & \alpha_3 \\ 4 & -(\alpha_1+\alpha_2+\alpha_3) \\ \end{bmatrix} + \begin{bmatrix} \mathtt{bean} & \mathrm{Deviation} \\ 1 & \beta_1 \\ 2 & \beta_2 \\ 3 & \beta_3 \\ 4 & \beta_4 \\ 5 & \beta_5 \\ 6 & -(\beta_1+\beta_2+\beta_3+\beta_4+\beta_5) \\ \end{bmatrix} + \epsilon \]
Two EVs \(X_1,\ X_2\) are said to be
In our Beans example,
lm(yield ~ block + bean)
orthogonality means that
block tells you nothing about what bean variety was planted on the plot; andbean variety tells you nothing about what block the plot is in. T R E A T M E N T
A B C D
+-------+-------+-------+-------+
¦ ¦ ¦ ¦ ¦
1 ¦ 3 ¦ 3 ¦ 3 ¦ 3 ¦
¦ ¦ ¦ ¦ ¦
+-------+-------+-------+-------+
¦ ¦ ¦ ¦ ¦
B 2 ¦ 3 ¦ 3 ¦ 3 ¦ 3 ¦
L ¦ ¦ ¦ ¦ ¦
O +-------+-------+-------+-------+
C ¦ ¦ ¦ ¦ ¦
K 3 ¦ 3 ¦ 3 ¦ 3 ¦ 3 ¦
¦ ¦ ¦ ¦ ¦
+-------+-------+-------+-------+
¦ ¦ ¦ ¦ ¦
4 ¦ 3 ¦ 3 ¦ 3 ¦ 3 ¦
¦ ¦ ¦ ¦ ¦
+-------+-------+-------+-------+
(Numbers = number of replicates for each block - treatment combination)
Do the proportions of treatment replicates differ between blocks? Here, no:
block.treat is the same in all blocks (3/12 = 25%).Knowing block tells you nothing about what treat a plot likely received, and vice versa: block and bean are therefore
How could \(X_1\) (block) possibly be informative regarding\(X_2\) (bean)?
T R E A T M E N T
A B C D
+-------+-------+-------+-------+
¦ ¦ ¦ ¦ ¦
1 ¦ 1 ¦ 3 ¦ 3 ¦ 3 ¦
¦ ¦ ¦ ¦ ¦
+-------+-------+-------+-------+
¦ ¦ ¦ ¦ ¦
B 2 ¦ 3 ¦ 3 ¦ 1 ¦ 3 ¦
L ¦ ¦ ¦ ¦ ¦
O +-------+-------+-------+-------+
C ¦ ¦ ¦ ¦ ¦
K 3 ¦ 3 ¦ 3 ¦ 3 ¦ 3 ¦
¦ ¦ ¦ ¦ ¦
+-------+-------+-------+-------+
¦ ¦ ¦ ¦ ¦
4 ¦ 3 ¦ 1 ¦ 3 ¦ 3 ¦
¦ ¦ ¦ ¦ ¦
+-------+-------+-------+-------+
(Numbers = number of replicates for each block - treatment combination)
Due to mishap, we have lost some replicates… so now:
The probability that treat = A is
block 1block 2 and 4block 3Knowing block therefore tells you something about what the treatment most likely was — and vice versa: they are mutually informative and therefore
They are not the same — you can have a design that is unbalanced, but still orthogonal:
T R E A T M E N T
A B C D
+-------+-------+-------+-------+
¦ ¦ ¦ ¦ ¦
1 ¦ 2 ¦ 4 ¦ 4 ¦ 6 ¦
¦ ¦ ¦ ¦ ¦
+-------+-------+-------+-------+
¦ ¦ ¦ ¦ ¦
B 2 ¦ 1 ¦ 2 ¦ 2 ¦ 3 ¦
L ¦ ¦ ¦ ¦ ¦
O +-------+-------+-------+-------+
C ¦ ¦ ¦ ¦ ¦
K 3 ¦ 2 ¦ 4 ¦ 4 ¦ 6 ¦
¦ ¦ ¦ ¦ ¦
+-------+-------+-------+-------+
¦ ¦ ¦ ¦ ¦
4 ¦ 3 ¦ 6 ¦ 6 ¦ 9 ¦
¦ ¦ ¦ ¦ ¦
+-------+-------+-------+-------+In all four blocks, the number of replicates in A:B:C:D = 1:2:2:3
block therefore entails no information about treatblock and treat are But
In an orthogonal design,
This gives a handy orthogonality check…