BS2004 | Week 3

2024-01-11

Interpretation

Model with two categorical variables

In the last mini-lecture, we fitted models with two categorical explanatory variables \((X_1, X_2)\). Now we need to interpret the output.

Returning to our Beans model

yield.m2 <- lm(yield ~ block + bean,  # including blocks info
               data = Beans,
               contrasts = list(block=contr.sum, bean=contr.sum)
               )
anova(yield.m2)

## Analysis of Variance Table
## 
## Response: yield
##           Df Sum Sq Mean Sq F value    Pr(>F)    
## block      3  52.90  17.632  4.6567   0.01713 *  
## bean       5 444.44  88.887 23.4757 1.341e-06 ***
## Residuals 15  56.79   3.786                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

ANOVA is not enough!

Recall what cannot be said often enough:

An ANOVA table is great in that it tells you about the importance of the different EVs in your model.

But it does not

tell you how the EVs affect the outcome;
let you predict the effect of the treatment.

We need to interpret the coefficients!

Coefficients from bean count

raw R output

with(summary(yield.m2), round(coefficients, 3))

##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   16.675      0.397  41.982    0.000
## block1         0.042      0.688   0.061    0.953
## block2         2.392      0.688   3.476    0.003
## block3        -1.475      0.688  -2.144    0.049
## bean1          5.075      0.888   5.714    0.000
## bean2          5.700      0.888   6.418    0.000
## bean3         -0.600      0.888  -0.676    0.510
## bean4         -0.250      0.888  -0.281    0.782
## bean5         -3.700      0.888  -4.166    0.001

Making sense of the coefficients

Term	Coef	In R output	Estimate	Std. Error	t value	Pr(>\|t\|)
Constant	\(\mu\)	(Intercept)	16.675	0.397	41.982	0.000
Block	\(\alpha_1\)	block1	0.042	0.688	0.061	0.953
	\(\alpha_2\)	block2	2.392	0.688	3.476	0.003
	\(\alpha_3\)	block3	-1.475	0.688	-2.144	0.049
Bean	\(\beta_1\)	bean1	5.075	0.888	5.714	0.000
	\(\beta_2\)	bean2	5.700	0.888	6.418	0.000
	\(\beta_3\)	bean3	-0.600	0.888	-0.676	0.510
	\(\beta_4\)	bean4	-0.250	0.888	-0.281	0.782
	\(\beta_5\)	bean5	-3.700	0.888	-4.166	0.001

The mean yield, or expected value \(E(y)\) of the yield for a given bean variety in a given block is

\[E(y\mid\mathrm{block,\ bean}) = \mu + \alpha_\mathrm{block} + \beta_\mathrm{bean}\]

Making sense of the coefficients

Term	Coef	In R output	Estimate	Std. Error	t value	Pr(>\|t\|)
Constant	\(\mu\)	(Intercept)	16.675	0.397	41.982	0.000
Block	\(\alpha_1\)	block1	0.042	0.688	0.061	0.953
	\(\alpha_2\)	block2	2.392	0.688	3.476	0.003
	\(\alpha_3\)	block3	-1.475	0.688	-2.144	0.049
Bean	\(\beta_1\)	bean1	5.075	0.888	5.714	0.000
	\(\beta_2\)	bean2	5.700	0.888	6.418	0.000
	\(\beta_3\)	bean3	-0.600	0.888	-0.676	0.510
	\(\beta_4\)	bean4	-0.250	0.888	-0.281	0.782
	\(\beta_5\)	bean5	-3.700	0.888	-4.166	0.001

\[E(y\mid\mathrm{block,\ bean}) = \mu + \alpha_\mathrm{block} + \beta_\mathrm{bean}\]

Example: Mean yield from Bean Variety 3 (bean = 3) grown in Block 2 (block = 2):

\(E(y\mid\mathrm{block}=2\land \mathrm{bean}=3) = \mu + \alpha_2 + \beta_3\)

\(E(y\mid\mathrm{block}=2\land \mathrm{bean}=3) = 16.675 + 2.392 -0.6 = 18.467\)

But what about Block 4!

And Bean Variety 6! Good point:

We have 4 blocks, but only 3 coefficients for block \((\alpha_1, \alpha_2, \alpha_3)\).
We have 6 kinds of bean, but only 5 coefficients for bean \((\beta_1, \ldots, \beta_5)\)
Remember that block ‘consumed’ only 3 DoF, and bean only 5 DoF:
Each estimate ‘costs’ one degree of freedom.

If you have \(n\) groups, you only get \(n-1\) group means. — Why?
We also know that the \(n\) group means must sum to zero!

\[\alpha_1 + \alpha_2 + \ldots +\alpha_n= \sum_{i=1}^n \alpha_i = 0\]

So, we do know the remaining \(n^\mathrm{th}\) goup mean!

\[\alpha_n = -\alpha_1 - \alpha_2 - \ldots - \alpha_{n-1} = -\sum_{i=1}^{n-1}\alpha_i\]

Model equation…

in pseudo-matrix notation

\[y = \mu + \begin{bmatrix} \mathtt{block} & \mathrm{Deviation} \\ 1 & \alpha_1 \\ 2 & \alpha_2 \\ 3 & \alpha_3 \\ 4 & -(\alpha_1+\alpha_2+\alpha_3) \\ \end{bmatrix} + \begin{bmatrix} \mathtt{bean} & \mathrm{Deviation} \\ 1 & \beta_1 \\ 2 & \beta_2 \\ 3 & \beta_3 \\ 4 & \beta_4 \\ 5 & \beta_5 \\ 6 & -(\beta_1+\beta_2+\beta_3+\beta_4+\beta_5) \\ \end{bmatrix} + \epsilon \]

The Concept of Orthogonality

A Definition First

Two EVs \(X_1,\ X_2\) are said to be orthogonal when knowledge of one variable provides no information about the other variable (and vice versa).

In our Beans example,

lm(yield ~ block + bean)

orthogonality means that

knowing block tells you nothing about what bean variety was planted on the plot; and
knowing the bean variety tells you nothing about what block the plot is in.

A simple orthogonal design

              T R E A T M E N T
              
          A       B       C       D       
      +-------+-------+-------+-------+
      ¦       ¦       ¦       ¦       ¦
   1  ¦   3   ¦   3   ¦   3   ¦   3   ¦
      ¦       ¦       ¦       ¦       ¦
      +-------+-------+-------+-------+
      ¦       ¦       ¦       ¦       ¦
B  2  ¦   3   ¦   3   ¦   3   ¦   3   ¦
L     ¦       ¦       ¦       ¦       ¦
O     +-------+-------+-------+-------+
C     ¦       ¦       ¦       ¦       ¦
K  3  ¦   3   ¦   3   ¦   3   ¦   3   ¦
      ¦       ¦       ¦       ¦       ¦
      +-------+-------+-------+-------+
      ¦       ¦       ¦       ¦       ¦
   4  ¦   3   ¦   3   ¦   3   ¦   3   ¦
      ¦       ¦       ¦       ¦       ¦
      +-------+-------+-------+-------+

(Numbers = number of replicates for each block - treatment combination)

Do the proportions of treatment replicates differ between blocks? Here, no:

A:B:C:D = 3:3:3:3 in every block.
The probability that a replicate (plot) has received a particular treat is the same in all blocks (3/12 = 25%).

Knowing block tells you nothing about what treat a plot likely received, and vice versa: block and bean are therefore orthogonal.

How could \(X_1\) (block) possibly be informative regarding\(X_2\) (bean)?

Now consider this

              T R E A T M E N T
              
          A       B       C       D       
      +-------+-------+-------+-------+
      ¦       ¦       ¦       ¦       ¦
   1  ¦   1   ¦   3   ¦   3   ¦   3   ¦
      ¦       ¦       ¦       ¦       ¦
      +-------+-------+-------+-------+
      ¦       ¦       ¦       ¦       ¦
B  2  ¦   3   ¦   3   ¦   1   ¦   3   ¦
L     ¦       ¦       ¦       ¦       ¦
O     +-------+-------+-------+-------+
C     ¦       ¦       ¦       ¦       ¦
K  3  ¦   3   ¦   3   ¦   3   ¦   3   ¦
      ¦       ¦       ¦       ¦       ¦
      +-------+-------+-------+-------+
      ¦       ¦       ¦       ¦       ¦
   4  ¦   3   ¦   1   ¦   3   ¦   3   ¦
      ¦       ¦       ¦       ¦       ¦
      +-------+-------+-------+-------+

(Numbers = number of replicates for each block - treatment combination)

Due to mishap, we have lost some replicates… so now:

The probability that treat = A is

1/10 = 10% in block 1
3/10 = 30% in block 2 and 4
3/12 = 25% in block 3

Knowing block therefore tells you something about what the treatment most likely was — and vice versa: they are mutually informative and therefore not orthogonal.

Orthonality vs. Balance

They are not the same — you can have a design that is unbalanced, but still orthogonal:

              T R E A T M E N T
              
          A       B       C       D       
      +-------+-------+-------+-------+
      ¦       ¦       ¦       ¦       ¦
   1  ¦   2   ¦   4   ¦   4   ¦   6   ¦
      ¦       ¦       ¦       ¦       ¦
      +-------+-------+-------+-------+
      ¦       ¦       ¦       ¦       ¦
B  2  ¦   1   ¦   2   ¦   2   ¦   3   ¦
L     ¦       ¦       ¦       ¦       ¦
O     +-------+-------+-------+-------+
C     ¦       ¦       ¦       ¦       ¦
K  3  ¦   2   ¦   4   ¦   4   ¦   6   ¦
      ¦       ¦       ¦       ¦       ¦
      +-------+-------+-------+-------+
      ¦       ¦       ¦       ¦       ¦
   4  ¦   3   ¦   6   ¦   6   ¦   9   ¦
      ¦       ¦       ¦       ¦       ¦
      +-------+-------+-------+-------+

In all four blocks, the number of replicates in A:B:C:D = 1:2:2:3

block therefore entails no information about treat
block and treat are orthogonal.

But unbalanced — we get much more information about Treatment D (24 replicates) than A (8 replicates).

Why worry?

In an orthogonal design,

We can unambiguously attribute (partition) variation to EVs.
We can unambiguously separate the effects of EVs.
The sequential and the adjusted SSQ are equal!
The order of the EVs does not affect the sequential SSQ.

This gives a handy orthogonality check…

Assessing orthogonality

From tables of replicates for each combination of \(X_1,\ X_2\), as we have just done.
From comparing sequential and adjusted SSQ — are they equal?
Are the sequential SSQ for \(X_1, X_2\) order-independent?