2024-01-11

Interpretation

Model with two categorical variables

In the last mini-lecture, we fitted models with two categorical explanatory variables \((X_1, X_2)\). Now we need to interpret the output.

Returning to our Beans model

yield.m2 <- lm(yield ~ block + bean,  # including blocks info
               data = Beans,
               contrasts = list(block=contr.sum, bean=contr.sum)
               )
anova(yield.m2)
## Analysis of Variance Table
## 
## Response: yield
##           Df Sum Sq Mean Sq F value    Pr(>F)    
## block      3  52.90  17.632  4.6567   0.01713 *  
## bean       5 444.44  88.887 23.4757 1.341e-06 ***
## Residuals 15  56.79   3.786                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

ANOVA is not enough!

Recall what cannot be said often enough:

An ANOVA table is great in that it tells you about the importance of the different EVs in your model.

But it does not

  • tell you how the EVs affect the outcome;
  • let you predict the effect of the treatment.

We need to interpret the coefficients!

Coefficients from bean count

raw R output

with(summary(yield.m2), round(coefficients, 3))
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   16.675      0.397  41.982    0.000
## block1         0.042      0.688   0.061    0.953
## block2         2.392      0.688   3.476    0.003
## block3        -1.475      0.688  -2.144    0.049
## bean1          5.075      0.888   5.714    0.000
## bean2          5.700      0.888   6.418    0.000
## bean3         -0.600      0.888  -0.676    0.510
## bean4         -0.250      0.888  -0.281    0.782
## bean5         -3.700      0.888  -4.166    0.001

Making sense of the coefficients

Term Coef In R output Estimate Std. Error t value Pr(>|t|)
Constant \(\mu\) (Intercept) 16.675 0.397 41.982 0.000
Block \(\alpha_1\) block1 0.042 0.688 0.061 0.953
\(\alpha_2\) block2 2.392 0.688 3.476 0.003
\(\alpha_3\) block3 -1.475 0.688 -2.144 0.049
Bean \(\beta_1\) bean1 5.075 0.888 5.714 0.000
\(\beta_2\) bean2 5.700 0.888 6.418 0.000
\(\beta_3\) bean3 -0.600 0.888 -0.676 0.510
\(\beta_4\) bean4 -0.250 0.888 -0.281 0.782
\(\beta_5\) bean5 -3.700 0.888 -4.166 0.001

The mean yield, or expected value \(E(y)\) of the yield for a given bean variety in a given block is

\[E(y\mid\mathrm{block,\ bean}) = \mu + \alpha_\mathrm{block} + \beta_\mathrm{bean}\]

Making sense of the coefficients

Term Coef In R output Estimate Std. Error t value Pr(>|t|)
Constant \(\mu\) (Intercept) 16.675 0.397 41.982 0.000
Block \(\alpha_1\) block1 0.042 0.688 0.061 0.953
\(\alpha_2\) block2 2.392 0.688 3.476 0.003
\(\alpha_3\) block3 -1.475 0.688 -2.144 0.049
Bean \(\beta_1\) bean1 5.075 0.888 5.714 0.000
\(\beta_2\) bean2 5.700 0.888 6.418 0.000
\(\beta_3\) bean3 -0.600 0.888 -0.676 0.510
\(\beta_4\) bean4 -0.250 0.888 -0.281 0.782
\(\beta_5\) bean5 -3.700 0.888 -4.166 0.001

\[E(y\mid\mathrm{block,\ bean}) = \mu + \alpha_\mathrm{block} + \beta_\mathrm{bean}\]

Example: Mean yield from Bean Variety 3 (bean = 3) grown in Block 2 (block = 2):

\(E(y\mid\mathrm{block}=2\land \mathrm{bean}=3) = \mu + \alpha_2 + \beta_3\)

\(E(y\mid\mathrm{block}=2\land \mathrm{bean}=3) = 16.675 + 2.392 -0.6 = 18.467\)

But what about Block 4!

And Bean Variety 6! Good point:

  • We have 4 blocks, but only 3 coefficients for block \((\alpha_1, \alpha_2, \alpha_3)\).
  • We have 6 kinds of bean, but only 5 coefficients for bean \((\beta_1, \ldots, \beta_5)\)
  • Remember that block ‘consumed’ only 3 DoF, and bean only 5 DoF:
    Each estimate ‘costs’ one degree of freedom.

If you have \(n\) groups, you only get \(n-1\) group means. — Why?
We also know that the \(n\) group means must sum to zero!

\[\alpha_1 + \alpha_2 + \ldots +\alpha_n= \sum_{i=1}^n \alpha_i = 0\]

So, we do know the remaining \(n^\mathrm{th}\) goup mean!

\[\alpha_n = -\alpha_1 - \alpha_2 - \ldots - \alpha_{n-1} = -\sum_{i=1}^{n-1}\alpha_i\]

Model equation…

in pseudo-matrix notation

\[y = \mu + \begin{bmatrix} \mathtt{block} & \mathrm{Deviation} \\ 1 & \alpha_1 \\ 2 & \alpha_2 \\ 3 & \alpha_3 \\ 4 & -(\alpha_1+\alpha_2+\alpha_3) \\ \end{bmatrix} + \begin{bmatrix} \mathtt{bean} & \mathrm{Deviation} \\ 1 & \beta_1 \\ 2 & \beta_2 \\ 3 & \beta_3 \\ 4 & \beta_4 \\ 5 & \beta_5 \\ 6 & -(\beta_1+\beta_2+\beta_3+\beta_4+\beta_5) \\ \end{bmatrix} + \epsilon \]

The Concept of Orthogonality

A Definition First

Two EVs \(X_1,\ X_2\) are said to be orthogonal when knowledge of one variable provides no information about the other variable (and vice versa).

In our Beans example,

lm(yield ~ block + bean)

orthogonality means that

  • knowing block tells you nothing about what bean variety was planted on the plot; and
  • knowing the bean variety tells you nothing about what block the plot is in.

A simple orthogonal design

              T R E A T M E N T
              
          A       B       C       D       
      +-------+-------+-------+-------+
      ¦       ¦       ¦       ¦       ¦
   1  ¦   3   ¦   3   ¦   3   ¦   3   ¦
      ¦       ¦       ¦       ¦       ¦
      +-------+-------+-------+-------+
      ¦       ¦       ¦       ¦       ¦
B  2  ¦   3   ¦   3   ¦   3   ¦   3   ¦
L     ¦       ¦       ¦       ¦       ¦
O     +-------+-------+-------+-------+
C     ¦       ¦       ¦       ¦       ¦
K  3  ¦   3   ¦   3   ¦   3   ¦   3   ¦
      ¦       ¦       ¦       ¦       ¦
      +-------+-------+-------+-------+
      ¦       ¦       ¦       ¦       ¦
   4  ¦   3   ¦   3   ¦   3   ¦   3   ¦
      ¦       ¦       ¦       ¦       ¦
      +-------+-------+-------+-------+

(Numbers = number of replicates for each block - treatment combination)

Do the proportions of treatment replicates differ between blocks? Here, no:

  • A:B:C:D = 3:3:3:3 in every block.
  • The probability that a replicate (plot) has received a particular treat is the same in all blocks (3/12 = 25%).

Knowing block tells you nothing about what treat a plot likely received, and vice versa: block and bean are therefore orthogonal.

How could \(X_1\) (block) possibly be informative regarding\(X_2\) (bean)?

Now consider this

              T R E A T M E N T
              
          A       B       C       D       
      +-------+-------+-------+-------+
      ¦       ¦       ¦       ¦       ¦
   1  ¦   1   ¦   3   ¦   3   ¦   3   ¦
      ¦       ¦       ¦       ¦       ¦
      +-------+-------+-------+-------+
      ¦       ¦       ¦       ¦       ¦
B  2  ¦   3   ¦   3   ¦   1   ¦   3   ¦
L     ¦       ¦       ¦       ¦       ¦
O     +-------+-------+-------+-------+
C     ¦       ¦       ¦       ¦       ¦
K  3  ¦   3   ¦   3   ¦   3   ¦   3   ¦
      ¦       ¦       ¦       ¦       ¦
      +-------+-------+-------+-------+
      ¦       ¦       ¦       ¦       ¦
   4  ¦   3   ¦   1   ¦   3   ¦   3   ¦
      ¦       ¦       ¦       ¦       ¦
      +-------+-------+-------+-------+

(Numbers = number of replicates for each block - treatment combination)

Due to mishap, we have lost some replicates… so now:

The probability that treat = A is

  • 1/10 = 10% in block 1
  • 3/10 = 30% in block 2 and 4
  • 3/12 = 25% in block 3

Knowing block therefore tells you something about what the treatment most likely was — and vice versa: they are mutually informative and therefore not orthogonal.

Orthonality vs. Balance

They are not the same — you can have a design that is unbalanced, but still orthogonal:

              T R E A T M E N T
              
          A       B       C       D       
      +-------+-------+-------+-------+
      ¦       ¦       ¦       ¦       ¦
   1  ¦   2   ¦   4   ¦   4   ¦   6   ¦
      ¦       ¦       ¦       ¦       ¦
      +-------+-------+-------+-------+
      ¦       ¦       ¦       ¦       ¦
B  2  ¦   1   ¦   2   ¦   2   ¦   3   ¦
L     ¦       ¦       ¦       ¦       ¦
O     +-------+-------+-------+-------+
C     ¦       ¦       ¦       ¦       ¦
K  3  ¦   2   ¦   4   ¦   4   ¦   6   ¦
      ¦       ¦       ¦       ¦       ¦
      +-------+-------+-------+-------+
      ¦       ¦       ¦       ¦       ¦
   4  ¦   3   ¦   6   ¦   6   ¦   9   ¦
      ¦       ¦       ¦       ¦       ¦
      +-------+-------+-------+-------+

In all four blocks, the number of replicates in A:B:C:D = 1:2:2:3

  • block therefore entails no information about treat
  • block and treat are orthogonal.

But unbalanced — we get much more information about Treatment D (24 replicates) than A (8 replicates).

Why worry?

In an orthogonal design,

  • We can unambiguously attribute (partition) variation to EVs.
  • We can unambiguously separate the effects of EVs.
  • The sequential and the adjusted SSQ are equal!
  • The order of the EVs does not affect the sequential SSQ.

This gives a handy orthogonality check…

Assessing orthogonality

  • From tables of replicates for each combination of \(X_1,\ X_2\), as we have just done.
  • From comparing sequential and adjusted SSQ — are they equal?
  • Are the sequential SSQ for \(X_1, X_2\) order-independent?