2024-02-01
We have already met examples of models with more than one EV (ANCOVA is one of them). But let’s go back to designed experiments, and just categorical variables. We have met models such as
yield ~ treatment (fully randomised)yield ~ block + treatment (randomised block)yield ~ row + col + treatment (latin squares)In all these, there is only really treatment — we are not interested in the effect of block.
Farmers can choose between two varieties of wheat,V1 (SupaGrow®) and V2 (YieldsRUs®).
As a farmer, you want to maximise your yield and find out
V1 or V2?S1–S4?How to design the experiment?
Compare sowing densities in one variety only, V1: use 16 plots, with the 4 densities randomly assigned to 4 plots each.
Compare the two varieties at density S1 only: use 8 plots, with the 2 varieties randomly assigned to 4 plots each.
Assume this result:
S4 is best.V2 is better than V1.V2 at S4 will give the best yields?Essentially, these are two separate experiments:
Experiment 1 Experiment 2
S1 S2 S3 S4 S1 S2 S3 S4
+---+---+---+---+ +---+---+---+---+
V1 | 4 | 4 | 4 | 4 | V1 | 4 | | | |
+---+---+---+---+ +---+---+---+---+
V2 | | | | | V2 | 4 | | | |
+---+---+---+---+ +---+---+---+---+
V2 at S4!One single
S1 S2 S3 S4
+---+---+---+---+
V1 | 3 | 3 | 3 | 3 |
+---+---+---+---+
V2 | 3 | 3 | 3 | 3 |
+---+---+---+---+
We can now find the best combination of sowing density and variety.
In fact, in a
Question 3 in statistical terminology: “Is there an
We get all three answers from a single design + analysis.
S1 S2 S3 S4
+---+---+---+---+
V1 | 3 | 3 | 3 | 3 |
+---+---+---+---+
V2 | 3 | 3 | 3 | 3 |
+---+---+---+---+
If there is an interaction (if the effect of sow density on yield depends on variety), only a factorial design can find it.
S1 S2 S3 S4
+---+---+---+---+
V1 | 3 | 3 | 3 | 3 |
+---+---+---+---+
V2 | 3 | 3 | 3 | 3 |
+---+---+---+---+
If there is no interaction (if yield responds to sow.dens in the same way in both varieties), it is still the better design:
sow.dens + 4 replicates of each variety.sow.dens + 12 replicates of each variety.Factorial designs afford
How do we best analyse our factorial wheat yield experiment?
There are 8 combinations of sow.dens and variety, each with 3 replicates (plots). Assume we have blocked the experiment, i.e., 3 blocks of 8 plots, each block with one replicate of each combination.
In principle, we could treat each of the 8 sow.dens and variety as a different treatment (trtmt), thus
trtmt = 1: variety V1 at sow density S1trtmt = 2: variety V2 at sow density S1trtmt = 3: variety V1 at sow density S2 etc…and then fit the LM
m.1 <- lm(yield ~ block + trtmt, Wheat)
anova(m.1)
## Analysis of Variance Table ## ## Response: yield ## Df Sum Sq Mean Sq F value Pr(>F) ## block 2 0.3937 0.19683 0.5193 0.60599 ## trtmt 7 8.0776 1.15394 3.0442 0.03623 ## Residuals 14 5.3069 0.37906
What does this tell us?
OK, fair enough — valid answer; but this one-way ANOVA does not answer the three separate questions we asked in our design!
So we need a model with three model terms (plus block):
sow.dens and variety as two separate terms in the model (not rolled into one trtmt), andsow.dens:variety.m.2 <- lm(yield ~ block + sow.dens + variety + sow.dens:variety, Wheat)
anova(m.2)
## Analysis of Variance Table ## ## Response: yield ## Df Sum Sq Mean Sq F value Pr(>F) ## block 2 0.3937 0.19683 0.5193 0.60599 ## sow.dens 3 5.8736 1.95786 5.1650 0.01303 ## variety 1 2.1474 2.14742 5.6651 0.03206 ## sow.dens:variety 3 0.0566 0.01887 0.0498 0.98470 ## Residuals 14 5.3069 0.37906
| Df | Sum Sq | Mean Sq | F value | Pr(>F) | |
|---|---|---|---|---|---|
| block | 2 | 0.3940 | 0.1970 | 0.5190 | 0.6060 |
| sow.dens | 3 | 5.8700 | 1.9600 | 5.1600 | 0.0130 |
| variety | 1 | 2.1500 | 2.1500 | 5.6700 | 0.0321 |
| sow.dens:variety | 3 | 0.0566 | 0.0189 | 0.0498 | 0.9850 |
| Residuals | 14 | 5.3100 | 0.3790 |
yield of the two varietys respond differently to sow.dens? No \((P=0.985)\).Note that we start our interpretation with the interaction, we’ll come back to this. In a nutshell, the main reason is this:
If the interaction is significant, we already know that both EVs affect yield!! — so the remaining two Qs become obsolete.
| Df | Sum Sq | Mean Sq | F value | Pr(>F) | |
|---|---|---|---|---|---|
| block | 2 | 0.3940 | 0.1970 | 0.5190 | 0.6060 |
| sow.dens | 3 | 5.8700 | 1.9600 | 5.1600 | 0.0130 |
| variety | 1 | 2.1500 | 2.1500 | 5.6700 | 0.0321 |
| sow.dens:variety | 3 | 0.0566 | 0.0189 | 0.0498 | 0.9850 |
| Residuals | 14 | 5.3100 | 0.3790 |
Does the yield of the two varietys respond differently to sow.dens? No \((P=0.985)\).
Does variety affect yield, when differences in sow.dens have been taken into account? Yes \((P=0.0321)\).
| Df | Sum Sq | Mean Sq | F value | Pr(>F) | |
|---|---|---|---|---|---|
| block | 2 | 0.3940 | 0.1970 | 0.5190 | 0.6060 |
| sow.dens | 3 | 5.8700 | 1.9600 | 5.1600 | 0.0130 |
| variety | 1 | 2.1500 | 2.1500 | 5.6700 | 0.0321 |
| sow.dens:variety | 3 | 0.0566 | 0.0189 | 0.0498 | 0.9850 |
| Residuals | 14 | 5.3100 | 0.3790 |
Does the yield of the two varietys respond differently to sow.dens? No \((P=0.985)\).
Does variety affect yield, when differences in sow.dens have been taken into account? Yes \((P=0.0321)\).
Does sow.dens affect yield, when differences in variety have been taken into account? Yes \((P=0.013)\).
In this example,
In the next lecture, we will