M. Drew LaMar
December 5, 2018
“These procedures are designed to remove the perception that unconscious bias might taint the results of a study.”
- Ruxton & Colegrave
Eliminate bias
Reduce sampling error
To reduce bias (increase accuracy):
To reduce sampling error (increase precision):
Positive control: Oh yeah, well is it better than a handgun?
Key takeaway:
“Careful statement of the hypothesis under test makes it easy to determine what type of control your experiment requires.”
- Ruxton & Colegrave
“These procedures are designed to remove the perception that unconscious bias might taint the results of a study.”
- Ruxton & Colegrave
\[ \mathrm{SE}_{\bar{Y}_{1}-\bar{Y}_{2}} = \sqrt{s_{p}^{2}\left(\frac{1}{n_{1}} + \frac{1}{n_{2}}\right)} \]
“If you know and can measure some factor of experimental units that is likely to explain a substantial fraction of between-subject variation then it can be effective to block on that factor.”
- Ruxton & Colegrave
“Don't block on a factor unless you have a clear expectation that that factor substantially increases between-individual variation.”
- Ruxton & Colegrave
Examples of type of statistics used:
Two-way, fixed-effects ANOVA
\( Y = \mu + A + B + A*B \)
The explanatory variables are called factors, as they represent treatments of direct interest.
\( \mathrm{A} \) and \( \mathrm{B} \) are called main effects; they represent effects of each factor alone, when averaged over the categories of the other factor.
Two-way, fixed-effects ANOVA
\( Y = \mu + A + B + A*B \)
The explanatory variables are called factors, as they represent treatments of direct interest.
\( \mathrm{A*B} \) is the interaction term.
Example 18.3: Interaction zone
Harley (2003) investigated how herbivores affect the abundance of plants living in the intertidal habitat of coastal Washington using field transplants of a red alga, Mazzaella parksii. The experiment also examined whether the effect of herbivores on the algae depended on where in the intertidal zone the plants were growing. Thirty-two study plots were established just above the low-tide mark, and another 32 plots were set at mid-height between the low- and high-tide marks. Using copper fencing, herbivores were excluded from a randomly chosen half of the plots at each height.
\[ \begin{align} \mathrm{ALGAE} = & \mathrm{CONSTANT} + \mathrm{HERBIVORY} + \mathrm{HEIGHT} +\\ & \mathrm{HERBIVORY*HEIGHT} \end{align} \]
\[ \begin{align} \mathrm{ALGAE} = & \mathrm{CONSTANT} + \mathrm{HERBIVORY} + \mathrm{HEIGHT} +\\ & \mathrm{HERBIVORY*HEIGHT} \end{align} \]
We need to examine the improvement in fit of the model to the data with and without each term (i.e. main effects and interaction effect).
Question: Does herbivory have an effect on mean algal cover?
Null Model (Type I): \[ \mathrm{ALGAE} = \mathrm{CONSTANT} \]
Alt Model: \[ \mathrm{ALGAE} = \mathrm{CONSTANT} + \mathrm{HERBIVORY} \]
\[ \begin{align} \mathrm{ALGAE} = & \mathrm{CONSTANT} + \mathrm{HERBIVORY} + \mathrm{HEIGHT} +\\ & \mathrm{HERBIVORY*HEIGHT} \end{align} \]
We need to examine the improvement in fit of the model to the data with and without each term (i.e. main effects and interaction effect).
Question: Does height have an effect on mean algal cover?
Null Model (Type I): \[ \mathrm{ALGAE} = \mathrm{CONSTANT} \]
Alt Model: \[ \mathrm{ALGAE} = \mathrm{CONSTANT} + \mathrm{HEIGHT} \]
\[ \begin{align} \mathrm{ALGAE} = & \mathrm{CONSTANT} + \mathrm{HERBIVORY} + \mathrm{HEIGHT} +\\ & \mathrm{HERBIVORY*HEIGHT} \end{align} \]
We need to examine the improvement in fit of the model to the data with and without each term (i.e. main effects and interaction effect).
Question: Does effect of herbivory on mean algal cover depend on height?
Null Model: \[ \mathrm{ALGAE} = \mathrm{CONSTANT} + \mathrm{HERBIVORY} + \mathrm{HEIGHT} \]
Alt Model: \[ \begin{align} \mathrm{ALGAE} = & \mathrm{CONSTANT} + \mathrm{HERBIVORY} + \mathrm{HEIGHT} +\\ & \mathrm{HERBIVORY*HEIGHT} \end{align} \]
All of these terms get tested at one time using the following notation in R.
algaeFullModel <- lm(sqrtArea ~ height * herbivores,
data = algae)
Caution: height * herbivores
refers to all terms, while height : herbivores
refers to the interaction term, i.e. the following performs the same analysis:
algaeFullModel <- lm(sqrtArea ~ height +
herbivores +
height : herbivores,
data = algae)
Let's load the data.
'data.frame': 64 obs. of 3 variables:
$ height : Factor w/ 2 levels "low","mid": 1 1 1 1 1 1 1 1 1 1 ...
$ herbivores: Factor w/ 2 levels "minus","plus": 1 1 1 1 1 1 1 1 1 1 ...
$ sqrtArea : num 9.41 34.47 46.67 16.64 24.38 ...
anova(algaeFullModel)
Analysis of Variance Table
Response: sqrtArea
Df Sum Sq Mean Sq F value Pr(>F)
height 1 89.0 88.97 0.3741 0.543096
herbivores 1 1512.2 1512.18 6.3579 0.014360 *
height:herbivores 1 2617.0 2616.96 11.0029 0.001549 **
Residuals 60 14270.5 237.84
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
algaeFullModel$coefficients
(Intercept) heightmid herbivoresplus
32.91450 -10.43090 -22.51075
heightmid:herbivoresplus
25.57809
Term | Estimate |
---|---|
(Intercept) | 32.9145029 |
heightmid | -10.430905 |
herbivoresplus | -22.5107478 |
heightmid:herbivoresplus | 25.5780939 |
Term | Estimate |
---|---|
(Intercept) | 32.9145029 |
heightmid | -10.430905 |
herbivoresplus | -22.5107478 |
heightmid:herbivoresplus | 25.5780939 |