Different experimental designs

M. Drew LaMar
December 5, 2018

“These procedures are designed to remove the perception that unconscious bias might taint the results of a study.”

- Ruxton & Colegrave

Class announcements

  • Reminder: Homework #8 Due: Friday, December 7, 11:59 pm
  • Exam #3 will be returned in class on Friday
  • Please fill out course evaluations!

Goal of experiments

Eliminate bias

Goal of experiments

Reduce sampling error

Experimental techniques

To reduce bias (increase accuracy):

  • Controls
  • Randomization
  • Blinding

To reduce sampling error (increase precision):

  • Replication
  • Balance
  • Blocking

Controls

Positive control: Oh yeah, well is it better than a handgun?

Controls

  • Concurrent control
    • Negative control (probably what you're familiar with)
    • Positive control (compare to best method available)
  • Historical control
    • When a researcher says “Ain't nobody got time/money for that!”

Key takeaway:

“Careful statement of the hypothesis under test makes it easy to determine what type of control your experiment requires.”
- Ruxton & Colegrave

Randomization

Blinding

“These procedures are designed to remove the perception that unconscious bias might taint the results of a study.”

- Ruxton & Colegrave

Blinding

Replication

Balance

Balance

\[ \mathrm{SE}_{\bar{Y}_{1}-\bar{Y}_{2}} = \sqrt{s_{p}^{2}\left(\frac{1}{n_{1}} + \frac{1}{n_{2}}\right)} \]

plot of chunk unnamed-chunk-1

Blocking

Blocking

“If you know and can measure some factor of experimental units that is likely to explain a substantial fraction of between-subject variation then it can be effective to block on that factor.”
- Ruxton & Colegrave

“Don't block on a factor unless you have a clear expectation that that factor substantially increases between-individual variation.”
- Ruxton & Colegrave

Blocking and covariates

Examples of type of statistics used:

  • Paired \( t \)-test (one categorical fixed effect with two levels)
  • Two-way fixed effects ANOVA (two categorical fixed effects with more than two levels; one variable is of interest, the other covariate is used to partition variation)
  • Two-way mixed effects (two categorical, one fixed effect of interest, and one random effect covariate used to partition variation)
  • ANCOVA (one categorical fixed effect of interest, and one numerical covariate): This essentially performs simultaneous regression in the levels of the categorical fixed effect.

Factorial designs

Two-way, fixed-effects ANOVA

\( Y = \mu + A + B + A*B \)

  • \( A \) and \( B \) are categorical explanatory variables
  • \( Y \) is a numerical response variable
  • \( \mu \) is a constant

The explanatory variables are called factors, as they represent treatments of direct interest.

\( \mathrm{A} \) and \( \mathrm{B} \) are called main effects; they represent effects of each factor alone, when averaged over the categories of the other factor.

Factorial designs

Two-way, fixed-effects ANOVA

\( Y = \mu + A + B + A*B \)

  • \( A \) and \( B \) are categorical explanatory variables
  • \( Y \) is a numerical response variable
  • \( \mu \) is a constant

The explanatory variables are called factors, as they represent treatments of direct interest.

\( \mathrm{A*B} \) is the interaction term.

Factorial designs

Factorial designs

Example 18.3: Interaction zone

Factorial designs

Harley (2003) investigated how herbivores affect the abundance of plants living in the intertidal habitat of coastal Washington using field transplants of a red alga, Mazzaella parksii. The experiment also examined whether the effect of herbivores on the algae depended on where in the intertidal zone the plants were growing. Thirty-two study plots were established just above the low-tide mark, and another 32 plots were set at mid-height between the low- and high-tide marks. Using copper fencing, herbivores were excluded from a randomly chosen half of the plots at each height.

Factorial designs

\[ \begin{align} \mathrm{ALGAE} = & \mathrm{CONSTANT} + \mathrm{HERBIVORY} + \mathrm{HEIGHT} +\\ & \mathrm{HERBIVORY*HEIGHT} \end{align} \]

Factorial designs

\[ \begin{align} \mathrm{ALGAE} = & \mathrm{CONSTANT} + \mathrm{HERBIVORY} + \mathrm{HEIGHT} +\\ & \mathrm{HERBIVORY*HEIGHT} \end{align} \]

We need to examine the improvement in fit of the model to the data with and without each term (i.e. main effects and interaction effect).

Question: Does herbivory have an effect on mean algal cover?

Null Model (Type I): \[ \mathrm{ALGAE} = \mathrm{CONSTANT} \]

Alt Model: \[ \mathrm{ALGAE} = \mathrm{CONSTANT} + \mathrm{HERBIVORY} \]

Factorial designs

\[ \begin{align} \mathrm{ALGAE} = & \mathrm{CONSTANT} + \mathrm{HERBIVORY} + \mathrm{HEIGHT} +\\ & \mathrm{HERBIVORY*HEIGHT} \end{align} \]

We need to examine the improvement in fit of the model to the data with and without each term (i.e. main effects and interaction effect).

Question: Does height have an effect on mean algal cover?

Null Model (Type I): \[ \mathrm{ALGAE} = \mathrm{CONSTANT} \]

Alt Model: \[ \mathrm{ALGAE} = \mathrm{CONSTANT} + \mathrm{HEIGHT} \]

Factorial designs

\[ \begin{align} \mathrm{ALGAE} = & \mathrm{CONSTANT} + \mathrm{HERBIVORY} + \mathrm{HEIGHT} +\\ & \mathrm{HERBIVORY*HEIGHT} \end{align} \]

We need to examine the improvement in fit of the model to the data with and without each term (i.e. main effects and interaction effect).

Question: Does effect of herbivory on mean algal cover depend on height?

Null Model: \[ \mathrm{ALGAE} = \mathrm{CONSTANT} + \mathrm{HERBIVORY} + \mathrm{HEIGHT} \]

Alt Model: \[ \begin{align} \mathrm{ALGAE} = & \mathrm{CONSTANT} + \mathrm{HERBIVORY} + \mathrm{HEIGHT} +\\ & \mathrm{HERBIVORY*HEIGHT} \end{align} \]

Factorial designs

All of these terms get tested at one time using the following notation in R.

algaeFullModel <- lm(sqrtArea ~ height * herbivores, 
                     data = algae)

Caution: height * herbivores refers to all terms, while height : herbivores refers to the interaction term, i.e. the following performs the same analysis:

algaeFullModel <- lm(sqrtArea ~ height + 
                       herbivores + 
                       height : herbivores, 
                     data = algae)

Factorial designs

Let's load the data.

'data.frame':   64 obs. of  3 variables:
 $ height    : Factor w/ 2 levels "low","mid": 1 1 1 1 1 1 1 1 1 1 ...
 $ herbivores: Factor w/ 2 levels "minus","plus": 1 1 1 1 1 1 1 1 1 1 ...
 $ sqrtArea  : num  9.41 34.47 46.67 16.64 24.38 ...

Factorial designs

anova(algaeFullModel)
Analysis of Variance Table

Response: sqrtArea
                  Df  Sum Sq Mean Sq F value   Pr(>F)   
height             1    89.0   88.97  0.3741 0.543096   
herbivores         1  1512.2 1512.18  6.3579 0.014360 * 
height:herbivores  1  2617.0 2616.96 11.0029 0.001549 **
Residuals         60 14270.5  237.84                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Factorial designs

algaeFullModel$coefficients
             (Intercept)                heightmid           herbivoresplus 
                32.91450                -10.43090                -22.51075 
heightmid:herbivoresplus 
                25.57809 
Term Estimate
(Intercept) 32.9145029
heightmid -10.430905
herbivoresplus -22.5107478
heightmid:herbivoresplus 25.5780939

Factorial designs

Term Estimate
(Intercept) 32.9145029
heightmid -10.430905
herbivoresplus -22.5107478
heightmid:herbivoresplus 25.5780939

plot of chunk unnamed-chunk-8

Recommendation

  • Read Whitlock & Schluter, Chapter 14: Designing experiments.
  • Read Whitlock & Schluter, Chapter 18: Multiple explanatory variables.