M. Drew LaMar
April 27, 2016
Linear model | Other name | Example study design |
---|---|---|
\( Y = \mu + A + B + A*B \) | Two-way, fixed- effects ANOVA |
Factorial experiment |
\( Y = \mu + A + b + A*b \) | Two-way, mixed- effects ANOVA |
Factorial experiment |
\( Y = \mu + X + A \) | Analysis of covariance (ANCOVA) |
Observational study |
The explanatory variables are called factors, as they represent treatments of direct interest.
\( \mathrm{A} \) and \( \mathrm{B} \) are called main effects; they represent effects of each factor alone, when averaged over the categories of the other factor.
Linear model | Other name | Example study design |
---|---|---|
\( Y = \mu + A + B + A*B \) | Two-way, fixed- effects ANOVA |
Factorial experiment |
\( Y = \mu + A + b + A*b \) | Two-way, mixed- effects ANOVA |
Factorial experiment |
\( Y = \mu + X + A \) | Analysis of covariance (ANCOVA) |
Observational study |
The explanatory variables are called factors, as they represent treatments of direct interest.
\( \mathrm{A*B} \) is the interaction term.
Example 18.3: Interaction zone
Harley (2003) investigated how herbivores affect the abundance of plants living in the intertidal habitat of coastal Washington using field transplants of a red alga, Mazzaella parksii. The experiment also examined whether the effect of herbivores on the algae depended on where in the intertidal zone the plants were growing. Thirty-two study plots were established just above the low-tide mark, and another 32 plots were set at mid-height between the low- and high-tide marks. Using copper fencing, herbivores were excluded from a randomly chosen half of the plots at each height.
\[ \begin{align} \mathrm{ALGAE} = & \mathrm{CONSTANT} + \mathrm{HERBIVORY} + \mathrm{HEIGHT} +\\ & \mathrm{HERBIVORY*HEIGHT} \end{align} \]
\[ \begin{align} \mathrm{ALGAE} = & \mathrm{CONSTANT} + \mathrm{HERBIVORY} + \mathrm{HEIGHT} +\\ & \mathrm{HERBIVORY*HEIGHT} \end{align} \]
We need to examine the improvement in fit of the model to the data with and without each term (i.e. main effects and interaction effect).
Question: Does herbivory have an effect on mean algal cover?
Null Model (Type I): \[ \mathrm{ALGAE} = \mathrm{CONSTANT} \]
Alt Model: \[ \mathrm{ALGAE} = \mathrm{CONSTANT} + \mathrm{HERBIVORY} \]
\[ \begin{align} \mathrm{ALGAE} = & \mathrm{CONSTANT} + \mathrm{HERBIVORY} + \mathrm{HEIGHT} +\\ & \mathrm{HERBIVORY*HEIGHT} \end{align} \]
We need to examine the improvement in fit of the model to the data with and without each term (i.e. main effects and interaction effect).
Question: Does height have an effect on mean algal cover?
Null Model (Type I): \[ \mathrm{ALGAE} = \mathrm{CONSTANT} \]
Alt Model: \[ \mathrm{ALGAE} = \mathrm{CONSTANT} + \mathrm{HEIGHT} \]
\[ \begin{align} \mathrm{ALGAE} = & \mathrm{CONSTANT} + \mathrm{HERBIVORY} + \mathrm{HEIGHT} +\\ & \mathrm{HERBIVORY*HEIGHT} \end{align} \]
We need to examine the improvement in fit of the model to the data with and without each term (i.e. main effects and interaction effect).
Question: Does effect of herbivory on mean algal cover depend on height?
Null Model: \[ \mathrm{ALGAE} = \mathrm{CONSTANT} + \mathrm{HERBIVORY} + \mathrm{HEIGHT} \]
Alt Model: \[ \begin{align} \mathrm{ALGAE} = & \mathrm{CONSTANT} + \mathrm{HERBIVORY} + \mathrm{HEIGHT} +\\ & \mathrm{HERBIVORY*HEIGHT} \end{align} \]
All of these terms get tested at one time using the following notation in R.
algaeFullModel <- lm(sqrtArea ~ height * herbivores, data = algae)
Caution: height * herbivores
refers to all terms, while height : herbivores
refers to the interaction term, i.e. the following performs the same analysis:
algaeFullModel <- lm(sqrtArea ~ height + herbivores + height : herbivores, data = algae)
Let's load the data.
'data.frame': 64 obs. of 3 variables:
$ height : Factor w/ 2 levels "low","mid": 1 1 1 1 1 1 1 1 1 1 ...
$ herbivores: Factor w/ 2 levels "minus","plus": 1 1 1 1 1 1 1 1 1 1 ...
$ sqrtArea : num 9.41 34.47 46.67 16.64 24.38 ...
anova(algaeFullModel)
Analysis of Variance Table
Response: sqrtArea
Df Sum Sq Mean Sq F value Pr(>F)
height 1 89.0 88.97 0.3741 0.543096
herbivores 1 1512.2 1512.18 6.3579 0.014360 *
height:herbivores 1 2617.0 2616.96 11.0029 0.001549 **
Residuals 60 14270.5 237.84
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
algaeFullModel$coefficients
(Intercept) heightmid herbivoresplus
32.91450 -10.43090 -22.51075
heightmid:herbivoresplus
25.57809
Term | Estimate |
---|---|
(Intercept) | 32.9145029 |
heightmid | -10.430905 |
herbivoresplus | -22.5107478 |
heightmid:herbivoresplus | 25.5780939 |
Term | Estimate |
---|---|
(Intercept) | 32.9145029 |
heightmid | -10.430905 |
herbivoresplus | -22.5107478 |
heightmid:herbivoresplus | 25.5780939 |
Linear model | Other name | Example study design |
---|---|---|
\( Y = \mu + A + B + A*B \) | Two-way, fixed- effects ANOVA |
Factorial experiment |
\( Y = \mu + A + b + A*b \) | Two-way, mixed- effects ANOVA |
Factorial experiment |
\( Y = \mu + X + A \) | Analysis of covariance (ANCOVA) |
Observational study |
\[ \begin{align} \mathrm{RESPONSE} = & \mathrm{CONSTANT} + \mathrm{COVARIATE} + \\ & \mathrm{TREATMENT} \end{align} \]
Used to investigate if linear regressions fit to data with two or more groups have the same slope.
For observational studies, the numerical explanatory is called a confounding variable.
Example 18.4: Mole-rat layabouts
“Mole rats are the only known mammals with distinct social castes. A single queen and a small number of males are the only reproducing individuals in a colony. Remaining individuals, called workers, gather food, defend the colony, care for the young, and maintain the burrows. Recently, it was discovered that there might be two worker castes in the Damaraland mole rat. "Frequent workers” do almost all of the work in the colony, whereas “infrequent workers” do little work except on rare occasions… Scantlebury et al. (2006) compared daily energy expenditures of wild mole rats during a dry season. Energy expenditure appears to vary with body mass in both groups, but infrequent workers are heavier than frequent workers.
Open circles: “frequent workers”
Solid circles: “infrequent workers”
Question: How different is mean energy expenditure between the two groups when adjusted for differences in body mass?
Step 0: Load data
moleRat <- read.csv(url("http://www.zoology.ubc.ca/~schluter/WhitlockSchluter/wp-content/data/chapter18/chap18e4MoleRatLayabouts.csv"))
moleRatSorted <- moleRat %>% arrange(lnMass)
str(moleRatSorted)
'data.frame': 35 obs. of 3 variables:
$ caste : Factor w/ 2 levels "lazy","worker": 2 2 2 2 2 2 2 2 2 2 ...
$ lnMass : num 3.85 3.95 3.99 3.99 4.11 ...
$ lnEnergy: num 3.69 4.11 3.69 4.19 3.69 ...
Step 1: Test for interaction
moleRatNoInteractModel <- lm(lnEnergy ~ lnMass + caste, data = moleRatSorted)
moleRatFullModel <- lm(lnEnergy ~ lnMass * caste, data = moleRatSorted)
anova(moleRatNoInteractModel, moleRatFullModel)
Analysis of Variance Table
Model 1: lnEnergy ~ lnMass + caste
Model 2: lnEnergy ~ lnMass * caste
Res.Df RSS Df Sum of Sq F Pr(>F)
1 32 2.8145
2 31 2.7249 1 0.089557 1.0188 0.3206
Discuss: Conclusion?
Step 2: Test for differences in castes, assuming no interaction
anova(moleRatNoInteractModel)
Analysis of Variance Table
Response: lnEnergy
Df Sum Sq Mean Sq F value Pr(>F)
lnMass 1 1.31061 1.31061 14.9013 0.0005178 ***
caste 1 0.63747 0.63747 7.2478 0.0111984 *
Residuals 32 2.81450 0.08795
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Discuss: Conclusion?
Effect size: \( \ \hat{Y}_{\mathrm{frequent}}-\hat{Y}_{\mathrm{infrequent}} = 0.39334 \)
Estimation of effect size
moleRatNoInteractModel
Call:
lm(formula = lnEnergy ~ lnMass + caste, data = moleRatSorted)
Coefficients:
(Intercept) lnMass casteworker
-0.09687 0.89282 0.39334
confint(moleRatNoInteractModel)
2.5 % 97.5 %
(Intercept) -2.01627751 1.8225407
lnMass 0.49961872 1.2860115
casteworker 0.09573442 0.6909503