Alban Guillaumet, Troy University
Definition: The
analysis of variance (ANOVA) compares the means of multiple groups simultaneously in a single analysis.
ANOVA generalizes two-sample \( t \)-test to more than two groups.
Data: Suppose I have one categorical explanatory variable X with \( k > 2 \) levels, and a numerical response variable Y.
Hypothesis test:
\[ \begin{eqnarray*} H_{0} & : & \mu_{1} = \mu_{2} = \cdots = \mu_{n}\\ H_{A} & : & \mathrm{At \ least \ one} \ \mu_{i} \ \mathrm{is \ different \ from \ the \ others} \end{eqnarray*} \]
Test statistic:
\[ F = \frac{\mathrm{group \ mean \ square}}{\mathrm{error \ mean \ square}} = \frac{\mathrm{MS}_{\mathrm{groups}}}{\mathrm{MS}_{\mathrm{error}}} \]
Test statistic:
\[ F = \frac{\mathrm{group \ mean \ square}}{\mathrm{error \ mean \ square}} = \frac{\mathrm{MS}_{\mathrm{groups}}}{\mathrm{MS}_{\mathrm{error}}} \]
Definition: The
group mean square (\( \mathrm{MS}_{\mathrm{groups}} \)) is proportional to the observed amount of variation among the group sample means [between-group variability ].
Definition: The
error mean square (\( \mathrm{MS}_{\mathrm{error}} \)) estimates the variability among subjects that belong to the same group [within-group variability ].
Test statistic:
\[ F = \frac{\mathrm{group \ mean \ square}}{\mathrm{error \ mean \ square}} = \frac{\mathrm{MS}_{\mathrm{groups}}}{\mathrm{MS}_{\mathrm{error}}} \]
If \( H_{0} \) is true, then \( \mathrm{MS}_{\mathrm{groups}} = \mathrm{MS}_{\mathrm{error}} \) and \( F = 1 \).
If \( H_{0} \) is false, then \( \mathrm{MS}_{\mathrm{groups}} > \mathrm{MS}_{\mathrm{error}} \) and \( F > 1 \).
The knees who say night:
Traveling to a different time zone can cause jet lag, but people adjust as the schedule of light to their eyes in the new timezone gradually resets their internal, circadian clock. But can it be also reset by exposing the back of the knee to light, as claimed in a controversial paper by by Campbell and Murphy (1998)?
The knees who say night:
Wright and Czeisler (2002) reexamined the phenomenon in a follow-up study measuring the circadian rythm, two days after treatment, by the daily cycle of melatonin production in 22 people randomly assigned to one of three light treatments: eyes only, knees only, or neither (control). A negative measurement indicates a delay in melatonin production, which is the predicted effect of light treatment.
Separating the sources of variation in the Data:
Data: With \( i \) representing group \( i \), and \( j \) individual \( j \),we have:
\[ \mathrm{SS}_{\mathrm{total}} = \sum_{i}\sum_{j}(Y_{ij}-\bar{Y})^2 \]
Data: With \( i \) representing group \( i \), and \( j \) individual \( j \),we have:
\[ \scriptsize{\mathrm{SS}_{\mathrm{total}} = \sum_{i}\sum_{j}(Y_{ij}-\bar{Y})^2 = \sum_{i}n_{i}(\bar{Y}_{i}-\bar{Y})^2 + \sum_{i}\sum_{j}(Y_{ij}-\bar{Y}_{i})^2} \]
Data: With \( i \) representing group \( i \), and \( j \) individual \( j \),we have:
\[ \scriptsize{ \begin{eqnarray*} \mathrm{SS}_{\mathrm{total}} = \sum_{i}\sum_{j}(Y_{ij}-\bar{Y})^2 & = & \sum_{i}n_{i}(\bar{Y}_{i}-\bar{Y})^2 + \sum_{i}\sum_{j}(Y_{ij}-\bar{Y}_{i})^2 \\ & = & \mathrm{SS}_{\mathrm{groups}} + \mathrm{SS}_{\mathrm{error}} \end{eqnarray*} } \]
Definition: The
group mean square is given by
\[ \mathrm{MS}_{\mathrm{groups}} = \frac{\mathrm{SS}_{\mathrm{groups}}}{df_{\mathrm{groups}}}, \] with \( df_{\mathrm{groups}} = k-1 \).
Definition: The
error mean square is given by
\[ \mathrm{MS}_{\mathrm{error}} = \frac{\mathrm{SS}_{\mathrm{error}}}{df_{\mathrm{error}}}, \] with \( df_{\mathrm{error}} = \sum (n_{i}-1) = N-k \).
Test statistic:
\[ F = \frac{\mathrm{group \ mean \ square}}{\mathrm{error \ mean \ square}} = \frac{\mathrm{MS}_{\mathrm{groups}}}{\mathrm{MS}_{\mathrm{error}}} \]
If \( H_{0} \) is true, then \( \mathrm{MS}_{\mathrm{groups}} = \mathrm{MS}_{\mathrm{error}} \) and \( F = 1 \).
If \( H_{0} \) is false, then \( \mathrm{MS}_{\mathrm{groups}} > \mathrm{MS}_{\mathrm{error}} \) and \( F > 1 \).
Practice Problem #1
Many humans like the effect of caffeine, but it occurs in plants as a deterrent to herbivory by animals. Caffeine is also found in flower nectar, and nectar is meant as a reward for pollinators, not a deterrent. How does caffeine in nectar affect visitation by pollinators?
Practice Problem #1
Singaravelan et al. (2005) set up feeding stations where bees were offered a choice between a control solution with 20% sucrose or a caffeinated solution with 20% sucrose plus some quantity of caffeine. Over the course of the experiment, four different concentrations of caffeine were provided: 50, 100, 150, and 200 ppm. The response variable was the difference between the amount of nectar consumed from the caffeine feeders and that removed from the control feeders at the same station (grams).
str(strungOutBees)
'data.frame': 20 obs. of 2 variables:
$ ppmCaffeine : Factor w/ 4 levels "ppm50","ppm100",..: 1 2 3 4 1 2 3 4 1 2 ...
$ consumptionDifferenceFromControl: num -0.4 0.01 0.65 0.24 0.34 -0.39 0.53 0.44 0.19 -0.08 ...
Discuss: State the null and alternative hypotheses appropriate for this question.
\[ \begin{eqnarray*} H_{0} & : & \mu_{50} = \mu_{100} = \mu_{150} = \mu_{200} \\ H_{A} & : & \mathrm{At \ least \ one \ of \ the \ means \ is \ different} \end{eqnarray*} \]
Short way using R
caffResults <- lm(consumptionDifferenceFromControl ~ ppmCaffeine, data=strungOutBees)
anova(caffResults)
Analysis of Variance Table
Response: consumptionDifferenceFromControl
Df Sum Sq Mean Sq F value Pr(>F)
ppmCaffeine 3 1.1344 0.37814 4.1779 0.02308 *
Residuals 16 1.4482 0.09051
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Definition: The
\( R^{2} \) value is the “fraction of the variation explained by groups” and is given by
\[ R^{2} = \frac{\mathrm{SS}_{\mathrm{groups}}}{\mathrm{SS}_{\mathrm{total}}}. \] Note: \( 0 \leq R^2 \leq 1 \).
beeAnovaSummary <- summary(caffResults)
beeAnovaSummary$r.squared
[1] 0.4392573
Assumptions (same as 2-sample \( t \)-test)
Robustness (same as 2-sample \( t \)-test)
Definition: The
Kruskal-Wallis test is a nonparametric method for multiple groups based on ranks.
The Kruskal-Wallis test is similar to the Mann-Whitney \( U \)-test and has the same assumptions:
So there is a difference between means amongst all groups. But which means are different from one another?
Definition: A
planned comparison is a comparison between means planned during the design of the study, identified before the data are examined.
A planned comparison must have a strong a priori justification, such as an expectation from theory or a prior study.
Only one or a small number of planned comparisons is allowed, to minimize inflating the Type I error rate.
Definition: An
unplanned comparison is one of multiple comparisons, such as between all pairs of means, carried out to help determine where differences between means lie.
Unplanned comparisons are a form of data dredging, so we need to minimize the rising Type I errors that we get from performing many tests.
A planned comparison is very similar to a 2-sample \( t \)-test, except that when computing the standard error we use the pooled sample variance (i.e., the error mean square \( \mathrm{MS}_{\mathrm{error}} \)) based on all \( k \) groups, and the corresponding error \( df=N-k \), rather than the pooled sample variance based only on the two groups being compared, i.e.
\[ \mathrm{SE} = \sqrt{\mathrm{MS}_{\mathrm{error}}\left(\frac{1}{n_{1}} + \frac{1}{n_{2}}\right)} \]
This step increases precision and power.
For instance, the planned 95% confidence interval for the difference between the mean of the “knee” and “control” groups is estimated as:
\[ -0.788< \mu_{2} - \mu_{1} < 0.734 \]
versus
\[ -0.813< \mu_{2} - \mu_{1} < 0.759 \]
for the two-sample confidence interval.
In R, we can use the multcomp package to do this.
Caffeine example:
caffPlanned <- glht(caffResults, linfct = mcp(ppmCaffeine = c("ppm100 - ppm50 = 0")))
confint(caffPlanned)
ppm100 - ppm50 = 0 is called a contrast. In this case, it simply means “Test if the means between the two groups ppm50 and ppm100 are the same.”
Simultaneous Confidence Intervals
Multiple Comparisons of Means: User-defined Contrasts
Fit: lm(formula = consumptionDifferenceFromControl ~ ppmCaffeine,
data = strungOutBees)
Quantile = 2.1199
95% family-wise confidence level
Linear Hypotheses:
Estimate lwr upr
ppm100 - ppm50 == 0 -0.1800 -0.5834 0.2234
Definition: An
unplanned comparison is one of multiple comparisons, such as between all pairs of means, carried out to help determine where differences between means lie.
Unplanned comparisons are a form of data dredging, so we need to minimize the rising Type I errors that we get from performing many tests.
Definition: With the
Tukey-Kramer method, the probability of making at least one Type I error throughout the course of testing all pairs of means is no greater than the significance level \( \alpha \).
The Tukey-Kramer method works like a series of two-sample \( t \)-tests, but it uses a larger critical value to limit the Type I error rate.
library(multcomp)
tukeyResults <- glht(caffResults,
linfct = mcp(ppmCaffeine = "Tukey"))
# summary(tukeyResults)-> next slide
# same thing as
# x<-aov(consumptionDifferenceFromControl ~ ppmCaffeine, data=strungOutBees)
#TukeyHSD(x)
Groups in the figure are assigned the same symbol if their means are not significantly different.
Groups in the figure are assigned the same symbol if their means are not significantly different.