The Analysis of Variance (ANOVA)

M. Drew LaMar
November 18, 2019

Course Announcements

Exam this Friday!!!

Chapters 9-13
Review Slides
Bring your calculator!
Lab this week will be review

Type I and II errors

If assumptions of parametric tests are violated, then Type I errors are inflated (leads to false confidence in results).
Nonparametric tests have less power then parametric tests (throwing away magnitudes and only using ranks).
When assumptions of two-sample \( t \)-test are met, the Mann-Whitney \( U \)-test has 95% as much power as the two-sample \( t \)-test when samples sizes are large (worse when small).
Power of Mann-Whitney \( U \)-test is zero when \( n=2 \) (i.e. useless).
When assumptions of one-sample \( t \)-test are met, the sign test has 64% as much power as the one-sample \( t \)-test when samples sizes are large (worse when small).
Power of sign test is zero when \( n=5 \).

Analysis of variance (intro)

Definition: The analysis of variance (ANOVA) compares the means of multiple groups simultaneously in a single analysis.

ANOVA generalizes two-sample \( t \)-test to more than two groups.

In two-sample \( t \)-test, the test statistic is a ratio of the difference between means and the standard error of the mean:

\[ t = \frac{\bar{Y}_{1}-\bar{Y}_{2}}{\mathrm{SE}_{\bar{Y}_{1}-\bar{Y}_{2}}} \]

Analysis of variance (intro)

\[ t = \frac{\bar{Y}_{1}-\bar{Y}_{2}}{\mathrm{SE}_{\bar{Y}_{1}-\bar{Y}_{2}}} \]

Analysis of variance (intro)

By squaring the numerator and denominator of the \( t \)-statistic, we get a ratio of variance components.

\[ \frac{\left(\bar{Y}_{1}-\bar{Y}_{2}\right)^2}{\mathrm{SE}^{2}_{\bar{Y}_{1}-\bar{Y}_{2}}} \]

\[ \frac{"\mathrm{Variance \ between \ groups}"}{"\mathrm{Variance \ within \ groups}"} \]

Analysis of variance (for real)

Data: Suppose I have one categorical explanatory variable X with \( k > 2 \) levels, and a response variable Y.

Hypothesis test:

\[ \begin{eqnarray*} H_{0} & : & \mu_{1} = \mu_{2} = \cdots = \mu_{n}\\ H_{A} & : & \mathrm{At \ least \ one} \ \mu_{i} \ \mathrm{is \ different \ from \ the \ others} \end{eqnarray*} \]

Test statistic:

\[ F = \frac{\mathrm{group \ mean \ square}}{\mathrm{error \ mean \ square}} = \frac{\mathrm{MS}_{\mathrm{groups}}}{\mathrm{MS}_{\mathrm{error}}} \]

Analysis of variance (for real)

Test statistic:

\[ F = \frac{\mathrm{group \ mean \ square}}{\mathrm{error \ mean \ square}} = \frac{\mathrm{MS}_{\mathrm{groups}}}{\mathrm{MS}_{\mathrm{error}}} \]

Definition: The group mean square (\( \mathrm{MS}_{\mathrm{groups}} \)) is proportional to the observed amount of variation among the group sample means [between-group variability].

Definition: The error mean square (\( \mathrm{MS}_{\mathrm{error}} \)) estimates the variance among subjects that belong to the same group [within-group variability].

Analysis of variance (for real)

Test statistic:

\[ F = \frac{\mathrm{group \ mean \ square}}{\mathrm{error \ mean \ square}} = \frac{\mathrm{MS}_{\mathrm{groups}}}{\mathrm{MS}_{\mathrm{error}}} \]

If \( H_{0} \) is true, then \( \mathrm{MS}_{\mathrm{groups}} = \mathrm{MS}_{\mathrm{error}} \) and \( F = 1 \).

If \( H_{0} \) is false, then \( \mathrm{MS}_{\mathrm{groups}} > \mathrm{MS}_{\mathrm{error}} \) and \( F > 1 \).

Analysis of variance (example)

Practice Problem #1

Many humans like the effect of caffeine, but it occurs in plants as a deterrent to herbivory by animals. Caffeine is also found in flower nectar, and nectar is meant as a reward for pollinators, not a deterrent. How does caffeine in nectar affect visitation by pollinators?

Analysis of variance (example)

Practice Problem #1

Singaravelan et al. (2005) set up feeding stations where bees were offered a choice between a control solution with 20% surcrose or a caffeinated solution with 20% sucrose plus some quantity of caffeine. Over the course of the experiment, four different concentrations of caffeine were provided: 50, 100, 150, and 200 ppm. The response variable was the difference between the amount of nectar consumed from the caffeine feeders and that removed from the control feeders at the same station (grams).

Analysis of variance (example)

Singaravelan et al. (2005) set up feeding stations where bees were offered a choice between a control solution with 20% surcrose or a caffeinated solution with 20% sucrose plus some quantity of caffeine. Over the course of the experiment, four different concentrations of caffeine were provided: 50, 100, 150, and 200 ppm. The response variable was the difference between the amount of nectar consumed from the caffeine feeders and that removed from the control feeders at the same station (grams).

Discuss: Describe the experimental design.