BIO2POS Lecture Topic 3A

class: middle
background-image: url(data:image/png;base64,#LTU_logo_clear.jpg)
background-position: top left
background-size: 25%

# BIO2POS 
# One-way ANOVA
## Data Analysis Topic 3A
### La Trobe University

---

# Welcome!

### In this lecture we will extend our repertoire of statistical tests and introduce the one-way ANOVA and its non-parametric equivalent.

Over the following slides, we will cover:

* .orangered_style[One-way ANOVA]

* Definition and Hypotheses
--

* Post-hoc testing via multiple comparisons
--

* Interpreting Output
--

* Assumptions
--

* .orangered_style[Kruskal-Wallis Test]

---

# Intended Learning Objectives

### By the end of this lecture you will:

* understand when and how to use a .orangered_style[one-way ANOVA]

* be able to assess whether one-way ANOVA test assumptions have been met

* understand when and how to use a .orangered_style[Kruskal-Wallis test]
  
--

* be able to correctly .seagreen_style[interpret] and .seagreen_style[summarise] the results of the above tests
  
--

<br>

The content you learn in Topics 3A and 3B extends the skills you have developed in Topics [2A](https://rpubs.com/LTU_BIO2POS/DA2A) and [2B](https://rpubs.com/LTU_BIO2POS/DA2B).

We will practice content from this topic in this week's DA computer lab, and the computer lab has some additional extension material if you would like to extend your knowledge.

---

# `$t$`-test Limitations

In the DA Topics [2A](https://rpubs.com/LTU_BIO2POS/DA2A) and [2B](https://rpubs.com/LTU_BIO2POS/DA2B) lectures, we introduced `$t$`-tests.

* .bold_style[One sample *t*-test]: Compare a sample mean to a fixed reference value
 
--

* .bold_style[Paired *t*-test]: Compare mean difference between two dependent groups

* .bold_style[Two sample *t*-test]: Compare sample means of two independent groups

These tests are a good starting point and can be used to analyse data from a wide variety of scientific experiments.

<br>
However, `$t$`-tests share a clear limitation:

* `$t$`-tests only analyse a .orangered_style[maximum of two groups] at any one time

<br>
In the .seagreen_style[Topic 3 lectures], we will introduce some more advanced tests which generalise `$t$`-tests for use with 2+ groups.

---

# ANOVA

.orangered_style[ANOVA] stands for .bold_style[An]alysis .bold_style[o]f .bold_style[Va]riance.

Several models and classes of ANOVA exist.

* We will start by focusing on .orangered_style[One-way ANOVA] (aka single factor ANOVA), which uses .bold_style[one] .seagreen_style[categorical independent variable] to explain a .seagreen_style[numeric response variable].
  
--

A one-way ANOVA can be used to compare the means of **two or more** independent groups.

* We can think of a one-way ANOVA as a generalisation of the two sample `$t$`-test.
  
  * Groups are determined via a single categorical variable (e.g. *car brand*)

---

# One-way ANOVA Hypotheses

Suppose we have a categorical independent variable with `$c$` categories `$(c \geq 2)$`.

<br>

Let `$\mu_i$` denote the population mean for category `$i$`, where `$i = 1, 2, \ldots, c$`.

<br>

Our .orangered_style[one-way ANOVA hypotheses] will be:

`$$H_0: \mu_1 = \mu_2 = \cdots = \mu_c \text{ vs } H_1: \text{ Not all means are equal}$$`
--

<br>

*It is important to note here that our alternate hypothesis is not saying that all the means are unequal necessarily, but rather that* **at least 2 means are not equal**.

---

# One-way ANOVA Hypotheses - Petrel Example

Recall the .seagreen_style[South Georgian Diving Petrels] example from the [Topic 2B lecture](https://rpubs.com/LTU_BIO2POS/DA2B).

* We simplified the data for our two sample `$t$`-test example

Fischer et al. (2018) categorised the petrels into several groups based on their location, including .seagreen_style[South Atlantic Ocean (SAO)] and .seagreen_style[South Indian Ocean (SIO)] and .seagreen_style[New Zealand (SPO)].

* So here we have `$c = 3$` categories

Let the following notation denote the population mean wing length (mm) for each group:

* `$\mu_{SAO}$` for SAO group petrels 
  
--

*  `$\mu_{SIO}$` for SIO group petrels
  
--

* `$\mu_{SPO}$`  for SPO group petrels

Our .orangered_style[one-way ANOVA hypotheses] will be:

`$$H_0: \mu_{SAO} = \mu_{SIO} = \mu_{SPO} \text{  vs  } H_1: \text{ Not all means are equal}$$`

---

# One-way ANOVA Sources of Variation

The one-way ANOVA compares sources of variation in the different groups being assessed.

If there are large differences in these variations, this indicates a large difference between the means of the different groups.

The two types of variation assessed are:

* .orangered_style[Between-Group Variation]: A measure of the sample mean variation between groups
  
    * e.g. the variation of the wing length sample means between the SAO, SIO and SPO petrel groups
  
--

* .orangered_style[Within-Group Variation]: A measure of the variation of individual sample values from their sample mean, within a group
  
    * e.g. the variation of the wing length sample measurements of SPO petrels from the wing length sample mean of SPO petrels

---

# Between-Group Variation - Petrel Example

---

# Within-Group Variation - Petrel Example

---

# F Test

Recall that for `$t$`-tests, we computed a test statistic `$T$` which followed a `$t$`-distribution with `$df$` degrees of freedom `$(T \sim t_{df})$`.

For one-way ANOVAs, we compute a .orangered_style[test statistic *F*], which follows an `$F$`-distribution, with two separate degrees of freedom values, `$df_1$` and `$df_2$` `$(F \sim F_{df_{1}, df_{2}})$`, which together define the shape of the `$F$`-distribution.

* `$df_1 = c - 1$`
  
  * `$df_2 = n - c$`

The `$F$` test statistic is obtained by calculating the ratio of the .orangered_style[Between-Group Variation] to the .orangered_style[Within-Group Variation].

* If we obtain a large `$F$` value, this suggests one or more of the means of the different groups is/are different

* A small `$F$` value suggests the means of the different groups are all similar

---

---

# Petrel Example - Descriptives Plot

.left-column[
<br>

The Wing Length sample means (black squares) look different, and the group variances look very different
{{content}}
]

This suggests a one-way ANOVA test is worthwhile

.right-column[
<img src="data:image/png;base64,#winglength_boxplots.jpg" width="400px" style="display: block; margin: auto;" />
]

---

# One-way ANOVA jamovi output

---

# One-way ANOVA `$F$`-test versions

.bold_style[Important Note:] There are two versions of the one-way ANOVA `$F$`-test we can use.

* .orangered_style[Fisher's *F*-test]

* .orangered_style[Welch's *F*-test] (default jamovi option)

.bold_style[déjà vu:] The version we select will depend on whether we treat the variances for the groups as equal, or unequal. In our discussions so far, we assumed the variances are equal.
  
--

* If the variances are equal, we can use the .orangered_style[Fisher's *F*-test].

* However, if the variances are unequal, the specified level of significance we have picked for our Fisher's `$F$`-test (e.g. `$\alpha = 0.05$`) might become inaccurate (e.g. Type I error may increase above what is expected)

* If the variances are unequal, we should use the more robust .orangered_style[Welch's *F*-test]

---

# Levene's Test

Just as for the two sample `$t$`-test, we can use the .orangered_style[Levene's Test] to check our assumption of equal variance for the `$F$`-test.

<br>

* Levene's Test uses the hypotheses:
  
  `$$H_0: \text{ Group Variances are equal vs } H_1: \text{ Group Variances are unequal}$$`
--

<br>

If the Levene's Test `$p$`-value is less than `$0.05$`, we reject `$H_0$` and use the .orangered_style[Welch's *F*-test] version of the one-way ANOVA `$F$`-test; otherwise we use the Fisher's `$F$`-test.

--
<br>

.bold_style[Suggestion]: If in doubt, always use Welch's `$F$`-test.

---

# One-way ANOVA jamovi output - both options

* *Note that the main outcome remains the same (reject `$H_0$`), although some values change*
  
---

# Conclusion?

If we obtain a statistically significant `$p$`-value for our one-way ANOVA `$F$`-test, you may be tempted to think our job is done.

* However, while we have evidence to reject `$H_0$`, at this stage we do not know which population means are statistically significantly different.

<table class='table table-striped'>
 <thead>
  <tr>
   <th style="text-align:center;"> Option </th>
   <th style="text-align:center;"> `\mu_{SAO} = \mu_{SIO}` </th>
   <th style="text-align:center;"> `\mu_{SAO} = \mu_{SPO}` </th>
   <th style="text-align:center;"> `\mu_{SIO} = \mu_{SPO}` </th>
   <th style="text-align:center;"> Selected Hypothesis </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:center;"> 1 </td>
   <td style="text-align:center;"> ✓ </td>
   <td style="text-align:center;"> ✓ </td>
   <td style="text-align:center;"> ✓ </td>
   <td style="text-align:center;"> `\H_{0}` </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 2 </td>
   <td style="text-align:center;"> ✓ </td>
   <td style="text-align:center;"> ✓ </td>
   <td style="text-align:center;">  </td>
   <td style="text-align:center;"> `\H_{1}` </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 3 </td>
   <td style="text-align:center;"> ✓ </td>
   <td style="text-align:center;">  </td>
   <td style="text-align:center;"> ✓ </td>
   <td style="text-align:center;"> `\H_{1}` </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 4 </td>
   <td style="text-align:center;">  </td>
   <td style="text-align:center;"> ✓ </td>
   <td style="text-align:center;"> ✓ </td>
   <td style="text-align:center;"> `\H_{1}` </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 5 </td>
   <td style="text-align:center;"> ✓ </td>
   <td style="text-align:center;">  </td>
   <td style="text-align:center;">  </td>
   <td style="text-align:center;"> `\H_{1}` </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 6 </td>
   <td style="text-align:center;">  </td>
   <td style="text-align:center;"> ✓ </td>
   <td style="text-align:center;">  </td>
   <td style="text-align:center;"> `\H_{1}` </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 7 </td>
   <td style="text-align:center;">  </td>
   <td style="text-align:center;">  </td>
   <td style="text-align:center;"> ✓ </td>
   <td style="text-align:center;"> `\H_{1}` </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 8 </td>
   <td style="text-align:center;">  </td>
   <td style="text-align:center;">  </td>
   <td style="text-align:center;">  </td>
   <td style="text-align:center;"> `\H_{1}` </td>
  </tr>
</tbody>
</table>

---

# Post-hoc Tests

If we reject `$H_0$`, we can then perform .orangered_style[post-hoc tests] to compute contrasts and/or multiple comparisons, to examine differences between specific groups.

* .orangered_style[Contrasts]: Examine differences between specified groups (e.g. weighted averages, control vs all treatment types) 
  
    * e.g. compare SPO petrels with petrels from all other groups
  
--

* .orangered_style[Multiple Comparisons]: Compare differences between each pair of groups
  
    * e.g. Compare SAO and SIO petrel groups, SAO and SPO petrel groups, SIO and SPO petrel groups
    
---

# Post-hoc Tests

Our choice of one-way ANOVA post-hoc test depends on our choice of `$F$`-test. Many post-hoc tests exist.

We will focus on .orangered_style[pairwise comparison] post-hoc tests:

* If we used Fisher's `$F$`-test, we can use the .orangered_style[Tukey HSD] post-hoc test. The Tukey HSD test is robust and generally considered an excellent post-hoc test choice.

* If we used Welch's `$F$`-test, we can instead use the .orangered_style[Games-Howell] post-hoc test

---

# Post-hoc Tests - Petrel Example

---

# One-way ANOVA Summary - Petrel Example

A .orangered_style[one-way ANOVA] was conducted to determine if there was a difference in the mean wing length (mm) of .seagreen_style[South Georgian Diving Petrel] populations in:

* .seagreen_style[New Zealand] `$(M_{SPO} = \text{119.750 mm}, SD_{SPO} = 2.599 \text{ mm}, n_{SPO} = 128)$`,

* .seagreen_style[The South Atlantic Ocean] `$(M_{SAO} = 116.897\text{ mm}, SD_{SAO} = 4.411 \text{mm}, n_{SAO} = 29)$`, and
  
  * .seagreen_style[The South Indian Ocean] `$(M_{SIO} = 117.816\text{ mm}, SD_{SIO} = 4.019\text{ mm}, n_{SIO} = 29)$`

There was a .orangered_style[statistically significant] effect of population on mean wing length, at the `$\alpha = 0.05$` level of significance, with Welch's `$F(2, 50.169) = 8.654$`, `$p < .001$`.

The Games-Howell post-hoc test indicated that the .orangered_style[mean wing length was larger for SPO petrels] than for SAO petrels `$(p = 0.006)$` or SIO petrels `$(p = 0.020)$`.

There was no significant difference between the mean wing lengths of SAO and SIO petrels `$(p = 0.656)$`.

---

# One-way ANOVA assumptions

When we conduct a one-way ANOVA, we make several assumptions.

To ensure our analysis procedures and conclusion are valid, we should always check these assumptions!

### Key Assumptions

1. The data are .orangered_style[numeric]
  
--

2.  For each group, the observed values for individuals from that population are independent and identically normally distributed (all sampled from the same normal distribution)
  
--

3. The population variances of the groups are equal

4. The response variable .orangered_style[residuals] follow a normal distribution

---

# Checking one-way ANOVA assumptions

We have previously covered how to conduct Levene's Test for equal variances (3) (aka Homogeneity of Variances Test).

We will focus here on checking the normality of residuals assumption (4). The checks should look familiar:

### How to check

* .orangered_style[Histogram] with normal/density curve overlaid
  
--

* .orangered_style[Normal Q-Q Plot]
  
--

* Formal statistical test: .orangered_style[Shapiro-Wilk test] and/or .orangered_style[Kolmogorov-Smirnov test]

---

# One-way ANOVA Residuals

Unlike `$t$`-test normality checks, for the one-way ANOVA we are not testing the normality of the original recorded values, but rather the .orangered_style[normality of the residuals] generated as part of fitting the ANOVA.

### What are the residuals?

When we carry out our one-way ANOVA, we assume our data follows the model:

`$$Y_{ij} = \mu + \alpha_i + \epsilon_{ij}$$`

Here the subscript `$i$` denotes the group membership, and the subscript `$j$` denotes the individual in that group

* `$Y_{ij}$` denotes the observed value for the `$j^{th}$` subject in group `$i$`
  
--

* `$\mu$` is the .orangered_style[overall mean], based on the full data set
  
--

* `$\alpha_i$` denotes the .orangered_style[group effect] of group `$i$`
  
--

* `$\epsilon_{ij}$` denotes the .orangered_style[error] for the `$j^{th}$` subject in group `$i$`

---

# One-way ANOVA Residuals

Once we have computed our sample means `$\mu_i = \mu + \alpha_i$` using our sample data, we can also calculate the residual (i.e. the estimated error) for each individual.

We treat these residuals `$r_{ij}$` as sample equivalents of the true population errors `$\epsilon_{ij}$`, which are assumed to follow a normal distribution.

This might all seem a little abstract - let us take a look at an example to clarify the process.
---

# One-way ANOVA Residuals - Petrel Example

For our petrel data, we have:

* Overall mean, based on all 195 observations: `$\mu = 118.949$` mm
  
  * The three sample means, e.g. `$\mu_{SPO} = 119.750$` mm, `$\mu_{SIO} = 117.816$` mm

With these results, we can compute the group effect `$\alpha_i$` for each group.

* E.g. `$\alpha_{SPO} = 119.750 - 118.949 = 0.801$` mm, which suggests that being in the SPO group has a positive effect on wing length on average 
  
--

We can also check how each individual petrel's measurement compares with their group's sample mean.

The first petrel in the SPO group has a recorded wing length of `$Y_{SPO 1} = 122$` mm. Now

`$$Y_{SPO 1} = \mu + \alpha_{SPO} + r_{SPO1}$$`

so `$$r_{SPO1} = 122 - 119.750 = 2.25$$`
---

# Residuals Normality Check - Petrel Example

.left-column[
  * The histogram of residuals suggests some non-normality  (note density curve is not quite symmetric)
]

.right-column[
<img src="data:image/png;base64,#residuals_histogram_petrel.jpg" width="475px" style="display: block; margin: auto;" />

]

---

# Residuals Normality Check - Petrel Example

.left-column[
  * The Q-Q plot suggests some non-normality  (note non-linear characteristics, strange horizontal patterns)
]

.right-column[
<img src="data:image/png;base64,#anova_qqplot_petrel.jpg" width="500px" style="display: block; margin: auto;" />

]

---

# Residuals Normality Check - Petrel Example

.pull-left[
<img src="data:image/png;base64,#anova_normality_petrel.jpg" width="500px" style="display: block; margin: auto;" />

]

.pull-right[
  * The Shapiro-Wilk test is testing for normality of the data.
  {{content}}
]

* A `$p$`-value `$< 0.05$` suggests the data is non-normal
{{content}}

* We cannot reject `$H_0$`, since `$p = 0.071 > 0.05$`, so conclude the residuals are normally distributed

<br>
.center[
It would appear our one-way ANOVA assumptions have been met!
]

--
.center[
*(although we may have some reservations)*
]
---

# Next Steps when Assumptions Fail

Fortunately, if one or more of the one-way ANOVA assumptions have failed, we have several options available:

<br>

* If the .orangered_style[Levene's Test] shows our equal variances assumption has failed, we can still use the .orangered_style[Welch's *F*-test] version of the one-way ANOVA
  
--

<br>

* If there are more serious violations, we can use the .orangered_style[Kruskal-Wallis test]. This test:
  
--

* Does not assume residuals follow normal distribution
    
--

* Is robust to skewed distributions, but not always as powerful as the `$F$`-tests

---

# Kruskal-Wallis Test

The .orangered_style[Kruskal-Wallis Test] is a non-parametric test for 2+ independent groups, which we can use when our one-way ANOVA assumptions are violated.

It compares the ranks of samples in the different groups, and tests for clear differences in ranks between groups.

* We can think of it as being an extension of the .orangered_style[Mann-Whitney U Test] (which is for 2 independent groups)
  
--

<br>

We have:

`$$H_0: \text{ Pop. distributions are equal vs } H_1: \text{ Pop. distributions are not all equal}$$`

<br>

For post-hoc pairwise comparisons, we can use tests like the .orangered_style[DSCF test] or .orangered_style[Dunn's test].

---

# Kruskal-Wallis Test - Petrel Example

---

# Kruskal-Wallis Test Summary - Petrel Example

A .orangered_style[Kruskal-Wallis Test] was conducted to compare the distribution of wing lengths (mm) across three petrel populations.

<br>

A .orangered_style[statistically significant difference] in wing length distributions across groups was observed at the `$\alpha = 0.05$` level of significance, with `$\chi^2 = 15.505$`, `$df = 2$`, `$N = 195$`, `$p< .001$` (two-tailed).

<br>

Post-hoc pairwise comparisons using the .orangered_style[DSCF test] indicate a statistically significant difference in wing length distributions between SAO and SPO petrels `$(p = 0.002 < 0.05)$` and between SIO and SPO petrels `$(p = 0.024 < 0.05)$`, but not between SAO and SIO petrels `$(p = 0.571)$`.

<br>

*Note we reach the same conclusion as with the one-way ANOVA. This may not always be the case.*
---

# Summary

A .orangered_style[one-way ANOVA] can be used to compare means of **two or more independent groups**.

* Generally we should use the .orangered_style[Welch's *F*-test] version

* If the variances of the groups are equal, we can use the .orangered_style[Fisher's *F*-test] version

* We need to check the assumptions of our one-way ANOVA before we conduct our tests

* If the test assumptions fail, we have alternatives (Welch's `$F$`-test, .orangered_style[Kruskal-Wallis Test])
  
--

* If our initial results are statistically significant, we need to perform .orangered_style[post-hoc tests] to determine which specific group(s) is/are different

---

# End

That concludes our lecture on one-way ANOVA.

### What to do next:

* Make sure to attend the Topic 3B lecture

* If you have any questions, check the LMS, email us or ask in the computer labs

### Optional Further Reading

* Parts from Kokoska (2020) Chapters 11, 14
  
---

# References

*  Fischer, J.H., Debski, I., Miskelly, C.M., Bost, C.A., Fromant, A., Tennyson, A.J.D., et al. (2018). Analyses of phenotypic differentiations among South Georgian Diving Petrel (Pelecanoides georgicus) populations reveal an undescribed and highly endangered species from New Zealand. *PLoS ONE*, 13(6): e0197766. [https://doi.org/10.1371/journal.pone.0197766](https://doi.org/10.1371/journal.pone.0197766)

* Kokoska, S. (2020). Introductory statistics: a problem-solving approach (Third edition..). W H FREEMAN.

* Midway, S., Robertson, M., Flinn, S., & Kaller, M. (2020). Comparing multiple comparisons: practical guidance for choosing the best multiple comparisons test. *PeerJ*, 8, e10387. [https://doi.org/10.7717/peerj.10387](https://doi.org/10.7717/peerj.10387)

* The jamovi project. (2022). *Jamovi [Computer Software]*. [https://www.jamovi.org](https://www.jamovi.org).

---
class: middle

<font color = "grey">
These notes have been prepared by Rupert Kuveke, Amanda Shaker, and other members of the Department of Mathematical and Physical Sciences. The copyright for the material in these notes resides with the authors named above, with the Department of Mathematical and Physical Sciences and with the Department of Environment and Genetics and with La Trobe University. Copyright in this work is vested in La Trobe University including all La Trobe University branding and naming. Unless otherwise stated, material within this work is licensed under a Creative Commons Attribution-Non Commercial-Non Derivatives License 
<a href = "https://creativecommons.org/licenses/by-nc-nd/4.0/" target="_blank"> BY-NC-ND. </a>
</font>