Analysis of Variance (ANOVA)

Sameer Mathur

Theory and Definitions

---

Background

Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures (such as the “variation” among and between groups) used to analyze the differences among group means in a sample.

ANOVA was developed by statistician and evolutionary biologist Ronald Fisher.

In the ANOVA setting, the observed variance in a particular variable is partitioned into components attributable to different sources of variation.

ANOVA provides a statistical test of whether the population means of several groups are equal, and therefore generalizes the t-test to more than two groups.

ANOVA is useful for comparing (testing) two or more group means for statistical significance.

A test result (calculated from the null hypothesis and the sample) is called statistically significant if it is deemed unlikely to have occurred by chance, assuming the truth of the null hypothesis.

A statistically significant result, when a probability (p-value) is less than a pre-specified threshold (significance level), justifies the rejection of the null hypothesis.

Assumptions

The analysis of variance can be presented in terms of a linear model, which makes the following assumptions about the probability distribution of the responses

Independence of observations - It means that the case of the dependent variable should be independent or the sample should be selected randomly. There should not be any pattern in the selection of the sample.

Assumptions

Normality - the distributions of the residuals are normal.

Assumptions

Equality (or “homogeneity”) of variances (homoscedasticity) - the variance of data in groups should be the same.

Types of ANOVA

Analysis of variance (ANOVA) has three types:

One-way ANOVA
Two-way ANOVA
k-way ANOVA

One-Way Analysis of Variance

When we are comparing two or more groups based on one factor variable.

Two-Way Analysis of Variance

When we are comparing two or more groups based on two factor variables.

k-Way Analysis of Variance

When we are comparing two or more groups based on k factor variables.

How one-way ANOVA test works?

Compute the MSB i.e. Mean Sum of Squares Between Groups
Compute the MSE i.e. Mean Sum of Squares of Errors
Calculate F-statistic as the ratio of \( MSB / MSE \)
Compare F-statistic with F-distribution table

See example