Sameer Mathur
Theory and Definitions
---
Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures (such as the “variation” among and between groups) used to analyze the differences among group means in a sample.
ANOVA was developed by statistician and evolutionary biologist Ronald Fisher.
In the ANOVA setting, the observed variance in a particular variable is partitioned into components attributable to different sources of variation.
ANOVA provides a statistical test of whether the population means of several groups are equal, and therefore generalizes the t-test to more than two groups.
ANOVA is useful for comparing (testing) two or more group means for statistical significance.
A test result (calculated from the null hypothesis and the sample) is called statistically significant if it is deemed unlikely to have occurred by chance, assuming the truth of the null hypothesis.
A statistically significant result, when a probability (p-value) is less than a pre-specified threshold (significance level), justifies the rejection of the null hypothesis.
The analysis of variance can be presented in terms of a linear model, which makes the following assumptions about the probability distribution of the responses
Independence of observations - It means that the case of the dependent variable should be independent or the sample should be selected randomly. There should not be any pattern in the selection of the sample.
Normality - the distributions of the residuals are normal.
Equality (or “homogeneity”) of variances (homoscedasticity) - the variance of data in groups should be the same.
Analysis of variance (ANOVA) has three types:
When we are comparing two or more groups based on one factor variable.
When we are comparing two or more groups based on two factor variables.
When we are comparing two or more groups based on k factor variables.
Compute the MSB i.e. Mean Sum of Squares Between Groups
Compute the MSE i.e. Mean Sum of Squares of Errors
Calculate F-statistic as the ratio of \( MSB / MSE \)
Compare F-statistic with F-distribution table