Assessing The Assumptions of One Way ANOVA in R

Brief Introduction

One way ANOVA (Analysis of variance) is the statistical tool used to test whether the average value of test variable is significantly equal among the grouping variable, categorized into more than two levels. It should be made clear that the test variable is continuous (interval or ratio scale) and the grouping variable is categorical (nominal). Also the number of observations for each groups may or may not be equal.

For an example: Let us take an example of agricultural research where the test variable is yield of a particular crop and the grouping variable is types of fertilizers, say A, B and C. We want to see whether there is significant difference in the average yield of the crop due to these fertilizers. This could be the situation where we can apply one way ANOVA. The word ANOVA is sometimes quite confusing, as the tool is used to test the significance of means in different groups not the variance. This tool was developed by Prof. R. A. Fisher who was an English statistician, geneticist and eugenicist.

In the following section we are going to discuss about the assumptions of one way ANOVA and its assessment using R.

Like every statistical tests, ANOVA is also based on some assumptions about the data. Besides other, the following are the key assumptions of ANOVA; they are very important and we need to be ware of these assumptions because without validation of these assumptions, performing ANOVA and generalizing the results would be misleading and useless.

Normality
Homogeneity of variance (Homoscedasticity)
Independence

\(\underline { Setting\ the \ Problem\ for\ ANOVA\ Test:}\)

data("iris")

For the demonstration we are going to use the data set iris . we can see that this data set consists of 150 observations on 5 variables namely Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species by using str() function. All the variables have numeric class except Species, which has factor class (categorical variable) having three levels namely setosa, versicolor, virginica.

Suppose, we want to see whether the average length of sepal of iris flower is significantly equal among the three Species namely setosa, versicolor, virginica, Then the test of ANOVA follow as below.

The Null Hypothesis(H₀): The average length of sepal of the iris flower is significantly equal among the three species.
Against,
The Alternative Hypothesis (H₁): At least one Species is different from the overall mean length of sepal of the iris flower.

Checking Homogeneity of variance

ANOVA assumes that the variance among the group is significantly equal to the variance of overall. The homogeneity of variance among the group can be assessed by using Levene’s test¹. And the levene’s test can be performed by using leveneTest function of the package car in R. The package car needs to be installed and loaded in R environment before applying the function.

library(car)

Loading required package: carData

iris.anova = aov(Sepal.Length~Species, data = iris)
leveneTest(iris.anova)  # Remember `iris.anova` is the `aov` object.

	Df	F value	Pr(>F)
group	2	6.35272	0.0022585
	147	NA	NA

Since the p-value (0.002259) is fairly less than 0.05, we have to reject the H₀. Concluding there is an issue of homogeneity of variance in the different species. The assumption of homogeneity is being violated. Even though the homogeneity is violated by the observation, we car resolve this issue by using Welch one way test which assumes that the variances among the groups are not equal. In Students t-test for independent two sample test, this issues was resolved by using Welch t-test.

The Welch one way test can be performed in R by using the regular anova function aov(), by default the aov() function runs the Welch one way test. To run the regular ANOVA test using this function we have to set the argument var.equal = TRUE. In this case the levene’s test suggests us to go with Welch one way test.

Checking Normality

Before we shall assess normality assumption, first try to understand clearly about the Normality. ANOVA assumes that the normality of the errors (or residuals) not necessarily of the observation. Sometimes we may get confused with the statement that the ANOVA (or Regression Analysis), the assumption of Normality refers to the residuals not the observations.

To assess this assumption in R, first we have to perform one way ANOVA by using the base R built in function aov(), and the result should be stored in a object; in this case we have stored in iris.anova. And then we extract the values of residual using the function residuals(). Finally, we are set to test whether these residuals are distributed normally or not, by using the Shapiro Wilk Test² for Normality using the function shapiro.test in R.

iris.residual = residuals(iris.anova) #  the object `iris.anova` is used to extract residuals.
shapiro.test(iris.residual)


    Shapiro-Wilk normality test

data:  iris.residual
W = 0.9879, p-value = 0.2189

The Shapiro Wilk normality test reports that the residuals are fairly normally distributed, meaning that there is no issue with the normality assumption. The normality assumption looks fine according to the sample data. Since the p-value (0.2189) is fairly higher than 0.05. Therefore, we can not reject the null hypothesis. We can also assess normality by using histogram and QQ normal plot. we can use hist() and qqnorm() and qqline for histogram and QQ plot respectively. We are not going to use them here, but the readers are strongly recommended to try these functions too.

Checking Independence

Basically, the assumption of independence means that the residuals are assumed to have been generated without any relationship to or regards for each other. This can be maintained if the experimental data are recorded in random manner as far as possible. The assumption of independence of error or residuals can be assessed by using Durbin Waston statistic³ and in R we can use the function durbinWatsonTest() for our anova object. Note that our anova object is iris.anova. Also note that the function durbinWatsonTest also comes with the package car.

iris.dw = durbinWatsonTest(iris.anova)
iris.dw

 lag Autocorrelation D-W Statistic p-value
   1     -0.02768987      2.043002   0.936
 Alternative hypothesis: rho != 0

The result clearly indicates that the null hypothesis can not be rejected because the p-value (0.936) is far greater than 0.05. Therefore, we can conclude that the data have no issue with this assumption. The residuals are fairly independent of each other.

Final remarks:

The sample data are fine with the assumptions normality and independence. There was an issue with the homogeneity. But this issues is resolved by Welch one way test. Therefore in this case, it is good to use Welch one way test rather the regular ANOVA.

The base R in built function aov() is defaulted to Welch one way test. If there are two f grouping variables, we use two way analysis of variance. We shall discuss it in the later sections..

Notes:

The sample data of Sepal length by Species can be explored by using base R function boxplot() and summary function ds_group_summary(iris, Species, Sepal.Length) of the descriptr package.

The result should be as follows:

boxplot(iris$Sepal.Length~ iris$Species,
        main = "Distribution of Sepal Length by Species",
        xlab = "Species",
        ylab = "Sepal Length")

Figure 1: The box plot clearly shows that the three species have different mean sepal length

library(descriptr)
ds_group_summary(iris, Species, Sepal.Length)

                                 Sepal.Length by Species                                  
-----------------------------------------------------------------------------------------
|     Statistic/Levels|               setosa|           versicolor|            virginica|
-----------------------------------------------------------------------------------------
|                  Obs|                   50|                   50|                   50|
|              Minimum|                  4.3|                  4.9|                  4.9|
|              Maximum|                  5.8|                    7|                  7.9|
|                 Mean|                 5.01|                 5.94|                 6.59|
|               Median|                    5|                  5.9|                  6.5|
|                 Mode|                    5|                  5.5|                  6.3|
|       Std. Deviation|                 0.35|                 0.52|                 0.64|
|             Variance|                 0.12|                 0.27|                  0.4|
|             Skewness|                 0.12|                 0.11|                 0.12|
|             Kurtosis|                -0.25|                -0.53|                 0.03|
|       Uncorrected SS|              1259.09|              1774.86|               2189.9|
|         Corrected SS|                 6.09|                13.06|                19.81|
|      Coeff Variation|                 7.04|                  8.7|                 9.65|
|      Std. Error Mean|                 0.05|                 0.07|                 0.09|
|                Range|                  1.5|                  2.1|                    3|
|  Interquartile Range|                  0.4|                  0.7|                 0.67|
-----------------------------------------------------------------------------------------

Note that the hypotheses set for the levene’s test are as follows;
The Null Hypothesis(H₀): There is no significant difference in group variances and the overall variance.
Against,
The Alternative Hypothesis (H₁): At least one group is different from the overall variance.↩︎
Under the Shapiro Wilk test, The Null Hypothesis (H₀): Data are assumed to be normally distributed. Against the Alternative Hypothesis (H₁): They are not significantly normally distributed.↩︎
Under the Durbin-Watson test; H₀: Residuals are independent of each other. Against, H₁: Residuals are not independent.↩︎