Bartlett’s Test

Bartlett’s test is used to examine if a positive, real number of samples from a population has equal variances, also known as a “homogeniety of variances”. It’s defined as:

\(H_0: \sigma_0^2 = \sigma_1^2 =...= \sigma_k^2\)

\(H_A: \sigma_i^2 \neq \sigma_j^2\)

\(T = \frac{(N-k)ln(s_p^2) - \sum^k_{i=1}(N_i - 1)ln(s^2_i)}{1 + \frac{1}{3(k-1)}*(\sum(\frac{1}{N_i - 1})) - \frac{1}{N-k}}\)

where…

\(s_i^2\) is the variance of the \(i^{th}\) group
\(N\) is the total sample size
\(N_i\) is the sample size of the \(i^{th}\) group
\(k\) is the number of groups
and \(s_p^2\) is the pooled variance, which is defined as…

\(s_p^2 = \sum_{i=1}^k(N_i - 1) * \frac{s_i^2}{N-k}\)

Why would anybody need to perform this analysis? Well, to verify an assumption that all the variances are equal across samples. Verifying an assumption arises when one is considering a statistical test, such as ANOVA which does indeed assume all variances are equal among samples of the population.

As shown above, Bartlett’s test utilizes a null hypothesis and an alternative hypothesis, which respectively say the variances among the groups are equal, and not equal (in the case of the alternative hypothesis). To illustrate this simply, an example in R is performed using the Oranges dataset. Here, we wish to test if the variances circumference of each tree. The null hypothesis states that the trees will have the same variances, otherwise for the alternative hypothesis. Via stats::bartlett.test(), this is achieved. We assume a 95% level of confidence: thus we maintain the null hypothesis if the p-value is greater than 0.05.

An Example

We will maintain the null hypothesis if the p-value is greater than \(\alpha = 0.05\).\(^6\)

data <- Orange
head(data, 3)

##   Tree age circumference
## 1    1 118            30
## 2    1 484            58
## 3    1 664            87

bartlett.test(circumference ~ Tree, data = data)

## 
##  Bartlett test of homogeneity of variances
## 
## data:  circumference by Tree
## Bartlett's K-squared = 2.4607, df = 4, p-value = 0.6517

margin = 0.6517 - 0.05
margin

## [1] 0.6017

Since the obtained p-value exceeds \(\alpha\) by a substantial margin, 0.6017, we cannot reject the null hypothesis–therefore, the variances of circumference for each tree is different. Therefore, if one were to perform either a linear or quadratic analysis on this dataset, the quadratic method would be most appropriate.

Bartlett’s Test

Joe Connolly

2022-05-24

Bartlett’s Test

An Example