Bayesian ANOVA Explanation

Acknowledgement: Thanks to Riko Kelter for explaining the BayesFactor and BayesANOVA package so clearly alongwith the R Codes. I have tried to add more explanation to help beginners. Here’s the citation:

Kelter, R. (2022). bayesanova: An R package for Bayesian Inference in the Analysis of Variance via Markov Chain Monte Carlo in Gaussian Mixture Models. R Journal, 14(1).

Here’s the link to the paper: https://journal.r-project.org/articles/RJ-2022-009/RJ-2022-009.pdf

Why do we use Bayesian ANOVA ?

As per Riko Kelter, “Traditional ANOVA based on null hypothesis significance testing (NHST) is prone to overestimating effects and stating effects if none are present.Bayesian ANOVAs developed so far are based on Bayes factors (BF), which also enforce a hypothesis testing stance.Instead, the Bayesian ANOVA implemented in ‘bayesanova’ focusses on effect size estimation and is based on a Gaussian mixture with known allocations, for which full posterior inference for the component parameters is implemented via Markov-Chain-Monte-Carlo (MCMC).”

What is a Gaussian Mixture with known allocations?

Gaussian Mixture Models: These are probabilistic models that assume all the data points are generated from a mixture of several Gaussian distributions with unlnown parameters. GMMs are used for clustering in unsupervised machine learning, where data points are grouped based on the likelihood of belonging to a certain distribution. Known allocations, in the context of Gaussian Mixture Models, means the group or cluster to which each data point belongs is known apriori (typically, in GMMS, the allocations are unknown apriori).

Inference for the difference in means, standard deviations and effect sizes between each of the groups is obtained automatically. Estimation of the parameters instead of hypothesis testing is embraced via the region of practical equivalence (ROPE), and helper functions provide checks of the model assumptions and visualization of the results.

What is ROPE (Region of Practical Equivalence)?

ROPE is a specified range of values for a parameter, within which differences are consideres to be practically negligible. In simple terms, it’s a way of defining what constitutes a “small enough” difference to be considered irrelevant or unimportant from a practical standpoint.

Application in Bayesian ANOVA:

  1. Parameter Estimation: In Bayesian ANOVA, parameters (like differences between group means) are estimated with a posterior distribution rather than a single point estimate. The ROPE is applied to this distribution.

  2. Decision Making: Instead of just looking at whether a parameter is statistically significantly different from zero (or another value), the ROPE allows us to consider whether the difference is practically significant. If the bulk of the posterior distribution of the parameter lies within the ROPE, the difference can be considered practically negligible.

  3. Credible Intervals and ROPE: A credible interval (the Bayesian equivalent of a confidence interval) provides a range of plausible values for a parameter. If this interval falls entirely or mostly within the ROPE, it suggests that the parameter’s true value is practically equivalent to the null value (or another specified value).

Significance of ROPE in Bayesian Context:

  1. Shifts Focus to Practical Significance: ROPE shifts the focus from “statistical significance” to “practical significance,” emphasizing the real-world relevance of the findings.

  2. Incorporates Subject Matter Expertise: The determination of the ROPE often involves domain expertise, as it requires a judgment about what size of effect is meaningful in a practical context.

  3. Enhances Interpretability: ROPE can make Bayesian analysis more interpretable to non-statisticians, as it relates directly to the practical implications of the results.

Bayesian ANOVA involves a scenario where:

  1. The data is assumed to be generated from a mixture of Gaussian distributions.
  2. The cluster or group membership of each data point is known.
  3. The analysis is conducted in a Bayesian framework, where prior knowledge about the parameters is updated with data to obtain postreior probabilities.

Tooth Growth Dataset

ToothGrowth data set contains the result from an experiment studying the effect of vitamin C on tooth growth in 60 Guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, (orange juice or ascorbic acid (a form of vitamin C and coded as VC).

#For consistency, we use set.seeds
set.seed(42)
#Loading inbuilt datasets
library(datasets)
#Loading the Tooth Growth dataset
data("ToothGrowth")
head(ToothGrowth, n=3)
##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
ToothGrowth$dose = factor(ToothGrowth$dose)
levels(ToothGrowth$dose) = c('Low', 'Medium', 'High')

#Using the BayesFactor package
library(BayesFactor)
## Warning: package 'BayesFactor' was built under R version 4.3.1
## Loading required package: coda
## Loading required package: Matrix
## Warning: package 'Matrix' was built under R version 4.3.1
## ************
## Welcome to BayesFactor 0.9.12-4.6. If you have questions, please contact Richard Morey (richarddmorey@gmail.com).
## 
## Type BFManual() to open the manual.
## ************
bf = anovaBF(len ~ supp * dose, data = ToothGrowth)
bf
## Bayes factor analysis
## --------------
## [1] supp                    : 1.198757     ±0.01%
## [2] dose                    : 4.983636e+12 ±0%
## [3] supp + dose             : 2.963312e+14 ±1.59%
## [4] supp + dose + supp:dose : 8.067205e+14 ±1.94%
## 
## Against denominator:
##   Intercept only 
## ---
## Bayes factor type: BFlinearModel, JZS
#The results show that there is strong evidence that the model attesting these six
#differing groups is favourable over the null model

#Using the bayesanova package
library(bayesanova)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyr)
## 
## Attaching package: 'tidyr'
## The following objects are masked from 'package:Matrix':
## 
##     expand, pack, unpack
library(magrittr)
## 
## Attaching package: 'magrittr'
## The following object is masked from 'package:tidyr':
## 
##     extract
data("ToothGrowth")

grp1 <- (ToothGrowth %>% filter(dose==0.5) %>% select(len))$len
grp2 = (ToothGrowth %>% filter(dose==1.0) %>% select(len))$len
grp3 = (ToothGrowth %>% filter(dose==2.0) %>% select(len))$len

library(rstatix)
## 
## Attaching package: 'rstatix'
## The following object is masked from 'package:stats':
## 
##     filter
#Next, we run the assumption checks on the data
assumption.check(grp1, grp2, grp3, conf.level=0.95)
## Model assumptions checked. No significant deviations from normality detected. Bayesian ANOVA can be run safely.
set.seed(42)
res = bayes.anova(first = grp1, second = grp2, third = grp3)
## Bayesian ANOVA output:
## Details: Gaussian-mixture model with three components
## 
## 
## |Parameter     |LQ    |Mean  |UQ    |Std.Err |
## |:-------------|:-----|:-----|:-----|:-------|
## |mu1           |8.69  |10.61 |12.5  |0.91    |
## |mu2           |18.05 |19.75 |21.46 |0.84    |
## |mu3           |24.94 |26.1  |27.25 |0.57    |
## |sigma1        |3.02  |4.07  |5.67  |0.67    |
## |sigma2        |2.95  |3.96  |5.43  |0.64    |
## |sigma3        |2.43  |3.25  |4.42  |0.52    |
## |mu2-mu1       |6.7   |9.15  |11.7  |1.25    |
## |mu3-mu1       |13.42 |15.49 |17.67 |1.06    |
## |mu3-mu2       |4.36  |6.34  |8.38  |1.01    |
## |sigma2-sigma1 |-2.02 |-0.11 |1.68  |0.93    |
## |sigma3-sigma1 |-2.62 |-0.81 |0.74  |0.85    |
## |sigma3-sigma2 |-2.46 |-0.71 |0.85  |0.82    |
## |delta12       |-5.77 |-4.59 |-3.21 |0.65    |
## |delta13       |-9.37 |-8.14 |-6.63 |0.71    |
## |delta23       |-4.36 |-3.36 |-2.19 |0.56    |
#The results table shows the lower and upper quantile, corresponding to the 100·ci+(100−ci)/2 and
#(100−ci)/2 quantiles where ci is the credible level chosen above. Also, the posterior mean and
#standard error are given for each parameter, difference of parameters and effect size. The results
#clearly show that there are huge differences between the groups: For example, one can immediately
#spot that the more vitamin c given, the more tooth growth can be observed via tooth lengths. While
#the first group (low dose) has a posterior mean of 10.61 with credible interval [8.69, 10.61], the second
#group achieves a mean of 19.75 with credible interval [18.05, 21.46]. The third group has a posterior
#mean of even 26.1 with credible level [24.94.27.25]. The posterior estimates for the differences µ2 − µ1
#µ3 − µ1 and µ3 − µ2 show that all groups differ from each other with a very high probability, given
#the data.

#If you need a threshold for delta, it is subjective. In some fields a delta of 0.2 might be consudered
#small, 0.5 medium, and 0.8 large following Cohen's conventional effect size guidelines.

Plots for checking assumptions:

This function runs a preliminary assumption check on the data, which is recommended before running a Bayesian ANOVA. The model assumptions are normality in each mixture component, so that the assumption.check function runs Shapiro-Wilk tests to check for normality (Shapiro and Wilk, 1965).

## Model assumptions checked. No significant deviations from normality detected. Bayesian ANOVA can be run safely.
knitr::include_graphics("/Users/sanjayfuloria/Documents/Indian Oil Corporation/Analytics/Output.png")

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.