Bonferroni Correction

Discussion Week 4 DATA621

Bonnie Cooper

short discussion post

library( dplyr )
library( ggplot2 )
library( tidyr )
library( faraway )

Bonferroni Correction

As more statistical analyses are applied to the same data sample, the familywise type I error rate (rate of erroneously rejecting the null hypothesis) increases. Bonferroni correction is a statistical method to account for this increase.

The relationship between the familywise type I error rate and the number of statistical analyses performed on a data set is given by:
\[\alpha_{FW} = 1 - (1- \alpha_{PC})^c\] where \(c\) = the number of statistical tests/comparisons and \(\alpha_{PC}\) = the accepted error rate (e.g. \(\alpha_{PC} = 0.05\)).

Let’s visualize this relationship for a few \(\alpha\)s:

c <- seq( 1,10 )
p_05 <- 1 - ( 1 - 0.05 )^c
p_01 <- 1 - ( 1 - 0.01 )^c
fwa <- data.frame( 'c' = c, 'a0.05' = p_05, 'a0.01' = p_01 )
fwa <- fwa %>%
  pivot_longer( cols = c( 'a0.05', 'a0.01'), names_to = 'alpha', values_to = 'val' )

ggplot( fwa, aes( x = c, y = val, color = alpha ) ) +
  geom_line( ) +
  theme_classic() +
  ggtitle( 'Familywise Error Rate' )

As we can see, the error rate increases with each additional statistical test performed on a sample of data. If 5 statistical tests are performed on the same sample of data and an \(\alpha = 0.05\) is preferred for evaluating a null hypothesis, then, due to familywise error, this would be equivalent to \(\sim 0.226\). Meaning that the chances of erroneously rejecting the null hypothesis at least once during the multiple statistical analyses performed on the same data set is \(\sim 22.6\mbox{%}\).
Therefore, adjustments must be made to either the \(\alpha\) level that p-values are null hypotheses are evaluated with or to the p-values themselves. Bonferroni Correction is the process of accounting for familywise error.

When should Bonferroni Correction be considered?: for multiple comparisons problems (comparing multiple means as in ANOVA) or multiple analysis problems (correlation matrices, multiple regressions etc)

a quick example from the LMR text: Using the jsp dataset, look at the schools where the mathematics test scores deviate from the mean

data( jsp, package = 'faraway' )
jsp$mathcentered <- jsp$math - mean( jsp$math ) #variation-centered math scores

#visualize the math scores
ggplot( jsp, aes( x = school, y = mathcentered ) ) +
  geom_boxplot() +
  ylab( 'Centered Math Scores' ) +
  theme( axis.text.x = element_text( angle = 90 ) ) +
  ggtitle( 'Mathematics Test Scores for London Junior Schools' )

lmod <- lm( mathcentered ~ school - 1, jsp )
#test for differences between the schools:
anova( lm( mathcentered ~ school, jsp ) )
## Analysis of Variance Table
## 
## Response: mathcentered
##             Df Sum Sq Mean Sq F value    Pr(>F)    
## school      48  15484  322.58  5.9353 < 2.2e-16 ***
## Residuals 3187 173212   54.35                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

How many schools have p-values < 0.05

pvals <- summary(lmod)$coef[,4]
sigp <- coef( lmod )[ pvals < 0.05 ]
sigp
##   school1   school3   school4   school5   school9  school14  school16  school21 
## -3.368450  2.296405 -2.661928  1.832048  2.245764 -3.289835 -3.737400 -3.618450 
##  school24  school27  school28  school31  school34  school36  school40  school45 
##  2.623786 -2.305764 -5.828595  3.824053  1.935898  2.360544 -4.985458 -4.639201 
##  school46  school47  school49  school50 
##  3.063562  2.396895  2.796405 -2.652027

From the result above, it seems that there are 20 schools with p-values that suggest we can reject the null hypothesis and feel 95% confident that the mean mathematics score for this school is significantly different from the overall mean. However, we have to account for how many comparisons are being made using the same sample of data.

How many schools have Bonferroni corrected p-values that are statistically significant with an \(\alpha= 0.05\):

padj <- p.adjust( pvals, method = 'bonferroni' )
coef( lmod )[ padj < 0.05 ]
##   school1  school16  school21  school28  school31  school40  school45  school50 
## -3.368450 -3.737400 -3.618450 -5.828595  3.824053 -4.985458 -4.639201 -2.652027

After correcting for familywise error, we see that only 8 schools have mean mathematical scores that have a statistically significant difference in means from the sample mean.

But how is the Bonferroni Correction calculated?
Method 1: multiply the observed p-values by the number of comparisons being made and evaluate which p-values are below the desired \(\alpha\) after this correction:

alpha <- 0.05
number_comparisons <- 50 #there are 50 means (1 for each school) that are compared for this ANOVA
pvals_adjusted <- pvals * number_comparisons #multiply by the number of comparisons
coef( lmod )[ pvals_adjusted < alpha ]
##   school1  school16  school21  school28  school31  school40  school45  school50 
## -3.368450 -3.737400 -3.618450 -5.828595  3.824053 -4.985458 -4.639201 -2.652027

Method 2: divide the desired \(\alpha\) by the number of comparisons

alpha_adj <- alpha / number_comparisons
coef( lmod )[pvals < alpha_adj ]
##   school1  school16  school21  school28  school31  school40  school45  school50 
## -3.368450 -3.737400 -3.618450 -5.828595  3.824053 -4.985458 -4.639201 -2.652027

Note that both methods resulted in the same outcome at the p.adjust method from the stats library.



References:
What is Bonferroni Correction?
Is Bonferroni Correction Really Necessary