short discussion post
library( dplyr )
library( ggplot2 )
library( tidyr )
library( faraway )Bonferroni Correction
As more statistical analyses are applied to the same data sample, the familywise type I error rate (rate of erroneously rejecting the null hypothesis) increases. Bonferroni correction is a statistical method to account for this increase.
The relationship between the familywise type I error rate and the number of statistical analyses performed on a data set is given by:
\[\alpha_{FW} = 1 - (1- \alpha_{PC})^c\] where \(c\) = the number of statistical tests/comparisons and \(\alpha_{PC}\) = the accepted error rate (e.g. \(\alpha_{PC} = 0.05\)).
Let’s visualize this relationship for a few \(\alpha\)s:
c <- seq( 1,10 )
p_05 <- 1 - ( 1 - 0.05 )^c
p_01 <- 1 - ( 1 - 0.01 )^c
fwa <- data.frame( 'c' = c, 'a0.05' = p_05, 'a0.01' = p_01 )
fwa <- fwa %>%
pivot_longer( cols = c( 'a0.05', 'a0.01'), names_to = 'alpha', values_to = 'val' )
ggplot( fwa, aes( x = c, y = val, color = alpha ) ) +
geom_line( ) +
theme_classic() +
ggtitle( 'Familywise Error Rate' ) As we can see, the error rate increases with each additional statistical test performed on a sample of data. If 5 statistical tests are performed on the same sample of data and an \(\alpha = 0.05\) is preferred for evaluating a null hypothesis, then, due to familywise error, this would be equivalent to \(\sim 0.226\). Meaning that the chances of erroneously rejecting the null hypothesis at least once during the multiple statistical analyses performed on the same data set is \(\sim 22.6\mbox{%}\).
Therefore, adjustments must be made to either the \(\alpha\) level that p-values are null hypotheses are evaluated with or to the p-values themselves. Bonferroni Correction is the process of accounting for familywise error.
When should Bonferroni Correction be considered?: for multiple comparisons problems (comparing multiple means as in ANOVA) or multiple analysis problems (correlation matrices, multiple regressions etc)
a quick example from the LMR text: Using the jsp dataset, look at the schools where the mathematics test scores deviate from the mean
data( jsp, package = 'faraway' )
jsp$mathcentered <- jsp$math - mean( jsp$math ) #variation-centered math scores
#visualize the math scores
ggplot( jsp, aes( x = school, y = mathcentered ) ) +
geom_boxplot() +
ylab( 'Centered Math Scores' ) +
theme( axis.text.x = element_text( angle = 90 ) ) +
ggtitle( 'Mathematics Test Scores for London Junior Schools' )lmod <- lm( mathcentered ~ school - 1, jsp )
#test for differences between the schools:
anova( lm( mathcentered ~ school, jsp ) )## Analysis of Variance Table
##
## Response: mathcentered
## Df Sum Sq Mean Sq F value Pr(>F)
## school 48 15484 322.58 5.9353 < 2.2e-16 ***
## Residuals 3187 173212 54.35
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
How many schools have p-values < 0.05
pvals <- summary(lmod)$coef[,4]
sigp <- coef( lmod )[ pvals < 0.05 ]
sigp## school1 school3 school4 school5 school9 school14 school16 school21
## -3.368450 2.296405 -2.661928 1.832048 2.245764 -3.289835 -3.737400 -3.618450
## school24 school27 school28 school31 school34 school36 school40 school45
## 2.623786 -2.305764 -5.828595 3.824053 1.935898 2.360544 -4.985458 -4.639201
## school46 school47 school49 school50
## 3.063562 2.396895 2.796405 -2.652027
From the result above, it seems that there are 20 schools with p-values that suggest we can reject the null hypothesis and feel 95% confident that the mean mathematics score for this school is significantly different from the overall mean. However, we have to account for how many comparisons are being made using the same sample of data.
How many schools have Bonferroni corrected p-values that are statistically significant with an \(\alpha= 0.05\):
padj <- p.adjust( pvals, method = 'bonferroni' )
coef( lmod )[ padj < 0.05 ]## school1 school16 school21 school28 school31 school40 school45 school50
## -3.368450 -3.737400 -3.618450 -5.828595 3.824053 -4.985458 -4.639201 -2.652027
After correcting for familywise error, we see that only 8 schools have mean mathematical scores that have a statistically significant difference in means from the sample mean.
But how is the Bonferroni Correction calculated?
Method 1: multiply the observed p-values by the number of comparisons being made and evaluate which p-values are below the desired \(\alpha\) after this correction:
alpha <- 0.05
number_comparisons <- 50 #there are 50 means (1 for each school) that are compared for this ANOVA
pvals_adjusted <- pvals * number_comparisons #multiply by the number of comparisons
coef( lmod )[ pvals_adjusted < alpha ]## school1 school16 school21 school28 school31 school40 school45 school50
## -3.368450 -3.737400 -3.618450 -5.828595 3.824053 -4.985458 -4.639201 -2.652027
Method 2: divide the desired \(\alpha\) by the number of comparisons
alpha_adj <- alpha / number_comparisons
coef( lmod )[pvals < alpha_adj ]## school1 school16 school21 school28 school31 school40 school45 school50
## -3.368450 -3.737400 -3.618450 -5.828595 3.824053 -4.985458 -4.639201 -2.652027
Note that both methods resulted in the same outcome at the p.adjust method from the stats library.
References:
What is Bonferroni Correction?
Is Bonferroni Correction Really Necessary