Chapter 1. Introduction to Post-hoc tests

  • Thought_1. If p-value is less than 0.05, ANOVA test tells you that whether you have an overall difference between your groups.

  • Thought_2. Then, the variance between groups are all equal?, so we perform Leven_s test. If p-value is higher that 0.05, then Leven_s test says that all the variances are equal. Then, we go to next step. The Post hoc test.

  • What is Post Hoc test? Unfortunately, ANOVA test doesn’t tell you that which specific groups differed. To figure out this, post hoc tests do. This test allows multiple pairwise-comparison in order to determine if the mean difference between specific pairs of group are statistically significant. (I want to call collection of t-test, see table below)

Chapter 2. ANOVA Test Process

Step 1. Data Setup and Display table

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
plant_growth <- PlantGrowth

group_by(plant_growth, group) %>%
  summarise(
    count = n(),
    mean = mean(weight, na.rm = TRUE),
    sd = sd(weight, na.rm = TRUE)
  )
## # A tibble: 3 × 4
##    group count  mean        sd
##   <fctr> <int> <dbl>     <dbl>
## 1   ctrl    10 5.032 0.5830914
## 2   trt1    10 4.661 0.7936757
## 3   trt2    10 5.526 0.4425733

Step 2. Population Data Visualization

## using ggplot
library(ggplot2)
# Basic box plot
pg_p <- ggplot(plant_growth, aes(x = group, y = weight, fill = group)) + 
  geom_boxplot() + 
  labs(title="Population Plot of Weight grop by",x = "Group", y = "Plant Weight")

pg_p + 
  geom_jitter(shape = 16, position=position_jitter(0.2)) + 
  theme(plot.title = element_text(hjust = 0.5, size = 20))

Step 3. Compute one-way ANOVA TEST

# Compute the analysis of variance
res.aov <- aov(weight ~ group, data = plant_growth)
# Summary of the analysis
summary(res.aov)
##             Df Sum Sq Mean Sq F value Pr(>F)  
## group        2  3.766  1.8832   4.846 0.0159 *
## Residuals   27 10.492  0.3886                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Step 4. Interpretation

  • p-value is less than 0.05, Null hypothesis, all the means are equal, is rejected, so, we conclude that there are significant difference between the groups.

Chapter 3. ANOVA Assumptions Test, Validity.

Assumption 1. The variance across groups are equal: Leven_s Test

  • To prove that all the variance between groups are equal, we try to conduct Leven_s test.
library(car)
## Warning: package 'car' was built under R version 3.4.1
## 
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
## 
##     recode
# Levene's test
leveneTest(plant_growth$weight, plant_growth$group)
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  1.1192 0.3412
##       27
# Levene's test with center = mean
leveneTest(plant_growth$weight, plant_growth$group, center = mean)
## Levene's Test for Homogeneity of Variance (center = mean)
##       Df F value Pr(>F)
## group  2   1.237 0.3062
##       27
  • Both tests tell you that p.value is higher than 0.05, so, Null Hypothesis, all the variances are equal, is accepted. Accurately, there is no evidence to suggest that the variance between groups is statistically different, particularly the group(p.value = 0.012) between trt2 and trt1.

  • Let’s see graph below (Homogeneity of variances)

plot(res.aov, 1) # 1. Homogeneity of variances

Assumption 2. Standardized Residuals shows normally distributed: Shapiro-Wilk test

  • To prove that the data is normally distributed, we are going to perform Shapiro-Wilk test.
# Extract the residuals
aov_residuals <- residuals(object = res.aov)
# Run Shapiro-Wilk test
shapiro.test(x = aov_residuals )
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.96607, p-value = 0.4379
  • The conclusion above, is supported by the Shapiro-Wilk test on the ANOVA residuals (W = 0.96, p = 0.6) which finds no indication that normality is violated.
plot(res.aov, 2) # 2. Normality

  • As all the points fall approximately along this reference line, we can assume normality.

  • Interestigly, both test shows three outliers, 4, 15, 17

Chapter 4. Performing Post Hoc test

  • We found that all the groups are significantly different BUT we don’t know which pairs of groups are different.

  • It’s possible to perform multiple pairwise-comparison called Post Hoc test, to determine if the mean difference between specific pairs of group are statistically significant.

  • More formally, post-hoc tests allow for multiple pairwise comparisons without inflating the type I error.

(1) The necessity of post-hoc tests

  • One question comes up. What does it mean to increae the type I error?

  • Suppose that you invovles conducting three pairwise comparisons, each the probability of a type I error(abbr. t_error) set at 5%, then total probability of making type I error is equal to (1 - (No_1. t_error x No_2. t_error x No_3. t_error))

  • Ref. NHST table NHST Review

(2) Post Hoc test Table

TukeyHSD(res.aov)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = weight ~ group, data = plant_growth)
## 
## $group
##             diff        lwr       upr     p adj
## trt1-ctrl -0.371 -1.0622161 0.3202161 0.3908711
## trt2-ctrl  0.494 -0.1972161 1.1852161 0.1979960
## trt2-trt1  0.865  0.1737839 1.5562161 0.0120064
  • Terminology Explanation is followed below:
  • diff: difference between means of the two groups
  • lwr, upr: the lower and the upper end point of the confidence interval at 95% (default)
  • p adj: p-value after adjustment for the multiple comparisons.

(3) Post Hoc graph

tukey <- TukeyHSD(res.aov)
plot(tukey)

(4) Bonferroni adjusted p-values

  • the Bonferroni correction is a method that is used to counteract the problem of inflated type I errors while engaging in multiple pairwise comparisons between subgroups.

  • Bonferroni is generally known as the most conservative method to control the familywise error rate.

  • Bonferroni is based on the idea that if you test N dependent or independent hypotheses, one way of maintaining the familywise error rate is to test each individual hypothesis at a statistical significance level that is deflated by a factor of 1/n,

  • So, for a significance level for the whole family of tests of αα, the Bonferroni correction would be to test each of the individual tests at a significance level of a/n.

# Use p.adjust
bonferroni_ex <- p.adjust(.005, method = "bonferroni", n = 8) 

# Print bonferroni_ex
bonferroni_ex
## [1] 0.04
# Pairwise t-test
pairwise.t.test(plant_growth$weight, plant_growth$group, p.adjust = "bonferroni")
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  plant_growth$weight and plant_growth$group 
## 
##      ctrl  trt1 
## trt1 0.583 -    
## trt2 0.263 0.013
## 
## P value adjustment method: bonferroni