Kruskal-Wallis test is done as an non-parametric alternative to ANOVA. When the assumptions of ANOVA - equality of variance and normality of residuals - are not met, this test is done. For demonstration purpose, the relevant codes and interpretations are give here.
Steps
- Set the working directory
- Keep the data set and scripts in the same folder
- Load required libraries
- Read the data set
- Explore the data set
- Execute the analysis
library(readxl)
library(ggpubr)
## Loading required package: ggplot2
library(agricolae)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ lubridate 1.9.4 ✔ tibble 3.2.1
## ✔ purrr 1.0.2 ✔ tidyr 1.3.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
fish = read_excel('datasets.xlsx', sheet = 'anova', range = 'G10:H110')
str(fish)
## tibble [100 × 2] (S3: tbl_df/tbl/data.frame)
## $ aquaria: chr [1:100] "Aquarium 15" "Aquarium 15" "Aquarium 15" "Aquarium 15" ...
## $ growth : num [1:100] 0.52 1.57 2.57 3.44 3.72 ...
fish$aquaria = as.factor(fish$aquaria)
str(fish)
## tibble [100 × 2] (S3: tbl_df/tbl/data.frame)
## $ aquaria: Factor w/ 4 levels "Aquarium 15",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ growth : num [1:100] 0.52 1.57 2.57 3.44 3.72 ...
head(fish)
## # A tibble: 6 × 2
## aquaria growth
## <fct> <dbl>
## 1 Aquarium 15 0.520
## 2 Aquarium 15 1.57
## 3 Aquarium 15 2.57
## 4 Aquarium 15 3.44
## 5 Aquarium 15 3.72
## 6 Aquarium 15 3.04
levels(fish$aquaria)
## [1] "Aquarium 15" "Aquarium 20" "Aquarium 25" "Aquarium 30"
group_by(fish, aquaria) %>%
summarise(
count = n(),
mean = mean(growth),
sd = sd(growth),
se = sd/mean*100
)
## # A tibble: 4 × 5
## aquaria count mean sd se
## <fct> <int> <dbl> <dbl> <dbl>
## 1 Aquarium 15 25 2.67 1.08 40.5
## 2 Aquarium 20 25 3.09 1.11 36.1
## 3 Aquarium 25 25 2.57 0.999 38.9
## 4 Aquarium 30 25 1.40 0.923 65.9
ggboxplot(fish, x = 'aquaria', y = 'growth', col = 'aquaria',
add = c('jitter'), legend = 'none')
ggline(fish, x = 'aquaria', y = 'growth',
add = c('mean_se', 'jitter'))
Run the Kruskal-Wallis test
kruskal.test(growth ~ aquaria, fish)
##
## Kruskal-Wallis rank sum test
##
## data: growth by aquaria
## Kruskal-Wallis chi-squared = 26.103, df = 3, p-value = 9.076e-06
Kruskal-Wallis chi-squared value is 26.103 wtih df = 3 and p-value
< 0.05. The H_0: Group means are equal
is rejected. So,
the group means are different. Which means are different? We need
further analysis to test that. Pair wise Wilcox test can be used with
Bonferroni adjusted p-values. Results show that aquarium 30 were
different from aquarium 15, aquarium 20 and aquarium 25. We also see
that the average growths in aquarium 15, aquarium 20 and aquarium 25 are
statistically the same.
pairwise.wilcox.test(fish$growth, fish$aquaria, p.adjust.method = 'bonferroni')
##
## Pairwise comparisons using Wilcoxon rank sum exact test
##
## data: fish$growth and fish$aquaria
##
## Aquarium 15 Aquarium 20 Aquarium 25
## Aquarium 20 1.00000 - -
## Aquarium 25 1.00000 1.00000 -
## Aquarium 30 0.00054 4.1e-06 0.00093
##
## P value adjustment method: bonferroni
We can do this analysis directly using kruskal() in agricolae package for more detailed output with group letters. We see the same results.
kruskal(y = fish$growth, trt = fish$aquaria, p.adj = 'bonferroni', console = TRUE)
##
## Study: fish$growth ~ fish$aquaria
## Kruskal-Wallis test's
## Ties or no Ties
##
## Critical Value: 26.10292
## Degrees of freedom: 3
## Pvalue Chisq : 9.075687e-06
##
## fish$aquaria, means of the ranks
##
## fish.growth r
## Aquarium 15 57.16 25
## Aquarium 20 64.64 25
## Aquarium 25 54.56 25
## Aquarium 30 25.64 25
##
## Post Hoc Analysis
##
## P value adjustment method: bonferroni
## t-Student: 2.694028
## Alpha : 0.05
## Minimum Significant Difference: 19.26356
##
## Treatments with the same letter are not significantly different.
##
## fish$growth groups
## Aquarium 20 64.64 a
## Aquarium 15 57.16 a
## Aquarium 25 54.56 a
## Aquarium 30 25.64 b