Kruskal-Wallis test is done as an non-parametric alternative to ANOVA. When the assumptions of ANOVA - equality of variance and normality of residuals - are not met, this test is done. For demonstration purpose, the relevant codes and interpretations are give here.

Steps

- Set the working directory

- Keep the data set and scripts in the same folder

- Load required libraries

- Read the data set

- Explore the data set

- Execute the analysis

library(readxl)
library(ggpubr)
## Loading required package: ggplot2
library(agricolae)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ lubridate 1.9.4     ✔ tibble    3.2.1
## ✔ purrr     1.0.2     ✔ tidyr     1.3.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
fish = read_excel('datasets.xlsx', sheet = 'anova', range = 'G10:H110')
str(fish)
## tibble [100 × 2] (S3: tbl_df/tbl/data.frame)
##  $ aquaria: chr [1:100] "Aquarium 15" "Aquarium 15" "Aquarium 15" "Aquarium 15" ...
##  $ growth : num [1:100] 0.52 1.57 2.57 3.44 3.72 ...
fish$aquaria = as.factor(fish$aquaria)
str(fish)
## tibble [100 × 2] (S3: tbl_df/tbl/data.frame)
##  $ aquaria: Factor w/ 4 levels "Aquarium 15",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ growth : num [1:100] 0.52 1.57 2.57 3.44 3.72 ...
head(fish)
## # A tibble: 6 × 2
##   aquaria     growth
##   <fct>        <dbl>
## 1 Aquarium 15  0.520
## 2 Aquarium 15  1.57 
## 3 Aquarium 15  2.57 
## 4 Aquarium 15  3.44 
## 5 Aquarium 15  3.72 
## 6 Aquarium 15  3.04
levels(fish$aquaria)
## [1] "Aquarium 15" "Aquarium 20" "Aquarium 25" "Aquarium 30"
group_by(fish, aquaria) %>%
    summarise(
        count = n(),
        mean = mean(growth),
        sd = sd(growth),
        se = sd/mean*100
    )
## # A tibble: 4 × 5
##   aquaria     count  mean    sd    se
##   <fct>       <int> <dbl> <dbl> <dbl>
## 1 Aquarium 15    25  2.67 1.08   40.5
## 2 Aquarium 20    25  3.09 1.11   36.1
## 3 Aquarium 25    25  2.57 0.999  38.9
## 4 Aquarium 30    25  1.40 0.923  65.9
ggboxplot(fish, x = 'aquaria', y = 'growth', col = 'aquaria',
      add = c('jitter'), legend = 'none')

ggline(fish, x = 'aquaria', y = 'growth',
      add = c('mean_se', 'jitter'))

Run the Kruskal-Wallis test

kruskal.test(growth ~ aquaria, fish)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  growth by aquaria
## Kruskal-Wallis chi-squared = 26.103, df = 3, p-value = 9.076e-06

Kruskal-Wallis chi-squared value is 26.103 wtih df = 3 and p-value < 0.05. The H_0: Group means are equal is rejected. So, the group means are different. Which means are different? We need further analysis to test that. Pair wise Wilcox test can be used with Bonferroni adjusted p-values. Results show that aquarium 30 were different from aquarium 15, aquarium 20 and aquarium 25. We also see that the average growths in aquarium 15, aquarium 20 and aquarium 25 are statistically the same.

pairwise.wilcox.test(fish$growth, fish$aquaria, p.adjust.method = 'bonferroni')
## 
##  Pairwise comparisons using Wilcoxon rank sum exact test 
## 
## data:  fish$growth and fish$aquaria 
## 
##             Aquarium 15 Aquarium 20 Aquarium 25
## Aquarium 20 1.00000     -           -          
## Aquarium 25 1.00000     1.00000     -          
## Aquarium 30 0.00054     4.1e-06     0.00093    
## 
## P value adjustment method: bonferroni

We can do this analysis directly using kruskal() in agricolae package for more detailed output with group letters. We see the same results.

kruskal(y = fish$growth, trt = fish$aquaria, p.adj = 'bonferroni', console = TRUE)
## 
## Study: fish$growth ~ fish$aquaria
## Kruskal-Wallis test's
## Ties or no Ties
## 
## Critical Value: 26.10292
## Degrees of freedom: 3
## Pvalue Chisq  : 9.075687e-06 
## 
## fish$aquaria,  means of the ranks
## 
##             fish.growth  r
## Aquarium 15       57.16 25
## Aquarium 20       64.64 25
## Aquarium 25       54.56 25
## Aquarium 30       25.64 25
## 
## Post Hoc Analysis
## 
## P value adjustment method: bonferroni
## t-Student: 2.694028
## Alpha    : 0.05
## Minimum Significant Difference: 19.26356 
## 
## Treatments with the same letter are not significantly different.
## 
##             fish$growth groups
## Aquarium 20       64.64      a
## Aquarium 15       57.16      a
## Aquarium 25       54.56      a
## Aquarium 30       25.64      b