Kruskal-Walis test

Kruskal_Walis is a non-parametric test that is an alternative to One-Way ANOVA test. The test extends the two-sample Wilcoxon test in the situation where there are more than two groups to compare.Kruskal is the option when the assumption of one-way ANOVA test are not met. # this article describes how to compute and interpret the Kruskal-Walis test using the R-programming statistical software. the article also guides on how to calculate the effect size based on the Kruskal-Wallis H* (statistic) ### Contents Prerequisites data preparation Summary statistics Visualization Computation Effect size Multiple Pair-wise calculation Report

Prerequisites

In preparation to carry out the Kruskal-Wallis test in R, prepare the working environment by calling or installing the following packages: tidyverse: the package enables data manipulation and visualization ggpubr: this package allows easy publication of readily visuals. rstatix: this package provides pipe-friendly R functions for easy statistical analyses.

Loading the requiered package

library(tidyverse)
library(ggpubr)
library(rstatix)

Data preparation

the article used the inbuilt R data set named PlantGrowth.PlantGrowth contains weight of plants obtained under a control and two different treatment conditions

data("PlantGrowth")
View(PlantGrowth)
#using the pipe function 
PlantGrowth %>% sample_n_by(group, size = 1)
## # A tibble: 3 x 2
##   weight group
##    <dbl> <fct>
## 1   4.5  ctrl 
## 2   4.32 trt1 
## 3   5.12 trt2

Re-ordering the group levels. we have

# pipe function will be used to the maximum 
PlantGrowth<- PlantGrowth %>% 
  reorder_levels(group, order = c("ctrl", "trt1", "trt2"))

Summary statistic

compute the summary statistic by groups

## # A tibble: 3 x 11
##   group variable     n   min   max median   iqr  mean    sd    se    ci
##   <fct> <chr>    <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 ctrl  weight      10  4.17  6.11   5.16 0.743  5.03 0.583 0.184 0.417
## 2 trt1  weight      10  3.59  6.03   4.55 0.662  4.66 0.794 0.251 0.568
## 3 trt2  weight      10  4.92  6.31   5.44 0.467  5.53 0.443 0.14  0.317

Visualization

the boxplot is created as it gives best visuals to this type of data. Create a boxplot of Plant Weight by group

Computation

Before you decide to carry out Kruskal-Wallis test, the objective of the test should be clearly outlined. the objective of this article….to determine if there is any significant difference between the average weights of plants in the three experimental conditions…

However, the article utilized the pipe-friendly kruskal_test() function from the rstatix package , this is a wrapper around the R base function kruskal.test()

kruskal <- PlantGrowth %>% kruskal_test(weight ~ group)
kruskal
## # A tibble: 1 x 6
##   .y.        n statistic    df      p method        
## * <chr>  <int>     <dbl> <int>  <dbl> <chr>         
## 1 weight    30      7.99     2 0.0184 Kruskal-Wallis

Effective size

## # A tibble: 1 x 5
##   .y.        n effsize method  magnitude
## * <chr>  <int>   <dbl> <chr>   <ord>    
## 1 weight    30   0.222 eta2[H] large

The eta square, based on the H*, can be used as a measure of the Kruskal-Wallis test effective size. it is calculated as follow: \(eta2[H]= (H - K + 1 )/ (n-k)\) where \(H\) is the value obtained in the Kruskal-Wallis test, \(K\) is the number of groups and Lastly \(n\) is the number of observation. this is based on M.T.Tomczak and Tomczak 2014.

The eta squared estimate assumes values from 0 to 1, this is taken as a percentage of variance in dependent variable explained by the independent variable.

the standard notation for interpretation of eta values are 0.01-0.06–> small effect, o.o6-0.14—> moderate effect and lastly >=0.14 large effect.

Multiple Pairwise-Comparisons

The computed Kruskal-Wallis gives limited information, the test does not tell exactly the group that is significantly different form the other.

To determine the cut-line of the different, the article the standard \(Dunn`s test\) to identify which groups are different.

# pairwise-comparisons
pwr_c <- PlantGrowth %>% dunn_test(weight ~ group, p.adjust.method = "bonferroni")
pwr_c
## # A tibble: 3 x 9
##   .y.    group1 group2    n1    n2 statistic       p  p.adj p.adj.signif
## * <chr>  <chr>  <chr>  <int> <int>     <dbl>   <dbl>  <dbl> <chr>       
## 1 weight ctrl   trt1      10    10     -1.12 0.264   0.791  ns          
## 2 weight ctrl   trt2      10    10      1.69 0.0912  0.273  ns          
## 3 weight trt1   trt2      10    10      2.81 0.00500 0.0150 *

Consequently, it is also possible to use the Wilcoxon`s test to calculate pairwise comparison between group levels. this can be used as a control factor for the multiple testing. Using Wilcoxon, it follows:

pwr_cc <- PlantGrowth %>% wilcox_test(weight ~ group, p.adjust.method="bonferroni")
pwr_cc
## # A tibble: 3 x 9
##   .y.    group1 group2    n1    n2 statistic     p p.adj p.adj.signif
## * <chr>  <chr>  <chr>  <int> <int>     <dbl> <dbl> <dbl> <chr>       
## 1 weight ctrl   trt1      10    10      67.5 0.199 0.597 ns          
## 2 weight ctrl   trt2      10    10      25   0.063 0.189 ns          
## 3 weight trt1   trt2      10    10      16   0.009 0.027 *

From both tests, only \(trt1\) and \(trt2\) are significantly different with p=0.015 and p=0.027 respectively.

Report

There was a statistically significant differences between treatment groups as assessed using the kruskal-wallis test (p=0.018). Pairwise Wilcoxon test and Dunn`s test between groups showed that only \(trt1\) and \(trt2\) group was significant.

The visuals of the over result can be plotted showing significant different is group as follows: