1. A psychologist conducts a study to determine whether or not noise can inhibit learning. Each of 15 subjects is randomly assigned to one of three groups. Each subject is given 20 minutes to memorize a list of 10 nonsense syllables which she is told she will be tested on the following day. The five subjects assigned to Group 1, the no noise condition, study the list of nonsense syllables while they are in a quiet room. The five subjects assigned to Group 2, the moderate noise condition, study the list of nonsense syllables while listening to classical music. The five subjects assigned to Group 3, the extreme noise condition, study the list of nonsense syllables while listening to rock music.
library(tidyverse)
Warning: package 'tidyverse' was built under R version 4.2.2
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
✔ ggplot2 3.4.0      ✔ purrr   0.3.4 
✔ tibble  3.1.8      ✔ dplyr   1.0.10
✔ tidyr   1.2.1      ✔ stringr 1.4.1 
✔ readr   2.1.2      ✔ forcats 0.5.2 
Warning: package 'ggplot2' was built under R version 4.2.2
Warning: package 'dplyr' was built under R version 4.2.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
library(ggpubr)
Warning: package 'ggpubr' was built under R version 4.2.2
library(rstatix)
Warning: package 'rstatix' was built under R version 4.2.2

Attaching package: 'rstatix'

The following object is masked from 'package:stats':

    filter
library(readxl)
Kruskal_Wallis <- read_excel("D:/MARV BS MATH/4th year, 2nd sem/Nonparametric Statistics/Final Exam/Kruskal-Wallis.xlsx")
Kruskal_Wallis
# A tibble: 5 × 4
  Subject Condition1 Condition2 Condition3
    <dbl>      <dbl>      <dbl>      <dbl>
1       1          8          7          4
2       2         10          8          8
3       3          9          5          7
4       4         10          8          5
5       5          9          5          7
F <- Kruskal_Wallis %>%
  gather(key = "Condition", value = "score", Condition1, Condition2, Condition3) %>%
  convert_as_factor(Subject, Condition)
head(F, 5)
# A tibble: 5 × 3
  Subject Condition  score
  <fct>   <fct>      <dbl>
1 1       Condition1     8
2 2       Condition1    10
3 3       Condition1     9
4 4       Condition1    10
5 5       Condition1     9
head(F)
# A tibble: 6 × 3
  Subject Condition  score
  <fct>   <fct>      <dbl>
1 1       Condition1     8
2 2       Condition1    10
3 3       Condition1     9
4 4       Condition1    10
5 5       Condition1     9
6 1       Condition2     7
summary(Kruskal_Wallis)
    Subject    Condition1     Condition2    Condition3 
 Min.   :1   Min.   : 8.0   Min.   :5.0   Min.   :4.0  
 1st Qu.:2   1st Qu.: 9.0   1st Qu.:5.0   1st Qu.:5.0  
 Median :3   Median : 9.0   Median :7.0   Median :7.0  
 Mean   :3   Mean   : 9.2   Mean   :6.6   Mean   :6.2  
 3rd Qu.:4   3rd Qu.:10.0   3rd Qu.:8.0   3rd Qu.:7.0  
 Max.   :5   Max.   :10.0   Max.   :8.0   Max.   :8.0  
set.seed(12345)
F %>% sample_n_by(Condition, size = 5)
# A tibble: 15 × 3
   Subject Condition  score
   <fct>   <fct>      <dbl>
 1 3       Condition1     9
 2 4       Condition1    10
 3 2       Condition1    10
 4 5       Condition1     9
 5 1       Condition1     8
 6 2       Condition2     8
 7 1       Condition2     7
 8 3       Condition2     5
 9 5       Condition2     5
10 4       Condition2     8
11 2       Condition3     8
12 5       Condition3     7
13 3       Condition3     7
14 4       Condition3     5
15 1       Condition3     4
F1 <- F %>%
  reorder_levels(Condition, order = c("C1", "C2", "C3"))

summary statistics

Compute summary statistics by groups:

F %>% 
  group_by(Condition) %>%
  get_summary_stats(score, type = "common")
# A tibble: 3 × 11
  Condition  variable     n   min   max median   iqr  mean    sd    se    ci
  <fct>      <fct>    <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Condition1 score        5     8    10      9     1   9.2 0.837 0.374  1.04
2 Condition2 score        5     5     8      7     3   6.6 1.52  0.678  1.88
3 Condition3 score        5     4     8      7     2   6.2 1.64  0.735  2.04

Visualization

ggboxplot(F, x = "Condition", y = "score")

Computation

res.kruskal <- F %>% kruskal_test(score ~ Condition)
res.kruskal
# A tibble: 1 × 6
  .y.       n statistic    df      p method        
* <chr> <int>     <dbl> <int>  <dbl> <chr>         
1 score    15      8.75     2 0.0126 Kruskal-Wallis

Effect size

F %>% kruskal_effsize(score ~ Condition)
# A tibble: 1 × 5
  .y.       n effsize method  magnitude
* <chr> <int>   <dbl> <chr>   <ord>    
1 score    15   0.562 eta2[H] large    

A large effect size is detected, eta2[H] = 0.562284

# Pairwise comparisons
pwc <- F %>% 
  dunn_test(score ~ Condition, p.adjust.method = "bonferroni") 
pwc
# A tibble: 3 × 9
  .y.   group1     group2        n1    n2 statistic       p  p.adj p.adj.signif
* <chr> <chr>      <chr>      <int> <int>     <dbl>   <dbl>  <dbl> <chr>       
1 score Condition1 Condition2     5     5    -2.34  0.0193  0.0578 ns          
2 score Condition1 Condition3     5     5    -2.74  0.00621 0.0186 *           
3 score Condition2 Condition3     5     5    -0.396 0.692   1      ns          

Only Condition1 and Condition3 are statistically significant with p-value 0.01863958.

Pairwise comparisons using Wilcoxon’s test:

pwc2 <- F %>% 
  wilcox_test(score ~ Condition, p.adjust.method = "bonferroni")
pwc2
# A tibble: 3 × 9
  .y.   group1     group2        n1    n2 statistic     p p.adj p.adj.signif
* <chr> <chr>      <chr>      <int> <int>     <dbl> <dbl> <dbl> <chr>       
1 score Condition1 Condition2     5     5      24   0.019 0.057 ns          
2 score Condition1 Condition3     5     5      24.5 0.015 0.045 *           
3 score Condition2 Condition3     5     5      15   0.664 1     ns          

The pairwise comparison shows that, Condition1 and Condition3 are significantly different (Wilcoxon’s test, p = 0.045).

Interpretation

There was a statistically significant differences between inhibiting learning depending on the noise present during the memorizing as assessed using the Kruskal-Wallis test (p = 0.013). Pairwise Wilcoxon test between groups showed that only the difference between Condition1 and Condition3 group was significant (Wilcoxon’s test, p = 0.045)

# Visualization: box plots with p-values
pwc <- pwc %>% add_xy_position(x = "Condition")
ggboxplot(F, x = "Condition", y = "score") +
  stat_pvalue_manual(pwc, hide.ns = TRUE) +
  labs(
    subtitle = get_test_label(res.kruskal, detailed = TRUE),
    caption = get_pwc_label(pwc)
    )