We’ll use the self esteem score dataset measured over three time points. The data is available in the datarium package.
data("selfesteem", package = "datarium")
paged_table(head(selfesteem, 3))
Gather columns t1, t2 and t3 into long format. Convert id and time variables into factor (or grouping) variables:
selfesteem <- selfesteem %>%
gather(key = "time", value = "score", t1, t2, t3) %>%
convert_as_factor(id, time)
paged_table(head(selfesteem, 3))
Compute some summary statistics of the self-esteem score by groups (time):
a<-selfesteem %>%
group_by(time) %>%
get_summary_stats(score, type = "common")
paged_table(a)
Create a box plot and add points corresponding to individual values
ggboxplot(selfesteem, x = "time", y = "score", add = "jitter")
We’ll use the pipe-friendly friedman_test() function [rstatix package], a wrapper around the R base function friedman.test().
res.fried <- selfesteem %>% friedman_test(score ~ time |id)
paged_table(res.fried)
The self esteem score was statistically significantly different at the different time points during the diet, X2(2) = 18.2, p = 0.0001.
The Kendall’s W can be used as the measure of the Friedman test effect size. It is calculated as follow : W = X2/N(K-1); where W is the Kendall’s W value; X2 is the Friedman test statistic value; N is the sample size. k is the number of measurements per subject (M. T. Tomczak and Tomczak 2014).
The Kendall’s W coefficient assumes the value from 0 (indicating no relationship) to 1 (indicating a perfect relationship).
Kendall’s W uses the Cohen’s interpretation guidelines of 0.1 - < 0.3 (small effect), 0.3 - < 0.5 (moderate effect) and >= 0.5 (large effect). Confidence intervals are calculated by bootstap.
b<-selfesteem %>% friedman_effsize(score ~ time |id)
paged_table(b)
A large effect size is detected, W = 0.91.
From the output of the Friedman test, we know that there is a significant difference between groups, but we don’t know which pairs of groups are different.
A significant Friedman test can be followed up by pairwise Wilcoxon signed-rank tests for identifying which groups are different.
Pairwise comparisons using paired Wilcoxon signed-rank test. P-values are adjusted using the Bonferroni multiple testing correction method.
pwc <- selfesteem %>%
wilcox_test(score ~ time, paired = TRUE, p.adjust.method = "bonferroni")
paged_table(pwc)
All the pairwise differences are statistically significant.
pwc2 <- selfesteem %>%
sign_test(score ~ time, p.adjust.method = "bonferroni")
paged_table(pwc2)
The self-esteem score was statistically significantly different at the different time points using Friedman test, X2(2) = 18.2, p = 0.00011.
Pairwise Wilcoxon signed rank test between groups revealed statistically significant differences in self esteem score between t1 and t2 (p = 0.006); t1 and t3 (0.006); t2 and t3 (0.012).
pwc <- pwc %>% add_xy_position(x = "time")
ggboxplot(selfesteem, x = "time", y = "score", add = "point") +
stat_pvalue_manual(pwc, hide.ns = TRUE) +
labs(
subtitle = get_test_label(res.fried, detailed = TRUE),
caption = get_pwc_label(pwc)
)