RM Anova

Introduction

The repeated measures ANOVA makes the following assumptions about the data:

  • No significant outliers in any cell of the design. This can be checked by visualizing the data using box plot methods and by using the function identify_outliers() [rstatix package].

  • Normality: the outcome (or dependent) variable should be approximately normally distributed in each cell of the design. This can be checked using the Shapiro-Wilk normality test (shapiro_test() [rstatix]) or by visual inspection using QQ plot (ggqqplot() [ggpubr package]).

  • Assumption of sphericity: the variance of the differences between groups should be equal. This can be checked using the Mauchly’s test of sphericity, which is automatically reported when using the R function anova_test() [rstatix package].

Note that, if the above assumptions are not met there are a non-parametric alternative (Friedman test) to the one-way repeated measures ANOVA!

Unfortunately, there are no non-parametric alternatives to the two-way and the three-way repeated measures ANOVA. Thus, in the situation where the assumptions are not met, you could consider running the two-way/three-way repeated measures ANOVA on the transformed and non-transformed data to see if there are any meaningful differences.

If both tests lead you to the same conclusions, you might not choose to transform the outcome variable and carry on with the two-way/three-way repeated measures ANOVA on the original data.

It’s also possible to perform robust ANOVA test using the WRS2 R package.

No matter your choice, you should report what you did in your results.

RM Anova in R

Key R functions:

  • anova_test() [rstatix package], a wrapper around car::Anova() for making easy the computation of repeated measures ANOVA. Key arguments for performing repeated measures ANOVA:

    • data: data frame

    • dv: (numeric) the dependent (or outcome) variable name.

    • wid: variable name specifying the case/sample identifier.

    • within: within-subjects factor or grouping variable

  • get_anova_table() [rstatix package]. Extracts the ANOVA table from the output of anova_test(). It returns ANOVA table that is automatically corrected for eventual deviation from the sphericity assumption. The default is to apply automatically the Greenhouse-Geisser sphericity correction to only within-subject factors violating the sphericity assumption (i.e., Mauchly’s test p-value is significant, p <= 0.05). Read more in Chapter @ref(mauchly-s-test-of-sphericity-in-r).

1-way RM Anova

The dataset “selfesteem” contains 10 individuals’ self-esteem stress on three music points during a specific diet to determine whether their self-esteem improved.

data("selfesteem", package = "datarium")
head(selfesteem, 3)
## # A tibble: 3 x 4
##      id    t1    t2    t3
##   <int> <dbl> <dbl> <dbl>
## 1     1  4.01  5.18  7.11
## 2     2  2.56  6.91  6.31
## 3     3  3.24  4.44  9.78

The one-way repeated measures ANOVA can be used to determine whether the means self-esteem stresss are significantly different between the three music points. So let’s convert this data frame into long format:

selfesteem <- selfesteem %>%
  gather(key = "time", value = "score", t1, t2, t3) %>%
  convert_as_factor(id, time)
head(selfesteem, 3)
## # A tibble: 3 x 3
##   id    time  score
##   <fct> <fct> <dbl>
## 1 1     t1     4.01
## 2 2     t1     2.56
## 3 3     t1     3.24

Descriptive statistics

selfesteem %>%
  group_by(time) %>%
  get_summary_stats(score, type = "mean_sd")
## # A tibble: 3 x 5
##   time  variable     n  mean    sd
##   <fct> <chr>    <dbl> <dbl> <dbl>
## 1 t1    score       10  3.14 0.552
## 2 t2    score       10  4.93 0.863
## 3 t3    score       10  7.64 1.14
bxp <- ggboxplot(selfesteem, x = "time", y = "score", add = "point")
bxp

Assumptions

outliers

selfesteem %>%
  group_by(time) %>%
  identify_outliers(score)
## # A tibble: 2 x 5
##   time  id    score is.outlier is.extreme
##   <fct> <fct> <dbl> <lgl>      <lgl>     
## 1 t1    6      2.05 TRUE       FALSE     
## 2 t2    2      6.91 TRUE       FALSE

normality

selfesteem %>%
  group_by(time) %>%
  shapiro_test(score)
## # A tibble: 3 x 4
##   time  variable statistic     p
##   <fct> <chr>        <dbl> <dbl>
## 1 t1    score        0.967 0.859
## 2 t2    score        0.876 0.117
## 3 t3    score        0.923 0.380
ggqqplot(selfesteem, "score", facet.by = "time")

Anova

results <- anova_test(data = selfesteem, dv = score, wid = id, within = time)
get_anova_table(results)
## ANOVA Table (type III tests)
## 
##   Effect DFn DFd      F        p p<.05   ges
## 1   time   2  18 55.469 2.01e-08     * 0.829

Post-hoc tests

pwc <- selfesteem %>%
  pairwise_t_test(
    score ~ time, paired = TRUE,
    p.adjust.method = "bonferroni"
    )
pwc
## # A tibble: 3 x 10
##   .y.   group1 group2    n1    n2 statistic    df         p   p.adj p.adj.signif
## * <chr> <chr>  <chr>  <int> <int>     <dbl> <dbl>     <dbl>   <dbl> <chr>       
## 1 score t1     t2        10    10     -4.97     9   7.72e-4 2.00e-3 **          
## 2 score t1     t3        10    10    -13.2      9   3.34e-7 1.00e-6 ****        
## 3 score t2     t3        10    10     -4.87     9   8.86e-4 3.00e-3 **

Conclusions

pwc <- pwc %>% add_xy_position(x = "time")
bxp + 
  stat_pvalue_manual(pwc) +
  labs(
    subtitle = get_test_label(results, detailed = TRUE),
    caption = get_pwc_label(pwc)
  )

2-way RM Anova

For Two-Way Repeated Measures ANOVA, “Two-way” means that there are two factors in the experiment, for example, different images and different conditions. “Repeated-measures” means that the same subject received more than one image and/or more than one condition. Similar to two-way ANOVA, two-way repeated measures ANOVA can be employed to test for significant differences between the factor level means within a factor and for interactions between factors.

Using a standard ANOVA in this case is not appropriate because it fails to model the correlation between the repeated measures, and the data violates the ANOVA assumption of independence. Two-Way Repeated Measures ANOVA designs can be two repeated measures factors, or one repeated measures factor and one non-repeated factor. If any repeated factor is present, then the repeated measures ANOVA should be used.

Please apply Two-way RM-ANOVA to analyze if any significant interactions (between music and music, music and image, music and image, or music and music and image)! Use the following data set:

set.seed(5250)
myData <- data.frame(PID = rep(seq(from = 1,
                               to = 60, by = 1), 20),
                     stress = sample(x = 1:100,
                                     size = 1200,
                                     replace = TRUE),
                     image = sample(c("Happy", "Angry"),
                                    size = 1200,
                                    replace = TRUE),
                     music = sample(c("Disney", "Horror"),
                                    size = 1200,
                                    replace = TRUE)
)
myData <- within(myData, {
  PID   <- factor(PID)
  image <- factor(image)
  music <- factor(music)
})
myData <- myData[order(myData$PID), ]
head(myData)
##     PID stress image  music
## 1     1     90 Happy Horror
## 61    1      7 Angry Disney
## 121   1     31 Happy Disney
## 181   1     68 Angry Disney
## 241   1      6 Happy Disney
## 301   1     80 Angry Horror

First Experiment (Wrong)

After following test, we need delete people who PID = 30 which has different conditions with others.

a<-myData
tapply(a$PID,INDEX = a$PID,FUN = length)
##  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 
## 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 
## 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 
## 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 
## 53 54 55 56 57 58 59 60 
## 20 20 20 20 20 20 20 20
myData1 <- filter(myData,PID!=30)
bxp <- ggboxplot(
  myData1, x = "music", y = "stress", color = "image")
bxp

Assumptions

outliers
myData1 %>%
  group_by(image, music) %>%
  identify_outliers(stress)
## [1] image      music      PID        stress     is.outlier is.extreme
## <0 行> (或0-长度的row.names)
normality
myData1 %>%
  group_by(image, music) %>%
  shapiro_test(stress)
## # A tibble: 4 x 5
##   image music  variable statistic             p
##   <fct> <fct>  <chr>        <dbl>         <dbl>
## 1 Angry Disney stress       0.944 0.00000000210
## 2 Angry Horror stress       0.958 0.000000135  
## 3 Happy Disney stress       0.954 0.0000000880 
## 4 Happy Horror stress       0.957 0.000000124

p < 0.05 which means not normal, we can try log and sqrt.But it is still unnormality.So there is something wrong in this data. But let’s ignore it firstly to do RM ANOVA Test.

# myData1$stress = sqrt(myData1$stress)
# myData1 %>%
#   group_by(image, music) %>%
#   shapiro_test(stress)
myData1$stress = log(myData1$stress)
myData1 %>%
  group_by(image, music) %>%
  shapiro_test(stress)
## # A tibble: 4 x 5
##   image music  variable statistic        p
##   <fct> <fct>  <chr>        <dbl>    <dbl>
## 1 Angry Disney stress       0.871 2.44e-15
## 2 Angry Horror stress       0.823 9.83e-18
## 3 Happy Disney stress       0.874 1.92e-14
## 4 Happy Horror stress       0.860 1.05e-15
ggqqplot(myData1, "stress", ggtheme = theme_bw()) +
  facet_grid(music ~ image, labeller = "label_both")

also showed it is not normal.

Anova

However, we find it has nothing to do with wid, this is not RM ANOVA in within subject test, so we need do the second experiment.

results <- anova_test(
  data = myData1, stress ~ image * music, wid = PID
  )
## Coefficient covariances computed by hccm()
get_anova_table(results)
## ANOVA Table (type II tests)
## 
##        Effect DFn  DFd     F     p p<.05      ges
## 1       image   1 1176 1.439 0.231       1.00e-03
## 2       music   1 1176 0.110 0.740       9.37e-05
## 3 image:music   1 1176 3.870 0.049     * 3.00e-03
results2 <- anova_test(
  data = myData1, stress ~ image * music
  )
## Coefficient covariances computed by hccm()
get_anova_table(results2)
## ANOVA Table (type II tests)
## 
##        Effect DFn  DFd     F     p p<.05      ges
## 1       image   1 1176 1.439 0.231       1.00e-03
## 2       music   1 1176 0.110 0.740       9.37e-05
## 3 image:music   1 1176 3.870 0.049     * 3.00e-03

Second Experiment

As the definition of within subject design, every group should do experiment of all condition and only have one value for each condition. So we still need delete the PID = 30 (we have done in the first experiment). And then I choose mean of each condition.

myData2<-aggregate(myData$stress, by=list(PID = myData$PID,image = myData$image,music = myData$music),median)
colnames(myData2)[4] = "stress"
myData2 <- filter(myData2,PID!=30)

Descriptive statistics

myData2 %>%
  group_by(image, music) %>%
  get_summary_stats(stress, type = "mean_sd")
## # A tibble: 4 x 6
##   image music  variable     n  mean    sd
##   <fct> <fct>  <chr>    <dbl> <dbl> <dbl>
## 1 Angry Disney stress      59  48.9  18.3
## 2 Angry Horror stress      59  53.9  18.5
## 3 Happy Disney stress      59  47.1  19.4
## 4 Happy Horror stress      59  47.5  20.5
bxp <- ggboxplot(
  myData2, x = "music", y = "stress", color = "image")
bxp

Assumptions

outliers
myData2 %>%
  group_by(image, music) %>%
  identify_outliers(stress)
## [1] image      music      PID        stress     is.outlier is.extreme
## <0 行> (或0-长度的row.names)

There isn’t extrem ourliers. ##### normality

myData2 %>%
  group_by(image, music) %>%
  shapiro_test(stress)
## # A tibble: 4 x 5
##   image music  variable statistic     p
##   <fct> <fct>  <chr>        <dbl> <dbl>
## 1 Angry Disney stress       0.988 0.827
## 2 Angry Horror stress       0.987 0.766
## 3 Happy Disney stress       0.977 0.315
## 4 Happy Horror stress       0.983 0.579

p > 0.05 which means normal, so except for “Happy” “Disney” are normal.

We can look at the plot.

ggqqplot(myData2, "stress", ggtheme = theme_bw()) +
  facet_grid(music ~ image, labeller = "label_both")

From the plot above, as all the points fall approximately along the reference line, we can assume normality.

Anova

results <- anova_test(
  data = myData2, stress ~ image * music, wid = PID
  )
## Coefficient covariances computed by hccm()
get_anova_table(results)
## ANOVA Table (type II tests)
## 
##        Effect DFn DFd     F     p p<.05   ges
## 1       image   1 232 2.731 0.100       0.012
## 2       music   1 232 1.166 0.281       0.005
## 3 image:music   1 232 0.841 0.360       0.004

There isn’t a statistically significant two-way interactions between treatment and time.

Post-hoc tests

The interaction is not significant. However, there is also no statistically significant main effects, so we dont need to do Post-hoc tests s below which can help to decide which is main effect.

# # comparisons for image variable
# myData2 %>%
#   pairwise_t_test(
#     stress ~ image, paired = TRUE,
#     p.adjust.method = "bonferroni"
#     )
# # comparisons for music variable
# myData2 %>%
#   pairwise_t_test(
#     stress ~ music, paired = TRUE,
#     p.adjust.method = "bonferroni"
#     )

Conclusion:

There is some wrong with the data, but after do exercise as above I can get trainning in the RM ANOVA,
and the result is:
There was not a statistically significant interaction between image and music on stress. And, there is also no statistically significant main effects.