If you find this document helpful for your research please do not forget to cite the references listed in this document. For this document itself you can cite Aydın et al. (2018):
Aydın, B., Algina, J., Leite, W. L., & Atilgan, H. (2018). An R Companion: A Compact Introduction for Social Scientists. Ankara: Anı
This procedure can be used when there are score on the same participants under several treatments or at several time points and is then called repeated measures ANOVA. It can also be used in blocking designs and is then called randomized block ANOVA. Compared to between-subjects designs, this procedure is expected to eliminate influence of individual differences, in other words, to reduced variability, and thus, to reduce error. This development results in more power than independent-samples ANOVA with the same sample size. Of course there are issues other than power that must be considered in selecting a design. For example, if the goal is to compare reading achievement under three instructional methods, using a design in which each participant is exposed to the three methods will be problematic because the effect of exposure to one treatment will continue to influence reading ability during the other treatments.
The structural model following Myers et al. (2013) notation for a non-additive model;
\[Y_{ij}=\mu + \eta_i + \alpha_j + (\eta \alpha)_{ij} + \epsilon_{ij}\]
where i represents the individual, i=1,…,n; j represents the levels of the within-subjects factor (i.e, the repeated measurement or the treatment factor), j=1,…,P. Y is the score; \(\mu\) is the grand mean; \(\eta_i\) represents the difference between individual’s average score over the levels and the grand mean; \(\alpha_j\) represents the difference between the average score under level j of the within-subjects factor and the grand mean; \((\eta \alpha)_ij\) represents the interaction, and \(\epsilon_{ij}\) represent the error component. Because \((\eta \alpha)_ij\) and \(\epsilon_{ij}\) have the same subscripts, they are confounded.
Generally, the interest is on \(\alpha_j\). This interest leads to hypothesis testing:
\[H_0: \mu_1 = \mu_2 = \cdots = \mu_P\]
The alternative hypothesis states that at least one population mean is different. The ANOVA table for a one-way with-subjects ANOVA is;
| SV | df | F |
|---|---|---|
| Subjects (S) | \(n-1\) | |
| Waves (A) | \(P-1\) | \(\frac{MS_A}{MS_{SA}}\) |
| SA | \((n-1)(P-1)\) | |
| Total | \(nP-1\) |
Note on additivity The simplest explanation of additivity is the situation that would justify \((\eta \alpha)_ij = 0\) in Equation above. This unrealistic restriction implies that the effect of levels of the within-subject factor waves is the same for all individuals.
Shown in the below table are data for an experiment in which each person participates in four treatments defined by the amount of alcohol consumed. The dependent variable is a reaction time measure.
owadata=data.frame(id=1:8,
noALC=c(1,2,2,2,3,3,3,6),
twoOZ=c(2,3,3,3,4,4,4,5),
fouroz=c(5,5,6,6,6,7,7,8),
sixoz=c(7,8,8,9,9,10,10,11))
knitr::kable(
owadata, booktabs = TRUE,
caption = 'Original Alcohol Data'
)
| id | noALC | twoOZ | fouroz | sixoz |
|---|---|---|---|---|
| 1 | 1 | 2 | 5 | 7 |
| 2 | 2 | 3 | 5 | 8 |
| 3 | 2 | 3 | 6 | 8 |
| 4 | 2 | 3 | 6 | 9 |
| 5 | 3 | 4 | 6 | 9 |
| 6 | 3 | 4 | 7 | 10 |
| 7 | 3 | 4 | 7 | 10 |
| 8 | 6 | 5 | 8 | 11 |
Treatment means, treatment standard deviations, and subject means are shown below. A subject mean is the average of the four scores for that subject.
# set the number of decimals (cosmetic)
options(digits = 2)
#participants mean
apply(owadata,1, mean)
## [1] 3.2 4.0 4.4 4.8 5.4 6.0 6.2 7.6
#treatment mean
apply(owadata[,-1],2, mean)
## noALC twoOZ fouroz sixoz
## 2.8 3.5 6.2 9.0
#treatment sd
apply(owadata[,-1],2,sd)
## noALC twoOZ fouroz sixoz
## 1.49 0.93 1.04 1.31
Because each participant has a score for each treatment, amount of alcohol is a within-subjects factor and it is possible to calculate a correlation for each pair of treatments. These correlations, which are presented in Table 9.2, indicate that reaction time is highly correlated for each pair of treatments. Recall that corresponding to each correlation there is a covariance;
\[Cov_{pp^`}=S_pS_{p^`}r_{pp^`}\]
where \(p\) and \(p'\) are two levels of the alcohol consumption factor. For example the correlation between the scores in the first two levels of the alcohol consumption factor is \(r_{02}=0.93\). And the corresponding covariance is \(Cov_{02}=1.5*0.9*0.93=1.26\)
| noALC | twoOZ | fouroz | sixoz | |
|---|---|---|---|---|
| noALC | 1.00 | 0.93 | 0.88 | 0.88 |
| twoOZ | 0.93 | 1.00 | 0.89 | 0.94 |
| fouroz | 0.88 | 0.89 | 1.00 | 0.95 |
| sixoz | 0.88 | 0.94 | 0.95 | 1.00 |
The alcohol consumption factor is a within-subjects factor. Consequently the F statistic for comparing the four treatment means should be appropriate for a design with a within-subjects factor. Let
P = the number of levels of the within-subjects factor, in the example P=4 ;
\(\bar C\) = the average covariance; in the example \(\bar C\) = 1.26.
The appropriate F statistic is
\[F_W=\frac{MS_A}{MS_{SA}}=\frac{MS_A}{MS_{S/A}- \bar C}\]
where \(MS_A\) and \(MS_{S/A}\) are calculated as they are for a between-subjects factor and the W emphasizes the F statistic is for a within-subjects factor. The critical value is \(F_{\alpha,P-1,(P-1)(n-1)}\) . The denominator mean square, \(MS_{SA}\) , is read mean square Subjects x A where A is the generic label for the treatment factor. Recall that the F statistic for a between-subjects factor is \(F_B=MS_A/MS_{S/A}\). Comparison of \(F_W\) and \(F_B\) shows that \(F_W\) incorporates the correlations between the pairs of treatments and \(F_B\) does not. (Recall that, in like fashion, the dependent samples t statistic incorporates the correlation whereas the independent samples t statistic does not.) As a result, when applied to a design with a within-subjects factor \(F_W \geq F_B\).Therefore, incorrectly using will usually result in a loss of power.
Sphericity is an assumption about the pattern of variances and covariances.If the data are spherical, the difference between each pair of repeated measures has the same variance for all pairs.
Example covariance matrix;
| \(Y_1\) | \(Y_2\) | \(Y_3\) | |
|---|---|---|---|
| \(Y_1\) | 10 | 7.5 | 10 |
| \(Y_2\) | 7.5 | 15 | 12.5 |
| \(Y_3\) | 10 | 12.5 | 20 |
Sphericity holds;
| \(Y_p-Y_{p'}\) | \(\sigma^2_p + \sigma^2_{p'} - 2\sigma_{pp'}\) |
|---|---|
| \(Y_1-Y_2\) | 10+15-2(7.5)=10 |
| \(Y_1-Y_3\) | 10+20-2(10)=10 |
| \(Y_2-Y_3\) | 15+20-2(12.5)=10 |
Box’s epsilon —measures how severely sphericity is violated.
\[\frac{1}{P-1}\leq \epsilon \leq 1\]
Estimates of \(\epsilon\) are Greenhouse-Geisser (\(\hat \epsilon\)) and Huynh-Feldt (\(\tilde{\epsilon}\))
\(\tilde{\epsilon}\) can be larger than 1.0; if it is \(\tilde{\epsilon}\) is set equal to 1.0.
Critical value assuming sphericity is \(F_{alpha,(P-1),(n-1)(P-1)}\).
Approximately correct critical value when sphericity is violated \(F_{alpha,\epsilon (P-1),\epsilon (n-1)(P-1)}\).
normality of errors in above equation, \(\epsilon_{ij}\) is assumed to be normally and independently distributed with a mean value of zero.
normality of \(\eta_i\), \(\eta_{i}\) is assumed to be normally and independently distributed with a mean value of zero.
Together the assumptions listed immediately above imply that the repeated measures are drawn from a multivariate normal distribution.
Although assumptions can be stated in terms of \(\eta_{i}\) and \(\epsilon_{ij}\), a simpler approach is that the repeated measures are drawn from a multivariate normal distribution with covariance matrix that meets the sphericity assumption. If the data meet the sphericity assumption, the difference between each pair of repeated measures has the same variance for all pairs.
Assuming that the data are drawn from a multivariate normal distribution and within each level of the within-subjects factor the scores are independent, having equal variance and covariances (compound symmetry) is a sufficient condition for the F test on the within-subjects factor to be valid.
If additivity holds and the equal variance assumption holds then compound symmetry holds. But compund symmetry is a stricter assumption than the sphericity. Considering that sphericity is a necessary and sufficient requirement for the F test on the within-subjects factor to be valid (assuming that data are drawn from a multivariate normal distribution and scores are independent within each level of the within-subjects factor) checking for spericity is more important than checking for additivity. In addition because there are adjusted degrees of freedom procedures that adjust the F test on the within-subjects factor for violation of sphericity, it is not necessary to test for additivity.
For illustrative purposes, a subsample from an original cluster randomized trial study (for details see Daunic et al. (2012)) was taken. The subsample included 1 control-classroom and 17 students. The variable of interest is the problem solving knowledge. Each wave was approximately one year apart. Higher scores mean higher knowledge.
Step 1: Set up data
#enter data
PSdata=data.frame(id=factor(1:17),
wave1=c(20,19,13,10,16,12,16,11,11,14,13,17,16,12,12,16,16),
wave2=c(28,27,18,17,29,18,26,21,15,26,28,23,29,18,26,21,22),
wave3=c(21,24,14,8,23,15,21,15,12,21,23,17,26,18,14,18,19))
Report descriptive
# set the number of decimals (cosmetic)
options(digits = 3)
##the long format will be needed
#head(PSdata)
library(tidyr)
data_long = gather(PSdata, wave, PrbSol, wave1:wave3, factor_key=TRUE)
#get descriptives
library(doBy)
library(moments)
desc1W=as.matrix(summaryBy(PrbSol~wave, data = data_long,
FUN = function(x) { c(n = sum(!is.na(x)),
mean = mean(x,na.rm=T), sdv = sd(x,na.rm=T),
skw=moments::skewness(x,na.rm=T),
krt=moments::kurtosis(x,na.rm=T)) } ))
# Table 6
desc1W
## wave PrbSol.n PrbSol.mean PrbSol.sdv PrbSol.skw PrbSol.krt
## 1 "wave1" "17" "14.4" "2.91" " 0.311" "2.10"
## 2 "wave2" "17" "23.1" "4.67" "-0.224" "1.64"
## 3 "wave3" "17" "18.2" "4.77" "-0.315" "2.45"
#write.csv(desc1W,file="onewayW_ANOVA_desc.csv")
The covariance matrix might be helpful.
# check covariance Table 7
cov(PSdata[,-1])
## wave1 wave2 wave3
## wave1 8.49 8.85 9.87
## wave2 8.85 21.81 18.49
## wave3 9.87 18.49 22.78
Step 2: Check assumptions
library(ggplot2)
ggplot(data_long, aes(x=wave, y=PrbSol))+
geom_boxplot()+
labs(x = "Wave",y="Problem Solving Knowledge scores")
Problem Solving Knowledge score by wave
We will test for the sphericity assumption using Mauchly’s test integrated in the ezANOVA function, but this graph implies that it might be violated.
ggplot(data_long, aes(x=wave, y=PrbSol, group=id))+
geom_line() + labs(x = "Wave",y="Problem Solving Knowledge scores")
Problem Solving Knowledge score by wave, line graph
This graph, which plots the problem solving scores by wave, suggests that the \(\eta \beta_{ij}\) interaction terms are not likely to all be zero; therefore assuming sphericity while testing the hypothesis of equal wave means is not likely to be justified. The tukey.add.test function in asbio (Aho (2016)) investigates \(H_0\) : main effect and blocks are additive.
library(asbio)
with(data_long,tukey.add.test(PrbSol,wave,id))
##
## Tukey's one df test for additivity
## F = 5.943 Denom df = 31 p-value = 0.021
# if additivity exists a randomized block approach might be appropriate
#additive=with(data_long,lm(PrbSol~id+wave))
#anova(additive)
The Tukey additive test rejects the null hypothesis of additivity agrees with the line graph. In other words, a non-additive model is more appropriate for the problem solving knowledge data.
Step 3: Run ANOVA (including checks for the sphericity and normality of residuals assumptions)
The ezANOVA function (Lawrence (2016)) reports the F test, the Levene Test and an effect size. Type of the effect size depends on the model and indirectly depends on the type of sum of squares used, the type argument (1,2 or 3) transmits the choice.
library(ez)
#alternative 1 the ezANOVA function
alternative1 = ezANOVA(
data = data_long,
wid=id, dv = PrbSol, within = wave,
detailed = T,return_aov=T)
alternative1
## $ANOVA
## Effect DFn DFd SSn SSd F p p<.05 ges
## 1 (Intercept) 1 16 17510 680 412.0 7.62e-13 * 0.954
## 2 wave 2 32 647 169 61.2 1.16e-11 * 0.433
##
## $`Mauchly's Test for Sphericity`
## Effect W p p<.05
## 2 wave 0.918 0.526
##
## $`Sphericity Corrections`
## Effect GGe p[GG] p[GG]<.05 HFe p[HF] p[HF]<.05
## 2 wave 0.924 6.17e-11 * 1.04 1.16e-11 *
##
## $aov
##
## Call:
## aov(formula = formula(aov_formula), data = data)
##
## Grand Mean: 18.5
##
## Stratum 1: id
##
## Terms:
## Residuals
## Sum of Squares 680
## Deg. of Freedom 16
##
## Residual standard error: 6.52
##
## Stratum 2: id:wave
##
## Terms:
## wave Residuals
## Sum of Squares 647 169
## Deg. of Freedom 2 32
##
## Residual standard error: 2.3
## Estimated effects may be unbalanced
PrbSolres=sort(alternative1$aov$id$residuals)
qqnorm(PrbSolres);qqline(PrbSolres)
PSK model residuals
The distribution of the residuals is not severely non-normal.
The aov function is the second alternative.
# alternative 2 the aov function
summary(aov(PrbSol ~ wave + Error(id/wave), data=data_long))
##
## Error: id
## Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 16 680 42.5
##
## Error: id:wave
## Df Sum Sq Mean Sq F value Pr(>F)
## wave 2 647 324 61.2 1.2e-11 ***
## Residuals 32 169 5
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
One of the convenient robust procedures, a heteroscedastic one-way repeated measures ANOVA for trimmed means, has been compressed into the rmanova function, available via WRS-2 (Mair & Wilcox (2016)).
library(WRS2)
#rmanova
# 20% trimmed
with(data_long,rmanova(PrbSol,wave,id,tr=.20))
## Call:
## rmanova(y = PrbSol, groups = wave, blocks = id, tr = 0.2)
##
## Test statistic: F = 34.9
## Degrees of freedom 1: 1.9
## Degrees of freedom 2: 19
## p-value: 0
Descriptive statistics for the problem solving knowledge scores at each wave are presented in Table 6. The covariance matrix is presented in Table 7. A one-way within ANOVA was reported. F test was conducted at \(\alpha=.05\). The assumptions of one-way within subjects ANOVA are satisfied. There was a significant difference between waves \(F(2,32)=61.2, p<.001\). The ezANOVA function reported a generalized eta hat squared (\(\hat\eta^2_G\)) of 0.43.
To be added.
To be added.
To be added.
See next RMD.