## Warning: package 'rstatix' was built under R version 4.0.3
## 
## Attaching package: 'rstatix'
## The following object is masked from 'package:stats':
## 
##     filter
## Loading required package: carData
## Warning: package 'faraway' was built under R version 4.0.3
## Registered S3 methods overwritten by 'lme4':
##   method                          from
##   cooks.distance.influence.merMod car 
##   influence.merMod                car 
##   dfbeta.influence.merMod         car 
##   dfbetas.influence.merMod        car
## 
## Attaching package: 'faraway'
## The following objects are masked from 'package:car':
## 
##     logit, vif

1 One-way ANOVA

In this exercise you have to consider data set named PlantGrowth from R environment. It contains the weight of plants obtained under a control and two different treatment conditions. Please consider the following steps:

1.1 Data Preparation (One-way ANOVA)

Here we use the built-in R dataset named PlantGrowth. It contains the weight of plants obtained under a control and two different treatment conditions.

##   weight group
## 1   4.17  ctrl
## 2   5.58  ctrl
## 3   5.18  ctrl
## 4   6.11  ctrl
## 5   4.50  ctrl
## 6   4.61  ctrl

Show the levels of the grouping variable:

## [1] "ctrl" "trt1" "trt2"

Where, ctrl = Control trt1 = Treatment 1 trt2 = Treatment 2

1.1.1 Summary statistics

Compute some summary statistics (count, mean and sd) of the variable weight organized by groups:

## # A tibble: 3 x 5
##   group variable     n  mean    sd
##   <fct> <chr>    <dbl> <dbl> <dbl>
## 1 ctrl  weight      10  5.03 0.583
## 2 trt1  weight      10  4.66 0.794
## 3 trt2  weight      10  5.53 0.443

1.1.3 Outliers

Outliers can be easily identified using box plot methods, implemented in the R function identify_outliers() [rstatix package].

## # A tibble: 2 x 4
##   group weight is.outlier is.extreme
##   <fct>  <dbl> <lgl>      <lgl>     
## 1 trt1    5.87 TRUE       FALSE     
## 2 trt1    6.03 TRUE       FALSE

There were no extreme outliers.

1.2 One-way ANOVA Test

The aov() function is used to obtain the relevant sums of squares. Using the summary() function on the output from aov() creates the desired ANOVA table. (Without the unneeded row for total).

## Coefficient covariances computed by hccm()
## ANOVA Table (type II tests)
## 
##   Effect DFn DFd     F     p p<.05   ges
## 1  group   2  27 4.846 0.016     * 0.264
##             Df Sum Sq Mean Sq F value Pr(>F)  
## group        2  3.766  1.8832   4.846 0.0159 *
## Residuals   27 10.492  0.3886                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

In the table above, the column ges corresponds to the generalized eta squared (effect size). It measures the proportion of the variability in the outcome variable (here plant weight) that can be explained in terms of the predictor (here, treatment group). An effect size of 0.264 (26.4%) means that 26.4% of the change in the weight can be accounted for the treatment conditions. From the above ANOVA table, it can be seen that there are significant differences between groups (\(p = 0.016\)), which are highlighted with “*“, \(F(2, 27) = 4.85\), \(p = 0.16\), \(eta2[g] = 0.26\).

where,

F indicates that we are comparing to an F-distribution (F-test); (2, 27) indicates the degrees of freedom in the numerator (DFn) and the denominator (DFd), respectively; 4.85 indicates the obtained F-statistic value p specifies the p-value *ges is the generalized effect size (amount of variability due to the factor)

1.3 Check Assumptions (One-way ANOVA)

The ANOVA test assumes that, the data are normally distributed and the variance across groups are homogeneous. We can check that with some diagnostic plots.

The normality assumption can be checked by using one of the following two approaches:

Analyzing the ANOVA model residuals to check the normality for all groups together. This approach is easier and it’s very handy when you have many groups or if there are few data points per group. Check normality for each group separately. This approach might be used when you have only a few groups and many data points per group.

This section will show how to proceed for both option 1 and 2.

Check normality assumption by analyzing the model residuals. QQ plot and Shapiro-Wilk test of normality are used. QQ plot draws the correlation between a given data and the normal distribution.

In the plot above, there is no evident to show relationships between residuals and fitted values (the mean of each groups), which is good. So, we can assume the homogenity of variances.

Check normality assumption by groups. Computing Shapiro-Wilk test for each group level. If the data is normally distributed, the p-value should be greater than 0.05.

## # A tibble: 3 x 4
##   group variable statistic     p
##   <fct> <chr>        <dbl> <dbl>
## 1 ctrl  weight       0.957 0.747
## 2 trt1  weight       0.930 0.452
## 3 trt2  weight       0.941 0.564

The score were normally distributed (\(p \gt 0.05\)) for each group, as assessed by Shapiro-Wilk’s test of normality. Note : if sample size is greater than 50, the normal QQ plot is preferred because at larger sample sizes the Shapiro-Wilk test becomes very sensitive even to a minor deviation from normality.

QQ plot draws the correlation between a given data and the normal distribution. Create QQ plots for each group level:

Points 4, 15, 17 are detected as outliers since it is far from the linearity, which can severely affect normality and homogeneity of variance. It can be useful to remove outliers to meet the test assumptions. I recommend Levene’s test, which is less sensitive to departures from normal distribution. From the Shapiro-Wilk test on the ANOVA residuals (\(W = 0.97\), \(p = 0.8\)) which finds no indication that normality is violated.

## # A tibble: 1 x 4
##     df1   df2 statistic     p
##   <int> <int>     <dbl> <dbl>
## 1     2    27      1.12 0.341
## 
##  One-way analysis of means (not assuming equal variances)
## 
## data:  weight and group
## F = 5.181, num df = 2.000, denom df = 17.128, p-value = 0.01739

From the output above, we can see that the p-value is \(\gt 0.05\), which is not significant. This means that, there is not significant difference between variances across groups. Therefore, we can assume the homogenity of variances in the different treatment groups.

1.4 Pairwise-comparison (One-way ANOVA)

A significant one-way ANOVA is generally followed up by Tukey post-hoc tests to perform multiple pairwise comparisons between groups. Key R function: tukey_hsd() [rstatix].

## # A tibble: 3 x 9
##   term  group1 group2 null.value estimate conf.low conf.high p.adj p.adj.signif
## * <chr> <chr>  <chr>       <dbl>    <dbl>    <dbl>     <dbl> <dbl> <chr>       
## 1 group ctrl   trt1            0   -0.371   -1.06      0.320 0.391 ns          
## 2 group ctrl   trt2            0    0.494   -0.197     1.19  0.198 ns          
## 3 group trt1   trt2            0    0.865    0.174     1.56  0.012 *

The output contains the following columns:

estimate: estimate of the difference between means of the two groups conf.low, conf.high: the lower and the upper end point of the confidence interval at 95% (default) p.adj: p-value after adjustment for the multiple comparisons. It can be seen from the output, that only the difference between trt2 and trt1 is significant (adjusted p-value = 0.012).

1.5 Tukey HSD (One-way ANOVA)

Tukey’s Honest Significance difference can be applied directly to an object which was created using aov(). It will adjust the p-values of the pairwise comparisons of the means to control the FWER, in this case, for 0.05. Notice it also gives confidence intervals for the difference of the means.

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = weight ~ group, data = PlantGrowth)
## 
## $group
##             diff        lwr       upr     p adj
## trt1-ctrl -0.371 -1.0622161 0.3202161 0.3908711
## trt2-ctrl  0.494 -0.1972161 1.1852161 0.1979960
## trt2-trt1  0.865  0.1737839 1.5562161 0.0120064

Based on these results, we see the control is causing treatment to change either control towards treatment 1 or control towards treatment 2. But not between treatments.

1.6 Report

We could report the results of one-way ANOVA as follow:

*A one-way ANOVA was performed to evaluate if the plant growth was different for the 3 different treatment groups: ctr, trt1 and trt2 all have \(n=10\).

*Data is presented as mean +/- standard deviation. Plant growth was statistically significantly different between different treatment groups, F(2, 27) = 4.85, p = 0.016, generalized eta squared = 0.26.

*Plant growth decreased in trt1 group (4.66 +/- 0.79) compared to ctr group (5.03 +/- 0.58). It increased in trt2 group (5.53 +/- 0.44) compared to trt1 and ctr group.

*Tukey post-hoc analyses revealed that the increase from trt1 to trt2 (0.87, 95% CI (0.17 to 1.56)) was statistically significant (p = 0.012), but no other group differences were statistically significant.


2 Two-way ANOVA

Please apply Two-way ANOVA procedures to the rats data from the faraway package. There are two factors here: poison and treat. You can use the levels() function to extract the levels of a factor variable. Give your opinions accordingly!

##       time        poison   treat 
##  Min.   :0.1800   I  :16   A:12  
##  1st Qu.:0.3000   II :16   B:12  
##  Median :0.4000   III:16   C:12  
##  Mean   :0.4794            D:12  
##  3rd Qu.:0.6225                  
##  Max.   :1.2400
## [1] "I"   "II"  "III"
## [1] "A" "B" "C" "D"

Here, 48 rats were randomly assigned both one of three poisons and one of four possible treatments. The experimenters then measures their survival time in tens of hours. A total of 12 groups, each with 4 replicates.

Before running any tests, we should first look at the data. We will create interaction plots, which will help us visualize the effect of one factor, as we move through the levels of another factor.

If there is not interaction, thus an additive model, we would expect to see parallel lines. That means when the factor level is changed, there can be an effect on the response. However, the difference between the levels of the other factor should still be the same.

The obvious indication of interaction would be lines that cross while heading in different directions. We can’t really see it but the lines aren’t strictly parallel, and there is some overlap on the right panel. However, is this interaction effect significant?

Let’s fit each of the possible models, then investigate their estimates for each of the group means.

To get the estimates, we’ll create a table which we will predict on.

##    poison treat
## 1       I     A
## 2      II     A
## 3     III     A
## 4       I     B
## 5      II     B
## 6     III     B
## 7       I     C
## 8      II     C
## 9     III     C
## 10      I     D
## 11     II     D
## 12    III     D
##      [,1]  [,2]   [,3]   
## [1,] "I-A" "II-A" "III-A"
## [2,] "I-B" "II-B" "III-B"
## [3,] "I-C" "II-C" "III-C"
## [4,] "I-D" "II-D" "III-D"

Since repeating ourselves a number of times, so write a function to perform the prediction. Some housekeeping is done to keep the estimates in order and provide row and column names. Above, it is shown where each of the estimates will be placed in the resulting matrix.

First, we obtain the estimates from the interaction model. Note that each cell has a different value.

I II III
A 0.4125 0.3200 0.210
B 0.8800 0.8150 0.335
C 0.5675 0.3750 0.235
D 0.6100 0.6675 0.325

Next, we obtain the estimates from the additive model. Again, each cell has a different value. We also see that these estimates are somewhat close to those from the interaction model.

I II III
A 0.4522917 0.3791667 0.1110417
B 0.8147917 0.7416667 0.4735417
C 0.5306250 0.4575000 0.1893750
D 0.6722917 0.5991667 0.3310417

To understand the difference, let’s consider the effect of the treatments.

##       I      II     III 
## -0.3625 -0.3625 -0.3625
##       I      II     III 
## -0.4675 -0.4950 -0.1250

This is the key difference between the interaction and additive models. The difference between the effect of treatments A and B is the same for each poison in the additive model. They are different in the interaction model.

The remaining three models are much simpler, having either only row or only column effects. Or no effects in the case of the null model.

I II III
A 0.6175 0.544375 0.27625
B 0.6175 0.544375 0.27625
C 0.6175 0.544375 0.27625
D 0.6175 0.544375 0.27625
I II III
A 0.3141667 0.3141667 0.3141667
B 0.6766667 0.6766667 0.6766667
C 0.3925000 0.3925000 0.3925000
D 0.5341667 0.5341667 0.5341667
I II III
A 0.479375 0.479375 0.479375
B 0.479375 0.479375 0.479375
C 0.479375 0.479375 0.479375
D 0.479375 0.479375 0.479375
##              Df Sum Sq Mean Sq F value   Pr(>F)    
## poison        2 1.0330  0.5165  23.222 3.33e-07 ***
## treat         3 0.9212  0.3071  13.806 3.78e-06 ***
## poison:treat  6 0.2501  0.0417   1.874    0.112    
## Residuals    36 0.8007  0.0222                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Using a significance level of \(\alpha = 0.05\), we see that the interaction is not significant. Within the additive model, both factors are significant, so we select the additive model.

Within the additive model, we could do further testing about the main effects.

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = time ~ poison + treat, data = rats)
## 
## $poison
##             diff        lwr        upr     p adj
## II-I   -0.073125 -0.2089936  0.0627436 0.3989657
## III-I  -0.341250 -0.4771186 -0.2053814 0.0000008
## III-II -0.268125 -0.4039936 -0.1322564 0.0000606
## 
## $treat
##            diff         lwr         upr     p adj
## B-A  0.36250000  0.18976135  0.53523865 0.0000083
## C-A  0.07833333 -0.09440532  0.25107198 0.6221729
## D-A  0.22000000  0.04726135  0.39273865 0.0076661
## C-B -0.28416667 -0.45690532 -0.11142802 0.0004090
## D-B -0.14250000 -0.31523865  0.03023865 0.1380432
## D-C  0.14166667 -0.03107198  0.31440532 0.1416151

For an example with interaction, we investigate the warpbreaks dataset, a default dataset in R.

Either plot makes it rather clear that the wool and tensions factors interact.

##              Df Sum Sq Mean Sq F value   Pr(>F)    
## wool          1    451   450.7   3.765 0.058213 .  
## tension       2   2034  1017.1   8.498 0.000693 ***
## wool:tension  2   1003   501.4   4.189 0.021044 *  
## Residuals    48   5745   119.7                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Using an \(\alpha\) of \(0.05\) the ANOVA test finds that the interaction is significant, so we use the interaction model here.