MANOVA in R – How To Implement and Interpret One-Way MANOVA

MANOVA in R: Implementation

As with most of the things in R, performing a MANOVA statistical test boils down to a single function call. But we’ll need a dataset first. The Iris dataset is well-known among the data science crowd, and it is built into R:

head(iris)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

Dependent variables

library(ggplot2)
library(gridExtra)

box_sl <- ggplot(iris, aes(x = Species, y = Sepal.Length, fill = Species)) +
  geom_boxplot() +
  theme(legend.position = "top")
box_sw <- ggplot(iris, aes(x = Species, y = Sepal.Width, fill = Species)) +
  geom_boxplot() +
  theme(legend.position = "top")
box_pl <- ggplot(iris, aes(x = Species, y = Petal.Length, fill = Species)) +
  geom_boxplot() +
  theme(legend.position = "top")
box_pw <- ggplot(iris, aes(x = Species, y = Petal.Width, fill = Species)) +
  geom_boxplot() +
  theme(legend.position = "top")

grid.arrange(box_sl, box_sw, box_pl, box_pw, ncol = 2, nrow = 2)

One-way MANOVA in R

We can now perform a one-way MANOVA in R. The best practice is to separate the dependent from the independent variable before calling the manova() function. Once the test is done, you can print its summary:

dependent_vars <- cbind(iris$Sepal.Length, iris$Sepal.Width, iris$Petal.Length, iris$Petal.Width)
independent_var <- iris$Species

manova_model <- manova(dependent_vars ~ independent_var, data = iris)
summary(manova_model)
##                  Df Pillai approx F num Df den Df    Pr(>F)    
## independent_var   2 1.1919   53.466      8    290 < 2.2e-16 ***
## Residuals       147                                            
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

While we’re here, we could also measure the effect size. One metric often used with MANOVA is Partial Eta Squared. It measures the effect the independent variable has on the dependent variables. If the value is 0.14 or greater, we can say the effect size is large. Here’s how to calculate it in R:

library(effectsize)
## Registered S3 method overwritten by 'parameters':
##   method                         from      
##   format.parameters_distribution datawizard
eta_squared(manova_model)
## # Effect Size for ANOVA (Type I)
## 
## Parameter       | Eta2 (partial) |       95% CI
## -----------------------------------------------
## independent_var |           0.60 | [0.54, 1.00]
## 
## - One-sided CIs: upper bound fixed at (1).

Interpret MANOVA in R With a Post-Hoc Test

Linear Discriminant Analysis (LDA), which finds a linear combination of features that best separates two or more groups.

library(MASS)

iris_lda <- lda(independent_var ~ dependent_vars, CV = F)
iris_lda
## Call:
## lda(independent_var ~ dependent_vars, CV = F)
## 
## Prior probabilities of groups:
##     setosa versicolor  virginica 
##  0.3333333  0.3333333  0.3333333 
## 
## Group means:
##            dependent_vars1 dependent_vars2 dependent_vars3 dependent_vars4
## setosa               5.006           3.428           1.462           0.246
## versicolor           5.936           2.770           4.260           1.326
## virginica            6.588           2.974           5.552           2.026
## 
## Coefficients of linear discriminants:
##                        LD1         LD2
## dependent_vars1  0.8293776  0.02410215
## dependent_vars2  1.5344731  2.16452123
## dependent_vars3 -2.2012117 -0.93192121
## dependent_vars4 -2.8104603  2.83918785
## 
## Proportion of trace:
##    LD1    LD2 
## 0.9912 0.0088

The snippet below uses the predict() function to get the linear discriminants and combines them with our independent variable:

lda_df <- data.frame(
  species = iris[, "Species"],
  lda = predict(iris_lda)$x
)
lda_df
##        species    lda.LD1      lda.LD2
## 1       setosa  8.0617998  0.300420621
## 2       setosa  7.1286877 -0.786660426
## 3       setosa  7.4898280 -0.265384488
## 4       setosa  6.8132006 -0.670631068
## 5       setosa  8.1323093  0.514462530
## 6       setosa  7.7019467  1.461720967
## 7       setosa  7.2126176  0.355836209
## 8       setosa  7.6052935 -0.011633838
## 9       setosa  6.5605516 -1.015163624
## 10      setosa  7.3430599 -0.947319209
## 11      setosa  8.3973865  0.647363392
## 12      setosa  7.2192969 -0.109646389
## 13      setosa  7.3267960 -1.072989426
## 14      setosa  7.5724707 -0.805464137
## 15      setosa  9.8498430  1.585936985
## 16      setosa  9.1582389  2.737596471
## 17      setosa  8.5824314  1.834489452
## 18      setosa  7.7807538  0.584339407
## 19      setosa  8.0783588  0.968580703
## 20      setosa  8.0209745  1.140503656
## 21      setosa  7.4968023 -0.188377220
## 22      setosa  7.5864812  1.207970318
## 23      setosa  8.6810429  0.877590154
## 24      setosa  6.2514036  0.439696367
## 25      setosa  6.5589334 -0.389222752
## 26      setosa  6.7713832 -0.970634453
## 27      setosa  6.8230803  0.463011612
## 28      setosa  7.9246164  0.209638715
## 29      setosa  7.9912902  0.086378713
## 30      setosa  6.8294645 -0.544960851
## 31      setosa  6.7589549 -0.759002759
## 32      setosa  7.3749525  0.565844592
## 33      setosa  9.1263463  1.224432671
## 34      setosa  9.4676820  1.825226345
## 35      setosa  7.0620139 -0.663400423
## 36      setosa  7.9587624 -0.164961722
## 37      setosa  8.6136720  0.403253602
## 38      setosa  8.3304176  0.228133530
## 39      setosa  6.9341201 -0.705519379
## 40      setosa  7.6882313 -0.009223623
## 41      setosa  7.9179372  0.675121313
## 42      setosa  5.6618807 -1.934355243
## 43      setosa  7.2410147 -0.272615132
## 44      setosa  6.4144356  1.247301306
## 45      setosa  6.8594438  1.051653957
## 46      setosa  6.7647039 -0.505151855
## 47      setosa  8.0818994  0.763392750
## 48      setosa  7.1867690 -0.360986823
## 49      setosa  8.3144488  0.644953177
## 50      setosa  7.6719674 -0.134893840
## 51  versicolor -1.4592755  0.028543764
## 52  versicolor -1.7977057  0.484385502
## 53  versicolor -2.4169489 -0.092784031
## 54  versicolor -2.2624735 -1.587252508
## 55  versicolor -2.5486784 -0.472204898
## 56  versicolor -2.4299673 -0.966132066
## 57  versicolor -2.4484846  0.795961954
## 58  versicolor -0.2226665 -1.584673183
## 59  versicolor -1.7502012 -0.821180130
## 60  versicolor -1.9584224 -0.351563753
## 61  versicolor -1.1937603 -2.634455704
## 62  versicolor -1.8589257  0.319006544
## 63  versicolor -1.1580939 -2.643409913
## 64  versicolor -2.6660572 -0.642504540
## 65  versicolor -0.3783672  0.086638931
## 66  versicolor -1.2011726  0.084437359
## 67  versicolor -2.7681025  0.032199536
## 68  versicolor -0.7768540 -1.659161847
## 69  versicolor -3.4980543 -1.684956162
## 70  versicolor -1.0904279 -1.626583496
## 71  versicolor -3.7158961  1.044514421
## 72  versicolor -0.9976104 -0.490530602
## 73  versicolor -3.8352593 -1.405958061
## 74  versicolor -2.2574125 -1.426794234
## 75  versicolor -1.2557133 -0.546424197
## 76  versicolor -1.4375576 -0.134424979
## 77  versicolor -2.4590614 -0.935277280
## 78  versicolor -3.5184849  0.160588866
## 79  versicolor -2.5897987 -0.174611728
## 80  versicolor  0.3074879 -1.318871459
## 81  versicolor -1.1066918 -1.752253714
## 82  versicolor -0.6055246 -1.942980378
## 83  versicolor -0.8987038 -0.904940034
## 84  versicolor -4.4984664 -0.882749915
## 85  versicolor -2.9339780  0.027379106
## 86  versicolor -2.1036082  1.191567675
## 87  versicolor -2.1425821  0.088779781
## 88  versicolor -2.4794560 -1.940739273
## 89  versicolor -1.3255257 -0.162869550
## 90  versicolor -1.9555789 -1.154348262
## 91  versicolor -2.4015702 -1.594583407
## 92  versicolor -2.2924888 -0.332860296
## 93  versicolor -1.2722722 -1.214584279
## 94  versicolor -0.2931761 -1.798715092
## 95  versicolor -2.0059888 -0.905418042
## 96  versicolor -1.1816631 -0.537570242
## 97  versicolor -1.6161564 -0.470103580
## 98  versicolor -1.4215888 -0.551244626
## 99  versicolor  0.4759738 -0.799905482
## 100 versicolor -1.5494826 -0.593363582
## 101  virginica -7.8394740  2.139733449
## 102  virginica -5.5074800 -0.035813989
## 103  virginica -6.2920085  0.467175777
## 104  virginica -5.6054563 -0.340738058
## 105  virginica -6.8505600  0.829825394
## 106  virginica -7.4181678 -0.173117995
## 107  virginica -4.6779954 -0.499095015
## 108  virginica -6.3169269 -0.968980756
## 109  virginica -6.3277368 -1.383289934
## 110  virginica -6.8528134  2.717589632
## 111  virginica -4.4407251  1.347236918
## 112  virginica -5.4500957 -0.207736942
## 113  virginica -5.6603371  0.832713617
## 114  virginica -5.9582372 -0.094017545
## 115  virginica -6.7592628  1.600232061
## 116  virginica -5.8070433  2.010198817
## 117  virginica -5.0660123 -0.026273384
## 118  virginica -6.6088188  1.751635872
## 119  virginica -9.1714749 -0.748255067
## 120  virginica -4.7645357 -2.155737197
## 121  virginica -6.2728391  1.649481407
## 122  virginica -5.3607119  0.646120732
## 123  virginica -7.5811998 -0.980722934
## 124  virginica -4.3715028 -0.121297458
## 125  virginica -5.7231753  1.293275530
## 126  virginica -5.2791592 -0.042458238
## 127  virginica -4.0808721  0.185936572
## 128  virginica -4.0770364  0.523238483
## 129  virginica -6.5191040  0.296976389
## 130  virginica -4.5837194 -0.856815813
## 131  virginica -6.2282401 -0.712719638
## 132  virginica -5.2204877  1.468195094
## 133  virginica -6.8001500  0.580895175
## 134  virginica -3.8151597 -0.942985932
## 135  virginica -5.1074897 -2.130589999
## 136  virginica -6.7967163  0.863090395
## 137  virginica -6.5244960  2.445035271
## 138  virginica -4.9955028  0.187768525
## 139  virginica -3.9398530  0.614020389
## 140  virginica -5.2038309  1.144768076
## 141  virginica -6.6530868  1.805319760
## 142  virginica -5.1055595  1.992182010
## 143  virginica -5.5074800 -0.035813989
## 144  virginica -6.7960192  1.460686950
## 145  virginica -6.8473594  2.428950671
## 146  virginica -5.6450035  1.677717335
## 147  virginica -5.1795646 -0.363475041
## 148  virginica -4.9677409  0.821140550
## 149  virginica -5.8861454  2.345090513
## 150  virginica -4.6831543  0.332033811

The final step in this post-hoc test is to visualize the above lda_df as a scatter plot. Ideally, we should see one or multiple groups stand out:

ggplot(lda_df) +
  geom_point(aes(x = lda.LD1, y = lda.LD2, color = species), size = 4) +
  theme_classic()