Analysis of variance (ANOVA) dapat digunakan untuk uji rata-rata lebih dari dua variabel . ANOVA dapat digunakan jika data memenuhi asumsi-asumsi uji parametrik.

Memuat Data

Data yang digunakan adalah data dari R (R data sets), "chickwts".

data("chickwts")
head(chickwts)
##   weight      feed
## 1    179 horsebean
## 2    160 horsebean
## 3    136 horsebean
## 4    227 horsebean
## 5    217 horsebean
## 6    168 horsebean

Uji Normalitas

Jika jumlah sampel besar, maka tidak membutuhkan uji normalitas.

Setelah data dimuat, langkah selanjutnya adalah uji normalitas. Akan tetapi, sebelumnya kita perlu menghitung ANOVA terlebih dahulu.

# Menghitung ANOVA
anova <- aov(weight ~ feed, data = chickwts)
summary.aov(anova)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## feed         5 231129   46226   15.37 5.94e-10 ***
## Residuals   65 195556    3009                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Uji normalitas
  # histogram
  hist(anova$residuals)

  # QQ-plot
  install.packages("car")
  library("car")
  qqPlot(anova$residuals,  id = FALSE # id = FALSE to remove point identification
         )

  # Shapiro-Wilk test
  shapiro.test(anova$residuals)
## 
##  Shapiro-Wilk normality test
## 
## data:  anova$residuals
## W = 0.98616, p-value = 0.6272

Uji Homogenitas

# Boxplot
boxplot(weight ~ feed, data = chickwts)

# Dotplot
install.packages("lattice")
library("lattice")
dotplot(weight ~ feed,
        data = chickwts
        )

# Levene's test
leveneTest(weight ~ feed,
           data = chickwts)
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  5  0.7493 0.5896
##       65

Berdasarkan uji yang telah dilakukan, data memenuhi asumsi-asumsi uji parametrik. Sehingga ANOVA dapat digunakan.

Uji Lanjut

Ketika hasil ANOVA menunjukan signifikansi p-value < 0.05, maka langkah selanjutnya dalah uji lanjut.

# Hasil ANOVA
summary.aov(anova)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## feed         5 231129   46226   15.37 5.94e-10 ***
## Residuals   65 195556    3009                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Uji Tukey
install.packages("multcomp")
library("multcomp")
post.test <- glht(anova,
                  linfct = mcp(feed = "Tukey")
                  )
summary(post.test)
## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: aov(formula = weight ~ feed, data = chickwts)
## 
## Linear Hypotheses:
##                            Estimate Std. Error t value Pr(>|t|)    
## horsebean - casein == 0    -163.383     23.485  -6.957  < 0.001 ***
## linseed - casein == 0      -104.833     22.393  -4.682  < 0.001 ***
## meatmeal - casein == 0      -46.674     22.896  -2.039  0.33196    
## soybean - casein == 0       -77.155     21.578  -3.576  0.00829 ** 
## sunflower - casein == 0       5.333     22.393   0.238  0.99989    
## linseed - horsebean == 0     58.550     23.485   2.493  0.14098    
## meatmeal - horsebean == 0   116.709     23.966   4.870  < 0.001 ***
## soybean - horsebean == 0     86.229     22.710   3.797  0.00418 ** 
## sunflower - horsebean == 0  168.717     23.485   7.184  < 0.001 ***
## meatmeal - linseed == 0      58.159     22.896   2.540  0.12731    
## soybean - linseed == 0       27.679     21.578   1.283  0.79297    
## sunflower - linseed == 0    110.167     22.393   4.920  < 0.001 ***
## soybean - meatmeal == 0     -30.481     22.100  -1.379  0.73872    
## sunflower - meatmeal == 0    52.008     22.896   2.271  0.22035    
## sunflower - soybean == 0     82.488     21.578   3.823  0.00379 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)
# atau
TukeyHSD(anova)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = weight ~ feed, data = chickwts)
## 
## $feed
##                            diff         lwr       upr     p adj
## horsebean-casein    -163.383333 -232.346876 -94.41979 0.0000000
## linseed-casein      -104.833333 -170.587491 -39.07918 0.0002100
## meatmeal-casein      -46.674242 -113.906207  20.55772 0.3324584
## soybean-casein       -77.154762 -140.517054 -13.79247 0.0083653
## sunflower-casein       5.333333  -60.420825  71.08749 0.9998902
## linseed-horsebean     58.550000  -10.413543 127.51354 0.1413329
## meatmeal-horsebean   116.709091   46.335105 187.08308 0.0001062
## soybean-horsebean     86.228571   19.541684 152.91546 0.0042167
## sunflower-horsebean  168.716667   99.753124 237.68021 0.0000000
## meatmeal-linseed      58.159091   -9.072873 125.39106 0.1276965
## soybean-linseed       27.678571  -35.683721  91.04086 0.7932853
## sunflower-linseed    110.166667   44.412509 175.92082 0.0000884
## soybean-meatmeal     -30.480519  -95.375109  34.41407 0.7391356
## sunflower-meatmeal    52.007576  -15.224388 119.23954 0.2206962
## sunflower-soybean     82.488095   19.125803 145.85039 0.0038845

Uji statistik deskriptif

Selain itu, kita bisa melakukan uji statistik deskriptif sederhana untuk mencari nilai rata-rata (mean) dan standar deviasi (sd). Bisa juga dilakukan uji statistik deskriptif secara lebih detail.

# Sederhana
aggregate(weight ~ feed,
          data = chickwts,
          function(x) round(c(mean = mean(x), sd = sd(x)), 2)
          )
##        feed weight.mean weight.sd
## 1    casein      323.58     64.43
## 2 horsebean      160.20     38.63
## 3   linseed      218.75     52.24
## 4  meatmeal      276.91     64.90
## 5   soybean      246.43     54.13
## 6 sunflower      328.92     48.84
# Detail
install.packages("rstatix")
library("rstatix")
chickwts %>% 
  group_by(feed) %>%
  get_summary_stats(weight, type = "common")
## # A tibble: 6 × 11
##   feed      variable     n   min   max median   iqr  mean    sd    se    ci
##   <fct>     <fct>    <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 casein    weight      12   216   404   342   93.5  324.  64.4  18.6  40.9
## 2 horsebean weight      10   108   227   152.  39.2  160.  38.6  12.2  27.6
## 3 linseed   weight      12   141   309   221   79.8  219.  52.2  15.1  33.2
## 4 meatmeal  weight      11   153   380   263   70.5  277.  64.9  19.6  43.6
## 5 soybean   weight      14   158   329   248   63.2  246.  54.1  14.5  31.3
## 6 sunflower weight      12   226   423   328   27.5  329.  48.8  14.1  31.0

Non-parametrik

Jika data tidak memenuhi asumsi-asumsi uji parametrik, maka dapat menggunakan uji non-parametrik.

Uji non-parametrik yang dapat digunakan sebagai alternatif uji ANOVA adalah uji Kruskal-Wallis.

# Uji Kruskal-Wallis
uji.kruskal <- kruskal.test(weight ~ feed, data = chickwts)
uji.kruskal
## 
##  Kruskal-Wallis rank sum test
## 
## data:  weight by feed
## Kruskal-Wallis chi-squared = 37.343, df = 5, p-value = 5.113e-07
# Uji lanjut (uji Dunn)
uji.dunn <- chickwts %>% 
  dunn_test(weight ~ feed, p.adjust.method = "bonferroni") 
uji.dunn
## # A tibble: 15 × 9
##    .y.    group1    group2       n1    n2 statistic       p   p.adj p.adj.signif
##  * <chr>  <chr>     <chr>     <int> <int>     <dbl>   <dbl>   <dbl> <chr>       
##  1 weight casein    horsebean    12    10    -4.81  1.49e-6 2.23e-5 ****        
##  2 weight casein    linseed      12    12    -3.31  9.39e-4 1.41e-2 *           
##  3 weight casein    meatmeal     12    11    -1.42  1.57e-1 1   e+0 ns          
##  4 weight casein    soybean      12    14    -2.50  1.24e-2 1.86e-1 ns          
##  5 weight casein    sunflower    12    12     0.183 8.55e-1 1   e+0 ns          
##  6 weight horsebean linseed      10    12     1.66  9.72e-2 1   e+0 ns          
##  7 weight horsebean meatmeal     10    11     3.36  7.68e-4 1.15e-2 *           
##  8 weight horsebean soybean      10    14     2.60  9.27e-3 1.39e-1 ns          
##  9 weight horsebean sunflower    10    12     4.99  6.12e-7 9.17e-6 ****        
## 10 weight linseed   meatmeal     12    11     1.82  6.88e-2 1   e+0 ns          
## 11 weight linseed   soybean      12    14     0.933 3.51e-1 1   e+0 ns          
## 12 weight linseed   sunflower    12    12     3.49  4.81e-4 7.21e-3 **          
## 13 weight meatmeal  soybean      11    14    -0.974 3.30e-1 1   e+0 ns          
## 14 weight meatmeal  sunflower    11    12     1.59  1.11e-1 1   e+0 ns          
## 15 weight soybean   sunflower    14    12     2.69  7.15e-3 1.07e-1 ns

Session Info:

## R version 4.3.2 (2023-10-31 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19045)
## 
## Matrix products: default
## 
## 
## locale:
## [1] LC_COLLATE=English_United States.utf8 
## [2] LC_CTYPE=English_United States.utf8   
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.utf8    
## 
## time zone: Asia/Jakarta
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] rstatix_0.7.2   multcomp_1.4-25 TH.data_1.1-2   MASS_7.3-60    
## [5] survival_3.5-7  mvtnorm_1.2-4   lattice_0.21-9  car_3.1-2      
## [9] carData_3.0-5  
## 
## loaded via a namespace (and not attached):
##  [1] Matrix_1.6-5      jsonlite_1.8.8    dplyr_1.1.4       compiler_4.3.2   
##  [5] highr_0.10        tidyselect_1.2.0  tidyr_1.3.1       jquerylib_0.1.4  
##  [9] splines_4.3.2     yaml_2.3.8        fastmap_1.1.1     R6_2.5.1         
## [13] generics_0.1.3    knitr_1.45        backports_1.4.1   tibble_3.2.1     
## [17] bslib_0.6.1       pillar_1.9.0      rlang_1.1.3       utf8_1.2.4       
## [21] cachem_1.0.8      broom_1.0.5       xfun_0.42         sass_0.4.8       
## [25] cli_3.6.2         withr_3.0.0       magrittr_2.0.3    digest_0.6.34    
## [29] grid_4.3.2        rstudioapi_0.15.0 sandwich_3.1-0    lifecycle_1.0.4  
## [33] vctrs_0.6.5       evaluate_0.23     glue_1.7.0        codetools_0.2-19 
## [37] zoo_1.8-12        abind_1.4-5       fansi_1.0.6       purrr_1.0.2      
## [41] rmarkdown_2.25    pkgconfig_2.0.3   tools_4.3.2       htmltools_0.5.7