Tukey Test Vs Kruskal-Wallis Rank Sum Test

1 Different costs between different housing options in Stockholm

1.1 Log Transformation of housing cost

##    Person Housing Log_MonthlyCost_SEK
## 1       1  Rental            8.070906
## 2       2  Rental            8.517193
## 3       3  Rental            8.294050
## 4       4  Rental            8.160518
## 5       5  Rental            7.972466
## 6       6  Rental            8.411833
## 7       7  Rental            8.366370
## 8       8  Rental            7.600902
## 9       9  Rental            8.131531
## 10     10  Rental            7.972466
## 11     11   Condo            8.853665
## 12     12   Condo            8.987197
## 13     13   Condo            9.392662
## 14     14   Condo            9.024011
## 15     15   Condo            8.974618
## 16     16   Condo            9.350102
## 17     17   Condo            9.104980
## 18     18   Condo            9.392662
## 19     19   Condo            9.126959
## 20     20   Condo            9.210340
## 21     21   Co_op            8.006368
## 22     22   Co_op            7.313220
## 23     23   Co_op            7.600902
## 24     24   Co_op            6.907755
## 25     25   Co_op            6.802395
## 26     26   Co_op            7.824046
## 27     27   Co_op            7.740664
## 28     28   Co_op            8.006368
## 29     29   Co_op            7.937375
## 30     30   Co_op            6.907755
##            mean        sd data:n
## Co_op  7.504685 0.4830513     10
## Condo  9.141720 0.1898171     10
## Rental 8.149824 0.2671258     10

1.2 Hypothesis Testing

  • Question: Should there any mean differences of housing cost for housing options in Stockholm ?
  • \(H_o: \mu(Rental) = \mu(Condo) = \mu(Co\_op)\)
  • \(H_a:\) Means are not all equal

1.3 ANOVA Model Fit

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## Housing      2 13.600   6.800   59.87 1.19e-10 ***
## Residuals   27  3.067   0.114                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

1.4 Tukey Contrast Test

## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: aov(formula = Log_MonthlyCost_SEK ~ Housing, data = Dataset)
## 
## Linear Hypotheses:
##                     Estimate Std. Error t value Pr(>|t|)    
## Condo - Co_op == 0    1.6370     0.1507  10.862  < 1e-04 ***
## Rental - Co_op == 0   0.6451     0.1507   4.281 0.000555 ***
## Rental - Condo == 0  -0.9919     0.1507  -6.581  < 1e-04 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)
## 
## 
##   Simultaneous Confidence Intervals
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: aov(formula = Log_MonthlyCost_SEK ~ Housing, data = Dataset)
## 
## Quantile = 2.4804
## 95% family-wise confidence level
##  
## 
## Linear Hypotheses:
##                     Estimate lwr     upr    
## Condo - Co_op == 0   1.6370   1.2632  2.0109
## Rental - Co_op == 0  0.6451   0.2713  1.0190
## Rental - Condo == 0 -0.9919  -1.3657 -0.6181
## 
##  Co_op  Condo Rental 
##    "a"    "c"    "b"

1.5 Diagnostic tools to check for normality and fit of model

## Analysis of Variance Table
## 
## Response: Log_MonthlyCost_SEK
##           Df  Sum Sq Mean Sq F value    Pr(>F)    
## Housing    2 13.5998  6.7999  59.871 1.188e-10 ***
## Residuals 27  3.0665  0.1136                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

1.6 Conclusion

Tukey Contrasts test indicated that \(\mu(Condo) - \mu(Co\_op)\), \(\mu(Rental)- \mu(Condo)\) and \(\mu(Rental)- \mu(Co\_op)\) are highly significant different from zero. Hence there is no reason to believe that housing costs have no differences with respect to housing options in Stockholm

2 Different stress factors effects on heart rate

2.1 import data

  • Definition:
    • Cold Water = CW
    • Mental Stress = MS
    • Physical Exercise = PE
    • Heat Rate = HR
##    Treatment   Log_HR
## 1         MS 4.276666
## 2         MS 4.204693
## 3         MS 4.406719
## 4         MS 4.828314
## 5         MS 4.700480
## 6         MS 4.356709
## 7         MS 4.553877
## 8         MS 4.532599
## 9         MS 4.356709
## 10        MS 4.234107
## 11        CW 4.007333
## 12        CW 3.828641
## 13        CW 3.988984
## 14        CW 3.988984
## 15        CW 4.077537
## 16        CW 4.219508
## 17        CW 4.110874
## 18        CW 4.143135
## 19        CW 4.043051
## 20        CW 4.219508
## 21        PE 4.094345
## 22        PE 4.189655
## 23        PE 4.430817
## 24        PE 4.330733
## 25        PE 4.219508
## 26        PE 4.143135
## 27        PE 4.158883
## 28        PE 4.317488
## 29        PE 4.356709
## 30        PE 4.406719

2.2 Hypothesis Testing

  • \(H_o: \mu(CW) = \mu(MS) = \mu(PE)\)
  • \(H_a:\) Means are not all equal
##        mean        sd data:n
## CW 4.062756 0.1189262     10
## MS 4.445087 0.2053029     10
## PE 4.264799 0.1183438     10

2.3 ANOVA Model Fit

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## Treatment    2 0.7317  0.3658   15.61 3.12e-05 ***
## Residuals   27 0.6327  0.0234                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

2.4 Tukey Contrast Test

## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: aov(formula = Log_HR ~ Treatment, data = Dataset)
## 
## Linear Hypotheses:
##              Estimate Std. Error t value Pr(>|t|)    
## MS - CW == 0  0.38233    0.06846   5.585   <0.001 ***
## PE - CW == 0  0.20204    0.06846   2.951   0.0173 *  
## PE - MS == 0 -0.18029    0.06846  -2.634   0.0357 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)
## 
## 
##   Simultaneous Confidence Intervals
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: aov(formula = Log_HR ~ Treatment, data = Dataset)
## 
## Quantile = 2.4788
## 95% family-wise confidence level
##  
## 
## Linear Hypotheses:
##              Estimate lwr      upr     
## MS - CW == 0  0.38233  0.21264  0.55202
## PE - CW == 0  0.20204  0.03235  0.37173
## PE - MS == 0 -0.18029 -0.34998 -0.01060
## 
##  CW  MS  PE 
## "a" "c" "b"

2.5 Diagnostic tools to check for normality and fit of model

## Analysis of Variance Table
## 
## Response: Log_HR
##           Df  Sum Sq Mean Sq F value    Pr(>F)    
## Treatment  2 0.73168 0.36584  15.612 3.122e-05 ***
## Residuals 27 0.63268 0.02343                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

2.6 Conclusion

Tukey Contrasts test indicated that \(\mu(PE) - \mu(MS)\), \(\mu(PE)- \mu(CW)\) and \(\mu(MS)- \mu(CW)\) are significant different from zero. Hence there is no reason to believe that all three stressors have no effects on heat rate.

3 Chlorophyll-α concentrations (μg/l) in the different basins of the Baltic Sea

3.1 import data

  • LARGE_basin Definition:
    • Arkona = south-western parts of the Baltic Sea including Öresund
    • Baltic Proper = from Bornholm to Ålands sea
    • Bothnian = northern part from Ålands sea to Bothnian bay
    • Western = Kattegatt and Skagerrak
##        Parameter Log_Chlorophyll_a  Basin LARGE_basin
## 1  Chlorophyll_a        -1.6094379 Arkona      Arkona
## 2  Chlorophyll_a        -1.3862944 Arkona      Arkona
## 3  Chlorophyll_a        -1.2039728 Arkona      Arkona
## 4  Chlorophyll_a        -1.2039728 Arkona      Arkona
## 5  Chlorophyll_a        -1.2039728 Arkona      Arkona
## 6  Chlorophyll_a        -1.2039728 Arkona      Arkona
## 7  Chlorophyll_a        -1.2039728 Arkona      Arkona
## 8  Chlorophyll_a        -1.2039728 Arkona      Arkona
## 9  Chlorophyll_a        -1.2039728 Arkona      Arkona
## 10 Chlorophyll_a        -1.2039728 Arkona      Arkona
## 11 Chlorophyll_a        -1.2039728 Arkona      Arkona
## 12 Chlorophyll_a        -1.2039728 Arkona      Arkona
## 13 Chlorophyll_a        -1.2039728 Arkona      Arkona
## 14 Chlorophyll_a        -0.9162907 Arkona      Arkona
## 15 Chlorophyll_a        -0.9162907 Arkona      Arkona
## 16 Chlorophyll_a        -0.9162907 Arkona      Arkona
## 17 Chlorophyll_a        -0.9162907 Arkona      Arkona
## 18 Chlorophyll_a        -0.9162907 Arkona      Arkona
## 19 Chlorophyll_a        -0.9162907 Arkona      Arkona
## 20 Chlorophyll_a        -0.6931472 Arkona      Arkona
## 21 Chlorophyll_a        -0.6931472 Arkona      Arkona
## 22 Chlorophyll_a        -0.6931472 Arkona      Arkona
## 23 Chlorophyll_a        -0.6931472 Arkona      Arkona
##  [ reached 'max' / getOption("max.print") -- omitted 7 rows ]

3.2 Hypothesis Testing

  • \(H_o:\) Means Chlorophyll-α concentrations in different basins of the Baltic Sea are all equal
  • \(H_a:\) Means are not all equal
##                   mean        sd data:n
## Arkona       0.4786409 0.6995542    347
## BalticProper 1.3570184 0.9928270   8056
## Bothnian     0.6412756 0.9554050   3323
## Western      0.5141939 0.9358696   1630
##                          mean        sd data:n
## Alands hav          1.1074110 0.8341088     51
## Arkona              0.4791835 0.7013780    345
## Bornholm_Hano       0.6488659 0.6605049    222
## Bottenhavet         0.7728728 0.9082209   1231
## Bottenviken         0.3978281 0.9881547   1257
## Kattegatt           0.5816730 0.9349497    758
## Norra Gotlandshavet 1.5992880 0.9183594   6154
## Norra Kvarken       0.7946486 0.8978374    784
## Oresund             0.3850541 0.2867071      2
## Skagerrak           0.4555366 0.9332449    872
## West Gotlandshavet  0.5631388 0.8177435   1680

3.3 ANOVA Model Fit

##                Df Sum Sq Mean Sq F value Pr(>F)    
## LARGE_basin     3   1896   631.9   671.3 <2e-16 ***
## Residuals   13352  12568     0.9                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

3.4 Tukey Contrast Test: LARGE_basin

## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: aov(formula = Log_Chlorophyll_a ~ LARGE_basin, data = Dataset)
## 
## Linear Hypotheses:
##                              Estimate Std. Error t value Pr(>|t|)    
## BalticProper - Arkona == 0    0.87838    0.05319  16.513   <0.001 ***
## Bothnian - Arkona == 0        0.16263    0.05474   2.971   0.0136 *  
## Western - Arkona == 0         0.03555    0.05736   0.620   0.9189    
## Bothnian - BalticProper == 0 -0.71574    0.02000 -35.782   <0.001 ***
## Western - BalticProper == 0  -0.84282    0.02635 -31.986   <0.001 ***
## Western - Bothnian == 0      -0.12708    0.02934  -4.332   <0.001 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)
## 
## 
##   Simultaneous Confidence Intervals
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: aov(formula = Log_Chlorophyll_a ~ LARGE_basin, data = Dataset)
## 
## Quantile = 2.5207
## 95% family-wise confidence level
##  
## 
## Linear Hypotheses:
##                              Estimate lwr      upr     
## BalticProper - Arkona == 0    0.87838  0.74429  1.01246
## Bothnian - Arkona == 0        0.16263  0.02466  0.30061
## Western - Arkona == 0         0.03555 -0.10903  0.18014
## Bothnian - BalticProper == 0 -0.71574 -0.76616 -0.66532
## Western - BalticProper == 0  -0.84282 -0.90925 -0.77640
## Western - Bothnian == 0      -0.12708 -0.20104 -0.05313
## 
##       Arkona BalticProper     Bothnian      Western 
##          "a"          "c"          "b"          "a"

3.5 Diagnostic tools to check for normality and fit of model:LARGE_basin

## Analysis of Variance Table
## 
## Response: Log_Chlorophyll_a
##                Df  Sum Sq Mean Sq F value    Pr(>F)    
## LARGE_basin     3  1895.6  631.86  671.26 < 2.2e-16 ***
## Residuals   13352 12568.3    0.94                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##  [1] "1"     "2"     "3"     "4"     "5"     "6"     "7"     "8"     "9"    
## [10] "10"    "348"   "349"   "350"   "351"   "352"   "353"   "354"   "355"  
## [19] "356"   "357"   "8401"  "8402"  "8403"  "8404"  "8405"  "8406"  "8407" 
## [28] "8408"  "8409"  "8410"  "8411"  "8412"  "8413"  "11726" "11725" "11724"
## [37] "11723" "11722" "11721" "11720" "11719" "11716" "11717" "11727" "11728"
## [46] "11729" "11730" "11731" "11732" "11733" "11734" "11735" "11736" "13356"
## [55] "13355" "13354" "13353" "13352" "13350" "13351" "13349" "13348" "13347"

3.6 Kruskal-Wallis rank sum test

## 
##  Kruskal-Wallis rank sum test
## 
## data:  Log_Chlorophyll_a by LARGE_basin
## Kruskal-Wallis chi-squared = 1818.4, df = 3, p-value < 2.2e-16

3.7 Conclusion

  • Tukey Contrasts test indicated that only \(\mu(Western) - \mu(Arkona)\) is not statistically significant different from zero.
  • Kruskal-Wallis rank sum test indicated that at least one group of mean concentration stochastically dominates one other group.
  • Hence not all groups from the large basin have the same mean Chlorophyll-α concentrations.

DK WC

2020-01-20