#data
method1 <- c(0.34, 0.12, 1.23, 0.70, 1.75, 0.12)
method2 <- c(0.91, 2.94, 2.14, 2.36, 2.86, 4.55)
method3 <- c(6.31, 8.37, 9.75, 6.09, 9.82, 7.24)
method4 <- c(17.15, 11.82, 10.97, 17.20, 14.35, 16.82)
#datset
discharge <- c(method1, method2, method3, method4)
method <- rep(1:4, each = 6)
Model: Y = μ + τ + ε
Where: - Y = discharge measurement - μ = overall mean - τ = treatment effect (method effect) - ε = random error
Hypotheses: - H0: τ1 = τ2 = τ3 = τ4 = 0 (all methods give same average) - Ha: At least one τi ≠0 (at least one method differs)
#boxplots
boxplot(discharge ~ method,
xlab = "method",
ylab = "discharge",
main = "discharge by Estimation Method")
#variance for each method
var(method1)
## [1] 0.43704
var(method2)
## [1] 1.421347
var(method3)
## [1] 2.71284
var(method4)
## [1] 7.814937
Answer: The data does not appear to be normally distributed, and the variance is not constant. From the boxplot, we can see that the variability increases as the method number increase, method 1 has very little variability, method 4 has much greater variability, and there even are outliers visible in method 2
#fit the ANOVA model
model <- aov(discharge ~ factor(method))
# results
summary(model)
## Df Sum Sq Mean Sq F value Pr(>F)
## factor(method) 3 708.7 236.2 76.29 4e-11 ***
## Residuals 20 61.9 3.1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Residual plots
plot(model, which = 1)
plot(model, which = 2)
Answer: The results indicate that the ANOVA assumptions are violated. The data shows non-constant variance and non-normality, which suggests a transformation or nonparametric test is needed
library(MASS)
#transformation
boxcox(discharge ~ factor(method))
#log transformation
log_discharge <- log(discharge)
#model
model_log <- aov(log_discharge ~ factor(method))
summary(model_log)
## Df Sum Sq Mean Sq F value Pr(>F)
## factor(method) 3 42.50 14.168 33.44 5.49e-08 ***
## Residuals 20 8.47 0.424
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#plotting
plot(model_log, which = 1)
plot(model_log, which = 2)
# 10e
# nonparametric test
kruskal.test(discharge ~ factor(method))
##
## Kruskal-Wallis rank sum test
##
## data: discharge by factor(method)
## Kruskal-Wallis chi-squared = 21.156, df = 3, p-value = 9.771e-05
{r eval=FALSE}