Part A: Write the linear effects equations and the hypothesis you are testing.

Linear effect equation is \[ y_{i j} = \mu + \tau_i + \varepsilon_{ij} \]

The hypothesis are \[ H_0 : \mu_1 = \mu_2 = \mu_3 = \mu_4 = 0 \] \[ H_a : \text{not all } \mu_i \text{ are equal} \]

Part B: Does it appear the data is normally distributed? Does it appear that the variance is constant?

Obs <- c(0.34, 0.12, 1.23, 0.70, 1.75, 0.12, 0.91, 2.94, 2.14, 2.36, 2.86, 4.55, 6.31, 8.37, 9.75, 6.09, 9.82, 7.24,17.15, 11.82, 10.97, 17.20, 14.35, 16.82)
Method <- c(rep(1,6),rep(2,6),rep(3,6),rep(4,6))

df <- data.frame(Method, Obs)

boxplot(Obs~Method)

Answer: It is not normally distributed. Method 1 has smallest variance, method 4 has largest variance, The boxplot shows that the variance is not constant

Part C: Fit the model using aov() in R on the raw data. Examine the residuals.

model <- aov(Obs~Method, data = df)
summary(model)

##             Df Sum Sq Mean Sq F value  Pr(>F)    
## Method       1  672.0   672.0   149.9 2.7e-11 ***
## Residuals   22   98.6     4.5                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

plot(model)

Answer: The p-value is very small, and also from the plot the residues was fitted, the variance is violated assumption, we can use transformation to fix it.

Part D: Select an appropriate transformation using Box Cox, transform the data and test hypothesis in R (alpha=0.05).

Step 1: finding lambda for the transformation

library(MASS)
boxcox(Obs~Method, data = df)

Step 2: choose lambda as 0.5, because the highest point on x_axis is 0.5.

trans <- (sqrt(Obs)-1)/0.5
df2 <- data.frame(trans,Method)
model2 <- aov(trans~Method, data = df2)
summary(model2)

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## Method       1  130.1  130.12   251.2 1.62e-13 ***
## Residuals   22   11.4    0.52                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

plot(model2)

Answer: P-value is much smaller, we reject the null hypothesis.

Part E: Perform a Kruskal-Wallace test in R (alpha=0.05)

kruskal.test(Obs~Method, data = df)

## 
##  Kruskal-Wallis rank sum test
## 
## data:  Obs by Method
## Kruskal-Wallis chi-squared = 21.156, df = 3, p-value = 9.771e-05

Answer: The Four estimation methods produce significantly different peak discharge estimate.

The completed Code

Obs <- c(0.34, 0.12, 1.23, 0.70, 1.75, 0.12, 0.91, 2.94, 2.14, 2.36, 2.86, 4.55, 6.31, 8.37, 9.75, 6.09, 9.82, 7.24,17.15, 11.82, 10.97, 17.20, 14.35, 16.82)
Method <- c(rep(1,6),rep(2,6),rep(3,6),rep(4,6))

df <- data.frame(Method, Obs)

boxplot(Obs~Method)

model <- aov(Obs~Method, data = df)
summary(model)
plot(model)

library(MASS)
boxcox(Obs~Method, data = df)
trans <- (sqrt(Obs)-1)/0.5
df2 <- data.frame(trans,Method)
model2 <- aov(trans~Method, data = df2)
summary(model2)
plot(model2)

kruskal.test(Obs~Method, data = df)

Assignement 10

Renkun Wang, Sams Zubayer Mridha, Shubham Singh

2025-10-07