method1 <- c(.34, .12, 1.23, .70, 1.75, .12)
method2 <- c(.91, 2.94, 2.14, 2.36, 2.86, 4.55)
method3 <- c(6.31, 8.37, 9.75, 6.09, 9.82, 7.24)
method4 <- c(17.15, 11.82, 10.97, 17.20, 14.35, 16.82)
methods <- c(method1, method2, method3, method4)
method <- data.frame(method1, method2, method3, method4)
method
## method1 method2 method3 method4
## 1 0.34 0.91 6.31 17.15
## 2 0.12 2.94 8.37 11.82
## 3 1.23 2.14 9.75 10.97
## 4 0.70 2.36 6.09 17.20
## 5 1.75 2.86 9.82 14.35
## 6 0.12 4.55 7.24 16.82
x <- c(rep(1,6), rep(2,6), rep(3,6), rep(4,6))
Linear effects equation:
X̅i = μ + τi + ∈ij
X̅ij = mean of group i
μ = The grand mean
τij = treatment effect of group 1
∈ij = error associated with values in group 1
Hypothesis Testing:
Null Hypothesis: Ho: The means are all equal; u1 = u2= u3= u4
Alternative Hypothesis Ha: At least one of the means varies.
library(dplyr)
## Warning: package 'dplyr' was built under R version 4.1.3
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(tidyr)
## Warning: package 'tidyr' was built under R version 4.1.3
method_new <- pivot_longer(method, c(method1, method2, method3, method4))
qqnorm(method_new$value)
qqline(method_new$value)
boxplot(method1, method2, method3, method4, main= "boxplot of methods", xlab = "methods", ylab = "flowrates (cubic.ft/sec)")
From the normal qqnorm we see that the data are no strongly in correlation with a linear pattern. Hence, we cannot assume from this visual graphic that the data are normally distributed.
From the boxplot comparison, we see that the variances are not equal in the groups of methods. So, the assumption of equal variance does not hold in this experiment.
kruskal.test(value~name, method_new)
##
## Kruskal-Wallis rank sum test
##
## data: value by name
## Kruskal-Wallis chi-squared = 21.156, df = 3, p-value = 9.771e-05
From the Kruskal-Wallace test we see that the p-value is lower than our critical value of p=0.5. Hence we reject the null hypothesis that the means are all equal.
library(MASS)
##
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
?boxcox
## starting httpd help server ...
## done
boxcox(methods~x)
lambda = 0.5
method_modified <- method_new
method_modified$value <- (method_modified$value)^lambda
kruskal.test(value~name, data = method_modified)
##
## Kruskal-Wallis rank sum test
##
## data: value by name
## Kruskal-Wallis chi-squared = 21.156, df = 3, p-value = 9.771e-05
From the boxcox plot we see that the 95% confidence interval on the maximum log-likelihood ratio falls between a lambda value of 0.4 to 0.6. So we choose our lambda to be 0.5.
We re-run our analysis of the non-parametric kruskal-wallace test. Our p-value (9.771e-059.771e-05) is still significantly smaller than our critical p-value of 0.05. Hence, we still reject the null hypothesis that the mean discharge are the same for all the methods.
method1 <- c(.34, .12, 1.23, .70, 1.75, .12)
method2 <- c(.91, 2.94, 2.14, 2.36, 2.86, 4.55)
method3 <- c(6.31, 8.37, 9.75, 6.09, 9.82, 7.24)
method4 <- c(17.15, 11.82, 10.97, 17.20, 14.35, 16.82)
methods <- c(method1, method2, method3, method4)
method <- data.frame(method1, method2, method3, method4)
method
x <- c(rep(1,6), rep(2,6), rep(3,6), rep(4,6))
library(dplyr)
library(tidyr)
method_new <- pivot_longer(method, c(method1, method2, method3, method4))
View(method_new)
qqnorm(method_new$value)
qqline(method_new$value)
boxplot(method1, method2, method3, method4, main= "boxplot of methods", xlab = "methods", ylab = "flowrates (cubic.ft/sec)")
kruskal.test(value~name, method_new)
library(MASS)
?boxcox
boxcox(methods~x)
lambda = 0.5
method_modified <- method_new
method_modified$value <- (method_modified$value)^lambda
kruskal.test(value~name, data = method_modified)