Assignment 10

library(tidyr)
library(dplyr)
library(MASS)

1a: Write the linear effects equation and the hypothesis you are testing.

yij= μ + τi + eij ; where i=1,2,3,4, j=1,2,3,4,5,6

H0 : µ1 = µ2 = µ3 = µ4 or H0 : τi = 0

Ha : At least one µ is different or Ha : τi != 0


1b: Does it appear the data is normally distributed? Does it appear that the variance is constant?

method1 <- c(.34,.12,1.23,.70,1.75,.12)
method2 <- c(.91,2.94,2.14,2.36,2.86,4.55)
method3 <- c(6.31,8.37,9.75,6.09,9.82,7.24)
method4 <- c(17.15,11.82,10.97,17.20,14.35,16.82)

methods <- data.frame(method1,method2,method3,method4)
method_long <- pivot_longer(methods,c(method1,method2,method3,method4))

aov.model<-aov(value~name,data=method_long)
summary(aov.model)
##             Df Sum Sq Mean Sq F value Pr(>F)    
## name         3  708.7   236.2   76.29  4e-11 ***
## Residuals   20   61.9     3.1                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(aov.model)

qqnorm(method_long$value)


Looking at the Normal Q-Q Plot and Q-Q Residuals, we see that the data is not normally distributed.


boxplot(method_long$value~method_long$name,xlab="population",ylab="observation",main="Boxplot of Observations")

From the boxplots, we see that the variances are not constant.



1c: (nonparametric) Perform a Kruskal-Wallace test in R (alpha=0.05)

kruskal.test(value~name,data=method_long)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  value by name
## Kruskal-Wallis chi-squared = 21.156, df = 3, p-value = 9.771e-05

The p-value = 9.771e-05 is less than alpha = .05. Therefore, we reject the null hypothesis.


1d: (parametric) Select an appropriate transformation using Box Cox, transform the data and test hypothesis in R (alpha=0.05)

boxcox(method_long$value~method_long$name)

lambda=.5  # We choose this because 1 is not in CI on lambda.
method_long$value <- method_long$value^(lambda)
boxcox(method_long$value~method_long$name)

boxplot(method_long$value~method_long$name,xlab="population",ylab="observation",main="Boxplot of Observations")

aov.model2<-aov(value~name,data=method_long)
summary(aov.model2)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## name         3  32.69  10.898   81.17 2.27e-11 ***
## Residuals   20   2.69   0.134                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(aov.model2)

The p-value = 2.27e-11 is less than alpha = .05. Therefore, we reject the null hypothesis.


Complete R Code

library(tidyr)
library(dplyr)
library(MASS)

## 1a: Write the linear effects equation and the hypothesis you are testing

### y_ij= μ + τ_i + e_ij ; where i=1,2,3,4, j=1,2,3,4,5,6
### H_0 : µ_1 = µ_2 = µ_3 = µ_4         or   H_0 : τ_i = 0
### H_a : At least one µ is different   or   H_a : τ_i != 0



## 1b: Does it appear the data is normally distributed? Does it appear that the variance is constant?

method1 <- c(.34,.12,1.23,.70,1.75,.12)
method2 <- c(.91,2.94,2.14,2.36,2.86,4.55)
method3 <- c(6.31,8.37,9.75,6.09,9.82,7.24)
method4 <- c(17.15,11.82,10.97,17.20,14.35,16.82)

methods <- data.frame(method1,method2,method3,method4)
method_long <- pivot_longer(methods,c(method1,method2,method3,method4))

aov.model<-aov(value~name,data=method_long)
summary(aov.model)

plot(aov.model)
qqnorm(method_long$value)

boxplot(method_long$value~method_long$name,xlab="population",ylab="observation",main="Boxplot of Observations")

## 1c: (nonparametric) Perform a Kruskal-Wallace test in R (alpha=0.05)
kruskal.test(value~name,data=method_long)


## 1d: (parametric) Select an appropriate transformation using Box Cox, transform the data and test hypothesis in R (alpha=0.05)
boxcox(method_long$value~method_long$name)
lambda=.5  # We choose this because 1 is not in CI on lambda.
method_long$value <- method_long$value^(lambda)
boxcox(method_long$value~method_long$name)
boxplot(method_long$value~method_long$name,xlab="population",ylab="observation",main="Boxplot of Observations")

aov.model2<-aov(value~name,data=method_long)
summary(aov.model2)
plot(aov.model2)