Item (a)

The hypothesis that we are testing is:

\[ H_0 = \mu_1 = \mu_2 = \mu_3 = \mu_4 = \mu \]

\[ H_1 = At\; least \; one \;mean\;is\;different \]

The linear effect equation is:

\[ y_{ij} = \mu + \tau_i +\epsilon_{ij}\\ \]

Item (b)

library(tidyr)
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

em1<-c(.34, .12,    1.23,   .70,    1.75,   .12)
em2<-c(.91, 2.94,   2.14,   2.36,   2.86,   4.55)
em3<-c(6.31,    8.37,   9.75,   6.09,   9.82,   7.24)
em4<-c(17.15,   11.82,  10.97,  17.20,  14.35,  16.82)
dat<-data.frame(em1,em2,em3,em4)
pop<-c(em1,em2,em3,em4)
meanx<-c(rep(mean(em1),6),rep(mean(em2),6),rep(mean(em3),6),rep(mean(em4),6))
res<-pop-meanx
qqnorm(res)
qqline(res)

plot(meanx,res,xlab="population average", ylab="residual",main="constant variance")

Conclusion: As it is possible to see in the Normal Q-Q Plot, the data does not appear to be fairly normal distributed. In addition, from the constant variance plot, we are able to see that the scatters are not approximately the same size. Therefore, we can conclude the data does not have constant variance.

Item (c)

dat<-pivot_longer(dat,c(em1,em2,em3,em4))
kruskal.test(value~name,data=dat)

## 
##  Kruskal-Wallis rank sum test
## 
## data:  value by name
## Kruskal-Wallis chi-squared = 21.156, df = 3, p-value = 9.771e-05

Conclusion: As the p-value of the Kruskal Test (9.771E-5) is way lesser than \(\alpha\) = 0.05, we can reject \(H_0\), which means that there is at least one mean that is different.

Item (d)

library(MASS)

## 
## Attaching package: 'MASS'

## The following object is masked from 'package:dplyr':
## 
##     select

boxcox(value~name,data=dat)

lambda=.5 
dat_transformed<-dat$value^(lambda) 

dat2<-data.frame(dat$name,dat_transformed)
colnames(dat2)<-c("name","value")

model<-aov(value~name,data=dat2)
summary(model)

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## name         3  32.69  10.898   81.17 2.27e-11 ***
## Residuals   20   2.69   0.134                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Conclusion: From the Box Cox plot it is possible to estimate the value of lambda (\(\lambda = 0.5\)).

After transforming the data, we performed an ANOVA test which resulted in a p-value < 0.001. Therefore, similarly with the nonparametric test, we reject the null hypothesis.

Flip Assignment 10

Felipe Zambrini Santos, Leonardo Tchen Hao Hang Wei, Gustavo Marin Paulon

2022-10-04

Item (a)

Item (b)

Item (c)

Item (d)