1 Question 3.23

1.1 Item a

We want to test the hypotheses:

\[ H_0:\mu_1=\mu_2=\mu_3=\mu_4 \\ H_a:\mu_1 \neq \mu_2\neq\mu_3\neq\mu_4\] From the boxplot, the variance seems to be fairly equal. Therefore, we can proceed to employ the ANOVA test.

F1 <- c(17.6,18.9,16.3,17.4,20.1,21.6)
F2 <- c(16.9,15.3,18.6,17.1,19.5,20.3)
F3 <- c(21.4,23.6,19.4,18.5,20.5,22.3)
F4 <- c(19.3,21.1,16.9,17.5,18.3,19.8)
boxplot(F1,F2,F3,F4)

We can see that, for an \(\alpha\) of 0.05, the p-value is greater thant the significance level and the null hypotheses is not rejected.

library(tidyr)
dat <- data.frame(F1,F2,F3,F4)

dat_org <- dat
dat_org <- pivot_longer(dat_org, c(F1,F2,F3,F4))

aov.model <- aov(value~name,data=dat_org)
summary(aov.model)

##             Df Sum Sq Mean Sq F value Pr(>F)  
## name         3  30.16   10.05   3.047 0.0525 .
## Residuals   20  65.99    3.30                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

1.2 Item b

Based on the statistics, there are no significant differences between types of fluids and we could not choose the correct type of fluid by assessing just the statistics. However, the evaluation of boxplots shows us, visually, that type III fluid displayed visually higher mean values.

1.3 Item c

From the residual analysis, it seems that the variance was constant and the basic analysis of variance assumptions is satisfied.

plot(aov.model,1)

2 Question 3.28

2.1 Item a

We want to test the hypotheses: \[ H_0:\mu_1=\mu_2=\mu_3=\mu_4=\mu_5 \\ H_a:\mu_1 \neq \mu_2\neq\mu_3\neq\mu_4\neq\mu_5\] Performing an ANOVA test, the conclusion was that there were significant differences in means of failure time between the five materials.

mat1 <- c(110,157,194,178)
mat2 <- c(1,2,4,18)
mat3 <- c(880,1256,5276,4355)
mat4 <- c(495,7040,5307,10050)
mat5 <- c(7,5,29,2)
dat <- data.frame(mat1,mat2,mat3,mat4,mat5)

dat_org <- dat
dat_org <- pivot_longer(dat_org, c(mat1,mat2,mat3,mat4,mat5))

aov.model <- aov(value~name,data=dat_org)
summary(aov.model)

##             Df    Sum Sq  Mean Sq F value  Pr(>F)   
## name         4 103191489 25797872   6.191 0.00379 **
## Residuals   15  62505657  4167044                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

2.2 Item b

The plot of residuals vs predicted values shows that the variance is not constant. Moreover, by evaluating the normal probability plot, we can assume that the data is not normal distributed. This suggest that a transformation is necessary.

plot(aov.model,1)

plot(aov.model,2)

## Item c By applying the non-parametric Kruskal-Wallis test, the resultant p-value suggests that we can reject the null hypotheses and the mean failure time differs significantly from each other.

kruskal.test(value~name,data=dat_org)

## 
##  Kruskal-Wallis rank sum test
## 
## data:  value by name
## Kruskal-Wallis chi-squared = 16.873, df = 4, p-value = 0.002046

3 Question 3.29

3.1 Item a

We want to test the hypotheses: \[ H_0:\mu_1=\mu_2=\mu_3 \\ H_a:\mu_1 \neq \mu_2\neq\mu_3\] From the ANOVA test it seems that there is a difference between the means and the null hypotheses is rejected.

met1 <- c(31,10,21,4,1)
met2 <- c(62,40,24,30,35)
met3 <- c(53,27,120,97,68)
dat <- data.frame(met1,met2,met3)

dat_org <- dat
dat_org <- pivot_longer(dat_org, c(met1,met2,met3))

aov.model <- aov(value~name,data=dat_org)
summary(aov.model)

##             Df Sum Sq Mean Sq F value  Pr(>F)   
## name         2   8964    4482   7.914 0.00643 **
## Residuals   12   6796     566                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

3.2 Item b

The residuals vs predicted responses showed that the constant variance assumption is not satisfied. The normal probability plot, however, showed a fairly normal distribution.

plot(aov.model,1)

plot(aov.model,2)

3.3 Item c

After applying the transformation based on the boxcox plot, the variances seems a little bit better. Applying an ANOVA test we can see that the null hypotheses is even more strongly rejected than before transforming the data.

library(MASS)
boxcox(value~name,data=dat_org)

lambda=.5
dat_org$value <- dat_org$value^(lambda)
aov.model <- aov(value~name,dat=dat_org)
plot(aov.model,1)

summary(aov.model)

##             Df Sum Sq Mean Sq F value  Pr(>F)   
## name         2  63.90   31.95    9.84 0.00295 **
## Residuals   12  38.96    3.25                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

4 Questions 3.51 and 3.52

The Kruskal-Wallis test showed a p-value of 0.1015, which is greater than 0.0525 from the ANOVA, indicating that the null hypotheses is not even close to be rejected in the Kruskal-Wallis. However, from the boxplot we can see that type III may be different from the other means. The p-value from the standard ANOVA (0.0525) suggests that the null hypotheses might be close to rejection but factors such constant variance assumption and normality not being totally satisfied might have influenced the ANOVA p-value. Therefore, a p-value of 0.1 from the Kruskal-Wallis test indicates that the null hypotheses may be actually true when in fact the null hypotheses should be next to the rejection region.

F1 <- c(17.6,18.9,16.3,17.4,20.1,21.6)
F2 <- c(16.9,15.3,18.6,17.1,19.5,20.3)
F3 <- c(21.4,23.6,19.4,18.5,20.5,22.3)
F4 <- c(19.3,21.1,16.9,17.5,18.3,19.8)
dat <- data.frame(F1,F2,F3,F4)

dat_org <- dat
dat_org <- pivot_longer(dat_org, c(F1,F2,F3,F4))
kruskal.test(value~name,data=dat_org)

## 
##  Kruskal-Wallis rank sum test
## 
## data:  value by name
## Kruskal-Wallis chi-squared = 6.2177, df = 3, p-value = 0.1015

Homework 6

Leonardo Tchen Hao Hang Wei

Last compiled on October 09, 2022 at 11:21 PM - CDT