3.23

a.

\(H_0: \mu_1=\mu_2 =\mu_3=\mu_4=\mu\)

\(H_a\): At least one \(\mu\) is not the same.

library(tidyr)
f1<-c(17.6,18.9,16.3,17.4,20.1,21.6)
f2<-c(16.9,15.3,18.6,17.1,19.5,20.3)
f3<-c(21.4,23.6,19.4,18.5,20.5,22.3)
f4<-c(19.3,21.1,16.9,17.5,18.3,19.8)
dat <- data.frame(f1,f2,f3,f4)
dat2 <- pivot_longer(dat,c(f1,f2,f3,f4))
aov.model <- aov(value~name,data=dat2)
summary(aov.model)

##             Df Sum Sq Mean Sq F value Pr(>F)  
## name         3  30.16   10.05   3.047 0.0525 .
## Residuals   20  65.99    3.30                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

p value is 0.0525 larger than 0.05, we fail to reject the null hypothesis. There is no indication that the fluids differ.

b.

I would choose fluid type 3 for longer life.

c.

pop <- c(f1,f2,f3,f4)
meanx <- c(rep(mean(f1),6),rep(mean(f2),6),rep(mean(f3),6),rep(mean(f4),6))
res <- pop-meanx
qqnorm(res)
qqline(res)

plot(meanx,res,xlab="population average", ylab="residual",main="variance")

The variance appear to be consistent.

3.28

a.

\(H_0: \mu_1=\mu_2 =\mu_3=\mu_4=\mu_5=\mu\)

\(H_a\): At least one \(\mu\) is not the same.

m1<-c(110, 157, 194, 178)
m2<-c(1, 2, 4, 18)
m3<-c(880, 1256, 5276, 4355)
m4<-c(495, 7040, 5307, 10050)
m5<-c(7, 5, 29, 2)
dat <- data.frame(m1,m2,m3,m4,m5)
library(tidyr)
dat2 <- pivot_longer(dat,c(m1,m2,m3,m4,m5))
aov.model <- aov(value~name, data=dat2)
summary(aov.model)

##             Df    Sum Sq  Mean Sq F value  Pr(>F)   
## name         4 103191489 25797872   6.191 0.00379 **
## Residuals   15  62505657  4167044                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

P value = 0.00379 which is smaller than 0.01, so we can reject the H0. The five materials do not have the same effect on mean failure time.

b.

pop <- c(m1,m2,m3,m4,m5)
meanx <- c(rep(mean(m1),4),rep(mean(m2),4),rep(mean(m3),4),rep(mean(m4),4),rep(mean(m5),4))
res <- pop-meanx
qqnorm(res)
qqline(res)

plot(meanx,res,xlab="population average", ylab="residual",main="variance")

The variances do not appear to be constant. The data is not normally distributed.

c.

library(MASS)
boxcox(pop~meanx)

dat3<-log(dat)
dat4<-pivot_longer(dat3,c(m1,m2,m3,m4,m5))
aov.model<-aov(value~name,data=dat4)
summary(aov.model)

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## name         4 165.06   41.26   37.66 1.18e-07 ***
## Residuals   15  16.44    1.10                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

P value is much smaller, and we can reject the H0.

3.29

a.

\(H_0: \mu_1=\mu_2 =\mu_3=\mu\)

\(H_a\): At least one \(\mu\) is not the same.

M1<-c(31, 10, 21, 4, 1)
M2<-c(62, 40, 24, 30, 35)
M3<-c(53, 27, 120, 97, 68)
dat<-data.frame(M1,M2,M3)
library(tidyr)
dat2<-pivot_longer(dat,c(M1,M2,M3))
aov.model<-aov(value~name,data=dat2)
summary(aov.model)

##             Df Sum Sq Mean Sq F value  Pr(>F)   
## name         2   8964    4482   7.914 0.00643 **
## Residuals   12   6796     566                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

P value = 0.00643 which is smaller than 0.01, so we can reject H0. All methods don’t have the same effect.

b.

pop <- c(M1,M2,M3)
meanx <- c(rep(mean(M1),5),rep(mean(M2),5),rep(mean(M3),5))
res <- pop-meanx
qqnorm(res)
qqline(res)

plot(meanx,res,xlab="population average", ylab="residual",main="variance")

The data seems to be normally distributed, but the variance is obviously not constant,

c.

library(MASS)
boxcox(pop~meanx)

dat5<-sqrt(dat)
dat6<-pivot_longer(dat5,c(M1,M2,M3))
aov.model<-aov(value~name,data=dat6)
summary(aov.model)

##             Df Sum Sq Mean Sq F value  Pr(>F)   
## name         2  63.90   31.95    9.84 0.00295 **
## Residuals   12  38.96    3.25                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

P value is 0.00295, which is even smaller than before the data transformation. We can safely reject the null hypothesis.

3.51&52

f1<-c(17.6,18.9,16.3,17.4,20.1,21.6)
f2<-c(16.9,15.3,18.6,17.1,19.5,20.3)
f3<-c(21.4,23.6,19.4,18.5,20.5,22.3)
f4<-c(19.3,21.1,16.9,17.5,18.3,19.8)
dat<-data.frame(f1,f2,f3,f4)
library(tidyr)
dat2<-pivot_longer(dat,c(f1,f2,f3,f4))
kruskal.test(value~name,data=dat2)

## 
##  Kruskal-Wallis rank sum test
## 
## data:  value by name
## Kruskal-Wallis chi-squared = 6.2177, df = 3, p-value = 0.1015

P value is higher than 0.05, so we fail to reject H0.

The results is different from the usual analysis of variance, which assume the normal distribution and the constant variance. In this case, Kruskal-Wallis test gives us more reliable and conservative results.

4.3

library(GAD)
chemical<-c(rep(1,5),rep(2,5),rep(3,5),rep(4,5))
bolt <- c(seq(1,5),seq(1,5),seq(1,5),seq(1,5))
obs<- c(73,68,74,71,67,
        73,67,75,72,70,
        75,68,78,73,68,
        73,71,75,75,69)
chemical<-as.fixed(chemical)
bolt <- as.fixed(bolt)
model<-lm(obs~chemical+bolt)
gad(model)

## $anova
## Analysis of Variance Table
## 
## Response: obs
##           Df Sum Sq Mean Sq F value    Pr(>F)    
## chemical   3  12.95   4.317  2.3761    0.1211    
## bolt       4 157.00  39.250 21.6055 2.059e-05 ***
## Residuals 12  21.80   1.817                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The results suggest that there is significant variability in tensile strength between different bolts (p value is smaller than 0.05), but the different chemical agents did not have a statistically significant effect on tensile strength at the 0.05 level.

4.16

mean_all <- mean(obs)
c1 <- c(73,68,74,71,67)
c2 <- c(73,67,75,72,70)
c3 <- c(75,68,78,73,68)
c4 <- c(73,71,75,75,69)
mean(c1)

## [1] 70.6

mean(c2)

## [1] 71.4

mean(c3)

## [1] 72.4

mean(c4)

## [1] 72.6

t1=mean(c1)-mean_all
t2=mean(c2)-mean_all
t3=mean(c3)-mean_all
t4=mean(c4)-mean_all
t1

## [1] -1.15

t2

## [1] -0.35

t3

## [1] 0.65

t4

## [1] 0.85

b1 <- c(73,73,75,73)
b2 <- c(68,67,68,71)
b3 <- c(74,75,78,75)
b4 <- c(71,72,73,75)
b5 <- c(67,70,68,69)
mean(b1)

## [1] 73.5

mean(b2)

## [1] 68.5

mean(b3)

## [1] 75.5

mean(b4)

## [1] 72.75

mean(b5)

## [1] 68.5

B1 <- mean(b1)-mean_all
B2 <- mean(b2)-mean_all
B3 <- mean(b3)-mean_all
B4 <- mean(b4)-mean_all
B5 <- mean(b5)-mean_all
B1

## [1] 1.75

B2

## [1] -3.25

B3

## [1] 3.75

B4

## [1] 1

B5

## [1] -3.25

\(\tau_1=\)-1.15

\(\tau_2=\)-0.35

\(\tau_3=\) 0.65

\(\tau_4=\) 0.85

\(\beta_1=\) 1.75

\(\beta_2=\) -3.25
\(\beta_3=\) 3.75

\(\beta_4=\) 1

\(\beta_5=\) -3.25

4.22

batch<-c(rep(1,5),rep(2,5),rep(3,5),rep(4,5),rep(5,5))
day<-c(seq(1,5),seq(1,5),seq(1,5),seq(1,5),seq(1,5))
value <- c(8,7,1,7,3,11,2,7,3,8,4,9,10,1,5,6,8,6,6,10,4,2,3,8,8)
ingre <- c('A','B','D','C','E','C','E','A','D','B','B','A','C','E','D','D','C','E','B','A','E','D','B','A','C')
dat <- data.frame(batch,day,value,ingre)
dat$batch <- as.factor(dat$batch)
dat$day <- as.factor(dat$day)
dat$ingre <- as.factor(dat$ingre)
aov.model <- aov(value~batch+day+ingre, data = dat)
summary(aov.model)

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## batch        4  15.44    3.86   1.235 0.347618    
## day          4  12.24    3.06   0.979 0.455014    
## ingre        4 141.44   35.36  11.309 0.000488 ***
## Residuals   12  37.52    3.13                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The ingredients appears to significantly affect the reaction time of the chemical process, since the p value is smaller than 0.001.

HW_M4

Peihang Li

2024-10-10

3.23

a.

b.

c.

3.28

a.

b.

c.

3.29

a.

b.

c.

3.51&52

4.3

4.16

4.22