knitr::opts_chunk$set(echo = TRUE)

IE 5342 Homework Module 4

Meghan Cephus

10/11/2024

3.23

  1. Comparing the F value and the Critical F value, F < F Critical (3.047 < 3.098), therefore we cannot reject the null hypothesis. There is no indication that the fluids differ at a α=0.05 significance.
F1 <- c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6)
F2 <- c(16.9, 15.3, 18.6, 17.1, 19.5, 20.3)
F3 <- c(21.4, 23.6, 19.4, 18.5, 20.5, 22.3)
F4 <- c(19.3, 21.1, 16.9, 17.5, 18.3, 19.8)

dat <- data.frame(F1,F2,F3,F4)

library(tidyr)

dat <- pivot_longer(dat,c(F1,F2,F3,F4))

aov <- aov(value~name, data=dat)
summary(aov)
##             Df Sum Sq Mean Sq F value Pr(>F)  
## name         3  30.17   10.05   3.047 0.0525 .
## Residuals   20  65.99    3.30                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
qf(0.95, 3, 20)
## [1] 3.098391
  1. Fluid 3 has the highest mean with the highest difference between the other means.
mean(F1)
## [1] 18.65
mean(F2)
## [1] 17.95
mean(F3)
## [1] 20.95
mean(F4)
## [1] 18.81667
  1. Looking at the plots of the residuals the variance seems to satisfy the consistency assumption.
plot(aov)

3.28

  1. Comparing the F value and the Critical F value, F > F Critical (6.191 > 3.055), therefore we can eject the null hypothesis. We cannot say each material has the same effect on the mean failure time.
M1 <- c(110, 157, 194, 178)
M2 <- c(1, 2, 4, 18)
M3 <- c(880, 1256, 5276, 4355)
M4 <- c(495, 7040, 5307, 10050)
M5 <- c(7, 5, 29, 2)

dat <- data.frame(M1,M2,M3,M4,M5)

dat <- pivot_longer(dat,c(M1,M2,M3,M4,M5))

aov <- aov(value~name, data=dat)
summary(aov)
##             Df    Sum Sq  Mean Sq F value  Pr(>F)   
## name         4 103191489 25797872   6.191 0.00379 **
## Residuals   15  62505657  4167044                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
qf(0.95, 4, 15)
## [1] 3.055568
  1. Looking at the plots the variance is the not satisfactorily consistent. The Q-Q plot shows most of the residuals falling on the fit line but there are still outliers. Normality is questionable.
plot(aov)

  1. Even with the Kruskal-Wallis test, the p-value is less 0.05 so there is a still significant difference between each material.
kruskal.test(value~name, data=dat)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  value by name
## Kruskal-Wallis chi-squared = 16.873, df = 4, p-value = 0.002046

3.29

  1. Comparing the F value and the Critical F value, F > F Critical (7.914 > 3.88), therefore we can eject the null hypothesis. We cannot say each method has the same effect on the mean particle count.
Me1 <- c(31, 10, 21, 4, 1)
Me2 <- c(62, 40, 24, 30, 35)
Me3 <- c(53, 27, 120, 97, 68)

dat <- data.frame(Me1,Me2,Me3)

library(tidyr)

dat <- pivot_longer(dat,c(Me1,Me2,Me3))

aov <- aov(value~name, data=dat)
summary(aov)
##             Df Sum Sq Mean Sq F value  Pr(>F)   
## name         2   8964    4482   7.914 0.00643 **
## Residuals   12   6796     566                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
qf(0.95, 2, 12)
## [1] 3.885294
  1. We cannot say there is constant variance between all the methods particular because of method 3. The Q-Q plot also has a lot of deviations and outliers from the straight line which make the assumption of normality unreliable.
plot(aov)

  1. After using Box-Cox transformation the variability and normality improve but there is still a significant different between the methods as the p-value (0.003) after running ANOVA again is less than 0.05.
library(MASS)

boxcox(dat$value~dat$name)

dat$value = dat$value^0.33
boxcox(dat$value~dat$name)

aov <- aov(value~name, data=dat)
summary(aov)
##             Df Sum Sq Mean Sq F value  Pr(>F)   
## name         2  9.238   4.619    9.77 0.00303 **
## Residuals   12  5.674   0.473                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(aov)

3.51

Running problem 3.23 with the the p-value (0.1015) is more than 0.05 so there still is not a significant difference between each fluid.

F1 <- c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6)
F2 <- c(16.9, 15.3, 18.6, 17.1, 19.5, 20.3)
F3 <- c(21.4, 23.6, 19.4, 18.5, 20.5, 22.3)
F4 <- c(19.3, 21.1, 16.9, 17.5, 18.3, 19.8)

dat <- data.frame(F1,F2,F3,F4)

library(tidyr)

dat <- pivot_longer(dat,c(F1,F2,F3,F4))

aov <- aov(value~name, data=dat)

kruskal.test(value~name, data=dat)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  value by name
## Kruskal-Wallis chi-squared = 6.2177, df = 3, p-value = 0.1015

3.52

Our findings with the non-parametric Kruskal-Wallis test are consistent with the original findings from ANOVA.

4.3

Using GAD for RCBD, we find that the chemical type has an F value that is less than the F critical value. (2.3761 < 3.49) Therefore the chemical type does NOT seem to have a significant effect. The bolt type, however, has an F value that is greater than the F critical value. (21.605 > 3.25) Therefore the bolt type DOES seem to have a significant effect.

library(GAD)

tensile_strength <- c(73, 68, 74, 71, 67,
                      73, 67, 75, 72, 70,
                      75, 68, 78, 73, 68,
                      73, 71, 75, 75, 69)
chemical <- factor(rep(1:4, each=5))  # 4 chemicals
chemical<- as.fixed(chemical)
bolt <- factor(rep(1:5, times=4))     # 5 bolts
bolt<- as.fixed(bolt)

# Create a data frame
data <- data.frame(bolt, chemical, tensile_strength)

model <- lm(tensile_strength~chemical+bolt)
gad(model)
## $anova
## Analysis of Variance Table
## 
## Response: tensile_strength
##           Df Sum Sq Mean Sq F value    Pr(>F)    
## chemical   3  12.95   4.317  2.3761    0.1211    
## bolt       4 157.00  39.250 21.6055 2.059e-05 ***
## Residuals 12  21.80   1.817                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
chem_crit_f <- qf(0.95, 3, 12)
chem_crit_f
## [1] 3.490295
bolt_crit_f <- qf(0.95, 4, 12)
bolt_crit_f
## [1] 3.259167

4.16

coefficients(model)
## (Intercept)   chemical2   chemical3   chemical4       bolt2       bolt3 
##       72.35        0.80        1.80        2.00       -5.00        2.00 
##       bolt4       bolt5 
##       -0.75       -5.00

4.22

After running ANOVA, the F critical for all the variables is 3.25. The F value for the day and batch are less than F critical (0.979 and 1.235) which means the day and batch does NOT have a significant effect on the reaction time. However, the ingredients F value is 11.3, which means that the ingredients DO have a significant effect on the reaction time.

reaction_time <- c(8, 7, 1, 7, 3,   # Batch 1
                   11, 2, 7, 3, 8,  # Batch 2
                   4, 9, 10, 1, 5,  # Batch 3
                   6, 8, 6, 6, 10,  # Batch 4
                   4, 2, 3, 8, 8)   # Batch 5

ingredient <- factor(c("A", "B", "D", "C", "E",
                       "C", "E", "A", "D", "B",
                       "B", "A", "C", "E", "D",
                       "D", "C", "E", "B", "A",
                       "E", "D", "B", "A", "C"))

batch <- factor(rep(1:5, each=5))
day <- factor(rep(1:5, times=5))

data <- data.frame(day, batch, ingredient, reaction_time)

aov <- aov(reaction_time~day+batch+ingredient)
summary(aov)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## day          4  12.24    3.06   0.979 0.455014    
## batch        4  15.44    3.86   1.235 0.347618    
## ingredient   4 141.44   35.36  11.309 0.000488 ***
## Residuals   12  37.52    3.13                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
batch_crit_f <- qf(0.95, 4, 12)
batch_crit_f
## [1] 3.259167
day_crit_f <- qf(0.95, 4, 12)
day_crit_f
## [1] 3.259167
ingrediant_crit_f <- qf(0.95, 4, 12)
ingrediant_crit_f
## [1] 3.259167