Homework Week 5

0.1 Question 3.7

The tensile strength of Portland cement is being studied. Four different mixing techniques can be used economically. A completely randomized experiment was conducted and the following data were collected:

a. Test the hypothesis that mixing techniques affect the strength of the cement. Use ( α�= 0.05 )

Mixing Technique	1	2	3	4
1	3129	3000	2865	2890
2	3200	3300	2975	3150
3	2800	2900	2985	3050
4	2600	2700	2600	2765

Use the Fisher LSD method with α� 0.05 to make comparisons between pairs of means.
Construct a normal probability plot of the residuals. What conclusion would you draw about the validity of the normality assumption?
Plot the residuals versus the predicted tensile strength. Comment on the plot.
Prepare a scatter plot of the results to aid the interpretation of the results of this experiment

Solution

Part C

#Data reading
Mixtech1<-c(3129,3000,2865,2890)
Mixtech2<-c(3200,3300,2975,3150)
Mixtech3<-c(2800,2900,2985,3050)
Mixtech4<-c(2600,2700,2600,2765)
dat<-data.frame(Mixtech1,Mixtech2,Mixtech3,Mixtech4)

#tidy data
library(tidyr)
dat<-pivot_longer(dat,c(Mixtech1,Mixtech2,Mixtech3,Mixtech4))
print(dat)

## # A tibble: 16 × 2
##    name     value
##    <chr>    <dbl>
##  1 Mixtech1  3129
##  2 Mixtech2  3200
##  3 Mixtech3  2800
##  4 Mixtech4  2600
##  5 Mixtech1  3000
##  6 Mixtech2  3300
##  7 Mixtech3  2900
##  8 Mixtech4  2700
##  9 Mixtech1  2865
## 10 Mixtech2  2975
## 11 Mixtech3  2985
## 12 Mixtech4  2600
## 13 Mixtech1  2890
## 14 Mixtech2  3150
## 15 Mixtech3  3050
## 16 Mixtech4  2765

#one way ANOVA
aov.model<-aov(value~name,data=dat)
summary(aov.model)

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## name         3 489740  163247   12.73 0.000489 ***
## Residuals   12 153908   12826                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

#graph plot
plot(aov.model)

#Next we define the hypothesis and conduct the LSD test:
#Ho:μi−μj=0
#Ho:μi−μj≠0
#below values are from T table
t<-2.179
MSE<-12826
n=4

LSD = t*sqrt(2*MSE/n)
print(LSD)

## [1] 174.497

“If a difference exceeds 3.745, it’s a significant contrast between treatments.”

abs(mean(Mixtech1)-mean(Mixtech2))

## [1] 185.25

abs(mean(Mixtech1)-mean(Mixtech3))

## [1] 37.25

abs(mean(Mixtech1)-mean(Mixtech4))

## [1] 304.75

abs(mean(Mixtech2)-mean(Mixtech3))

## [1] 222.5

abs(mean(Mixtech2)-mean(Mixtech4))

## [1] 490

abs(mean(Mixtech3)-mean(Mixtech4))

## [1] 267.5

Results

The only pair of means we don’t reject is μ1 & μ3 because their mean difference (37.5) is less than the LSD (Least Significant Difference) value. For all other pairs of means, we reject the null hypothesis and conclude that there is a significant difference between population means.

[D] Upon conducting ANOVA validation, specifically a Normal Probability Plot of residuals, we observe the data closely aligns with a normal distribution. This suggests that the assumption of normality is valid.

[E] Upon examining the graph depicting ‘residuals vs. Fitted values’ it is apparent that the spread of residuals does not deviate significantly from the mean.

[F]

# Assuming you have your data in a data frame called "dat"
dat <- data.frame(
  Strength = c(3129, 3000, 2865, 2890, 3200, 3300, 2975, 3150, 2800, 2900, 2985, 3050, 2600, 2700, 2600, 2765),
  Type = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4)
)

# Load necessary libraries
library(car)

## Loading required package: carData

library(carData)

# Create the scatterplot
scatterplot(Strength ~ Type, data = dat,
            xlab = "Mixtech", ylab = "Tensile Strength",
            main = "Scatter Plot")

0.2 Question 3.10

A product developer is investigating the tensile strength of a new synthetic fiber that will be used to make cloth for men’s shirts. Strength is usually affected by the percentage of cotton used in the blend of materials for the fiber. The engineer conducts a completely randomized experiment with five levels of cotton content and replicates the experiment five times. The data are shown in the following table.

Cotton Weight %	1	2	3	4	5
15	7	7	15	11	9
20	12	17	12	18	18
25	14	19	19	18	18
30	19	25	22	19	23
35	7	10	11	15	11

b. Use the Fisher LSD method to make comparisons between the pairs of means. What conclusions can you draw?

c. Analyze the residuals from this experiment and comment on model adequacy.

SOLUTION

Null hypothesis : Ho: μi−μj=0

Alternative Hypothesis : Ha: μi−μj≠0

#data reading
cottonweight15 <- c(7,7,15,11,9)
cottonweight20 <- c(12,17,12,18,18)
cottonweight25 <- c(14,19,19,18,18)
cottonweight30 <- c(19,25,22,19,23)
cottonweight35 <- c(7,10,11,15,11)
dat<-data.frame(cottonweight15,cottonweight20,cottonweight25,cottonweight30,cottonweight35)

library(tidyr)
dat<-pivot_longer(dat,c(cottonweight15,cottonweight20,cottonweight25,cottonweight30,cottonweight35))
print(dat)

## # A tibble: 25 × 2
##    name           value
##    <chr>          <dbl>
##  1 cottonweight15     7
##  2 cottonweight20    12
##  3 cottonweight25    14
##  4 cottonweight30    19
##  5 cottonweight35     7
##  6 cottonweight15     7
##  7 cottonweight20    17
##  8 cottonweight25    19
##  9 cottonweight30    25
## 10 cottonweight35    10
## # ℹ 15 more rows

aov.model<-aov(value~name,data=dat)
summary(aov.model)

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## name         4  475.8  118.94   14.76 9.13e-06 ***
## Residuals   20  161.2    8.06                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

plot(aov.model)

# Hypothesis for the LSD test:
#Ho:μi−μj=0
#Ho:μi−μj≠0
t<-2.086
MSE<-8.06
n=5

LSD = t*sqrt(2*MSE/n)
print(LSD)

## [1] 3.745517

abs(mean(cottonweight15)-mean(cottonweight20))

## [1] 5.6

abs(mean(cottonweight15)-mean(cottonweight25))

## [1] 7.8

abs(mean(cottonweight15)-mean(cottonweight30))

## [1] 11.8

abs(mean(cottonweight15)-mean(cottonweight35))

## [1] 1

abs(mean(cottonweight20)-mean(cottonweight25))

## [1] 2.2

abs(mean(cottonweight20)-mean(cottonweight30))

## [1] 6.2

abs(mean(cottonweight20)-mean(cottonweight35))

## [1] 4.6

abs(mean(cottonweight25)-mean(cottonweight30))

## [1] 4

abs(mean(cottonweight25)-mean(cottonweight35))

## [1] 6.8

abs(mean(cottonweight30)-mean(cottonweight35))

## [1] 10.8

The only pairs of means that we do not reject are μ1 & μ5 and μ2 & μ3, as their mean differences are less than the LSD value of 3.745, with y¯1 - y¯5 = 1 and y¯2 - y¯3 = 1, both falling below the LSD threshold.

[C]

Following the ONE Way ANOVA conducted in Part B, we proceeded with ANOVA Validation.

Results:

The normal probability plot of residuals closely adheres to normality.

0.3 Question 3.44

Suppose that four normal populations have means of μ1=50�1=50 , μ2=60�2=60, μ3=50�3=50, and μ4=60�4=60 How many observations should be taken from each population so that the probability of rejecting the null hypothesis of equal population means is at least 0.90? Assume that α� = 0.05 and that a reasonable estimate of the error variance is σ=5�=5.

0.4 Solution

power.anova.test(groups = 4, n=NULL, between.var = var(c(50,50,60,60)), within.var = 25, sig.level = 0.05, power = 0.90)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 4
##               n = 4.658128
##     between.var = 33.33333
##      within.var = 25
##       sig.level = 0.05
##           power = 0.9
## 
## NOTE: n is number in each group

sigma=5
d=10/sigma
print(d)

## [1] 2

f=d/2
print(f)

## [1] 1

library(pwr)
pwr.anova.test(k=4,n=NULL,f=1, sig.level = 0.05, power = 0.90)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 4
##               n = 4.658119
##               f = 1
##       sig.level = 0.05
##           power = 0.9
## 
## NOTE: n is number in each group

Result

Both methods confirm that the appropriate number of observations to be selected from each population is 5.

0.5 Question 3.45

Refer to Problem 3.44.

How would your answer change if a reasonable estimate of the experimental error variance were 36?
How would your answer change if a reasonable estimate of the experimental error variance were 49?
Can you draw any conclusions about the sensitivity of your answer in this particular situation about how your estimate of 2 affects the decision about sample size?
Can you make any recommendations about how we should use this general approach to choosing n in practice?

Variance=36
sigma=6
pwr.anova.test(k=4,n=NULL,f=0.8333333, sig.level = 0.05, power = 0.90)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 4
##               n = 6.180858
##               f = 0.8333333
##       sig.level = 0.05
##           power = 0.9
## 
## NOTE: n is number in each group

Variance=49
sigma=7
pwr.anova.test(k=4,n=NULL,f=0.7142857, sig.level = 0.05, power = 0.90)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 4
##               n = 7.998751
##               f = 0.7142857
##       sig.level = 0.05
##           power = 0.9
## 
## NOTE: n is number in each group

The suitable number of observations to be chosen from each population is 7.

0.6 Complete R code

#Reading the Data:
Mixtech1<-c(3129,3000,2865,2890)
Mixtech2<-c(3200,3300,2975,3150)
Mixtech3<-c(2800,2900,2985,3050)
Mixtech4<-c(2600,2700,2600,2765)
dat<-data.frame(Mixtech1,Mixtech2,Mixtech3,Mixtech4)

library(tidyr)
dat<-pivot_longer(dat,c(Mixtech1,Mixtech2,Mixtech3,Mixtech4))
print(dat)
aov.model<-aov(value~name,data=dat)
summary(aov.model)
plot(aov.model)

t<-2.179
MSE<-12826
n=4
LSD = t*sqrt(2*MSE/n)
print(LSD)

abs(mean(Mixtech1)-mean(Mixtech2))
abs(mean(Mixtech1)-mean(Mixtech3))
abs(mean(Mixtech1)-mean(Mixtech4))
abs(mean(Mixtech2)-mean(Mixtech3))
abs(mean(Mixtech2)-mean(Mixtech4))
abs(mean(Mixtech3)-mean(Mixtech4))

Strength <- c(3129, 3000, 2865, 2890, 3200, 3300, 2975, 3150, 2800, 2900, 2985, 3050, 2600, 2700, 2600, 2765)
Type <- c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4)
library(car)
library(carData)
scatterplot(Strength ~ Type, data=dat,
            xlab="Type", ylab="Strength",
            main="Scatter Plot")

cottonweight15<-c(7,7,15,11,9)
cottonweight20<-c(12,17,12,18,18)
cottonweight25<-c(14,19,19,18,18)
cottonweight30<-c(19,25,22,19,23)
cottonweight35<-c(7,10,11,15,11)
dat<-data.frame(cottonweight15,cottonweight20,cottonweight25,cottonweight30,cottonweight35)
library(tidyr)
dat<-pivot_longer(dat,c(cottonweight15,cottonweight20,cottonweight25,cottonweight30,cottonweight35))
print(dat)
aov.model<-aov(value~name,data=dat)
summary(aov.model)
plot(aov.model)

t<-2.086
MSE<-8.06
n=5
LSD = t*sqrt(2*MSE/n)
print(LSD)

abs(mean(cottonweight15)-mean(cottonweight20))
abs(mean(cottonweight15)-mean(cottonweight25))
abs(mean(cottonweight15)-mean(cottonweight30))
abs(mean(cottonweight15)-mean(cottonweight35))
abs(mean(cottonweight20)-mean(cottonweight25))
abs(mean(cottonweight20)-mean(cottonweight30))
abs(mean(cottonweight20)-mean(cottonweight35))
abs(mean(cottonweight25)-mean(cottonweight30))
abs(mean(cottonweight25)-mean(cottonweight35))
abs(mean(cottonweight30)-mean(cottonweight35))

library(pwr)
power.anova.test(groups = 4, n=NULL, between.var = var(c(50,50,60,60)), within.var = 25, sig.level = 0.05, power = 0.90)
#OR
sigma=5
d=10/sigma
print(d)
f=d/2
print(f)
pwr.anova.test(k=4,n=NULL,f=1, sig.level = 0.05, power = 0.90)

Variance=36
sigma=6
pwr.anova.test(k=4,n=NULL,f=0.8333333, sig.level = 0.05, power = 0.90)

Variance=49
sigma=7
pwr.anova.test(k=4,n=NULL,f=0.7142857, sig.level = 0.05, power = 0.90)