Home Work Week 5

0.1 1. Question 3.7

The tensile strength of Portland cement is being studied. Four different mixing techniques can be used economically. A completely randomized experiment was conducted and the following data were collected:

Observations of Tensile Strength


Mixing Technique	1	2	3	4
1	3129	3000	2865	2890
2	3200	3300	2975	3150
3	2800	2900	2985	3050
4	2600	2700	2600	2765

c. Use the Fisher LSD method with \alpha 0.05 to make comparisons between pairs of means.

d. Construct a normal probability plot of the residuals. What conclusion would you draw about the validity of the normality assumption?

f. Plot the residuals versus the predicted tensile strength. Comment on the plot.

g. Prepare a scatter plot of the results to aid the interpretation of the results of this experiment

0.2 Solution 3C

Reading the Data:

Mixingtech_1<-c(3129,3000,2865,2890)
Mixingtech_2<-c(3200,3300,2975,3150)
Mixingtech_3<-c(2800,2900,2985,3050)
Mixingtech_4<-c(2600,2700,2600,2765)
dat<-data.frame(Mixingtech_1,Mixingtech_2,Mixingtech_3,Mixingtech_4)

Next we create a tidy Data

library(tidyr)
dat<-pivot_longer(dat,c(Mixingtech_1,Mixingtech_2,Mixingtech_3,Mixingtech_4))
print(dat)

## # A tibble: 16 × 2
##    name         value
##    <chr>        <dbl>
##  1 Mixingtech_1  3129
##  2 Mixingtech_2  3200
##  3 Mixingtech_3  2800
##  4 Mixingtech_4  2600
##  5 Mixingtech_1  3000
##  6 Mixingtech_2  3300
##  7 Mixingtech_3  2900
##  8 Mixingtech_4  2700
##  9 Mixingtech_1  2865
## 10 Mixingtech_2  2975
## 11 Mixingtech_3  2985
## 12 Mixingtech_4  2600
## 13 Mixingtech_1  2890
## 14 Mixingtech_2  3150
## 15 Mixingtech_3  3050
## 16 Mixingtech_4  2765

Applying One Way ANOVA:

aov.model<-aov(value~name,data=dat)
summary(aov.model)

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## name         3 489740  163247   12.73 0.000489 ***
## Residuals   12 153908   12826                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Now getting the various plots:

plot(aov.model)

Next we define the hypothesis and conduct the LSD test:

\[ H_{o}:\mu _{i}-\mu{j}= 0 \]

\[ H_{o}:\mu _{i}-\mu{j}\neq 0 \]

i and J represents the mixing techniques

We now calculate the least significant difference with this equation:

\[ LSD= \tau _{\alpha /2,N-1*}\sqrt{\frac{2MSE}{n}} \]

From our ANOVA result:

t<-2.179
MSE<-12826
n=4

LSD = t*sqrt(2*MSE/n)
print(LSD)

## [1] 174.497

Our LSD value=174.5

Any treatment average that is 174.5 more will imply a huge difference:

Testing each pairs below:

abs(mean(Mixingtech_1)-mean(Mixingtech_2))

## [1] 185.25

abs(mean(Mixingtech_1)-mean(Mixingtech_3))

## [1] 37.25

abs(mean(Mixingtech_4)-mean(Mixingtech_1))

## [1] 304.75

abs(mean(Mixingtech_2)-mean(Mixingtech_3))

## [1] 222.5

abs(mean(Mixingtech_2)-mean(Mixingtech_4))

## [1] 490

abs(mean(Mixingtech_4)-mean(Mixingtech_3))

## [1] 267.5

Conclusion

We accept all the pair except

he only pair of means that we fail to reject is \(\mu_{1} and \mu_{3}\) , because the difference in means is less than the LSD. i.e.,

\[ y_{1}-y_{3}=37.5< LSD \]

0.2.1 Solution 3D

plot(aov.model)

Conclusion: SInce most seems to lie on a straight line, we conclude the plot is normal.

0.2.2 Solution 3E

From the "Residuals vs Fitted" graph, the residual spread is not scattered. As a result, we uphold the assumption of constant variance.

0.2.3 Solution 3F

Strength <- c(3129, 3000, 2865, 2890, 3200, 3300, 2975, 3150, 2800, 2900, 2985, 3050, 2600, 2700, 2600, 2765)
Type <- c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4)
library(car)
library(carData)
scatterplot(Strength ~ Type, data=dat,
            xlab="Mixing Technique", ylab="Tensile Strength",
            main="Scatter Plot")

Deductions

a. Mixing techniques one and three appear to follow the same pattern, and shows significant difference from the rest.

b. The scatter plot is somewhat similar to the graph of “Residual vs. Factors Levels”.

0.3 2. Question 3.10

A product developer is investigating the tensile strength of a new synthetic fiber that will be used to make cloth for men's shirts. Strength is usually affected by the percentage of cotton used in the blend of materials for the fiber. The engineer conducts a completely randomized experiment with five levels of cotton content and replicates the experiment five times. The data are shown in the following table.

Observations of Tensile Strength of cloth fibre


Cotton Weight %	1	2	3	4	5
15	7	7	15	11	9
20	12	17	12	18	18
25	14	19	19	18	18
30	19	25	22	19	23
35	7	10	11	15	11

b. Use the Fisher LSD method to make comparisons between the pairs of means. What conclusions can you draw?

c. Analyze the residuals from this experiment and comment on model adequacy.

0.3.1 Solution 3.10b

Loading the data

CW_15<-c(7,7,15,11,9)
CW_20<-c(12,17,12,18,18)
CW_25<-c(14,19,19,18,18)
CW_30<-c(19,25,22,19,23)
CW_35<-c(7,10,11,15,11)
dat<-data.frame(CW_15,CW_20,CW_25,CW_30,CW_35)

Creating a tidy data with Pivot Longer command

library(tidyr)
dat<-pivot_longer(dat,c(CW_15,CW_20,CW_25,CW_30,CW_35))
print(dat)

## # A tibble: 25 × 2
##    name  value
##    <chr> <dbl>
##  1 CW_15     7
##  2 CW_20    12
##  3 CW_25    14
##  4 CW_30    19
##  5 CW_35     7
##  6 CW_15     7
##  7 CW_20    17
##  8 CW_25    19
##  9 CW_30    25
## 10 CW_35    10
## # ℹ 15 more rows

One Way ANOVA:

aov.model<-aov(value~name,data=dat)
summary(aov.model)

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## name         4  475.8  118.94   14.76 9.13e-06 ***
## Residuals   20  161.2    8.06                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Plotting the Model

plot(aov.model)

Finally, we define the hypothesis for the LSD test:

\[ H_{o}:\mu _{i}-\mu{j}= 0 \]

\[ H_{o}:\mu _{i}-\mu{j}\neq 0 \]

i and J represents the cotton weight percentages

We now calculate the least significant difference with this equation:

\[ LSD= \tau _{\alpha /2,N-1*}\sqrt{\frac{2MSE}{n}} \]

Since our data is balanced, we assume n1=n2=n3=n4

t<-2.086
MSE<-8.06
n=5

Applying the LSD Equation

LSD = t*sqrt(2*MSE/n)
print(LSD)

## [1] 3.745517

Next, we look for any treatment difference that is less than 3.745, which will mean that the pair of means greatly differs

abs(mean(CW_15)-mean(CW_20))

## [1] 5.6

abs(mean(CW_15)-mean(CW_25))

## [1] 7.8

abs(mean(CW_15)-mean(CW_30))

## [1] 11.8

abs(mean(CW_15)-mean(CW_35))

## [1] 1

abs(mean(CW_20)-mean(CW_25))

## [1] 2.2

abs(mean(CW_20)-mean(CW_30))

## [1] 6.2

abs(mean(CW_20)-mean(CW_35))

## [1] 4.6

abs(mean(CW_25)-mean(CW_30))

## [1] 4

abs(mean(CW_25)-mean(CW_35))

## [1] 6.8

abs(mean(CW_30)-mean(CW_35))

## [1] 10.8

Conclusion

The pairs whose difference is less than 3.75 are: CW15 and CW35; CW20 and CW25

\[ y_{1}-y_{5}=1< LSD \]

\[ y_{2}-y_{3}=1< LSD \]

As a result, we fail to reject these two pairs. However, we reject all other hypothesis

0.3.2 Solution 3.10 C

We have gotten the plots in the previous part since we conducted the ANOVA validation

Conclusion

the normal probability plot of the residual tends towards normality since majority of the data follows a straight line.

0.4 3. Question 3.44

Suppose that four normal populations have means of \(\mu _{1}= 50, \mu _{2}= 60, \mu _{3}= 50 and \mu _{4}= 60\) How many observations should be taken from each population so that the probability of rejecting the null hypothesis of equal population means is at least 0.90? Assume that \(\alpha = 0.05\) and that a reasonable estimate of the error variance is \(\sigma = 5\).

Solution:

data = 4

\(\alpha = 0.05\)

\(\sigma = 5\)

Power = 90%

Group =4

Applying power Anova test:

power.anova.test(groups = 4, n=NULL, between.var = var(c(50,50,60,60)), within.var = 25, sig.level = 0.05, power = 0.90)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 4
##               n = 4.658128
##     between.var = 33.33333
##      within.var = 25
##       sig.level = 0.05
##           power = 0.9
## 
## NOTE: n is number in each group

Conclusion:

Number of observation to be collected is 5

0.5 4. Question 3.45

From Problem 3.44.

a. How would your answer change if a reasonable estimate of the experimental error variance were 36?

b. How would your answer change if a reasonable estimate of the experimental error variance were 49?

c. Can you draw any conclusions about the sensitivity of your answer in this particular situation about how your estimate of 2 affects the decision about sample size?

d. Can you make any recommendations about how we should use this general approach to choosing n in practice?

Solution 3a

How would your answer change if a reasonable estimate of the experimental error variance were 36?

power.anova.test(groups = 4, n=NULL, between.var = var(c(50,50,60,60)), within.var = 36, sig.level = 0.05, power = 0.90)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 4
##               n = 6.180885
##     between.var = 33.33333
##      within.var = 36
##       sig.level = 0.05
##           power = 0.9
## 
## NOTE: n is number in each group

Conclusion: We will now select 7 observation.

Solution 3b

How would your answer change if a reasonable estimate of the experimental error variance were 49?

power.anova.test(groups = 4, n=NULL, between.var = var(c(50,50,60,60)), within.var = 49, sig.level = 0.05, power = 0.90)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 4
##               n = 7.998751
##     between.var = 33.33333
##      within.var = 49
##       sig.level = 0.05
##           power = 0.9
## 
## NOTE: n is number in each group

Deduction: we will now collect 8 observations

Solution 3c

Can you draw any conclusions about the sensitivity of your answer in this particular situation about how your estimate of 2 affects the decision about sample size?

It appears that there is a linear relationship between the variance and samples collected. As the variance increases, the sample to be collected also rises.

Solution 3d

Can you make any recommendations about how we should use this general approach to choosing n in practice?

Recommendation: In designing an experiment, getting a range of variance will help make the best estimate on the samples that will be collected.

0.6 Complete R Code

#Question 3c
Mixingtech_1<-c(3129,3000,2865,2890)
Mixingtech_2<-c(3200,3300,2975,3150)
Mixingtech_3<-c(2800,2900,2985,3050)
Mixingtech_4<-c(2600,2700,2600,2765)
dat<-data.frame(Mixingtech_1,Mixingtech_2,Mixingtech_3,Mixingtech_4)

#tidying up the data 

library(tidyr)
dat<-pivot_longer(dat,c(Mixingtech_1,Mixingtech_2,Mixingtech_3,Mixingtech_4))
print(dat)

#applying one way anova and plotting the model

aov.model<-aov(value~name,data=dat)
summary(aov.model)
plot(aov.model)


#Getting the LSD value

t<-2.179
MSE<-12826
n=4

LSD = t*sqrt(2*MSE/n)
print(LSD)

#Testing the pairs
abs(mean(Mixingtech_1)-mean(Mixingtech_2))
abs(mean(Mixingtech_1)-mean(Mixingtech_3))
abs(mean(Mixingtech_2)-mean(Mixingtech_3))
abs(mean(Mixingtech_2)-mean(Mixingtech_4))
abs(mean(Mixingtech_4)-mean(Mixingtech_3))


#Question 3D
plot(aov.model)


#Question 3F

Strength <- c(3129, 3000, 2865, 2890, 3200, 3300, 2975, 3150, 2800, 2900, 2985, 3050, 2600, 2700, 2600, 2765)
Type <- c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4)
library(car)
library(carData)
scatterplot(Strength ~ Type, data=dat,
            xlab="Mixing Technique", ylab="Tensile Strength",
            main="Scatter Plot")

#Solution 3.10b

#Load the data 

CW_15<-c(7,7,15,11,9)
CW_20<-c(12,17,12,18,18)
CW_25<-c(14,19,19,18,18)
CW_30<-c(19,25,22,19,23)
CW_35<-c(7,10,11,15,11)
dat<-data.frame(CW_15,CW_20,CW_25,CW_30,CW_35)

#tidy the data

library(tidyr)
dat<-pivot_longer(dat,c(CW_15,CW_20,CW_25,CW_30,CW_35))
print(dat)

#performing one way ANOVA and plot
aov.model<-aov(value~name,data=dat)
summary(aov.model)
plot(aov.model)

#Applying the LSD Function
t<-2.086
MSE<-8.06
n=5

LSD = t*sqrt(2*MSE/n)
print(LSD)


#Treatment difference
abs(mean(CW_15)-mean(CW_20))
abs(mean(CW_15)-mean(CW_25))
abs(mean(CW_15)-mean(CW_30))
abs(mean(CW_15)-mean(CW_35))
abs(mean(CW_20)-mean(CW_30))
abs(mean(CW_25)-mean(CW_30))
abs(mean(CW_25)-mean(CW_35))


## Question 3.44
Applying Power Anova Test

power.anova.test(groups = 4, n=NULL, between.var = var(c(50,50,60,60)), within.var = 25, sig.level = 0.05, power = 0.90)

## Question 3.45

#Solution 3a

#The power anova test
power.anova.test(groups = 4, n=NULL, between.var = var(c(50,50,60,60)), within.var = 36, sig.level = 0.05, power = 0.90)


#Solution 3b
#The power anova test
power.anova.test(groups = 4, n=NULL, between.var = var(c(50,50,60,60)), within.var = 49, sig.level = 0.05, power = 0.90)

Home Work Week 5

Ponmile Ajala

Last compiled on October 01, 2023 at 11:56 AM - CDT

0.1 1. Question 3.7

0.2 Solution 3C

0.2.1 Solution 3D

0.2.2 Solution 3E

0.2.3 Solution 3F

0.3 2. Question 3.10

0.3.1 Solution 3.10b

0.3.2 Solution 3.10 C

0.4 3. Question 3.44

0.5 4. Question 3.45

0.6 Complete R Code