The tensile strength of Portland cement is being studied. Four different mixing techniques can be used economically. A completely randomized experiment was conducted and the following data were collected:
Observations of Tensile Strength
| Mixing Technique | 1 | 2 | 3 | 4 |
| 1 | 3129 | 3000 | 2865 | 2890 |
| 2 | 3200 | 3300 | 2975 | 3150 |
| 3 | 2800 | 2900 | 2985 | 3050 |
| 4 | 2600 | 2700 | 2600 | 2765 |
c. Use the Fisher LSD method with \alpha 0.05 to make comparisons between pairs of means.
d. Construct a normal probability plot of the residuals. What conclusion would you draw about the validity of the normality assumption?
f. Plot the residuals versus the predicted tensile strength. Comment on the plot.
g. Prepare a scatter plot of the results to aid the interpretation of the results of this experiment
Reading the Data:
Mixingtech_1<-c(3129,3000,2865,2890)
Mixingtech_2<-c(3200,3300,2975,3150)
Mixingtech_3<-c(2800,2900,2985,3050)
Mixingtech_4<-c(2600,2700,2600,2765)
dat<-data.frame(Mixingtech_1,Mixingtech_2,Mixingtech_3,Mixingtech_4)
Next we create a tidy Data
library(tidyr)
dat<-pivot_longer(dat,c(Mixingtech_1,Mixingtech_2,Mixingtech_3,Mixingtech_4))
print(dat)
## # A tibble: 16 × 2
## name value
## <chr> <dbl>
## 1 Mixingtech_1 3129
## 2 Mixingtech_2 3200
## 3 Mixingtech_3 2800
## 4 Mixingtech_4 2600
## 5 Mixingtech_1 3000
## 6 Mixingtech_2 3300
## 7 Mixingtech_3 2900
## 8 Mixingtech_4 2700
## 9 Mixingtech_1 2865
## 10 Mixingtech_2 2975
## 11 Mixingtech_3 2985
## 12 Mixingtech_4 2600
## 13 Mixingtech_1 2890
## 14 Mixingtech_2 3150
## 15 Mixingtech_3 3050
## 16 Mixingtech_4 2765
Applying One Way ANOVA:
aov.model<-aov(value~name,data=dat)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## name 3 489740 163247 12.73 0.000489 ***
## Residuals 12 153908 12826
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Now getting the various plots:
plot(aov.model)
Next we define the hypothesis and conduct the LSD test:
\[ H_{o}:\mu _{i}-\mu{j}= 0 \]
\[ H_{o}:\mu _{i}-\mu{j}\neq 0 \]
i and J represents the mixing techniques
We now calculate the least significant difference with this equation:
\[ LSD= \tau _{\alpha /2,N-1*}\sqrt{\frac{2MSE}{n}} \]
From our ANOVA result:
t<-2.179
MSE<-12826
n=4
LSD = t*sqrt(2*MSE/n)
print(LSD)
## [1] 174.497
Our LSD value=174.5
Any treatment average that is 174.5 more will imply a huge difference:
Testing each pairs below:
abs(mean(Mixingtech_1)-mean(Mixingtech_2))
## [1] 185.25
abs(mean(Mixingtech_1)-mean(Mixingtech_3))
## [1] 37.25
abs(mean(Mixingtech_4)-mean(Mixingtech_1))
## [1] 304.75
abs(mean(Mixingtech_2)-mean(Mixingtech_3))
## [1] 222.5
abs(mean(Mixingtech_2)-mean(Mixingtech_4))
## [1] 490
abs(mean(Mixingtech_4)-mean(Mixingtech_3))
## [1] 267.5
Conclusion
We accept all the pair except
he only pair of means that we fail to reject is \(\mu_{1} and \mu_{3}\) , because the difference in means is less than the LSD. i.e.,
\[ y_{1}-y_{3}=37.5< LSD \]
plot(aov.model)
Conclusion: SInce most seems to lie on a straight line, we conclude the plot is normal.
From the "Residuals vs Fitted" graph, the residual spread is not scattered. As a result, we uphold the assumption of constant variance.
Strength <- c(3129, 3000, 2865, 2890, 3200, 3300, 2975, 3150, 2800, 2900, 2985, 3050, 2600, 2700, 2600, 2765)
Type <- c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4)
library(car)
library(carData)
scatterplot(Strength ~ Type, data=dat,
xlab="Mixing Technique", ylab="Tensile Strength",
main="Scatter Plot")
Deductions
a. Mixing techniques one and three appear to follow the same pattern, and shows significant difference from the rest.
b. The scatter plot is somewhat similar to the graph of “Residual vs. Factors Levels”.
A product developer is investigating the tensile strength of a new synthetic fiber that will be used to make cloth for men's shirts. Strength is usually affected by the percentage of cotton used in the blend of materials for the fiber. The engineer conducts a completely randomized experiment with five levels of cotton content and replicates the experiment five times. The data are shown in the following table.
Observations of Tensile Strength of cloth fibre
| Cotton Weight % | 1 | 2 | 3 | 4 | 5 |
| 15 | 7 | 7 | 15 | 11 | 9 |
| 20 | 12 | 17 | 12 | 18 | 18 |
| 25 | 14 | 19 | 19 | 18 | 18 |
| 30 | 19 | 25 | 22 | 19 | 23 |
| 35 | 7 | 10 | 11 | 15 | 11 |
b. Use the Fisher LSD method to make comparisons between the pairs of means. What conclusions can you draw?
c. Analyze the residuals from this experiment and comment on model adequacy.
Loading the data
CW_15<-c(7,7,15,11,9)
CW_20<-c(12,17,12,18,18)
CW_25<-c(14,19,19,18,18)
CW_30<-c(19,25,22,19,23)
CW_35<-c(7,10,11,15,11)
dat<-data.frame(CW_15,CW_20,CW_25,CW_30,CW_35)
Creating a tidy data with Pivot Longer command
library(tidyr)
dat<-pivot_longer(dat,c(CW_15,CW_20,CW_25,CW_30,CW_35))
print(dat)
## # A tibble: 25 × 2
## name value
## <chr> <dbl>
## 1 CW_15 7
## 2 CW_20 12
## 3 CW_25 14
## 4 CW_30 19
## 5 CW_35 7
## 6 CW_15 7
## 7 CW_20 17
## 8 CW_25 19
## 9 CW_30 25
## 10 CW_35 10
## # ℹ 15 more rows
One Way ANOVA:
aov.model<-aov(value~name,data=dat)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## name 4 475.8 118.94 14.76 9.13e-06 ***
## Residuals 20 161.2 8.06
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Plotting the Model
plot(aov.model)
Finally, we define the hypothesis for the LSD test:
\[ H_{o}:\mu _{i}-\mu{j}= 0 \]
\[ H_{o}:\mu _{i}-\mu{j}\neq 0 \]
i and J represents the cotton weight percentages
We now calculate the least significant difference with this equation:
\[ LSD= \tau _{\alpha /2,N-1*}\sqrt{\frac{2MSE}{n}} \]
Since our data is balanced, we assume n1=n2=n3=n4
t<-2.086
MSE<-8.06
n=5
Applying the LSD Equation
LSD = t*sqrt(2*MSE/n)
print(LSD)
## [1] 3.745517
Next, we look for any treatment difference that is less than 3.745, which will mean that the pair of means greatly differs
abs(mean(CW_15)-mean(CW_20))
## [1] 5.6
abs(mean(CW_15)-mean(CW_25))
## [1] 7.8
abs(mean(CW_15)-mean(CW_30))
## [1] 11.8
abs(mean(CW_15)-mean(CW_35))
## [1] 1
abs(mean(CW_20)-mean(CW_25))
## [1] 2.2
abs(mean(CW_20)-mean(CW_30))
## [1] 6.2
abs(mean(CW_20)-mean(CW_35))
## [1] 4.6
abs(mean(CW_25)-mean(CW_30))
## [1] 4
abs(mean(CW_25)-mean(CW_35))
## [1] 6.8
abs(mean(CW_30)-mean(CW_35))
## [1] 10.8
Conclusion
The pairs whose difference is less than 3.75 are: CW15 and CW35; CW20 and CW25
\[ y_{1}-y_{5}=1< LSD \]
\[ y_{2}-y_{3}=1< LSD \]
As a result, we fail to reject these two pairs. However, we reject all other hypothesis
We have gotten the plots in the previous part since we conducted the ANOVA validation
Conclusion
the normal probability plot of the residual tends towards normality since majority of the data follows a straight line.
Suppose that four normal populations have means of \(\mu _{1}= 50, \mu _{2}= 60, \mu _{3}= 50 and \mu _{4}= 60\) How many observations should be taken from each population so that the probability of rejecting the null hypothesis of equal population means is at least 0.90? Assume that \(\alpha = 0.05\) and that a reasonable estimate of the error variance is \(\sigma = 5\).
Solution:
data = 4
\(\alpha = 0.05\)
\(\sigma = 5\)
Power = 90%
Group =4
Applying power Anova test:
power.anova.test(groups = 4, n=NULL, between.var = var(c(50,50,60,60)), within.var = 25, sig.level = 0.05, power = 0.90)
##
## Balanced one-way analysis of variance power calculation
##
## groups = 4
## n = 4.658128
## between.var = 33.33333
## within.var = 25
## sig.level = 0.05
## power = 0.9
##
## NOTE: n is number in each group
Conclusion:
Number of observation to be collected is 5
From Problem 3.44.
a. How would your answer change if a reasonable estimate of the experimental error variance were 36?
b. How would your answer change if a reasonable estimate of the experimental error variance were 49?
c. Can you draw any conclusions about the sensitivity of your answer in this particular situation about how your estimate of 2 affects the decision about sample size?
d. Can you make any recommendations about how we should use this general approach to choosing n in practice?
Solution 3a
How would your answer change if a reasonable estimate of the experimental error variance were 36?
power.anova.test(groups = 4, n=NULL, between.var = var(c(50,50,60,60)), within.var = 36, sig.level = 0.05, power = 0.90)
##
## Balanced one-way analysis of variance power calculation
##
## groups = 4
## n = 6.180885
## between.var = 33.33333
## within.var = 36
## sig.level = 0.05
## power = 0.9
##
## NOTE: n is number in each group
Conclusion: We will now select 7 observation.
Solution 3b
How would your answer change if a reasonable estimate of the experimental error variance were 49?
power.anova.test(groups = 4, n=NULL, between.var = var(c(50,50,60,60)), within.var = 49, sig.level = 0.05, power = 0.90)
##
## Balanced one-way analysis of variance power calculation
##
## groups = 4
## n = 7.998751
## between.var = 33.33333
## within.var = 49
## sig.level = 0.05
## power = 0.9
##
## NOTE: n is number in each group
Deduction: we will now collect 8 observations
Solution 3c
Can you draw any conclusions about the sensitivity of your answer in this particular situation about how your estimate of 2 affects the decision about sample size?
It appears that there is a linear relationship between the variance and samples collected. As the variance increases, the sample to be collected also rises.
Solution 3d
Can you make any recommendations about how we should use this general approach to choosing n in practice?
Recommendation: In designing an experiment, getting a range of variance will help make the best estimate on the samples that will be collected.
#Question 3c
Mixingtech_1<-c(3129,3000,2865,2890)
Mixingtech_2<-c(3200,3300,2975,3150)
Mixingtech_3<-c(2800,2900,2985,3050)
Mixingtech_4<-c(2600,2700,2600,2765)
dat<-data.frame(Mixingtech_1,Mixingtech_2,Mixingtech_3,Mixingtech_4)
#tidying up the data
library(tidyr)
dat<-pivot_longer(dat,c(Mixingtech_1,Mixingtech_2,Mixingtech_3,Mixingtech_4))
print(dat)
#applying one way anova and plotting the model
aov.model<-aov(value~name,data=dat)
summary(aov.model)
plot(aov.model)
#Getting the LSD value
t<-2.179
MSE<-12826
n=4
LSD = t*sqrt(2*MSE/n)
print(LSD)
#Testing the pairs
abs(mean(Mixingtech_1)-mean(Mixingtech_2))
abs(mean(Mixingtech_1)-mean(Mixingtech_3))
abs(mean(Mixingtech_2)-mean(Mixingtech_3))
abs(mean(Mixingtech_2)-mean(Mixingtech_4))
abs(mean(Mixingtech_4)-mean(Mixingtech_3))
#Question 3D
plot(aov.model)
#Question 3F
Strength <- c(3129, 3000, 2865, 2890, 3200, 3300, 2975, 3150, 2800, 2900, 2985, 3050, 2600, 2700, 2600, 2765)
Type <- c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4)
library(car)
library(carData)
scatterplot(Strength ~ Type, data=dat,
xlab="Mixing Technique", ylab="Tensile Strength",
main="Scatter Plot")
#Solution 3.10b
#Load the data
CW_15<-c(7,7,15,11,9)
CW_20<-c(12,17,12,18,18)
CW_25<-c(14,19,19,18,18)
CW_30<-c(19,25,22,19,23)
CW_35<-c(7,10,11,15,11)
dat<-data.frame(CW_15,CW_20,CW_25,CW_30,CW_35)
#tidy the data
library(tidyr)
dat<-pivot_longer(dat,c(CW_15,CW_20,CW_25,CW_30,CW_35))
print(dat)
#performing one way ANOVA and plot
aov.model<-aov(value~name,data=dat)
summary(aov.model)
plot(aov.model)
#Applying the LSD Function
t<-2.086
MSE<-8.06
n=5
LSD = t*sqrt(2*MSE/n)
print(LSD)
#Treatment difference
abs(mean(CW_15)-mean(CW_20))
abs(mean(CW_15)-mean(CW_25))
abs(mean(CW_15)-mean(CW_30))
abs(mean(CW_15)-mean(CW_35))
abs(mean(CW_20)-mean(CW_30))
abs(mean(CW_25)-mean(CW_30))
abs(mean(CW_25)-mean(CW_35))
## Question 3.44
Applying Power Anova Test
power.anova.test(groups = 4, n=NULL, between.var = var(c(50,50,60,60)), within.var = 25, sig.level = 0.05, power = 0.90)
## Question 3.45
#Solution 3a
#The power anova test
power.anova.test(groups = 4, n=NULL, between.var = var(c(50,50,60,60)), within.var = 36, sig.level = 0.05, power = 0.90)
#Solution 3b
#The power anova test
power.anova.test(groups = 4, n=NULL, between.var = var(c(50,50,60,60)), within.var = 49, sig.level = 0.05, power = 0.90)