The tensile strength of Portland cement is being studied. Four different mixing techniques can be used economically. A completely randomized experiment was conducted and the following data were collected:
Use the Fisher LSD method with α 0.05 to make comparisons between pairs of means.
Construct a normal probability plot of the residuals. What conclusion would you draw about the validity of the normality assumption?
Plot the residuals versus the predicted tensile strength. Comment on the plot.
Prepare a scatter plot of the results to aid the interpretation of the results of this experiment
Part C:
Mixing1<-c(3129,3000,2865,2890)
Mixing2<-c(3200,3300,2975,3150)
Mixing3<-c(2800,2900,2985,3050)
Mixing4<-c(2600,2700,2600,2765)
dat<-data.frame(Mixing1,Mixing2,Mixing3,Mixing4)
library(tidyr)
dat<-pivot_longer(dat,c(Mixing1,Mixing2,Mixing3,Mixing4))
aov.model<-aov(value~name,data=dat)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## name 3 489740 163247 12.73 0.000489 ***
## Residuals 12 153908 12826
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conducting LSD test to define null hypothesis first:
\[ H_o: μ_1-μ_j=0 \] \[ H_a: μ_1-μ_j≠0 \]
where, i and j correspond to Cotton Weight Percentages.
To calculate the least significant difference using the equation below: \[ LSD=\tau _{a/2,N-1}*\sqrt{{(2MSE)/n}} \] From t-table, We note down our t-statistic and MSE from ANOVA results.
t<-2.179
MSE<-12826
n=4
LSD = t*sqrt(2*MSE/n)
print(LSD)
## [1] 174.497
If the difference exceeds 174.5, it would imply that pair of means significance differs.
abs(mean(Mixing1)-mean(Mixing2)) #reject
## [1] 185.25
abs(mean(Mixing1)-mean(Mixing3)) #fail to reject
## [1] 37.25
abs(mean(Mixing1)-mean(Mixing4)) #reject
## [1] 304.75
abs(mean(Mixing2)-mean(Mixing3)) #reject
## [1] 222.5
abs(mean(Mixing2)-mean(Mixing4)) #reject
## [1] 490
abs(mean(Mixing3)-mean(Mixing4)) #reject
## [1] 267.5
Conclusion
The only pair of means that we fail to reject the null hypothesis is μ1 & μ3 because the difference in means is less than the LSD value of 174.5. For all pairs of means, we reject the null hypothesis and conclude there is a significance difference between population means.
PART D:
The below normal probability plot shows to be very close to normality as the data follows a single line pattern.
plot(aov.model)
PART E:
The “Residual vs Fitted” values shows a spread that is not that for off and thus the assumption of constant variance is accurate.
PART F:
Strength <- c(3129, 3000, 2865, 2890, 3200, 3300, 2975, 3150, 2800, 2900, 2985, 3050, 2600, 2700, 2600, 2765)
Type <- c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4)
library(car)
library(carData)
scatterplot(Strength ~ Type, data=dat,
xlab="Mixing Technique", ylab="Tensile Strength",
main="Scatter Plot")
The plot shown above indicates the sample average for each treatment and the 95% conf interval on the treatment mean.
A product developer is investigating the tensile strength of a new synthetic fiber that will be used to make cloth for men’s shirts. Strength is usually affected by the percentage of cotton used in the blend of materials for the fiber. The engineer conducts a completely randomized experiment with five levels of cotton content and replicates the experiment five times. The data are shown in the following table.
Use the Fisher LSD method to make comparisons between the pairs of means. What conclusions can you draw?
Analyze the residuals from this experiment and comment on model adequacy.
PART B:
CW15<-c(7,7,15,11,9)
CW20<-c(12,17,12,18,18)
CW25<-c(14,19,19,18,18)
CW30<-c(19,25,22,19,23)
CW35<-c(7,10,11,15,11)
dat<-data.frame(CW15,CW20,CW25,CW30,CW35)
dat<-pivot_longer(dat,c(CW15,CW20,CW25,CW30,CW35))
aov.model<-aov(value~name,data=dat)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## name 4 475.8 118.94 14.76 9.13e-06 ***
## Residuals 20 161.2 8.06
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conducting LSD test to define null hypothesis first:
\[ H_o: μ_1-μ_j=0 \]
\[ H_a: μ_1-μ_j≠0 \]
where, i and j correspond to Cotton Weight Percentages.
To calculate the least significant difference using the equation below:
\[ LSD=\tau _{a/2,N-1}*\sqrt{{(2MSE)/n}} \]
From t-table, We note down our t-statistic and MSE from ANOVA results and use the LSD Equation:
t<-2.086
MSE<-8.06
n=5
LSD = t*sqrt(2*MSE/n)
print(LSD)
## [1] 3.745517
If the difference exceeds 3.745517 , it would imply that pair of means significance differs.
abs(mean(CW15)-mean(CW20))
## [1] 5.6
abs(mean(CW15)-mean(CW25))
## [1] 7.8
abs(mean(CW15)-mean(CW30))
## [1] 11.8
abs(mean(CW15)-mean(CW35))
## [1] 1
abs(mean(CW20)-mean(CW25))
## [1] 2.2
abs(mean(CW20)-mean(CW30))
## [1] 6.2
abs(mean(CW20)-mean(CW35))
## [1] 4.6
abs(mean(CW25)-mean(CW30))
## [1] 4
abs(mean(CW25)-mean(CW35))
## [1] 6.8
abs(mean(CW30)-mean(CW35))
## [1] 10.8
We fail to reject the null hypothesis on pairs (μ1 & μ5) and (μ2 & μ3). For all other pairs, we reject the null hypothesis.
PART B:
The below normal probability plot shows to be very close to normality as the data follows a single line pattern.
The “Residual vs Fitted” values shows a spread that is not that for off and thus the assumption of constant variance is accurate.
plot(aov.model)
Suppose that four normal populations have means of μ1=50 , μ2=60, μ3=50, and μ4=60 How many observations should be taken from each population so that the probability of rejecting the null hypothesis of equal population means is at least 0.90? Assume that α = 0.05 and that a reasonable estimate of the error variance is σ=5.
library(pwr)
#d= (μMax-μMin)/σ,
d=(60-50)/5
f=d/2
print(f)
## [1] 1
pwr.anova.test(k=4,n=NULL,f=1, sig.level = 0.05, power = 0.90)
##
## Balanced one-way analysis of variance power calculation
##
## k = 4
## n = 4.658119
## f = 1
## sig.level = 0.05
## power = 0.9
##
## NOTE: n is number in each group
We would require at least 5 observations
Refer to Problem 3.44. a. How would your answer change if a reasonable estimate of the experimental error variance were 36? b. How would your answer change if a reasonable estimate of the experimental error variance were 49? c. Can you draw any conclusions about the sensitivity of your answer in this particular situation about how your estimate of 2 affects the decision about sample size? d. Can you make any recommendations about how we should use this general approach to choosing n in practice?
PART A:
Variance=36
sigma=6
d=(60-50)/sigma
print(d)
## [1] 1.666667
f=d/2
print(f)
## [1] 0.8333333
pwr.anova.test(k=4,n=NULL,f=0.8333333, sig.level = 0.05, power = 0.90)
##
## Balanced one-way analysis of variance power calculation
##
## k = 4
## n = 6.180858
## f = 0.8333333
## sig.level = 0.05
## power = 0.9
##
## NOTE: n is number in each group
We would require at least 7 observations
PART B:
Variance=49
sigma=7
d=(60-50)/sigma
print(d)
## [1] 1.428571
f=d/2
print(f)
## [1] 0.7142857
pwr.anova.test(k=4,n=NULL,f=0.7142857, sig.level = 0.05, power = 0.90)
##
## Balanced one-way analysis of variance power calculation
##
## k = 4
## n = 7.998751
## f = 0.7142857
## sig.level = 0.05
## power = 0.9
##
## NOTE: n is number in each group
We would require at least 8 observations
PART C:
As the variances increases, the number of samples must also increase to obtain the results using the given significant and power level
PART D:
It would be practical to know the upper and lower limit of observations required, to properly assess the type of data collection tool to be used.
#3.7
Mixing1<-c(3129,3000,2865,2890)
Mixing2<-c(3200,3300,2975,3150)
Mixing3<-c(2800,2900,2985,3050)
Mixing4<-c(2600,2700,2600,2765)
dat<-data.frame(Mixing1,Mixing2,Mixing3,Mixing4)
library(tidyr)
dat<-pivot_longer(dat,c(Mixing1,Mixing2,Mixing3,Mixing4))
aov.model<-aov(value~name,data=dat)
summary(aov.model)
t<-2.179
MSE<-12826
n=4
LSD = t*sqrt(2*MSE/n)
print(LSD)
#If difference of any treatment average exceeds by more than 174.5 would imply that that pair of means significantly differs
abs(mean(Mixing1)-mean(Mixing2)) #reject
abs(mean(Mixing1)-mean(Mixing3)) #fail to reject
abs(mean(Mixing1)-mean(Mixing4)) #reject
abs(mean(Mixing2)-mean(Mixing3)) #reject
abs(mean(Mixing2)-mean(Mixing4)) #reject
abs(mean(Mixing3)-mean(Mixing4)) #reject
plot(aov.model)
Strength <- c(3129, 3000, 2865, 2890, 3200, 3300, 2975, 3150, 2800, 2900, 2985, 3050, 2600, 2700, 2600, 2765)
Type <- c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4)
library(car)
library(carData)
scatterplot(Strength ~ Type, data=dat,
xlab="Mixing Technique", ylab="Tensile Strength",
main="Scatter Plot")
#3.10
CW15<-c(7,7,15,11,9)
CW20<-c(12,17,12,18,18)
CW25<-c(14,19,19,18,18)
CW30<-c(19,25,22,19,23)
CW35<-c(7,10,11,15,11)
dat<-data.frame(CW15,CW20,CW25,CW30,CW35)
dat<-pivot_longer(dat,c(CW15,CW20,CW25,CW30,CW35))
aov.model<-aov(value~name,data=dat)
summary(aov.model)
t<-2.086
MSE<-8.06
n=5
LSD = t*sqrt(2*MSE/n)
print(LSD)
#If difference of any treatment average exceeds by more than 3.745 would imply that that pair of means significantly differs
abs(mean(CW15)-mean(CW20))
abs(mean(CW15)-mean(CW25))
abs(mean(CW15)-mean(CW30))
abs(mean(CW15)-mean(CW35))
abs(mean(CW20)-mean(CW25))
abs(mean(CW20)-mean(CW30))
abs(mean(CW20)-mean(CW35))
abs(mean(CW25)-mean(CW30))
abs(mean(CW25)-mean(CW35))
abs(mean(CW30)-mean(CW35))
plot(aov.model)
#3.44
library(pwr)
#d= (μMax-μMin)/σ,
d=(60-50)/5
f=d/2
print(f)
pwr.anova.test(k=4,n=NULL,f=1, sig.level = 0.05, power = 0.90)
#3.45
Variance=36
sigma=6
d=(60-50)/sigma
print(d)
f=d/2
print(f)
pwr.anova.test(k=4,n=NULL,f=0.8333333, sig.level = 0.05, power = 0.90)
Variance=49
sigma=7
d=(60-50)/sigma
print(d)
f=d/2
print(f)
pwr.anova.test(k=4,n=NULL,f=0.7142857, sig.level = 0.05, power = 0.90)