This report captures work done for the individual homework for Week 4. R code along with the results are provided. The required homework problems were taken from “Design and Analysis of Experiments 8th Edition”:
1) 3.7c,d,e,f
2) 3.10b,c
3) 3.44 & 3.45
Answers to the questions are in blue.
The tensile strength of Portland cement is being studied. Four different mixing techniques can be used economically. A completely randomized experiment was conducted and the following data were collected: [See table of values in book.]
(c) Use the Fisher LSD method with alpha = 0.05 to make comparisons between pairs of means.
(d) Construct a normal probability plot of the residuals. What conclusion would you draw about the validity of the normality assumption?
(e) Plot the residuals versus the predicted tensile strength. Comment on the plot.
(f) Prepare a scatter plot of the results to aid the interpretation of the results of this experiment.
The Fisher LSD comparisons are below:
##
## Study: model ~ "MixingTechniques"
##
## LSD t Test for TensileStrengths
##
## Mean Square Error: 12825.69
##
## MixingTechniques, means and individual ( 95 %) CI
##
## TensileStrengths std r LCL UCL Min Max
## Mix1 2971.00 120.55704 4 2847.624 3094.376 2865 3129
## Mix2 3156.25 135.97641 4 3032.874 3279.626 2975 3300
## Mix3 2933.75 108.27242 4 2810.374 3057.126 2800 3050
## Mix4 2666.25 80.97067 4 2542.874 2789.626 2600 2765
##
## Alpha: 0.05 ; DF Error: 12
## Critical Value of t: 2.178813
##
## Comparison between treatments means
##
## difference pvalue signif. LCL UCL
## Mix1 - Mix2 -185.25 0.0392 * -359.72984 -10.77016
## Mix1 - Mix3 37.25 0.6501 -137.22984 211.72984
## Mix1 - Mix4 304.75 0.0025 ** 130.27016 479.22984
## Mix2 - Mix3 222.50 0.0167 * 48.02016 396.97984
## Mix2 - Mix4 490.00 0.0001 *** 315.52016 664.47984
## Mix3 - Mix4 267.50 0.0059 ** 93.02016 441.97984
The normal probability plot of the residuals is below. It shows that the residuals are relatively normal.
The residuals versus the predicted tensile strengths are plotted below. There is concern that we are seeing a fan effect and that the residuals are growing as the fitted values grow.
Below is a scatter plot of the results to aid the interpretation of the results of this experiment. One can see that the model predicts with less accuracy as the tensile stengths increase.
A product developer is investigating the tensile strength of a new synthetic fiber that will be used to make cloth for men’s shirts. Strength is usually affected by the percentage of cotton used in the blend of materials for the fiber. The engineer conducts a completely randomized experiment with five levels of cotton content and replicates the experiment five times. The data are shown in the following table. [data table provided in book]
(b) Use the Fisher LSD method to make comparisons between the pairs of means. What conclusions can you draw?
(c) Analyze the residuals from theis experiment and comment on model adequacy.
The Fisher LSD comparisons are below. We can see that the comparisons between the Cotton Weights with the lowest and highest tensile strengths are significant. This is show in the statically significant comparisons of: CW15-CW25, CW15-CW30 and CW25-CW35, CW30-CW35.
##
## Study: cottonmodel ~ "CottonWeights"
##
## LSD t Test for CottonFiberTensiles
##
## Mean Square Error: 26.23217
##
## CottonWeights, means and individual ( 95 %) CI
##
## CottonFiberTensiles std r LCL UCL Min Max
## 1 9.8 3.346640 5 5.06172 14.53828 7 15
## 2 15.4 3.130495 5 10.66172 20.13828 12 18
## 3 17.6 2.073644 5 12.86172 22.33828 14 19
## 4 21.6 2.607681 5 16.86172 26.33828 19 25
## 5 10.8 2.863564 5 6.06172 15.53828 7 15
##
## Alpha: 0.05 ; DF Error: 23
## Critical Value of t: 2.068658
##
## Comparison between treatments means
##
## difference pvalue signif. LCL UCL
## 1 - 2 -5.6 0.0973 . -12.30094036 1.1009404
## 1 - 3 -7.8 0.0245 * -14.50094036 -1.0990596
## 1 - 4 -11.8 0.0014 ** -18.50094036 -5.0990596
## 1 - 5 -1.0 0.7603 -7.70094036 5.7009404
## 2 - 3 -2.2 0.5038 -8.90094036 4.5009404
## 2 - 4 -6.2 0.0681 . -12.90094036 0.5009404
## 2 - 5 4.6 0.1690 -2.10094036 11.3009404
## 3 - 4 -4.0 0.2294 -10.70094036 2.7009404
## 3 - 5 6.8 0.0470 * 0.09905964 13.5009404
## 4 - 5 10.8 0.0029 ** 4.09905964 17.5009404
The residuals have been analyzed to ensure model adequacy. In particular the residuals need to show rough normality and constant variance. The normality assumption is observed and met in the first plot with an additional Shapiro hypothesis test, and the constant variance is shown by the similar lengths of the residuals for each of the fitted values. While the residuals are different for each, it is clear that the high - low spread for each fitted value is rather constant.
##
## Shapiro-Wilk normality test
##
## data: residuals(cottonmodel)
## W = 0.94698, p-value = 0.2141
Based on information listed in question 3.44 in the book, how many observation should be taken from each population so that the probability of rejecting the null hypothesis of equal population means is at least 0.90?
Based on the output below, the samples needed for each are 5.
##
## Balanced one-way analysis of variance power calculation
##
## k = 4
## n = 4.658119
## f = 1
## sig.level = 0.05
## power = 0.9
##
## NOTE: n is number in each group
(a) How would your answer change if a reasonable estimate of the experimental error variance were sigma^2 =36?
Based on the output below, the samples grow to 6.18 which is rounded up to 7.
##
## Balanced one-way analysis of variance power calculation
##
## k = 4
## n = 6.180857
## f = 0.8333333
## sig.level = 0.05
## power = 0.9
##
## NOTE: n is number in each group
(b) How would your answer change if a reasonable estimate of the experimental error variance were sigma^2 =49?
Based on the output below, the samples grow to 8.
##
## Balanced one-way analysis of variance power calculation
##
## k = 4
## n = 7.998751
## f = 0.7142857
## sig.level = 0.05
## power = 0.9
##
## NOTE: n is number in each group
(c) Can you draw any conclusions about the sensitivity of your answer in this particular situation about how your estimate of sigma affects the decision about sample size?
With all other variables held constant, as the variance increases, more samples need to be taken to detect a statistically significant difference. For this small window of observation, as sigma increases by 1, the samples needed is growing approximately by 1 also. This researcher would not speculate if this holds true in other situations.
(d) Can you make any recommendations about how we should use this general approach to choosing n in practice?
From this small sample one could postulate that sample size always needs to be at least 1 greater than the variance.