A bacteriologist is interested in the effects of two different culture media and two different times on the growth of a particular virus. He or she performs six replicates of a 22 design, making the runs in random order. Analyze the bacterial growth data that follow and draw appropriate conclusions. Analyze the residuals and comment on the model’s adequacy
c1 <- c(21,23,20,37,38,35)
c2 <- c(22,28,26,39,38,36)
c3 <- c(25,24,29,31,29,30)
c4 <- c(26,25,27,34,33,35)
observation <- c(c1,c2,c3,c4)
culture <- c(rep(-1,12),rep(1,12))
time <- rep(c(rep(-1,3),rep(1,3)),4)
dat <- as.data.frame(cbind(observation,culture,time))
model <- lm(observation~culture*time,data=dat)
halfnormal(model)
##
## Significant effects (alpha=0.05, Lenth method):
## [1] time culture:time
interaction.plot(culture,time,observation)
model <- aov(observation~culture*time,data=dat)
summary(model)
## Df Sum Sq Mean Sq F value Pr(>F)
## culture 1 9.4 9.4 1.835 0.190617
## time 1 590.0 590.0 115.506 9.29e-10 ***
## culture:time 1 92.0 92.0 18.018 0.000397 ***
## Residuals 20 102.2 5.1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(model)
## hat values (leverages) are all = 0.1666667
## and there are no factor predictors; no plot no. 5
Time and Culture medium *time (interaction) have effect on Growth rate. The residuals plot show very small inaquality of variance.
An artcle in the AT&T Technical Journal (March/April 1986, Vol. 65, pp. 39–50) describes the application of two-level factorial designs to integrated circuit manufacturing. A basic processing step is to grow an epitaxial layer on polished silicon wafers. The wafers mounted on a susceptor are positioned inside a bell jar, and chemical vapors are introduced. The susceptible is rotated, and heat is applied until the epitaxial layer is thick enough. An experiment was run using two factors: arsenic flow rate (A) and deposition time (B). Four replicates were run, and the epitaxial layer thickness was measured ( m). The data are shown in Table P6.1.
Arsenic <- c(rep(-1,4),rep(1,4),rep(-1,4),rep(1,4))
Deposition <- c(rep(-1,4),rep(-1,4),rep(1,4),rep(1,4))
Thickness <- c(14.037,16.165,13.972,13.907,13.880,13.860,14.032,13.914,14.821,14.757,14.843,14.878,14.888,14.921,14.415,14.932)
datB <- data.frame(Arsenic,Deposition,Thickness)
### a)
modelB <- lm(Thickness~Arsenic*Deposition, data = datB)
coef(modelB)
## (Intercept) Arsenic Deposition Arsenic:Deposition
## 14.513875 -0.158625 0.293000 0.140750
### b)
anova(modelB)
## Analysis of Variance Table
##
## Response: Thickness
## Df Sum Sq Mean Sq F value Pr(>F)
## Arsenic 1 0.4026 0.40259 1.2619 0.28327
## Deposition 1 1.3736 1.37358 4.3054 0.06016 .
## Arsenic:Deposition 1 0.3170 0.31697 0.9935 0.33856
## Residuals 12 3.8285 0.31904
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
### d)
plot(modelB)
## hat values (leverages) are all = 0.25
## and there are no factor predictors; no plot no. 5
a) Estimate the factor effects:
Arsenic 0.31725
Deposition 0.586
Interaction 0.2815
b) By looking at the anova summary, we can see that all the factors and their interaction are not significant.
c) y(Thickness)=14.51−0.158(A)+0.293(B)+0.14075(AB)+Error
d) Observation #2 is out of the groupings in the normal probability plot and the plot of residual versus predicted.
e)I would would replace the observation with the average of the observations from that experimental cell.
I am always interested in improving my golf scores. Since a typical golfer uses the putter for about 35–45 percent of his or her strokes, it seems reasonable that improving one’s putting is a logical and perhaps simple way to improve a golf score (“The man who can putt is a match for any man.”— Willie Parks, 1864–1925, two time winner of the British Open). An experiment was conducted to study the effects of four factors on putting accuracy. The design factors are length of putt, type of putter, breaking putt versus straight putt, and level versus downhill putt. The response variable is distance from the ball to the center of the cup after the ball comes to rest. One golfer performs the experiment, a 24 factorial design with seven replicates was used, and all putts are made in random order. The results are shown in Table P6.4.
Length <- c(rep(-1,7),rep(1,7),rep(-1,7),rep(1,7),rep(-1,7),rep(1,7),rep(-1,7),rep(1,7),rep(-1,7),rep(1,7),rep(-1,7),rep(1,7),rep(-1,7),rep(1,7),rep(-1,7),rep(1,7))
Type <- c(rep(-1,7),rep(-1,7),rep(1,7),rep(1,7),rep(-1,7),rep(-1,7),rep(1,7),rep(1,7),rep(-1,7),rep(-1,7),rep(1,7),rep(1,7),rep(-1,7),rep(-1,7),rep(1,7),rep(1,7))
Break <- c(rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7))
Slope <- c(rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7))
Distance <- c(10,18,14,12.5,19,16,18.5,
0,16.5,4.5,17.5,20.5,17.5,33,
4,6,1,14.5,12,14,5,
0,10,34,11,25.5,21.5,0,
0,0,18.5,19.5,16,15,11,
5,20.5,18,20,29.5,19,10,
6.5,18.5,7.5,6,0,10,0,
16.5,4.5,0,23.5,8,8,8,
4.5,18,14.5,10,0,17.5,6,
19.5,18,16,5.5,10,7,36,
15,16,8.5,0,0.5,9,3,
41.5,39,6.5,3.5,7,8.5,36,
8,4.5,6.5,10,13,41,14,
21.5,10.5,6.5,0,15.5,24,16,
0,0,0,4.5,1,4,6.5,
18,5,7,10,32.5,18.5,8)
datC <- data.frame(Length,Type,Break,Slope,Distance)
modelC <- lm(Distance~Length*Type*Break*Slope, data = datC)
halfnormal(modelC)
##
## Significant effects (alpha=0.05, Lenth method):
## [1] Length e95 e28 e44 e49 Type e84 e32 e78
modelC <- aov(Distance~Length*Type,data=datC)
plot(modelC)
## hat values (leverages) are all = 0.03571429
## and there are no factor predictors; no plot no. 5
summary(modelC)
## Df Sum Sq Mean Sq F value Pr(>F)
## Length 1 917 917.1 10.969 0.00126 **
## Type 1 388 388.1 4.642 0.03342 *
## Length:Type 1 219 218.7 2.615 0.10875
## Residuals 108 9030 83.6
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
H0: (τβij) = 0
H1: At least one (τβij) ≠ 0
a) The only significant factors are the length and type.
b) The Normality Plot has outliers, which appears to violate the normality assumptions.And the Residual vs Fitted Plot shows inconsistant variance. As both of these violate the assumptions of ANOVA, this model is not adequate, therefore a square root transformation could correct the violations.
Resistivity on a silicon wafer is influenced by several factors. The results of a 24 factorial experiment performed during a critical processing step is shown in Table P6.10.
a <- c(-1,1)
b <- c(-1,-1,1,1)
c <- c(-1,-1,-1,-1,1,1,1,1)
d <- c(-1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,1,1,1,1)
A <- c(rep(a,8))
B <- c(rep(b,4))
C <- c(rep(c,2))
D <- c(rep(d,1))
observation <- c(1.92,11.28,1.09,5.75,2.13,9.53,1.03,5.35,1.60,11.73,1.16,4.68,2.16,9.11,1.07,5.30)
datD <- as.data.frame(cbind(A,B,C,D,observation))
modelD <- lm(observation~A*B*C*D,data=datD)
halfnormal(modelD)
##
## Significant effects (alpha=0.05, Lenth method):
## [1] A B A:B A:B:C
modelD <- aov(observation~A*B*C,data=datD)
summary(modelD)
## Df Sum Sq Mean Sq F value Pr(>F)
## A 1 159.83 159.83 1563.061 1.84e-10 ***
## B 1 36.09 36.09 352.937 6.66e-08 ***
## C 1 0.78 0.78 7.616 0.02468 *
## A:B 1 18.30 18.30 178.933 9.33e-07 ***
## A:C 1 1.42 1.42 13.907 0.00579 **
## B:C 1 0.84 0.84 8.232 0.02085 *
## A:B:C 1 1.90 1.90 18.556 0.00259 **
## Residuals 8 0.82 0.10
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(modelD)
## hat values (leverages) are all = 0.5
## and there are no factor predictors; no plot no. 5
coef(modelD)
## (Intercept) A B C A:B A:C
## 4.680625 3.160625 -1.501875 -0.220625 -1.069375 -0.298125
## B:C A:B:C
## 0.229375 0.344375
mean(observation)
## [1] 4.680625
qqnorm(observation)
observation <- log(observation)
dat <- as.data.frame(cbind(A,B,C,D,observation))
modelD <- lm(observation~A*B*C*D,data=datD)
coef(modelD)
## (Intercept) A B C D A:B
## 4.680625 3.160625 -1.501875 -0.220625 -0.079375 -1.069375
## A:C B:C A:D B:D C:D A:B:C
## -0.298125 0.229375 -0.056875 -0.046875 0.029375 0.344375
## A:B:D A:C:D B:C:D A:B:C:D
## -0.096875 -0.010625 0.094375 0.141875
a) Factor effect can be observed in the table above!
b) The normal probability plot of residuals is not satisfactory. A non-constant variance can be observed in the plots of residual versus predicted, residual versus factor A, and the residual versus factor B. .
An article in Quality and Reliability Engineering International (2010, Vol. 26, pp. 223–233) presents a 25 factorial design. The experiment is shown in Table P6.12.
a <- c(-1,1)
b <- c(-1,-1,1,1)
c <- c(-1,-1,-1,-1,1,1,1,1)
d <- c(-1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,1,1,1,1)
A <- c(rep(a,16))
B <- c(rep(b,8))
C <- c(rep(c,4))
D <- c(rep(d,2))
E <- c(rep(-1,16),rep(1,16))
observation <- c(8.11,5.56,5.77,5.82,9.17,7.8,3.23,5.69,8.82,14.23,9.2,8.94,8.68,11.49,6.25,9.12,7.93,5,7.47,12,9.86,3.65,6.4,11.61,12.43,17.55,8.87,25.38,13.06,18.85,11.78,26.05)
dat <- as.data.frame(cbind(A,B,C,D,E,observation))
model <- lm(observation~A*B*C*D*E,data=dat)
halfnormal(model)
##
## Significant effects (alpha=0.05, Lenth method):
## [1] D E A:D A D:E B:E A:B A:B:E A:E A:D:E
model <- aov(observation~A*B*D*E,data=dat)
plot(model)
## hat values (leverages) are all = 0.5
## and there are no factor predictors; no plot no. 5
interaction.plot(A,D,observation)
interaction.plot(D,E,observation)
interaction.plot(B,E,observation)
interaction.plot(A,B,observation)
interaction.plot(A,E,observation)
dat <- as.data.frame(cbind(A,B,D,E,observation))
model <- lm(observation~A*B*D*E,data=dat)
interaction.plot(A,D,observation)
interaction.plot(D,E,observation)
interaction.plot(B,E,observation)
interaction.plot(A,B,observation)
interaction.plot(A,E,observation)
halfnormal(model)
##
## Significant effects (alpha=0.05, Lenth method):
## [1] D E A:D A D:E B:E A:B A:B:E A:E A:D:E e10
coef(model)
## (Intercept) A B D E A:B
## 10.1803125 1.6159375 0.0434375 2.9884375 2.1878125 1.2365625
## A:D B:D A:E B:E D:E A:B:D
## 1.6665625 -0.0134375 1.0271875 1.2834375 1.3896875 -0.3453125
## A:B:E A:D:E B:D:E A:B:D:E
## 1.1853125 0.9015625 -0.0396875 0.4071875
mean(observation) # grand mean of observation
## [1] 10.18031
a) The significant factors are A,D, and E. And the interations are AB, AD, AE, BE, DE, ABE, and ADE.
b) None inadequancy or violations of assumptions were indentified by the residual plots.
c) After dropping the factor, the analysis shows some improvments in the results, but the factors and interactions still the same.
d) E set at +1 and C set at 0
A, B and D all set at +1