Homework 11

Question 6.8

A bacteriologist is interested in the effects of two different culture media and two different times on the growth of a particular virus. He or she performs six replicates of a 22 design, making the runs in random order. Analyze the bacterial growth data that follow and draw appropriate conclusions. Analyze the residuals and comment on the model’s adequacy

c1 <- c(21,23,20,37,38,35)
c2 <- c(22,28,26,39,38,36)
c3 <- c(25,24,29,31,29,30)
c4 <- c(26,25,27,34,33,35)

observation <- c(c1,c2,c3,c4)
culture <- c(rep(-1,12),rep(1,12))
time <- rep(c(rep(-1,3),rep(1,3)),4)
dat <- as.data.frame(cbind(observation,culture,time))
model <- lm(observation~culture*time,data=dat)
halfnormal(model)

## 
## Significant effects (alpha=0.05, Lenth method):

## [1] time         culture:time

interaction.plot(culture,time,observation)

model <- aov(observation~culture*time,data=dat)
summary(model)

##              Df Sum Sq Mean Sq F value   Pr(>F)    
## culture       1    9.4     9.4   1.835 0.190617    
## time          1  590.0   590.0 115.506 9.29e-10 ***
## culture:time  1   92.0    92.0  18.018 0.000397 ***
## Residuals    20  102.2     5.1                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

plot(model)

## hat values (leverages) are all = 0.1666667
##  and there are no factor predictors; no plot no. 5

Time and Culture medium *time (interaction) have effect on Growth rate. The residuals plot show very small inaquality of variance.

Question 6.12

An artcle in the AT&T Technical Journal (March/April 1986, Vol. 65, pp. 39–50) describes the application of two-level factorial designs to integrated circuit manufacturing. A basic processing step is to grow an epitaxial layer on polished silicon wafers. The wafers mounted on a susceptor are positioned inside a bell jar, and chemical vapors are introduced. The susceptible is rotated, and heat is applied until the epitaxial layer is thick enough. An experiment was run using two factors: arsenic flow rate (A) and deposition time (B). Four replicates were run, and the epitaxial layer thickness was measured ( m). The data are shown in Table P6.1.

Arsenic <- c(rep(-1,4),rep(1,4),rep(-1,4),rep(1,4))
Deposition <- c(rep(-1,4),rep(-1,4),rep(1,4),rep(1,4))
Thickness <- c(14.037,16.165,13.972,13.907,13.880,13.860,14.032,13.914,14.821,14.757,14.843,14.878,14.888,14.921,14.415,14.932)
datB <- data.frame(Arsenic,Deposition,Thickness)
### a)
modelB <- lm(Thickness~Arsenic*Deposition, data = datB)
coef(modelB)

##        (Intercept)            Arsenic         Deposition Arsenic:Deposition 
##          14.513875          -0.158625           0.293000           0.140750

### b)
anova(modelB)

## Analysis of Variance Table
## 
## Response: Thickness
##                    Df Sum Sq Mean Sq F value  Pr(>F)  
## Arsenic             1 0.4026 0.40259  1.2619 0.28327  
## Deposition          1 1.3736 1.37358  4.3054 0.06016 .
## Arsenic:Deposition  1 0.3170 0.31697  0.9935 0.33856  
## Residuals          12 3.8285 0.31904                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

### d)
plot(modelB)

## hat values (leverages) are all = 0.25
##  and there are no factor predictors; no plot no. 5

a) Estimate the factor effects:
    Arsenic 0.31725 
    Deposition 0.586
    Interaction 0.2815
    
b) By looking at the anova summary, we can see that all the factors and their interaction are not significant.

c) y(Thickness)=14.51−0.158(A)+0.293(B)+0.14075(AB)+Error

d)  Observation #2 is out of the groupings in the normal probability plot and the plot of residual versus predicted. 

e)I would would replace the observation with the average of the observations from that experimental cell.

Question 6.21

I am always interested in improving my golf scores. Since a typical golfer uses the putter for about 35–45 percent of his or her strokes, it seems reasonable that improving one’s putting is a logical and perhaps simple way to improve a golf score (“The man who can putt is a match for any man.”— Willie Parks, 1864–1925, two time winner of the British Open). An experiment was conducted to study the effects of four factors on putting accuracy. The design factors are length of putt, type of putter, breaking putt versus straight putt, and level versus downhill putt. The response variable is distance from the ball to the center of the cup after the ball comes to rest. One golfer performs the experiment, a 24 factorial design with seven replicates was used, and all putts are made in random order. The results are shown in Table P6.4.

Length <- c(rep(-1,7),rep(1,7),rep(-1,7),rep(1,7),rep(-1,7),rep(1,7),rep(-1,7),rep(1,7),rep(-1,7),rep(1,7),rep(-1,7),rep(1,7),rep(-1,7),rep(1,7),rep(-1,7),rep(1,7))

Type <- c(rep(-1,7),rep(-1,7),rep(1,7),rep(1,7),rep(-1,7),rep(-1,7),rep(1,7),rep(1,7),rep(-1,7),rep(-1,7),rep(1,7),rep(1,7),rep(-1,7),rep(-1,7),rep(1,7),rep(1,7))

Break <- c(rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7))

Slope <- c(rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7))

Distance <- c(10,18,14,12.5,19,16,18.5,
              0,16.5,4.5,17.5,20.5,17.5,33,
              4,6,1,14.5,12,14,5,
              0,10,34,11,25.5,21.5,0,
              0,0,18.5,19.5,16,15,11,
              5,20.5,18,20,29.5,19,10,
              6.5,18.5,7.5,6,0,10,0,
              16.5,4.5,0,23.5,8,8,8,
              4.5,18,14.5,10,0,17.5,6,
              19.5,18,16,5.5,10,7,36,
              15,16,8.5,0,0.5,9,3,
              41.5,39,6.5,3.5,7,8.5,36,
              8,4.5,6.5,10,13,41,14,
              21.5,10.5,6.5,0,15.5,24,16,
              0,0,0,4.5,1,4,6.5,
              18,5,7,10,32.5,18.5,8)
datC <- data.frame(Length,Type,Break,Slope,Distance)
modelC <- lm(Distance~Length*Type*Break*Slope, data = datC)
halfnormal(modelC)

## 
## Significant effects (alpha=0.05, Lenth method):

## [1] Length e95    e28    e44    e49    Type   e84    e32    e78

modelC <- aov(Distance~Length*Type,data=datC)
plot(modelC)

## hat values (leverages) are all = 0.03571429
##  and there are no factor predictors; no plot no. 5

summary(modelC)

##              Df Sum Sq Mean Sq F value  Pr(>F)   
## Length        1    917   917.1  10.969 0.00126 **
## Type          1    388   388.1   4.642 0.03342 * 
## Length:Type   1    219   218.7   2.615 0.10875   
## Residuals   108   9030    83.6                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

H0: (τβij) = 0
H1: At least one (τβij) ≠ 0

a) The only significant factors are the length and type.

b) The Normality Plot has outliers, which appears to violate the normality assumptions.And the Residual vs Fitted Plot shows inconsistant variance. As both of these violate the assumptions of ANOVA, this model is not adequate, therefore a square root transformation could correct the violations.

Question 6.36

Resistivity on a silicon wafer is influenced by several factors. The results of a 24 factorial experiment performed during a critical processing step is shown in Table P6.10.

a <- c(-1,1)
b <- c(-1,-1,1,1)
c <- c(-1,-1,-1,-1,1,1,1,1)
d <- c(-1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,1,1,1,1)
A <- c(rep(a,8))
B <- c(rep(b,4))
C <- c(rep(c,2))
D <- c(rep(d,1))

observation <- c(1.92,11.28,1.09,5.75,2.13,9.53,1.03,5.35,1.60,11.73,1.16,4.68,2.16,9.11,1.07,5.30)
datD <- as.data.frame(cbind(A,B,C,D,observation))
modelD <- lm(observation~A*B*C*D,data=datD)
halfnormal(modelD)

## 
## Significant effects (alpha=0.05, Lenth method):

## [1] A     B     A:B   A:B:C

modelD <- aov(observation~A*B*C,data=datD)
summary(modelD)

##             Df Sum Sq Mean Sq  F value   Pr(>F)    
## A            1 159.83  159.83 1563.061 1.84e-10 ***
## B            1  36.09   36.09  352.937 6.66e-08 ***
## C            1   0.78    0.78    7.616  0.02468 *  
## A:B          1  18.30   18.30  178.933 9.33e-07 ***
## A:C          1   1.42    1.42   13.907  0.00579 ** 
## B:C          1   0.84    0.84    8.232  0.02085 *  
## A:B:C        1   1.90    1.90   18.556  0.00259 ** 
## Residuals    8   0.82    0.10                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

plot(modelD)

## hat values (leverages) are all = 0.5
##  and there are no factor predictors; no plot no. 5

coef(modelD)

## (Intercept)           A           B           C         A:B         A:C 
##    4.680625    3.160625   -1.501875   -0.220625   -1.069375   -0.298125 
##         B:C       A:B:C 
##    0.229375    0.344375

mean(observation)

## [1] 4.680625

qqnorm(observation)

observation <- log(observation)
dat <- as.data.frame(cbind(A,B,C,D,observation))
modelD <- lm(observation~A*B*C*D,data=datD)
coef(modelD)

## (Intercept)           A           B           C           D         A:B 
##    4.680625    3.160625   -1.501875   -0.220625   -0.079375   -1.069375 
##         A:C         B:C         A:D         B:D         C:D       A:B:C 
##   -0.298125    0.229375   -0.056875   -0.046875    0.029375    0.344375 
##       A:B:D       A:C:D       B:C:D     A:B:C:D 
##   -0.096875   -0.010625    0.094375    0.141875

a) Factor effect can be observed in the table above!

b) The normal probability plot of residuals is not satisfactory. A non-constant variance can be observed in the plots of residual versus predicted, residual versus factor A, and the residual versus factor B. .

Question 6.39

An article in Quality and Reliability Engineering International (2010, Vol. 26, pp. 223–233) presents a 25 factorial design. The experiment is shown in Table P6.12.

a <- c(-1,1)
b <- c(-1,-1,1,1)
c <- c(-1,-1,-1,-1,1,1,1,1)
d <- c(-1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,1,1,1,1)
A <- c(rep(a,16))
B <- c(rep(b,8))
C <- c(rep(c,4))
D <- c(rep(d,2))
E <- c(rep(-1,16),rep(1,16))
observation <- c(8.11,5.56,5.77,5.82,9.17,7.8,3.23,5.69,8.82,14.23,9.2,8.94,8.68,11.49,6.25,9.12,7.93,5,7.47,12,9.86,3.65,6.4,11.61,12.43,17.55,8.87,25.38,13.06,18.85,11.78,26.05)
dat <- as.data.frame(cbind(A,B,C,D,E,observation))
model <- lm(observation~A*B*C*D*E,data=dat)
halfnormal(model)

## 
## Significant effects (alpha=0.05, Lenth method):

##  [1] D     E     A:D   A     D:E   B:E   A:B   A:B:E A:E   A:D:E

model <- aov(observation~A*B*D*E,data=dat)
plot(model)

## hat values (leverages) are all = 0.5
##  and there are no factor predictors; no plot no. 5

interaction.plot(A,D,observation)

interaction.plot(D,E,observation)

interaction.plot(B,E,observation)

interaction.plot(A,B,observation)

interaction.plot(A,E,observation)

dat <- as.data.frame(cbind(A,B,D,E,observation))
model <- lm(observation~A*B*D*E,data=dat)

interaction.plot(A,D,observation)

interaction.plot(D,E,observation)

interaction.plot(B,E,observation)

interaction.plot(A,B,observation)

interaction.plot(A,E,observation)

halfnormal(model)

## 
## Significant effects (alpha=0.05, Lenth method):

##  [1] D     E     A:D   A     D:E   B:E   A:B   A:B:E A:E   A:D:E e10

coef(model)

## (Intercept)           A           B           D           E         A:B 
##  10.1803125   1.6159375   0.0434375   2.9884375   2.1878125   1.2365625 
##         A:D         B:D         A:E         B:E         D:E       A:B:D 
##   1.6665625  -0.0134375   1.0271875   1.2834375   1.3896875  -0.3453125 
##       A:B:E       A:D:E       B:D:E     A:B:D:E 
##   1.1853125   0.9015625  -0.0396875   0.4071875

mean(observation) # grand mean of observation

## [1] 10.18031

a) The significant factors are A,D, and E. And the interations are AB, AD, AE, BE, DE, ABE, and ADE.
   
b) None inadequancy or violations of assumptions were indentified by the residual plots.

c) After dropping the factor, the analysis shows some improvments in the results, but the factors and interactions still the same.

d) E set at +1 and C set at 0
   A, B and D all set at +1