This report captures work done for the individual homework for Week 11. R code along with the results are provided. The required homework problems were taken from “Design and Analysis of Experiments 8th Edition”:
   1) 6.8
   2) 6.12
   3) 6.21
   3) 6.36
   3) 6.39


Problem 4 (6.36)

Resistivity on a silicon wafer is influenced by several factors. The results of a 2^4 factorial experiment performed during a critical processing step is shown in the Table shown in the book.

  1. Estimate the factor effects. Plot the effect estimates on a normal probability plot and select a tentative model.


# setup Libraries
library(knitr)
library(dplyr)
library(tidyr)
library(GAD)
library(tinytex)
library(ggplot2)
library(ggfortify)
library(car)
library(DoE.base)
# READ IN FILE
setwd("D:/R Files/")
dat <- read.csv("D:/R Files/6-8.csv",header=TRUE)

# SET UP DATA TYPES
dat$Time <- as.fixed(dat$Time)
dat$CultureMedium <- as.fixed(dat$CultureMedium)
Response <- dat$Growth


The Linear Effects for this Model are:

\(\quad y_{ijk}\) = \(\mu\) + \(\alpha_i\) + \(\beta_j\) + \(\alpha \beta_{ij}\) \(\epsilon_{ijk}\)

Where:


\(\quad \alpha_i\) is the main effect of the \(\ i^{th}\) treatment of Time Rate used


\(\quad \beta_j\) is the main effect of the \(\ j^{th}\) treatment of Culture Medium


\(\quad \alpha \beta_{ij}\) is the interaction effect of the \(\ ij^{th}\) treatment of Time * Culture Medium, and


\(\quad \epsilon_{ijk}\) is the random error term.

The Hypotheses we will test are:

Time Main Effect:


\(\quad H_0\) : \(\alpha_i\) = 0 \(\forall\) i


\(\quad H_a\) : \(\alpha_i \neq\) 0 for at least one \(\ i\)

Culture Medium Main Effect:


\(\quad H_0\) : \(\beta_j\) = 0 \(\forall\) j


\(\quad H_a\) : \(\beta_j \neq\) 0 for at least one \(\ j\)

Time * Culture Medium Interaction Effect:


\(\quad H_0\) : \(\alpha \beta_{ij}\) = 0 \(\forall\) ij


\(\quad H_a\) : \(\alpha \beta_{ij} \neq\) 0 for at least one \(\ ij\)


\(\quad\)at a significance level of \(\alpha\) = 0.05


# SET UP model
model<-aov(Response~dat$Time*dat$CultureMedium)
#run model through GAD
GAD::gad(model)
## Analysis of Variance Table
## 
## Response: Response
##                            Df Sum Sq Mean Sq  F value    Pr(>F)    
## dat$Time                    1 590.04  590.04 115.5057 9.291e-10 ***
## dat$CultureMedium           1   9.38    9.38   1.8352 0.1906172    
## dat$Time:dat$CultureMedium  1  92.04   92.04  18.0179 0.0003969 ***
## Residual                   20 102.17    5.11                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We evaluate the hypothesis for the interaction effect first and find that it’s p-value < 0.000 which is significant to an \(\alpha\) of 0.05, so we reject the null hypothesis. Because of the Hierarchy Principle, we keep both main effects in the model even though the p-value for the Culture Medium is = 0.191 and we would have failed to reject the null hypothesis. We reject the null hypothesis for Time because its p-value is below our level of significance, with a p-value of <0.000.

Analyzing the residuals, we get the following:

autoplot(model)

We can see from the plots below that the residual are normal-ish and show constant variation so we find the model to be adequate.


Problem 2 (6.12)

The full explanation of this problem can be found in the text book. a) Estimate the factor effects.

dat2 <- read.csv("D:/R Files/6-12.csv",header=TRUE)
FactorA <- dat2$A
FactorB <- dat2$B
Result <- dat2$Result
mod <- lm(Result~FactorA*FactorB, data = dat2)
coef(mod)
##     (Intercept)         FactorA         FactorB FactorA:FactorB 
##       14.513875       -0.158625        0.293000        0.140750


b) Conduct an analysis of variance. Which factors are important?

anova(mod)
## Analysis of Variance Table
## 
## Response: Result
##                 Df Sum Sq Mean Sq F value  Pr(>F)  
## FactorA          1 0.4026 0.40259  1.2619 0.28327  
## FactorB          1 1.3736 1.37358  4.3054 0.06016 .
## FactorA:FactorB  1 0.3170 0.31697  0.9935 0.33856  
## Residuals       12 3.8285 0.31904                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

This first round shows that none of the factors are significant. I’ll remove the interaction term and run the ANOVA again.

mod <- lm(Result~FactorA+FactorB, data = dat2)
anova(mod)
## Analysis of Variance Table
## 
## Response: Result
##           Df Sum Sq Mean Sq F value  Pr(>F)  
## FactorA    1 0.4026 0.40259  1.2625 0.28150  
## FactorB    1 1.3736 1.37358  4.3075 0.05835 .
## Residuals 13 4.1454 0.31888                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We now see that both main effects are still not significant. If I remove Factor A because it is least significant and run again, I get the following for FactorB:

mod <- lm(Result~FactorB, data = dat2)
anova(mod)
## Analysis of Variance Table
## 
## Response: Result
##           Df Sum Sq Mean Sq F value Pr(>F)  
## FactorB    1 1.3736 1.37358  4.2282 0.0589 .
## Residuals 14 4.5480 0.32486                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Which is still not significant. Therefore none of the factors or interactions were found to be important / significant.

c) Write down a regression equation that could be used to predict epitaxial layer thickness over the region of arsenic flow rate and deposition time used in this experiment. The equation is:
Result = 14.514 -0.159A + 0.293B + 0.141 AB

d)Analyze the residuals. Are there any residuals that should cause concern? Analyzing the residuals, we get the following:

mod <- lm(Result~FactorA*FactorB, data = dat2)
autoplot(mod)

The residual for first observation of Replicate II is an outlier with the original value of the observation at 16.165.

  1. Discuss how you might deal with the potential outlier found in part (d).
    Ways to deal with the outlier are varied we could, first see if it was due to a transcription error and then correct accordingly. We could replace it while maintaining orthogonality or under the right circumstances, we could disregard the value and make note of it. —-

Problem 3 (6.21)

The full explanation of this problem can be found in the textbook. a) Analyze the data from this experiment. Which factors significantly affect putting performance?

dat3 <- read.csv("D:/R Files/6-21.csv",header=TRUE)
PuttLength <- dat3$PuttLength
PutterType <- dat3$PutterType
PuttBreak <- dat3$PuttBreak
PuttSlope <- dat3$PuttSlope
DistanceFromCup <- dat3$DistanceFromCup
PuttLength <- as.fixed(PuttLength)
PutterType <- as.fixed(PutterType)
PuttBreak <-  as.fixed(PuttBreak)
PuttSlope <-  as.fixed(PuttSlope)
mod3 <- lm(DistanceFromCup~PuttLength+PutterType+PuttBreak+PuttSlope+
              PuttLength*PutterType+
              PuttLength*PuttBreak+
              PutterType*PuttBreak+
              PuttLength*PuttSlope+
              PutterType*PuttSlope+
              PuttBreak*PuttSlope+
              PuttLength*PutterType*PuttBreak+
              PuttLength*PutterType*PuttSlope+
              PuttLength*PuttBreak*PuttSlope+
              PuttLength*PuttBreak*PuttSlope+
              PutterType*PuttBreak*PuttSlope+
              PuttLength*PutterType*PuttBreak*PuttSlope)
GAD::gad(mod3)
## Analysis of Variance Table
## 
## Response: DistanceFromCup
##                                           Df Sum Sq Mean Sq F value   Pr(>F)   
## PuttLength                                 1  917.1  917.15 10.5878 0.001572 **
## PutterType                                 1  388.1  388.15  4.4809 0.036862 * 
## PuttBreak                                  1  145.1  145.15  1.6756 0.198615   
## PuttSlope                                  1    1.4    1.40  0.0161 0.899280   
## PuttLength:PutterType                      1  218.7  218.68  2.5245 0.115377   
## PuttLength:PuttBreak                       1   11.9   11.90  0.1373 0.711776   
## PutterType:PuttBreak                       1  115.0  115.02  1.3278 0.252054   
## PuttLength:PuttSlope                       1   93.8   93.81  1.0829 0.300658   
## PutterType:PuttSlope                       1   56.4   56.43  0.6515 0.421588   
## PuttBreak:PuttSlope                        1    1.6    1.63  0.0188 0.891271   
## PuttLength:PutterType:PuttBreak            1    7.3    7.25  0.0837 0.772939   
## PuttLength:PutterType:PuttSlope            1  113.0  113.00  1.3045 0.256228   
## PuttLength:PuttBreak:PuttSlope             1   39.5   39.48  0.4558 0.501207   
## PutterType:PuttBreak:PuttSlope             1   33.8   33.77  0.3899 0.533858   
## PuttLength:PutterType:PuttBreak:PuttSlope  1   95.6   95.65  1.1042 0.295994   
## Residual                                  96 8315.8   86.62                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


I’ll remove the 4th level interaction effect and run again because it isn’t significant to an alpha of 0.05.

mod31 <- lm(DistanceFromCup~PuttLength+PutterType+PuttBreak+PuttSlope+
              PuttLength*PutterType+
              PuttLength*PuttBreak+
              PutterType*PuttBreak+
              PuttLength*PuttSlope+
              PutterType*PuttSlope+
              PuttBreak*PuttSlope+
              PuttLength*PutterType*PuttBreak+
              PuttLength*PutterType*PuttSlope+
              PuttLength*PuttBreak*PuttSlope+
              PuttLength*PuttBreak*PuttSlope+
              PutterType*PuttBreak*PuttSlope)
GAD::gad(mod31)
## Analysis of Variance Table
## 
## Response: DistanceFromCup
##                                 Df Sum Sq Mean Sq F value   Pr(>F)   
## PuttLength                       1  917.1  917.15 10.5764 0.001576 **
## PutterType                       1  388.1  388.15  4.4761 0.036934 * 
## PuttBreak                        1  145.1  145.15  1.6738 0.198822   
## PuttSlope                        1    1.4    1.40  0.0161 0.899331   
## PuttLength:PutterType            1  218.7  218.68  2.5218 0.115536   
## PuttLength:PuttBreak             1   11.9   11.90  0.1372 0.711915   
## PutterType:PuttBreak             1  115.0  115.02  1.3264 0.252277   
## PuttLength:PuttSlope             1   93.8   93.81  1.0818 0.300889   
## PutterType:PuttSlope             1   56.4   56.43  0.6508 0.421816   
## PuttBreak:PuttSlope              1    1.6    1.63  0.0188 0.891326   
## PuttLength:PutterType:PuttBreak  1    7.3    7.25  0.0836 0.773051   
## PuttLength:PutterType:PuttSlope  1  113.0  113.00  1.3031 0.256452   
## PuttLength:PuttBreak:PuttSlope   1   39.5   39.48  0.4553 0.501419   
## PutterType:PuttBreak:PuttSlope   1   33.8   33.77  0.3894 0.534062   
## Residual                        97 8411.4   86.72                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We can now see that none of the 3rd level interaction effects are close to being significant so we will remove them and re-run the model.

mod32 <- lm(DistanceFromCup~PuttLength+PutterType+PuttBreak+PuttSlope+
              PuttLength*PutterType+
              PuttLength*PuttBreak+
              PutterType*PuttBreak+
              PuttLength*PuttSlope+
              PutterType*PuttSlope+
              PuttBreak*PuttSlope)
GAD::gad(mod32)
## Analysis of Variance Table
## 
## Response: DistanceFromCup
##                        Df Sum Sq Mean Sq F value   Pr(>F)   
## PuttLength              1  917.1  917.15 10.7649 0.001421 **
## PutterType              1  388.1  388.15  4.5558 0.035227 * 
## PuttBreak               1  145.1  145.15  1.7036 0.194779   
## PuttSlope               1    1.4    1.40  0.0164 0.898432   
## PuttLength:PutterType   1  218.7  218.68  2.5668 0.112255   
## PuttLength:PuttBreak    1   11.9   11.90  0.1396 0.709444   
## PutterType:PuttBreak    1  115.0  115.02  1.3500 0.248009   
## PuttLength:PuttSlope    1   93.8   93.81  1.1010 0.296542   
## PutterType:PuttSlope    1   56.4   56.43  0.6624 0.417645   
## PuttBreak:PuttSlope     1    1.6    1.63  0.0191 0.890357   
## Residual              101 8604.9   85.20                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We then remove each secondary interaction in order of highest p-value and discover that as we remove each term, none of the 2nd order interactions are significant. The step-wise process hasn’t been shown here for brevity. Running the model with main effects, we get the following:

mod33 <- lm(DistanceFromCup~PuttLength+PutterType+PuttBreak+PuttSlope)
GAD::gad(mod33)
## Analysis of Variance Table
## 
## Response: DistanceFromCup
##             Df Sum Sq Mean Sq F value   Pr(>F)   
## PuttLength   1  917.1  917.15 10.7812 0.001386 **
## PutterType   1  388.1  388.15  4.5627 0.034956 * 
## PuttBreak    1  145.1  145.15  1.7062 0.194280   
## PuttSlope    1    1.4    1.40  0.0164 0.898342   
## Residual   107 9102.4   85.07                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


We now remove PuttSlope becasue it isn’t significant and re-run:

mod34 <- lm(DistanceFromCup~PuttLength+PutterType+PuttBreak)
GAD::gad(mod34)
## Analysis of Variance Table
## 
## Response: DistanceFromCup
##             Df Sum Sq Mean Sq F value   Pr(>F)   
## PuttLength   1  917.1  917.15 10.8803 0.001317 **
## PutterType   1  388.1  388.15  4.6046 0.034125 * 
## PuttBreak    1  145.1  145.15  1.7219 0.192233   
## Residual   108 9103.8   84.29                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


We now remove PuttBreak becasue it isn’t significant and re-run:

mod35 <- lm(DistanceFromCup~PuttLength+PutterType)
GAD::gad(mod35)
## Analysis of Variance Table
## 
## Response: DistanceFromCup
##             Df Sum Sq Mean Sq F value  Pr(>F)   
## PuttLength   1  917.1  917.15 10.8087 0.00136 **
## PutterType   1  388.1  388.15  4.5743 0.03469 * 
## Residual   109 9248.9   84.85                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

And we now find the significant factors, we see PuttLength at a p-value of 0.00136 and PutterType at a p-value of 0.03469.
b) Analyze the residuals from this experiment. Are there any indications of model inadequacy?
The resiudal plots are below:

autoplot(mod35)

The residuals appear to be significantly not normal at the higher end of the theoretical quantiles so the model is inadequate.

Problem 4 (6.36)

Resistivity on a silicon wafer is influenced by several factors. The results of a 2^4 factorial experiment performed during a critical processing step is shown in the Table in the book.

  1. Estimate the factor effects. Plot the effect estimates on a normal probability plot and select a tentative model.
dat4 <- read.csv("D:/R Files/6-36.csv",header=TRUE)
A <- dat4$A
B <- dat4$B
C <- dat4$C
D <- dat4$D
Resistivity <- dat4$Resistivity
A <- as.numeric(A)
B <- as.numeric(B)
C <-  as.numeric(C)
D <-  as.numeric(D)
dat636 <- data.frame(A,B,C,D,Resistivity)
mod5 <- lm(Resistivity~A*B*C*D, data = dat636)
anova(mod5)
## Warning in anova.lm(mod5): ANOVA F-tests on an essentially perfect fit are
## unreliable
## Analysis of Variance Table
## 
## Response: Resistivity
##           Df  Sum Sq Mean Sq F value Pr(>F)
## A          1 159.833 159.833     NaN    NaN
## B          1  36.090  36.090     NaN    NaN
## C          1   0.779   0.779     NaN    NaN
## D          1   0.101   0.101     NaN    NaN
## A:B        1  18.297  18.297     NaN    NaN
## A:C        1   1.422   1.422     NaN    NaN
## B:C        1   0.842   0.842     NaN    NaN
## A:D        1   0.052   0.052     NaN    NaN
## B:D        1   0.035   0.035     NaN    NaN
## C:D        1   0.014   0.014     NaN    NaN
## A:B:C      1   1.898   1.898     NaN    NaN
## A:B:D      1   0.150   0.150     NaN    NaN
## A:C:D      1   0.002   0.002     NaN    NaN
## B:C:D      1   0.143   0.143     NaN    NaN
## A:B:C:D    1   0.322   0.322     NaN    NaN
## Residuals  0   0.000     NaN
halfnormal(mod5)

The half-normal plot shows that Factors A, B and AB are significant from the plot.
b) Fit the model identified in part (a) and analyze the residuals, Is there any indication of model inadequacy?

dat6362 <- data.frame(A,B,C,D,Resistivity)
mod52 <- lm(Resistivity~A+B+A*B, data= dat6362)

anova(mod52)
## Analysis of Variance Table
## 
## Response: Resistivity
##           Df  Sum Sq Mean Sq F value    Pr(>F)    
## A          1 159.833 159.833 333.088 4.049e-10 ***
## B          1  36.090  36.090  75.211 1.630e-06 ***
## A:B        1  18.297  18.297  38.130 4.763e-05 ***
## Residuals 12   5.758   0.480                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
autoplot(mod52)

After plotting the residuals, the model appears to be inadequate because the residuals aren’t normally distributed.
c) Repeat the analysis from parts (a) and (b) using ln(y) as the response variable. Is there an indication that the transformation has been useful?

ResistivityLog <- log(Resistivity)
dat6363 <- data.frame(A,B,C,D,ResistivityLog)
mod6 <- lm(ResistivityLog~A*B*C*D, data = dat6363)
anova(mod6)
## Warning in anova.lm(mod6): ANOVA F-tests on an essentially perfect fit are
## unreliable
## Analysis of Variance Table
## 
## Response: ResistivityLog
##           Df  Sum Sq Mean Sq F value Pr(>F)
## A          1 10.5721 10.5721     NaN    NaN
## B          1  1.5803  1.5803     NaN    NaN
## C          1  0.0007  0.0007     NaN    NaN
## D          1  0.0052  0.0052     NaN    NaN
## A:B        1  0.0097  0.0097     NaN    NaN
## A:C        1  0.0252  0.0252     NaN    NaN
## B:C        1  0.0003  0.0003     NaN    NaN
## A:D        1  0.0015  0.0015     NaN    NaN
## B:D        1  0.0002  0.0002     NaN    NaN
## C:D        1  0.0051  0.0051     NaN    NaN
## A:B:C      1  0.0644  0.0644     NaN    NaN
## A:B:D      1  0.0143  0.0143     NaN    NaN
## A:C:D      1  0.0002  0.0002     NaN    NaN
## B:C:D      1  0.0002  0.0002     NaN    NaN
## A:B:C:D    1  0.0157  0.0157     NaN    NaN
## Residuals  0  0.0000     NaN
halfnormal(mod6)

mod53 <- lm(ResistivityLog~A+B+C+(A*B*C), data= dat6363)

anova(mod53)
## Analysis of Variance Table
## 
## Response: ResistivityLog
##           Df  Sum Sq Mean Sq   F value    Pr(>F)    
## A          1 10.5721 10.5721 1994.5559 6.975e-11 ***
## B          1  1.5803  1.5803  298.1470 1.289e-07 ***
## C          1  0.0007  0.0007    0.1240  0.733861    
## A:B        1  0.0097  0.0097    1.8393  0.212066    
## A:C        1  0.0252  0.0252    4.7632  0.060627 .  
## B:C        1  0.0003  0.0003    0.0539  0.822233    
## A:B:C      1  0.0644  0.0644   12.1466  0.008256 ** 
## Residuals  8  0.0424  0.0053                        
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
autoplot(mod53)


The transformation on Resistivity corrected the non-normal residuals. The model is now adeuqate.

  1. Fit a model in terms of the coded variables that can be used predict the resistivity.
coef(mod53)
##  (Intercept)            A            B            C          A:B          A:C 
##  1.185417116  0.812870345 -0.314277554 -0.006408558 -0.024684570 -0.039723700 
##          B:C        A:B:C 
## -0.004225796  0.063434408

The model is: 1.1854 + 0.8129A - 0.3143B - 0.0064C - 0.0247AB - 0.0397AC - 0.0042BC + 0.0634ABC

Problem 5 (6.39)

An article in Quality and Reliability Engineering…
a) Analyze the data from this experiment. Identify the significant factors and interactions.

Running the anova and half normality plot, we see that only the interaction ADE is significant.

dat5 <- read.csv("D:/R Files/6-39.csv",header=TRUE)
A <- dat5$A
B <- dat5$B
C <- dat5$C
D <- dat5$D
E <- dat5$E
Result <- dat5$y
A <- as.numeric(A)
B <- as.numeric(B)
C <-  as.numeric(C)
D <-  as.numeric(D)
E <-  as.numeric(E)
dat639 <- data.frame(A,B,C,D,E,Result)
mod639 <- lm(Result~A*B*C*D*E, data = dat639)
anova(mod639)
## Warning in anova.lm(mod639): ANOVA F-tests on an essentially perfect fit are
## unreliable
## Analysis of Variance Table
## 
## Response: Result
##           Df  Sum Sq Mean Sq F value Pr(>F)
## A          1   2.674   2.674     NaN    NaN
## B          1   0.991   0.991     NaN    NaN
## C          1   0.257   0.257     NaN    NaN
## D          1  81.505  81.505     NaN    NaN
## E          1  26.082  26.082     NaN    NaN
## A:B        1  13.533  13.533     NaN    NaN
## A:C        1  15.249  15.249     NaN    NaN
## B:C        1   4.766   4.766     NaN    NaN
## A:D        1   5.176   5.176     NaN    NaN
## B:D        1  33.395  33.395     NaN    NaN
## C:D        1   0.498   0.498     NaN    NaN
## A:E        1  76.107  76.107     NaN    NaN
## B:E        1 104.221 104.221     NaN    NaN
## C:E        1  27.288  27.288     NaN    NaN
## D:E        1  32.140  32.140     NaN    NaN
## A:B:C      1  51.537  51.537     NaN    NaN
## A:B:D      1   1.062   1.062     NaN    NaN
## A:C:D      1  26.481  26.481     NaN    NaN
## B:C:D      1   0.179   0.179     NaN    NaN
## A:B:E      1  11.725  11.725     NaN    NaN
## A:C:E      1  20.082  20.082     NaN    NaN
## B:C:E      1  30.167  30.167     NaN    NaN
## A:D:E      1 179.409 179.409     NaN    NaN
## B:D:E      1  45.816  45.816     NaN    NaN
## C:D:E      1   2.983   2.983     NaN    NaN
## A:B:C:D    1  44.157  44.157     NaN    NaN
## A:B:C:E    1   0.518   0.518     NaN    NaN
## A:B:D:E    1   5.687   5.687     NaN    NaN
## A:C:D:E    1  15.750  15.750     NaN    NaN
## B:C:D:E    1  50.125  50.125     NaN    NaN
## A:B:C:D:E  1   2.605   2.605     NaN    NaN
## Residuals  0   0.000     NaN
halfnormal(mod639)


b) Analyze the residuals from this experiment. Are there any indications of model inadequacy or violations of the assumptions?

mod6391 <- lm(Result~A+D+E+(A*D)+(A*E)+(D*E)+(A*D*E), data = dat639)
anova(mod6391)
## Analysis of Variance Table
## 
## Response: Result
##           Df Sum Sq Mean Sq F value   Pr(>F)   
## A          1   2.67   2.674  0.1261 0.725657   
## D          1  81.50  81.505  3.8425 0.061677 . 
## E          1  26.08  26.082  1.2296 0.278465   
## A:D        1   5.18   5.176  0.2440 0.625802   
## A:E        1  76.11  76.107  3.5881 0.070310 . 
## D:E        1  32.14  32.140  1.5152 0.230268   
## A:D:E      1 179.41 179.409  8.4582 0.007708 **
## Residuals 24 509.07  21.211                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
autoplot(mod6391)


Analyzing the residuals shows that they are normal and of constant variance, so the model is adequate.
c) One of the factors from this experiment does not seem to be important. If you drop this factor, what type of design remains? Analyze the data using the full factorial model for only the four active factors. Compare your results with those obtained in part (a).
Factor C is not significant in this experiment. If we remove Factor C, we have a non-hierarchical model. The half-normal plot below shows that the significant factors would now be ADE, BE, D and AE. This is much different from the results obtained in part a when we had ADE as significant. Adding C in with noise allowed us to see BE, D and AE as significant.

mod6392 <- lm(Result~A*B*D*E, data = dat639)
anova(mod6392)
## Analysis of Variance Table
## 
## Response: Result
##           Df  Sum Sq Mean Sq F value   Pr(>F)   
## A          1   2.674   2.674  0.1462 0.707234   
## B          1   0.991   0.991  0.0542 0.818933   
## D          1  81.505  81.505  4.4562 0.050862 . 
## E          1  26.082  26.082  1.4260 0.249816   
## A:B        1  13.533  13.533  0.7399 0.402395   
## A:D        1   5.176   5.176  0.2830 0.602048   
## B:D        1  33.395  33.395  1.8259 0.195416   
## A:E        1  76.107  76.107  4.1611 0.058227 . 
## B:E        1 104.221 104.221  5.6982 0.029671 * 
## D:E        1  32.140  32.140  1.7573 0.203584   
## A:B:D      1   1.062   1.062  0.0581 0.812629   
## A:B:E      1  11.725  11.725  0.6411 0.435057   
## A:D:E      1 179.409 179.409  9.8091 0.006434 **
## B:D:E      1  45.816  45.816  2.5050 0.133050   
## A:B:D:E    1   5.687   5.687  0.3109 0.584829   
## Residuals 16 292.640  18.290                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
halfnormal(mod6392)

  1. Find settings of the active factors that maximize the predicted response.
    Using another software package’s response optimizer, I found that the following solution optimizes the predicted response: A = -1; B = 1; C = -1; D = 1; E= -1. This generates the result of y = 22.19.