Setup

Load Libraries Into Session

library(dplyr)
library(tidyr)
library(GAD)
library(DoE.base)

Question 6.8

A bacteriologist is interested in the effects of two different culture media and two different times on the growth of a particular virus. He or she performs six replicates of a \(2^2\) design, making the runs in random order. Analyze the bacterial growth data that follow and draw appropriate conclusions. Analyze the residuals and comment on the model’s adequacy.

Reading in Data

BacteriaData <- read.csv("~/Grad School/IE 5342/Homework/6.8Data.csv")

Model Equation

\(y_{ijk} = \alpha_i + \beta_j + \alpha\beta_{ij} + \epsilon{ijk}\)

Hypotheses

Medium Effect

\(H_0: \alpha_i=0\) for all i

\(H_a: \alpha_i\neq0\) for some i

Hour Effect

\(H_0: \beta_j=0\) for all j

\(H_a: \beta_j\neq0\) for some j

Interaction Effect

\(H_0: \alpha\beta_{ij}=0\) for all ij

\(H_a: \alpha\beta_{ij}\neq0\) for some ij

Manipulating Data and Running AOV

BacteriaData$Hour <- as.factor(BacteriaData$Hour)
BacteriaData$Medium <- as.factor(BacteriaData$Medium)
BacteriaDataModel <-  aov(Response~Medium+Hours+Medium*Hours,data=BacteriaData)
summary(BacteriaDataModel)

##              Df Sum Sq Mean Sq F value   Pr(>F)    
## Medium        1    9.4     9.4   1.835 0.190617    
## Hours         1  590.0   590.0 115.506 9.29e-10 ***
## Medium:Hours  1   92.0    92.0  18.018 0.000397 ***
## Residuals    20  102.2     5.1                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Conclusions

AOV resulted in a p-value for the Medium effect of \(0.1906 > 0.05\), meaning the Feed Rate is not significant and we fail to reject \(H_0\).

AOV resulted in a p-value for the Hour effect of \(9.29*10^{-10} < 0.05\), meaning the Hour effect is significant and we reject \(H_0\).

AOV resulted in a p-value for Medium-Hour Interaction effect of \(0.000397 < 0.05\), meaning the Interaction effect is significant and we reject \(H_0\).

Normal Probability Plot

plot(BacteriaDataModel,2)

Residuals vs. Predicted Plot

plot(BacteriaDataModel,1)

Medium - Hour Interaction Plot

attach(BacteriaData)
interaction.plot(Medium,Hour,Response)

The residual plot and normal probability plots tell us that the data is normal and has a little variance. THe interaction plot tells us that interaction occurs since the two lines are non-parallel.

Question 6.12

An article in the AT&T Technical Journal…..

Reading in Data

WaferData <- read.csv("~/Grad School/IE 5342/Homework/6.12Data.csv",header=TRUE)

Model Equation

\(y_{ijk} = \alpha_i + \beta_j + \alpha\beta_{ij} + \epsilon{ijk}\)

Hypotheses

Flow Rate Effect

\(H_0: \alpha_i=0\) for all i

\(H_a: \alpha_i\neq0\) for some i

Deposition Time Effect

\(H_0: \beta_j=0\) for all j

\(H_a: \beta_j\neq0\) for some j

Interaction Effect

\(H_0: \alpha\beta_{ij}=0\) for all ij

\(H_a: \alpha\beta_{ij}\neq0\) for some ij

Part A

Estimate the Factor Effects

n <- 4
Trt1Sum <- 58.081
Trt2Sum <- 55.686
Trt3Sum <- 59.299
Trt4Sum <- 59.156
EffectA <- 1/(2*n)*(Trt4Sum+Trt2Sum-Trt3Sum-Trt1Sum)
EffectB <- 1/(2*n)*(Trt4Sum+Trt3Sum-Trt2Sum-Trt1Sum)
EffectAB <- 1/(2*n)*(Trt1Sum+Trt4Sum-Trt3Sum-Trt2Sum)
mean(WaferData$Response)

## [1] 14.51388

EffectA

## [1] -0.31725

EffectB

## [1] 0.586

EffectAB

## [1] 0.2815

Part B

Conduct an Analysis of Variance. Which factors are important?

WaferDataModel <-  lm(Response~FlowRate*Time,data=WaferData)
anova(WaferDataModel)

## Analysis of Variance Table
## 
## Response: Response
##               Df Sum Sq Mean Sq F value  Pr(>F)  
## FlowRate       1 0.4026 0.40259  1.2619 0.28327  
## Time           1 1.3736 1.37358  4.3054 0.06016 .
## FlowRate:Time  1 0.3170 0.31697  0.9935 0.33856  
## Residuals     12 3.8285 0.31904                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

AOV resulted in a p-value for the FlowRate effect of \(0.283 > 0.05\), meaning the Feed Rate is not significant and we fail to reject \(H_0\).

AOV resulted in a p-value for the Time effect of \(0.060 > 0.05\), meaning the Hour effect is not significant and we fail to reject \(H_0\).

AOV resulted in a p-value for FlowRate-Time Interaction effect of \(0.339 > 0.05\), meaning the Interaction effect is not significant and we fail to reject \(H_0\).

Part C

Regression Equation

\(y=14.514-(0.317*A)/2+(0.586*B)/2+(0.282*AB)/2\)

Part D

Analyze the residuals. Are there any residuals that should cause concern?

Normal Probability Plot

plot(WaferDataModel,2)

Residuals vs. Fitted Plot

plot(WaferDataModel,1)

The NPP and RVF plots show us that data point 2 is an outlier. Other than data point 2, the data is normal and has constant variance.

Part E

We could replace the outlier’s value with an average of other observations.

Question 6.21

Reading in Data

PuttData <- read.csv("~/Grad School/IE 5342/Homework/6.21Data3.csv",header=TRUE)

Part A

Analyze the data from this experiment. Which factors significantly affect putting performance?

PuttModel1 <- lm(Response~PuttLength*PutterType*PuttBreak*PuttSlope,data=PuttData)
halfnormal(PuttModel1)

## Warning in halfnormal.lm(PuttModel1): halfnormal not recommended for models with
## more residual df than model df

## 
## Significant effects (alpha=0.05, Lenth method):

## [1] PuttLength e95        e28        e44        e49        PutterType e84        
## 
## [8] e32        e78

The half normal plot tells us that the Putt Length and Putt Type are significant factors, while the other two (Putt Break and Putt Slope) are not significant factors. With this, we can narrow down and do our ANOVA on those two significant factors.

PuttModel2 <- lm(Response~PuttLength*PutterType,data=PuttData)
anova(PuttModel2)

## Analysis of Variance Table
## 
## Response: Response
##                        Df Sum Sq Mean Sq F value   Pr(>F)   
## PuttLength              1  917.1  917.15 10.9689 0.001261 **
## PutterType              1  388.1  388.15  4.6421 0.033418 * 
## PuttLength:PutterType   1  218.7  218.68  2.6154 0.108750   
## Residuals             108 9030.3   83.61                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Running ANOVA on our new model confirms what the halfnormal plot showed us that Putt Length is significant with a p-value of \(0.0013 << 0.05\) and that Putter Type is significant with a p-value of \(0.0334 < 0.05\). The Putt Length-Putter Type interaction is not significant with a p-value of \(0.1088 > 0.05\).

Part B

Analyze the residuals from this experiment. Are there any indications of model inadequacy?

Normal Probability Plot

plot(PuttModel1,2)

Residuals vs. Fitted Plot

plot(PuttModel1,1)

The data is normal until it reaches Quantile 1 and then loses normality. The data has constant variance in difference parts, but overall the data does not have constant variance. No, the model is not adequate.

Question 6.36

Resistivity on a silicon wafer is influenced by several factors. The results of a \(2^4\) factorial experiment performed during a critical processing step is shown in Table P6.10.

Reading in Data

ResistivityDataa <- read.csv("~/Grad School/IE 5342/Homework/6.36Dataa.csv",header=TRUE)

Part A

Estimate the factor effects. Plot the effect estimates on a normal probability plot and select a tentative model.

ResistivityModel1 <- lm(Response~A*B*C*D, data=ResistivityDataa)
halfnormal(ResistivityModel1)

## 
## Significant effects (alpha=0.05, Lenth method):

## [1] A     B     A:B   A:B:C

The half normal plot shows that Factor A, Factor B, and the interaction between Factor A and Factor B are significant, so we can narrow down our factors to those for running ANOVA.

Part B

Fit the model identified in part (a) and analyze the residuals. Is there any indication of model inadequacy?

ResistivityModel2 <- lm(Response~A+B+A*B,data=ResistivityDataa)
anova(ResistivityModel2)

## Analysis of Variance Table
## 
## Response: Response
##           Df  Sum Sq Mean Sq F value    Pr(>F)    
## A          1 159.833 159.833 333.088 4.049e-10 ***
## B          1  36.090  36.090  75.211 1.630e-06 ***
## A:B        1  18.297  18.297  38.130 4.763e-05 ***
## Residuals 12   5.758   0.480                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Normal Probability Plot

plot(ResistivityModel2,2)

Residuals vs. Fitted Plot

plot(ResistivityModel2,1)

The NPP shows that the data is not normally distributed and the Residuals vs Fitted plot shows that the data does not have constant variance. This means that the model is inadequate.

Part C

Repeat the analysis from parts (a) and (b) using \(ln(y)\) as the response variable. Is there an indication that the transformation has been useful?

ResistivityDatab <- read.csv("~/Grad School/IE 5342/Homework/6.36Datab.csv",header=TRUE)

ResistivityModel3 <- lm(Response~A*B*C*D, data=ResistivityDatab)
halfnormal(ResistivityModel3)

## 
## Significant effects (alpha=0.05, Lenth method):

## [1] A     B     A:B:C

The half normal plot again shows that Factor A and Factor B are significant, but no longer shows significance of the interaction between Factor A and Factor B. It does however show significance between Factors A, B, and C. So we will use these three for our ANOVA.

ResistivityModel4 <- lm(Response~A+B+A*B*C,data=ResistivityDatab)
anova(ResistivityModel4)

## Analysis of Variance Table
## 
## Response: Response
##           Df  Sum Sq Mean Sq   F value    Pr(>F)    
## A          1 10.5721 10.5721 1994.5559 6.975e-11 ***
## B          1  1.5803  1.5803  298.1470 1.289e-07 ***
## C          1  0.0007  0.0007    0.1240  0.733861    
## A:B        1  0.0097  0.0097    1.8393  0.212066    
## A:C        1  0.0252  0.0252    4.7632  0.060627 .  
## B:C        1  0.0003  0.0003    0.0539  0.822233    
## A:B:C      1  0.0644  0.0644   12.1466  0.008256 ** 
## Residuals  8  0.0424  0.0053                        
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Normal Probability Plot

plot(ResistivityModel4,2)

Residuals vs. Fitted Plot

plot(ResistivityModel4,1)

While the normality has improved with the \(ln\) transformation, the data is still not normal. The data still does not have constant variance. No, this transformation has not been useful.

Part D

Fit a model in terms of the coded variables that can be used to predict the resistivity.

coef(ResistivityModel4)

##  (Intercept)            A            B            C          A:B          A:C 
##  1.185417116  0.812870345 -0.314277554 -0.006408558 -0.024684570 -0.039723700 
##          B:C        A:B:C 
## -0.004225796  0.063434408

From the coefficients, we get the following model equation:

\(y = 1.185+0.813A-0.314B-0.006C-0.025AB-0.040AC-0.004BC-0.063ABC\)

Question 6.39

An article……..

Part A

Analyze the data from this experiment. Identify the significant factors and interactions.

Reading in Data

dat639 <- read.csv("~/Grad School/IE 5342/Homework/6.39Data.csv",header=TRUE)

Creating Half Normal Plot

dat639model1 <- lm(Response~A*B*C*D*E, data = dat639)
halfnormal(dat639model1)

## 
## Significant effects (alpha=0.05, Lenth method):

##  [1] D     E     A:D   A     D:E   B:E   A:B   A:B:E A:E   A:D:E

The half normal plot shows that A, D, and E are significant factors and A:B, A:D, A:E, B:E, D:E, A:B:E, and A:D:E are significant interactions.

Part B

Analyze the residuals from this experiment. Are there any indications of model inadequacy or violations of the assumptions?

dat639model2 <- aov(Response~A*B*D*E,data=dat639)

Normal Probability Plot

plot(dat639model2,2)

Residuals vs. Fitted Plot

plot(dat639model2,1)

The plots show a fairly normal distribution of data and that the data has constant variance, so yes the model is adequate.

Part C

One of the factors does not seem to be important. If you drop this factor, what type of design remains? Analyze the data using the full factorial model for only the four active factors. Compare your results with those obtained in part (a).

Factor C does not appear to be significant based on previous results. If we drop Factor C, the design will be come a \(2^4\) design.

Reading in Data

dat639c <- read.csv("~/Grad School/IE 5342/Homework/6.39Datac.csv",header=TRUE)

Creating Half Normal Plot

dat639model3 <- lm(Response~A*B*D*E, data = dat639c)
halfnormal(dat639model3)

## 
## Significant effects (alpha=0.05, Lenth method):

##  [1] D     E     A:D   A     D:E   B:E   A:B   A:B:E A:E   A:D:E e10

The half normal plot again shows that A, D, and E are significant factors and A:B, A:D, A:E, B:E, D:E, A:B:E, and A:D:E are significant interactions. Thus dropping Factor C did not change any of the results.

Homework Week 11

IE 5342 - Dr. Timothy I. Matis

Hunter Swerdloff

14 Nov 2021

Setup

Load Libraries Into Session

Question 6.8

Reading in Data

Model Equation

Hypotheses

Medium Effect

Hour Effect

Interaction Effect

Manipulating Data and Running AOV

Conclusions

Normal Probability Plot

Residuals vs. Predicted Plot

Medium - Hour Interaction Plot

Question 6.12

Reading in Data

Model Equation

Hypotheses

Flow Rate Effect

Deposition Time Effect

Interaction Effect

Part A

Part B

Part C

Regression Equation

Part D

Normal Probability Plot

Residuals vs. Fitted Plot

Part E

Question 6.21

Reading in Data

Part A

Part B

Normal Probability Plot

Residuals vs. Fitted Plot

Question 6.36

Reading in Data

Part A

Part B

Normal Probability Plot

Residuals vs. Fitted Plot

Part C

Normal Probability Plot

Residuals vs. Fitted Plot

Part D

Question 6.39

Part A

Reading in Data

Creating Half Normal Plot

Part B

Normal Probability Plot

Residuals vs. Fitted Plot

Part C

Reading in Data

Creating Half Normal Plot