1 Question 6.8:

1.1 Solution:

Writing Model Equation:

\[ y_{ij}=\mu+\alpha_{i}+\beta_{j}+\alpha\beta_{ij}+\epsilon_{ijk} \]

Writing the Hypothesis:

Interaction:

Null:\[H_o:\alpha\beta_{ij}=0\space\forall\space"i,j"\]

Alternate:\[H_a:\alpha\beta_{ij}\neq0\space\exists\space"i,j"\]

Main Effects:

Null:\[H_o:\alpha_{i}=0\space\forall\space"i"\]

\[ H_o:\beta_{j}=0\space\forall\space"j" \]

Alternate:\[H_a:\alpha{i}\neq0\space\exists\space"i"\]

\[ H_a:\beta{j}\neq0\space\exists\space"j" \]

Reading the Data:

CultureMedium <-  c(1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2)
Time <- c(rep(12,12),rep(18,12))
Values <- c(21,22,25,26,23,28,24,25,20,26,29,27,37,39,31,34,38,38,29,33,35,36,30,35)
CM <- as.factor(CultureMedium)
Time <- as.factor(Time)
Data <- data.frame(CM,Time,Values)

Model <- aov(Values~CM*Time,data = Data)
summary(Model)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## CM           1    9.4     9.4   1.835 0.190617    
## Time         1  590.0   590.0 115.506 9.29e-10 ***
## CM:Time      1   92.0    92.0  18.018 0.000397 ***
## Residuals   20  102.2     5.1                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
--> we can conclude that interaction of Factor A & B is significant. Therefore, we’ll reject Null hypothesis of interaction and stop here and we will look at the interaction plot

Interaction Plot:

interaction.plot(CultureMedium,Time,Values,col = c("blue","red"))

Model Adequacy:

library(ggfortify)
library(ggplot2)
autoplot(Model)

Comment:

--> As per Normal Q-Q Plot data seems to follow normality but the residual vs fitted values plot indicates that constant variation assumption may not hold since the plot is uneven and due to this model may not hold

2 Question 6.12:

  1. Estimate the factor effects.

  2. Conduct an analysis of variance. Which factors are important?

  3. Write down a regression equation that could be used to predict epitaxial layer thickness over the region of arsenic flow rate and deposition time used in this experiment.

  4. Analyze the residuals. Are there any residuals that should cause concern?

  5. Discuss how you might deal with the potential outlier found in part (d).

2.1 Solution:

Writing Model Equation:

\[ y_{ij}=\mu+\alpha_{i}+\beta_{j}+\alpha\beta_{ij}+\epsilon_{ijk} \]

Reading the Data:

A <- c(-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1)
B <- c(-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1)
Obs <- c(14.037,13.880,14.821,14.888,16.165,13.860,14.757,14.921,13.972,14.032,14.843,14.415,13.907,13.914,14.878,14.932)
A <- as.factor(A)
B <- as.factor(B)
Data <- data.frame(A,B,Obs)

PART A:

One <- c(14.037,16.165,13.972,13.907)
A <- c(13.88,13.86,14.032,13.914)
B <- c(14.821,14.757,14.843,14.878)
AB <- c(14.888,14.921,14.415,14.932)

S1 <- sum(One)
SA <- sum(A)
SB <- sum(B)
SAB <- sum(AB)

EffectA <- (2*(SA+SAB-S1-SB)/(4*4))
EffectB <- (2*(SB+SAB-S1-SA)/(4*4))
EffectAB <- (2*(SA+SB-S1-SAB)/(4*4))

print(EffectA)
## [1] -0.31725
print(EffectB)
## [1] 0.586
print(EffectAB)
## [1] -0.2815

PART B:

Writing the Hypothesis:

Interaction:

Null:\[H_o:\alpha\beta_{ij}=0\space\forall\space"i,j"\]

Alternate:\[H_a:\alpha\beta_{ij}\neq0\space\exists\space"i,j"\]

Main Effects:

Null:\[H_o:\alpha_{i}=0\space\forall\space"i"\]

\[ H_o:\beta_{j}=0\space\forall\space"j" \]

Alternate:\[H_a:\alpha{i}\neq0\space\exists\space"i"\]

\[ H_a:\beta{j}\neq0\space\exists\space"j" \]

Running the Model:

Model <- aov(Obs~A*B,data = Data)
summary(Model)
##             Df Sum Sq Mean Sq F value Pr(>F)  
## A            1  0.403  0.4026   1.262 0.2833  
## B            1  1.374  1.3736   4.305 0.0602 .
## A:B          1  0.317  0.3170   0.994 0.3386  
## Residuals   12  3.828  0.3190                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
--> From above result, we can conclude that interaction between factors A & B is insignificant. Thus removing interaction effect and testing for main effects.
Model <- aov(Obs~A+B,data = Data)
summary(Model)
##             Df Sum Sq Mean Sq F value Pr(>F)  
## A            1  0.403  0.4026   1.263 0.2815  
## B            1  1.374  1.3736   4.308 0.0584 .
## Residuals   13  4.145  0.3189                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
--> From above results the main effects are also insignificant

PART C:

Model <- lm(Obs~A*B,data = Data)
coef(Model)
## (Intercept)          A1          B1       A1:B1 
##    14.52025    -0.59875     0.30450     0.56300
summary(Model)
## 
## Call:
## lm(formula = Obs ~ A * B, data = Data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.61325 -0.14431 -0.00563  0.10188  1.64475 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  14.5202     0.2824  51.414 1.93e-15 ***
## A1           -0.5987     0.3994  -1.499    0.160    
## B1            0.3045     0.3994   0.762    0.461    
## A1:B1         0.5630     0.5648   0.997    0.339    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5648 on 12 degrees of freedom
## Multiple R-squared:  0.3535, Adjusted R-squared:  0.1918 
## F-statistic: 2.187 on 3 and 12 DF,  p-value: 0.1425

Therefore Regression Equation:

\[ Y_{i,j,k}= 14.52025 - 0.59875 \alpha_{i} + 0.30450\beta_{j} + 0.5630\alpha\beta\gamma_{ijk} + \epsilon_{i,j,k} \]

PART D:

Analyzing Residuals:

autoplot(Model)

--> The plots show that the data is not normally distributed & also has an outlier, and also the variance is not constant so the model isn’t adequate

PART E:

--> We can perform a BoxCox transformation on the data and find out the appropriate value of lambda and then perform the ANOVA analysis on the transformed data

3 Question 6.21:

  1. Analyze the data from this experiment. Which factors significantly affect putting performance?

  2. Analyze the residuals from this experiment. Are there any indications of model inadequacy?

3.1 Solution:

Reading the Data;

Typeofputter <- c(rep(-1,7),rep(-1,7),rep(1,7),rep(1,7),rep(-1,7),rep(-1,7),rep(1,7),rep(1,7),rep(-1,7),rep(-1,7),rep(1,7),rep(1,7),rep(-1,7),rep(-1,7),rep(1,7),rep(1,7))
LengthofPutt <- c(rep(-1,7),rep(1,7),rep(-1,7),rep(1,7),rep(-1,7),rep(1,7),rep(-1,7),rep(1,7),rep(-1,7),rep(1,7),rep(-1,7),rep(1,7),rep(-1,7),rep(1,7),rep(-1,7),rep(1,7))
Slopeofputt <- c(rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7))
Breakofputt <- c(rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(-1,7),rep(1,7),rep(1,7),rep(1,7),rep(1,7))

DistancefromCup <- c(10,18,14,12.5,19,16,18.5, 0,16.5,4.5,17.5,20.5,17.5,33, 4,6,1,14.5,12,14,5, 0,10,34,11,25.5,21.5,0, 0,0,18.5,19.5,16,15,11, 5,20.5,18,20,29.5,19,10, 6.5,18.5,7.5,6,0,10,0, 16.5,4.5,0,23.5,8,8,8, 4.5,18,14.5,10,0,17.5,6, 19.5,18,16,5.5,10,7,36, 15,16,8.5,0,0.5,9,3, 41.5,39,6.5,3.5,7,8.5,36, 8,4.5,6.5,10,13,41,14, 21.5,10.5,6.5,0,15.5,24,16, 0,0,0,4.5,1,4,6.5, 18,5,7,10,32.5,18.5,8) 

library(GAD)
Typeofputter <- as.fixed(Typeofputter)
LengthofPutt <- as.fixed(LengthofPutt)
Slopeofputt <- as.fixed(Slopeofputt)
Breakofputt <- as.fixed(Breakofputt)

Dat3 <- data.frame(LengthofPutt, Typeofputter, Breakofputt, Slopeofputt, DistancefromCup)
Model <- lm(DistancefromCup~LengthofPutt*Typeofputter*Breakofputt*Slopeofputt, data = Dat3)
coef(Model)
##                                           (Intercept) 
##                                            15.4285714 
##                                         LengthofPutt1 
##                                             0.2142857 
##                                         Typeofputter1 
##                                            -7.3571429 
##                                          Breakofputt1 
##                                            -4.0000000 
##                                          Slopeofputt1 
##                                            -5.3571429 
##                           LengthofPutt1:Typeofputter1 
##                                             6.2857143 
##                            LengthofPutt1:Breakofputt1 
##                                             5.7857143 
##                            Typeofputter1:Breakofputt1 
##                                             2.8571429 
##                            LengthofPutt1:Slopeofputt1 
##                                             5.7142857 
##                            Typeofputter1:Slopeofputt1 
##                                             4.7142857 
##                             Breakofputt1:Slopeofputt1 
##                                             7.7857143 
##              LengthofPutt1:Typeofputter1:Breakofputt1 
##                                            -9.4285714 
##              LengthofPutt1:Typeofputter1:Slopeofputt1 
##                                             0.6428571 
##               LengthofPutt1:Breakofputt1:Slopeofputt1 
##                                           -12.1428571 
##               Typeofputter1:Breakofputt1:Slopeofputt1 
##                                           -11.7857143 
## LengthofPutt1:Typeofputter1:Breakofputt1:Slopeofputt1 
##                                            14.7857143

Running GAD Model:

Model <- aov(Model)
gad(Model)
## Analysis of Variance Table
## 
## Response: DistancefromCup
##                                                   Df Sum Sq Mean Sq F value
## LengthofPutt                                       1  917.1  917.15 10.5878
## Typeofputter                                       1  388.1  388.15  4.4809
## Breakofputt                                        1  145.1  145.15  1.6756
## Slopeofputt                                        1    1.4    1.40  0.0161
## LengthofPutt:Typeofputter                          1  218.7  218.68  2.5245
## LengthofPutt:Breakofputt                           1   11.9   11.90  0.1373
## Typeofputter:Breakofputt                           1  115.0  115.02  1.3278
## LengthofPutt:Slopeofputt                           1   93.8   93.81  1.0829
## Typeofputter:Slopeofputt                           1   56.4   56.43  0.6515
## Breakofputt:Slopeofputt                            1    1.6    1.63  0.0188
## LengthofPutt:Typeofputter:Breakofputt              1    7.3    7.25  0.0837
## LengthofPutt:Typeofputter:Slopeofputt              1  113.0  113.00  1.3045
## LengthofPutt:Breakofputt:Slopeofputt               1   39.5   39.48  0.4558
## Typeofputter:Breakofputt:Slopeofputt               1   33.8   33.77  0.3899
## LengthofPutt:Typeofputter:Breakofputt:Slopeofputt  1   95.6   95.65  1.1042
## Residual                                          96 8315.8   86.62        
##                                                     Pr(>F)   
## LengthofPutt                                      0.001572 **
## Typeofputter                                      0.036862 * 
## Breakofputt                                       0.198615   
## Slopeofputt                                       0.899280   
## LengthofPutt:Typeofputter                         0.115377   
## LengthofPutt:Breakofputt                          0.711776   
## Typeofputter:Breakofputt                          0.252054   
## LengthofPutt:Slopeofputt                          0.300658   
## Typeofputter:Slopeofputt                          0.421588   
## Breakofputt:Slopeofputt                           0.891271   
## LengthofPutt:Typeofputter:Breakofputt             0.772939   
## LengthofPutt:Typeofputter:Slopeofputt             0.256228   
## LengthofPutt:Breakofputt:Slopeofputt              0.501207   
## Typeofputter:Breakofputt:Slopeofputt              0.533858   
## LengthofPutt:Typeofputter:Breakofputt:Slopeofputt 0.295994   
## Residual                                                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
--> The length of putt & type of putter seem to be the only factors that have a significant effect on the golf performance. Furthermore, we also see that the ANOVA results the p values for both the above factors is 0.001572 & 0.036862 respectively, hence we reject the null hypothesis and confirm with our conclusions as well.

PART B:

Model Adequacy:

autoplot(Model)

--> Looking at the “Normal Q-Q” & “Residuals vs Fitted” plots we can see that the model is inadequate. Though the data satisfies the normality assumption, there’s a wide spread in variance of data

4 Question 6.36:

  1. Estimate the factor effects. Plot the effect estimates on a normal probability plot and select a tentative model.

  2. Fit the model identified in part (a) and analyze the residuals. Is there any indication of model inadequacy?

  3. Repeat the analysis from parts (a) and (b) using ln (y) as the response variable. Is there an indication that the transformation has been useful?

  4. Fit a model in terms of the coded variables that can be used to predict the resistivity.

4.1 Solution:

Reading the Data:

A<-rep(c(-1,1),8)
B<-rep(c(-1,-1,1,1),4)
C<-rep(c(rep(-1,4),rep(1,4)),2)
D<-c(rep(-1,8),rep(1,8))

Resistivity<-c(1.92,11.28,1.09,5.75,2.13,9.53,1.03,5.35,1.60,11.73,1.16,4.68,2.16,9.11,1.07,5.30)

Dat<-data.frame(A,B,C,D,Resistivity)
Dat
##     A  B  C  D Resistivity
## 1  -1 -1 -1 -1        1.92
## 2   1 -1 -1 -1       11.28
## 3  -1  1 -1 -1        1.09
## 4   1  1 -1 -1        5.75
## 5  -1 -1  1 -1        2.13
## 6   1 -1  1 -1        9.53
## 7  -1  1  1 -1        1.03
## 8   1  1  1 -1        5.35
## 9  -1 -1 -1  1        1.60
## 10  1 -1 -1  1       11.73
## 11 -1  1 -1  1        1.16
## 12  1  1 -1  1        4.68
## 13 -1 -1  1  1        2.16
## 14  1 -1  1  1        9.11
## 15 -1  1  1  1        1.07
## 16  1  1  1  1        5.30

PART A:

ModeL <- lm(Resistivity~A*B*C*D, data=Dat)
coef(ModeL)
## (Intercept)           A           B           C           D         A:B 
##    4.680625    3.160625   -1.501875   -0.220625   -0.079375   -1.069375 
##         A:C         B:C         A:D         B:D         C:D       A:B:C 
##   -0.298125    0.229375   -0.056875   -0.046875    0.029375    0.344375 
##       A:B:D       A:C:D       B:C:D     A:B:C:D 
##   -0.096875   -0.010625    0.094375    0.141875

Half Normal Plot:

library(DoE.base)
halfnormal(ModeL)

We see that the significant effects are A, B, A:, and A:B:C.

--> The model eqn is Resistivity = 4.680625 + (3.160625)A + (-1.501875)B + (-1.069375)AB + Error

PART B:

Model Adequacy:

An <- as.fixed(A)
Bn <- as.fixed(B)
Dat1 <- data.frame(An,Bn,Resistivity)
Model <- aov(Resistivity~An*Bn, data=Dat1)
GAD::gad(Model)
## Analysis of Variance Table
## 
## Response: Resistivity
##          Df  Sum Sq Mean Sq F value    Pr(>F)    
## An        1 159.833 159.833 333.088 4.049e-10 ***
## Bn        1  36.090  36.090  75.211 1.630e-06 ***
## An:Bn     1  18.297  18.297  38.130 4.763e-05 ***
## Residual 12   5.758   0.480                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
autoplot(Model)

--> Looking at the “Normal Q-Q” & “Residuals vs Fitted” plots we can see that the model is inadequate. We can visualize that the data is neither normally distributed nor the variance can be characterized constant. Further the ANOVA analysis presents that all factors and interactions A,B,C,AB,AC,BC & ABC are significant at a significance level of 0.05.

PART C: (Transformation for Part a)

Lresistivity <- log(Resistivity)
Dat2<-data.frame(A,B,C,D,Resistivity)

Model <- lm(Resistivity~A*B*C*D, data=Dat2)
coef(Model)
## (Intercept)           A           B           C           D         A:B 
##    4.680625    3.160625   -1.501875   -0.220625   -0.079375   -1.069375 
##         A:C         B:C         A:D         B:D         C:D       A:B:C 
##   -0.298125    0.229375   -0.056875   -0.046875    0.029375    0.344375 
##       A:B:D       A:C:D       B:C:D     A:B:C:D 
##   -0.096875   -0.010625    0.094375    0.141875

Half Normal Plot:

halfnormal(Model)

--> We can conclude that apart from the main effects of A & B, ABC interaction looks to fairly all within the normality line, so we can no longer consider it as a significant effect.

Now, seeing for main effects

Ac <- as.fixed(A)
Bc <- as.fixed(B)
Dat3 <- data.frame(Ac,Bc,Lresistivity)
Model <- aov(Lresistivity~Ac+Bc, data=Dat3)
GAD::gad(Model)
## Analysis of Variance Table
## 
## Response: Lresistivity
##          Df  Sum Sq Mean Sq F value    Pr(>F)    
## Ac        1 10.5721 10.5721  962.95 1.408e-13 ***
## Bc        1  1.5803  1.5803  143.94 2.095e-08 ***
## Residual 13  0.1427  0.0110                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
autoplot(Model)

--> We can clearly see that the variances have stabilized a bit & also the NPP plot looks to be fairly normally distributed. The model seems to be adequate

Transformation for Part b:

Dat4 <- data.frame(An,Bn,Lresistivity)
Model <- aov(Lresistivity~An+Bn, data=Dat4)
GAD::gad(Model)
## Analysis of Variance Table
## 
## Response: Lresistivity
##          Df  Sum Sq Mean Sq F value    Pr(>F)    
## An        1 10.5721 10.5721  962.95 1.408e-13 ***
## Bn        1  1.5803  1.5803  143.94 2.095e-08 ***
## Residual 13  0.1427  0.0110                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
autoplot(Model)

--> We can clearly see that the variances have stabilized a bit & also the NPP plot looks to be fairly normally distributed. The model seems to be adequate. Factors A & B seem to be significant as per their p-values.
Dat5 <- data.frame(Lresistivity,A,B)
Model <- lm(Lresistivity~A+B, data=Dat5)
coef(Model)
## (Intercept)           A           B 
##   1.1854171   0.8128703  -0.3142776

PART D:

--> log(Resistivity) = 1.1854171 + (0.8128703)A + (-0.3142776)B + Error.

5 Question 6.39:

Reading the Data:

library(DoE.base)
A <- c(-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1)
B <- c(-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1)
C <- c(-1,-1,-1,-1,1,1,1,1,-1,-1,-1,-1,1,1,1,1,-1,-1,-1,-1,1,1,1,1,-1,-1,-1,-1,1,1,1,1)
D <- c(-1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,1,1,1,1,-1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,1,1,1,1)
E <- c(-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
Obs <- c(8.11,5.56,5.77,5.82,9.17,7.8,3.23,5.69,8.82,14.23,9.2,8.94,8.68,11.49,6.25,9.12,7.93,5,7.47,12,9.86,3.65,6.4,11.61,12.43,17.55,8.87,25.38,13.06,18.85,11.78,26.05)
Data <- data.frame(A,B,C,D,E,Obs)

PART A:

Model <- lm(Obs~A*B*C*D*E,data = Data)
coef(Model)
## (Intercept)           A           B           C           D           E 
##  10.1803125   1.6159375   0.0434375  -0.0121875   2.9884375   2.1878125 
##         A:B         A:C         B:C         A:D         B:D         C:D 
##   1.2365625  -0.0015625  -0.1953125   1.6665625  -0.0134375   0.0034375 
##         A:E         B:E         C:E         D:E       A:B:C       A:B:D 
##   1.0271875   1.2834375   0.3015625   1.3896875   0.2503125  -0.3453125 
##       A:C:D       B:C:D       A:B:E       A:C:E       B:C:E       A:D:E 
##  -0.0634375   0.3053125   1.1853125  -0.2590625   0.1709375   0.9015625 
##       B:D:E       C:D:E     A:B:C:D     A:B:C:E     A:B:D:E     A:C:D:E 
##  -0.0396875   0.3959375  -0.0740625  -0.1846875   0.4071875   0.1278125 
##     B:C:D:E   A:B:C:D:E 
##  -0.0746875  -0.3553125
halfnormal(Model)

summary(Model)
## 
## Call:
## lm.default(formula = Obs ~ A * B * C * D * E, data = Data)
## 
## Residuals:
## ALL 32 residuals are 0: no residual degrees of freedom!
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.180312        NaN     NaN      NaN
## A            1.615938        NaN     NaN      NaN
## B            0.043438        NaN     NaN      NaN
## C           -0.012187        NaN     NaN      NaN
## D            2.988437        NaN     NaN      NaN
## E            2.187813        NaN     NaN      NaN
## A:B          1.236562        NaN     NaN      NaN
## A:C         -0.001563        NaN     NaN      NaN
## B:C         -0.195313        NaN     NaN      NaN
## A:D          1.666563        NaN     NaN      NaN
## B:D         -0.013438        NaN     NaN      NaN
## C:D          0.003437        NaN     NaN      NaN
## A:E          1.027188        NaN     NaN      NaN
## B:E          1.283437        NaN     NaN      NaN
## C:E          0.301563        NaN     NaN      NaN
## D:E          1.389687        NaN     NaN      NaN
## A:B:C        0.250313        NaN     NaN      NaN
## A:B:D       -0.345312        NaN     NaN      NaN
## A:C:D       -0.063437        NaN     NaN      NaN
## B:C:D        0.305312        NaN     NaN      NaN
## A:B:E        1.185313        NaN     NaN      NaN
## A:C:E       -0.259062        NaN     NaN      NaN
## B:C:E        0.170938        NaN     NaN      NaN
## A:D:E        0.901563        NaN     NaN      NaN
## B:D:E       -0.039687        NaN     NaN      NaN
## C:D:E        0.395938        NaN     NaN      NaN
## A:B:C:D     -0.074063        NaN     NaN      NaN
## A:B:C:E     -0.184688        NaN     NaN      NaN
## A:B:D:E      0.407187        NaN     NaN      NaN
## A:C:D:E      0.127812        NaN     NaN      NaN
## B:C:D:E     -0.074688        NaN     NaN      NaN
## A:B:C:D:E   -0.355312        NaN     NaN      NaN
## 
## Residual standard error: NaN on 0 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:    NaN 
## F-statistic:   NaN on 31 and 0 DF,  p-value: NA
Model2 <- aov(Obs~A+B+D+E+A*B+A*D+A*E+B*E+D*E+A*B*E+A*D*E,data = Data)
summary(Model2)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## A            1  83.56   83.56  51.362 6.10e-07 ***
## B            1   0.06    0.06   0.037 0.849178    
## D            1 285.78  285.78 175.664 2.30e-11 ***
## E            1 153.17  153.17  94.149 5.24e-09 ***
## A:B          1  48.93   48.93  30.076 2.28e-05 ***
## A:D          1  88.88   88.88  54.631 3.87e-07 ***
## A:E          1  33.76   33.76  20.754 0.000192 ***
## B:E          1  52.71   52.71  32.400 1.43e-05 ***
## D:E          1  61.80   61.80  37.986 5.07e-06 ***
## A:B:E        1  44.96   44.96  27.635 3.82e-05 ***
## A:D:E        1  26.01   26.01  15.988 0.000706 ***
## Residuals   20  32.54    1.63                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
--> Half normal plot displays factors A,D,E,A:D,D:E,B:E,A:B,A:E,A:B:E,A:D:E as significant. ANOVA analysis also presents the factors A,D,E,AB,AD,AE,BE,DE,ABE,ADE as significant.

PART B:

autoplot(Model2)

--> Looking at the “Normal Q-Q” & “Residuals vs Fitted” plots we can conclude that the model is inadequate. Though the data satisfies the normality assumption, there’s a wide spread in variance of data.

PART C:

A <- c(-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1)
B <- c(-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1)
D <- c(-1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,1,1,1,1,-1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,1,1,1,1)
E <- c(-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
Obs <- c(8.11,5.56,5.77,5.82,9.17,7.8,3.23,5.69,8.82,14.23,9.2,8.94,8.68,11.49,6.25,9.12,7.93,5,7.47,12,9.86,3.65,6.4,11.61,12.43,17.55,8.87,25.38,13.06,18.85,11.78,26.05)
Data <- data.frame(A,B,D,E,Obs)

Model <- lm(Obs~A*B*D*E,data = Data)
coef(Model)
## (Intercept)           A           B           D           E         A:B 
##  10.1803125   1.6159375   0.0434375   2.9884375   2.1878125   1.2365625 
##         A:D         B:D         A:E         B:E         D:E       A:B:D 
##   1.6665625  -0.0134375   1.0271875   1.2834375   1.3896875  -0.3453125 
##       A:B:E       A:D:E       B:D:E     A:B:D:E 
##   1.1853125   0.9015625  -0.0396875   0.4071875
halfnormal(Model)

summary(Model)
## 
## Call:
## lm.default(formula = Obs ~ A * B * D * E, data = Data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.4750 -0.5637  0.0000  0.5637  1.4750 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 10.18031    0.21360  47.661  < 2e-16 ***
## A            1.61594    0.21360   7.565 1.14e-06 ***
## B            0.04344    0.21360   0.203 0.841418    
## D            2.98844    0.21360  13.991 2.16e-10 ***
## E            2.18781    0.21360  10.243 1.97e-08 ***
## A:B          1.23656    0.21360   5.789 2.77e-05 ***
## A:D          1.66656    0.21360   7.802 7.66e-07 ***
## B:D         -0.01344    0.21360  -0.063 0.950618    
## A:E          1.02719    0.21360   4.809 0.000193 ***
## B:E          1.28344    0.21360   6.009 1.82e-05 ***
## D:E          1.38969    0.21360   6.506 7.24e-06 ***
## A:B:D       -0.34531    0.21360  -1.617 0.125501    
## A:B:E        1.18531    0.21360   5.549 4.40e-05 ***
## A:D:E        0.90156    0.21360   4.221 0.000650 ***
## B:D:E       -0.03969    0.21360  -0.186 0.854935    
## A:B:D:E      0.40719    0.21360   1.906 0.074735 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.208 on 16 degrees of freedom
## Multiple R-squared:  0.9744, Adjusted R-squared:  0.9504 
## F-statistic: 40.58 on 15 and 16 DF,  p-value: 7.07e-10
Model2 <- aov(Obs~A+B+D+E+A*B+A*D+A*E+B*E+D*E+A*B*E+A*D*E,data = Data)
summary(Model2)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## A            1  83.56   83.56  51.362 6.10e-07 ***
## B            1   0.06    0.06   0.037 0.849178    
## D            1 285.78  285.78 175.664 2.30e-11 ***
## E            1 153.17  153.17  94.149 5.24e-09 ***
## A:B          1  48.93   48.93  30.076 2.28e-05 ***
## A:D          1  88.88   88.88  54.631 3.87e-07 ***
## A:E          1  33.76   33.76  20.754 0.000192 ***
## B:E          1  52.71   52.71  32.400 1.43e-05 ***
## D:E          1  61.80   61.80  37.986 5.07e-06 ***
## A:B:E        1  44.96   44.96  27.635 3.82e-05 ***
## A:D:E        1  26.01   26.01  15.988 0.000706 ***
## Residuals   20  32.54    1.63                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
autoplot(Model2)

--> Since factor C was insignificant and thus didn’t seem important, so dropped it. Even after dropping factor C and analyzing the data with four active factors still the results are same as in part A. ANOVA analysis in this case too presents the factors A,D,E,AB,AD,AE,BE,DE,ABE,ADE as significant.

PART D:

--> y = 10.1809375 + (1.6153125)A + (0.0428125)B + (2.9890625)D + (2.1884375)E + Error.

We can see that the coefficients are positive in all the factors, hence they will maximize the predicted response.