HW11DOE

QUESTION 6.36

Getting in data

library(DoE.base)
A<- c(-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1)
B<- c(-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1)
C<- c(-1,-1,-1,-1,1,1,1,1,-1,-1,-1,-1,1,1,1,1)
D<- c(-1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,1,1,1,1)
obs<- c(1.92,11.28,1.09,5.75,2.13,9.53,1.03,5.35,1.6,11.73,1.16,4.68,2.16,9.11,1.07,5.3)
dat<- data.frame(A,B,C,D,obs)

6.36) A

model<-lm(obs~A*B*C*D,data=dat)
coef(model)

## (Intercept)           A           B           C           D         A:B 
##    4.680625    3.160625   -1.501875   -0.220625   -0.079375   -1.069375 
##         A:C         B:C         A:D         B:D         C:D       A:B:C 
##   -0.298125    0.229375   -0.056875   -0.046875    0.029375    0.344375 
##       A:B:D       A:C:D       B:C:D     A:B:C:D 
##   -0.096875   -0.010625    0.094375    0.141875

halfnormal(model)

## 
## Significant effects (alpha=0.05, Lenth method):

## [1] A     B     A:B   A:B:C

From the Half-normal plot,we can see that Factors A,B, A:B , and A:B:C are significant at alpha=0.5

summary(model)

## 
## Call:
## lm.default(formula = obs ~ A * B * C * D, data = dat)
## 
## Residuals:
## ALL 16 residuals are 0: no residual degrees of freedom!
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  4.68062        NaN     NaN      NaN
## A            3.16062        NaN     NaN      NaN
## B           -1.50187        NaN     NaN      NaN
## C           -0.22062        NaN     NaN      NaN
## D           -0.07937        NaN     NaN      NaN
## A:B         -1.06938        NaN     NaN      NaN
## A:C         -0.29812        NaN     NaN      NaN
## B:C          0.22937        NaN     NaN      NaN
## A:D         -0.05687        NaN     NaN      NaN
## B:D         -0.04688        NaN     NaN      NaN
## C:D          0.02937        NaN     NaN      NaN
## A:B:C        0.34437        NaN     NaN      NaN
## A:B:D       -0.09688        NaN     NaN      NaN
## A:C:D       -0.01063        NaN     NaN      NaN
## B:C:D        0.09438        NaN     NaN      NaN
## A:B:C:D      0.14188        NaN     NaN      NaN
## 
## Residual standard error: NaN on 0 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:    NaN 
## F-statistic:   NaN on 15 and 0 DF,  p-value: NA

To select our tentative model, we would consider the significant factors from our half-normal plots,which are factors A,B,A:B,A:B:C but we would neglect the interaction effect A:B:C because they do not differ in such a large amount from our model distribution.

So therefore our tentative model can be written as

\(Y_{i,j,k}=4.68062+3.16062\alpha _{i}-1.50187\beta_{j}-1.06938\alpha\beta_{ij}+\epsilon_{ijk}\)

6.36) B

model2<- aov(obs~A+B+C+A*B+A*B*C,data =dat)
summary(model2)

##             Df Sum Sq Mean Sq  F value   Pr(>F)    
## A            1 159.83  159.83 1563.061 1.84e-10 ***
## B            1  36.09   36.09  352.937 6.66e-08 ***
## C            1   0.78    0.78    7.616  0.02468 *  
## A:B          1  18.30   18.30  178.933 9.33e-07 ***
## A:C          1   1.42    1.42   13.907  0.00579 ** 
## B:C          1   0.84    0.84    8.232  0.02085 *  
## A:B:C        1   1.90    1.90   18.556  0.00259 ** 
## Residuals    8   0.82    0.10                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

plot(model2)

## hat values (leverages) are all = 0.5
##  and there are no factor predictors; no plot no. 5

From the Normal Q-Q plot, we can’t assume Normality in the data since the data points in the plot doesn’t appears to fall on a straight line

Also, from the Residuals vs fitted plot, we can’t assume constant variance since all the residuals have varying spread, depleting that the variances are not equal.

6.36) C

Using a Log transformation on our data below

logobs <- log(obs)
dat2 <- data.frame(A,B,C,D,logobs)
model3<- lm(logobs~A*B*C*D,data = dat2)
coef(model3)

##  (Intercept)            A            B            C            D          A:B 
##  1.185417116  0.812870345 -0.314277554 -0.006408558 -0.018077390 -0.024684570 
##          A:C          B:C          A:D          B:D          C:D        A:B:C 
## -0.039723700 -0.004225796 -0.009578245  0.003708723  0.017780432  0.063434408 
##        A:B:D        A:C:D        B:C:D      A:B:C:D 
## -0.029875960 -0.003740235  0.003765760  0.031322043

halfnormal(model3)

## 
## Significant effects (alpha=0.05, Lenth method):

## [1] A     B     A:B:C

summary(model3)

## 
## Call:
## lm.default(formula = logobs ~ A * B * C * D, data = dat2)
## 
## Residuals:
## ALL 16 residuals are 0: no residual degrees of freedom!
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept)  1.185417        NaN     NaN      NaN
## A            0.812870        NaN     NaN      NaN
## B           -0.314278        NaN     NaN      NaN
## C           -0.006409        NaN     NaN      NaN
## D           -0.018077        NaN     NaN      NaN
## A:B         -0.024685        NaN     NaN      NaN
## A:C         -0.039724        NaN     NaN      NaN
## B:C         -0.004226        NaN     NaN      NaN
## A:D         -0.009578        NaN     NaN      NaN
## B:D          0.003709        NaN     NaN      NaN
## C:D          0.017780        NaN     NaN      NaN
## A:B:C        0.063434        NaN     NaN      NaN
## A:B:D       -0.029876        NaN     NaN      NaN
## A:C:D       -0.003740        NaN     NaN      NaN
## B:C:D        0.003766        NaN     NaN      NaN
## A:B:C:D      0.031322        NaN     NaN      NaN
## 
## Residual standard error: NaN on 0 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:    NaN 
## F-statistic:   NaN on 15 and 0 DF,  p-value: NA

model4<-aov(logobs~A+B+C+A*B*C,data=dat2)
summary(model4)

##             Df Sum Sq Mean Sq  F value   Pr(>F)    
## A            1 10.572  10.572 1994.556 6.98e-11 ***
## B            1  1.580   1.580  298.147 1.29e-07 ***
## C            1  0.001   0.001    0.124  0.73386    
## A:B          1  0.010   0.010    1.839  0.21207    
## A:C          1  0.025   0.025    4.763  0.06063 .  
## B:C          1  0.000   0.000    0.054  0.82223    
## A:B:C        1  0.064   0.064   12.147  0.00826 ** 
## Residuals    8  0.042   0.005                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

plot(model4)

After the Log transformation, we can observe that from the half-normal plots that only factors A, B, and A:B:C are significant and interactions of A:B seems insignificant as compared to pre-log transformation.

The confirmation of the half normal plot was double checked using ANOVA analysis that showed factors A,B, and A:B:C to be significant as shown above.

Based on the residuals plots, after the log transformation was performed we observed that

From the Normal Q-Q plot, we can assume Normality in the data since the data points in the plot does appears to fall on a straight line

Also, from the Residuals vs fitted plot, we can assume constant variance since all the residuals have same spread, depleting that the variances are equal.

6.36) D

Fitting a model of the coded variables we have

\(Y_{i,j,k,l}=1.185417+0.812870\alpha _{i}-0.314278\beta_{j}+\epsilon_{ijkl}\)

QUESTION 6.39

Getting in the data we have

library(DoE.base)
A<-c(-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1)
B<-c(-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1)
C<-c(-1,-1,-1,-1,1,1,1,1,-1,-1,-1,-1,1,1,1,1,-1,-1,-1,-1,1,1,1,1,-1,-1,-1,-1,1,1,1,1)
D<-c(-1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,1,1,1,1,-1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,1,1,1,1)
E<- c(-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
obs<- c(8.11,5.56,5.77,5.82,9.17,7.8,3.23,5.69,8.82,14.23,9.2,8.94,8.68,11.49,6.25,9.12,7.93,5,7.47,12,9.86,3.65,6.4,11.61,12.43,17.55,8.87,25.38,13.06,18.85,11.78,26.05)
dat<- data.frame(A,B,C,D,E,obs)

6.39) A

model<-lm(obs~A*B*C*D*E,data=dat)
coef(model)

## (Intercept)           A           B           C           D           E 
##  10.1803125   1.6159375   0.0434375  -0.0121875   2.9884375   2.1878125 
##         A:B         A:C         B:C         A:D         B:D         C:D 
##   1.2365625  -0.0015625  -0.1953125   1.6665625  -0.0134375   0.0034375 
##         A:E         B:E         C:E         D:E       A:B:C       A:B:D 
##   1.0271875   1.2834375   0.3015625   1.3896875   0.2503125  -0.3453125 
##       A:C:D       B:C:D       A:B:E       A:C:E       B:C:E       A:D:E 
##  -0.0634375   0.3053125   1.1853125  -0.2590625   0.1709375   0.9015625 
##       B:D:E       C:D:E     A:B:C:D     A:B:C:E     A:B:D:E     A:C:D:E 
##  -0.0396875   0.3959375  -0.0740625  -0.1846875   0.4071875   0.1278125 
##     B:C:D:E   A:B:C:D:E 
##  -0.0746875  -0.3553125

halfnormal(model)

## 
## Significant effects (alpha=0.05, Lenth method):

##  [1] D     E     A:D   A     D:E   B:E   A:B   A:B:E A:E   A:D:E

summary(model)

## 
## Call:
## lm.default(formula = obs ~ A * B * C * D * E, data = dat)
## 
## Residuals:
## ALL 32 residuals are 0: no residual degrees of freedom!
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.180312        NaN     NaN      NaN
## A            1.615938        NaN     NaN      NaN
## B            0.043438        NaN     NaN      NaN
## C           -0.012187        NaN     NaN      NaN
## D            2.988437        NaN     NaN      NaN
## E            2.187813        NaN     NaN      NaN
## A:B          1.236562        NaN     NaN      NaN
## A:C         -0.001563        NaN     NaN      NaN
## B:C         -0.195313        NaN     NaN      NaN
## A:D          1.666563        NaN     NaN      NaN
## B:D         -0.013438        NaN     NaN      NaN
## C:D          0.003437        NaN     NaN      NaN
## A:E          1.027188        NaN     NaN      NaN
## B:E          1.283437        NaN     NaN      NaN
## C:E          0.301563        NaN     NaN      NaN
## D:E          1.389687        NaN     NaN      NaN
## A:B:C        0.250313        NaN     NaN      NaN
## A:B:D       -0.345312        NaN     NaN      NaN
## A:C:D       -0.063437        NaN     NaN      NaN
## B:C:D        0.305312        NaN     NaN      NaN
## A:B:E        1.185313        NaN     NaN      NaN
## A:C:E       -0.259062        NaN     NaN      NaN
## B:C:E        0.170938        NaN     NaN      NaN
## A:D:E        0.901563        NaN     NaN      NaN
## B:D:E       -0.039687        NaN     NaN      NaN
## C:D:E        0.395938        NaN     NaN      NaN
## A:B:C:D     -0.074063        NaN     NaN      NaN
## A:B:C:E     -0.184688        NaN     NaN      NaN
## A:B:D:E      0.407187        NaN     NaN      NaN
## A:C:D:E      0.127812        NaN     NaN      NaN
## B:C:D:E     -0.074688        NaN     NaN      NaN
## A:B:C:D:E   -0.355312        NaN     NaN      NaN
## 
## Residual standard error: NaN on 0 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:    NaN 
## F-statistic:   NaN on 31 and 0 DF,  p-value: NA

summary(model)

## 
## Call:
## lm.default(formula = obs ~ A * B * C * D * E, data = dat)
## 
## Residuals:
## ALL 32 residuals are 0: no residual degrees of freedom!
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.180312        NaN     NaN      NaN
## A            1.615938        NaN     NaN      NaN
## B            0.043438        NaN     NaN      NaN
## C           -0.012187        NaN     NaN      NaN
## D            2.988437        NaN     NaN      NaN
## E            2.187813        NaN     NaN      NaN
## A:B          1.236562        NaN     NaN      NaN
## A:C         -0.001563        NaN     NaN      NaN
## B:C         -0.195313        NaN     NaN      NaN
## A:D          1.666563        NaN     NaN      NaN
## B:D         -0.013438        NaN     NaN      NaN
## C:D          0.003437        NaN     NaN      NaN
## A:E          1.027188        NaN     NaN      NaN
## B:E          1.283437        NaN     NaN      NaN
## C:E          0.301563        NaN     NaN      NaN
## D:E          1.389687        NaN     NaN      NaN
## A:B:C        0.250313        NaN     NaN      NaN
## A:B:D       -0.345312        NaN     NaN      NaN
## A:C:D       -0.063437        NaN     NaN      NaN
## B:C:D        0.305312        NaN     NaN      NaN
## A:B:E        1.185313        NaN     NaN      NaN
## A:C:E       -0.259062        NaN     NaN      NaN
## B:C:E        0.170938        NaN     NaN      NaN
## A:D:E        0.901563        NaN     NaN      NaN
## B:D:E       -0.039687        NaN     NaN      NaN
## C:D:E        0.395938        NaN     NaN      NaN
## A:B:C:D     -0.074063        NaN     NaN      NaN
## A:B:C:E     -0.184688        NaN     NaN      NaN
## A:B:D:E      0.407187        NaN     NaN      NaN
## A:C:D:E      0.127812        NaN     NaN      NaN
## B:C:D:E     -0.074688        NaN     NaN      NaN
## A:B:C:D:E   -0.355312        NaN     NaN      NaN
## 
## Residual standard error: NaN on 0 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:    NaN 
## F-statistic:   NaN on 31 and 0 DF,  p-value: NA

model2<- aov(obs~A+B+D+E+A*B+A*D+A*E+B*E+D*E+A*B*E+A*D*E,data=dat)
summary(model2)

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## A            1  83.56   83.56  51.362 6.10e-07 ***
## B            1   0.06    0.06   0.037 0.849178    
## D            1 285.78  285.78 175.664 2.30e-11 ***
## E            1 153.17  153.17  94.149 5.24e-09 ***
## A:B          1  48.93   48.93  30.076 2.28e-05 ***
## A:D          1  88.88   88.88  54.631 3.87e-07 ***
## A:E          1  33.76   33.76  20.754 0.000192 ***
## B:E          1  52.71   52.71  32.400 1.43e-05 ***
## D:E          1  61.80   61.80  37.986 5.07e-06 ***
## A:B:E        1  44.96   44.96  27.635 3.82e-05 ***
## A:D:E        1  26.01   26.01  15.988 0.000706 ***
## Residuals   20  32.54    1.63                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We can observe that from the half-normal plots that A,D,E,A:D,D:E,B:E,A:B,A:E,A:B:E,A:D:E are significant

The confirmation of the half normal plot was double checked using ANOVA analysis that showed that the above factors mentioned were significant

6.39) B

plot(model2)

## hat values (leverages) are all = 0.375
##  and there are no factor predictors; no plot no. 5

From the Normal Q-Q plot, we can’t assume Normality in the data since the data points in the plot doesn’t appears to fall on a straight line

Also, from the Residuals vs fitted plot, we can’t assume constant variance since all the residuals have varying spread, depleting that the variances are not equal.

6.39) C

A<- c(-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1)
B<- c(-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1)
D<- c(-1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,1,1,1,1,-1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,1,1,1,1)
E<- c(-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
obs<- c(8.11,5.56,5.77,5.82,9.17,7.8,3.23,5.69,8.82,14.23,9.2,8.94,8.68,11.49,6.25,9.12,7.93,5,7.47,12,9.86,3.65,6.4,11.61,12.43,17.55,8.87,25.38,13.06,18.85,11.78,26.05)
dat<- data.frame(A,B,D,E,obs)

model4<- lm(obs~A*B*D*E,data =dat)
coef(model4)

## (Intercept)           A           B           D           E         A:B 
##  10.1803125   1.6159375   0.0434375   2.9884375   2.1878125   1.2365625 
##         A:D         B:D         A:E         B:E         D:E       A:B:D 
##   1.6665625  -0.0134375   1.0271875   1.2834375   1.3896875  -0.3453125 
##       A:B:E       A:D:E       B:D:E     A:B:D:E 
##   1.1853125   0.9015625  -0.0396875   0.4071875

halfnormal(model4)

## 
## Significant effects (alpha=0.05, Lenth method):

##  [1] D     E     A:D   A     D:E   B:E   A:B   A:B:E A:E   A:D:E e10

summary(model4)

## 
## Call:
## lm.default(formula = obs ~ A * B * D * E, data = dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.4750 -0.5637  0.0000  0.5637  1.4750 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 10.18031    0.21360  47.661  < 2e-16 ***
## A            1.61594    0.21360   7.565 1.14e-06 ***
## B            0.04344    0.21360   0.203 0.841418    
## D            2.98844    0.21360  13.991 2.16e-10 ***
## E            2.18781    0.21360  10.243 1.97e-08 ***
## A:B          1.23656    0.21360   5.789 2.77e-05 ***
## A:D          1.66656    0.21360   7.802 7.66e-07 ***
## B:D         -0.01344    0.21360  -0.063 0.950618    
## A:E          1.02719    0.21360   4.809 0.000193 ***
## B:E          1.28344    0.21360   6.009 1.82e-05 ***
## D:E          1.38969    0.21360   6.506 7.24e-06 ***
## A:B:D       -0.34531    0.21360  -1.617 0.125501    
## A:B:E        1.18531    0.21360   5.549 4.40e-05 ***
## A:D:E        0.90156    0.21360   4.221 0.000650 ***
## B:D:E       -0.03969    0.21360  -0.186 0.854935    
## A:B:D:E      0.40719    0.21360   1.906 0.074735 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.208 on 16 degrees of freedom
## Multiple R-squared:  0.9744, Adjusted R-squared:  0.9504 
## F-statistic: 40.58 on 15 and 16 DF,  p-value: 7.07e-10

model5<- aov(obs~A+B+D+E+A*B+A*D+A*E+B*E+D*E+A*B*E+A*D*E,data=dat)
summary(model5)

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## A            1  83.56   83.56  51.362 6.10e-07 ***
## B            1   0.06    0.06   0.037 0.849178    
## D            1 285.78  285.78 175.664 2.30e-11 ***
## E            1 153.17  153.17  94.149 5.24e-09 ***
## A:B          1  48.93   48.93  30.076 2.28e-05 ***
## A:D          1  88.88   88.88  54.631 3.87e-07 ***
## A:E          1  33.76   33.76  20.754 0.000192 ***
## B:E          1  52.71   52.71  32.400 1.43e-05 ***
## D:E          1  61.80   61.80  37.986 5.07e-06 ***
## A:B:E        1  44.96   44.96  27.635 3.82e-05 ***
## A:D:E        1  26.01   26.01  15.988 0.000706 ***
## Residuals   20  32.54    1.63                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

plot(model5)

## hat values (leverages) are all = 0.375
##  and there are no factor predictors; no plot no. 5

Factor C was dropped totally from our model initially because it seemed insignificant. After dropping factor C we realized that from the half-normal plots that A,D,E,A:D,D:E,B:E,A:B,A:E,A:B:E,A:D:E are significant

The confirmation of the half normal plot was double checked using ANOVA analysis that showed that the above factors mentioned were significant

Which was similar to the results gotten when factor C was included.

6.39) D

Fitting our model in a linear equation we can see that

\(Y_{i,j,k,l}=10.18031+1.61594\alpha _{i}+0.04344\beta_{j}+2.98844\gamma_{k}+2.18781\delta_{l}+ \epsilon_{ijkl}\)

To maximize the predicted response, the above linear model should be follows

Note

Because all the factors are of positive coefficient, therefore they should be at a +1 level to produce maximum response.