Looking at results from Ganesh Krishnan in Practice quiz 6 thread.

Ganesh’s Post

source("contourPlot.R")

#Experiment 1-4 with baseline 3.46, 150
P1.real.baseline = 3.46
H1.real.baseline = 150
P1.real = c(3.3, 3.6, 3.3, 3.6)
H1.real = c(140, 140, 160, 160)
P1.coded = c(-1, 1, -1, 1)
H1.coded = c(-1, -1, 1, 1)
y1 = c(694, 586, 685, 546)

P1.real.mean = P1.real.baseline
P1.real.hrange = 0.3/2
H1.real.mean = H1.real.baseline
H1.real.hrange = 20/2

m1 = lm(y1 ~ P1.coded*H1.coded)
summary(m1)
## 
## Call:
## lm(formula = y1 ~ P1.coded * H1.coded)
## 
## Residuals:
## ALL 4 residuals are 0: no residual degrees of freedom!
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)
## (Intercept)         627.75         NA      NA       NA
## P1.coded            -61.75         NA      NA       NA
## H1.coded            -12.25         NA      NA       NA
## P1.coded:H1.coded    -7.75         NA      NA       NA
## 
## Residual standard error: NaN on 0 degrees of freedom
## Multiple R-squared:     1,   Adjusted R-squared:   NaN 
## F-statistic:  NaN on 3 and 0 DF,  p-value: NA
contourPlot (m1, P1.coded, H1.coded)
## Warning: package 'ggplot2' was built under R version 3.0.3

plot of chunk unnamed-chunk-1

y1.baseline.prediction = predict (m1, data.frame(P1.coded = 0, H1.coded = 0))
y1.baseline.actual = 665

#$30 prediction error, so build quadratic model to get best direction
P1.coded.cc = c(-1.414, 0, 1.414, 0)
H1.coded.cc = c(0, -1.414, 0, 1.414)
P1.real.cc = P1.coded.cc*P1.real.hrange + P1.real.mean
H1.real.cc = H1.coded.cc*H1.real.hrange + H1.real.mean
y1.cc = c(679, 677, 497, 651)

#Combine initial factorial and cc
P1.coded.combined = c(P1.coded, P1.coded.cc, 0)
H1.coded.combined = c(H1.coded, H1.coded.cc, 0)
y1.combined = c(y1, y1.cc, y1.baseline.actual)
m1.combined = lm(y1.combined ~ P1.coded.combined*H1.coded.combined + I(P1.coded.combined^2)
                 + I(H1.coded.combined^2))

contourPlot(m1.combined, P1.coded.combined, H1.coded.combined)

plot of chunk unnamed-chunk-1

RSM Information

Let’s try taking a look at these results with the R rsm package. The three vignettes in that package are very helpful as examples of Response Surface Methodology.

First using a linear model (no interaction term) with the initial four experiments.

library(rsm)
## Warning: package 'rsm' was built under R version 3.0.3
#data.frame(y1, P1.coded, H1.coded)
rsmmod1 <- rsm(y1 ~ FO(P1.coded, H1.coded))
summary(rsmmod1)
## 
## Call:
## rsm(formula = y1 ~ FO(P1.coded, H1.coded))
## 
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)   627.75       7.75   81.00   0.0079 **
## P1.coded      -61.75       7.75   -7.97   0.0795 . 
## H1.coded      -12.25       7.75   -1.58   0.3591   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Multiple R-squared:  0.985,  Adjusted R-squared:  0.955 
## F-statistic:   33 on 2 and 1 DF,  p-value: 0.122
## 
## Analysis of Variance Table
## 
## Response: y1
##                        Df Sum Sq Mean Sq F value Pr(>F)
## FO(P1.coded, H1.coded)  2  15853    7926      33   0.12
## Residuals               1    240     240               
## Lack of fit             1    240     240               
## Pure error              0      0                       
## 
## Direction of steepest ascent (at radius 1):
## P1.coded H1.coded 
##  -0.9809  -0.1946 
## 
## Corresponding increment in original units:
## P1.coded H1.coded 
##  -0.9809  -0.1946
contour(rsmmod1, ~ P1.coded * H1.coded)
points(P1.coded, H1.coded, pch=19)
text(P1.coded, H1.coded, y1)

plot of chunk unnamed-chunk-2

steepest(rsmmod1)
## Path of steepest ascent from ridge analysis:
##    dist P1.coded H1.coded |  yhat
## 1   0.0    0.000    0.000 | 627.8
## 2   0.5   -0.490   -0.097 | 659.2
## 3   1.0   -0.981   -0.195 | 690.7
## 4   1.5   -1.471   -0.292 | 722.2
## 5   2.0   -1.962   -0.389 | 753.7
## 6   2.5   -2.452   -0.486 | 785.1
## 7   3.0   -2.943   -0.584 | 816.6
## 8   3.5   -3.433   -0.681 | 848.1
## 9   4.0   -3.924   -0.778 | 879.6
## 10  4.5   -4.414   -0.876 | 911.0
## 11  5.0   -4.904   -0.973 | 942.5

Now add an interaction term.

rsmmod2 <- rsm(y1 ~ FO(P1.coded, H1.coded) + TWI(P1.coded, H1.coded))
summary(rsmmod2)
## Warning: ANOVA F-tests on an essentially perfect fit are unreliable
## 
## Call:
## rsm(formula = y1 ~ FO(P1.coded, H1.coded) + TWI(P1.coded, H1.coded))
## 
##                   Estimate Std. Error t value Pr(>|t|)
## (Intercept)         627.75         NA      NA       NA
## P1.coded            -61.75         NA      NA       NA
## H1.coded            -12.25         NA      NA       NA
## P1.coded:H1.coded    -7.75         NA      NA       NA
## 
## Multiple R-squared:     1,   Adjusted R-squared:   NaN 
## F-statistic:  NaN on 3 and 0 DF,  p-value: NA
## 
## Analysis of Variance Table
## 
## Response: y1
##                         Df Sum Sq Mean Sq F value Pr(>F)
## FO(P1.coded, H1.coded)   2  15853    7926               
## TWI(P1.coded, H1.coded)  1    240     240               
## Residuals                0      0                       
## Lack of fit              0      0                       
## Pure error               0      0                       
## 
## Stationary point of response surface:
## P1.coded H1.coded 
##   -1.581   -7.968 
## 
## Eigenanalysis:
## $values
## [1]  3.875 -3.875
## 
## $vectors
##             [,1]    [,2]
## P1.coded -0.7071 -0.7071
## H1.coded  0.7071 -0.7071
contour(rsmmod2, ~ P1.coded * H1.coded)
points(P1.coded, H1.coded, pch=19)
text(P1.coded, H1.coded, y1)

plot of chunk unnamed-chunk-3

steepest(rsmmod2)
## Path of steepest ascent from ridge analysis:
##    dist P1.coded H1.coded |  yhat
## 1   0.0    0.000    0.000 | 627.8
## 2   0.5   -0.495   -0.068 | 658.9
## 3   1.0   -0.997   -0.074 | 689.6
## 4   1.5   -1.500   -0.015 | 720.4
## 5   2.0   -1.997    0.103 | 751.4
## 6   2.5   -2.485    0.273 | 783.1
## 7   3.0   -2.961    0.484 | 815.8
## 8   3.5   -3.424    0.726 | 849.6
## 9   4.0   -3.875    0.992 | 884.7
## 10  4.5   -4.315    1.276 | 921.2
## 11  5.0   -4.746    1.574 | 959.4

The simple linear model looks close here.

Add the baseline as a center point.

df3 <- data.frame(y1, P1.coded, H1.coded)
df3 <- rbind(df3, c(y1.baseline.actual, 0, 0))

rsmmod3 <- rsm(y1 ~ FO(P1.coded, H1.coded) + TWI(P1.coded, H1.coded), data=df3)
summary(rsmmod3)
## 
## Call:
## rsm(formula = y1 ~ FO(P1.coded, H1.coded) + TWI(P1.coded, H1.coded), 
##     data = df3)
## 
##                   Estimate Std. Error t value Pr(>|t|)  
## (Intercept)         635.20      14.90   42.63    0.015 *
## P1.coded            -61.75      16.66   -3.71    0.168  
## H1.coded            -12.25      16.66   -0.74    0.596  
## P1.coded:H1.coded    -7.75      16.66   -0.47    0.723  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Multiple R-squared:  0.935,  Adjusted R-squared:  0.742 
## F-statistic: 4.83 on 3 and 1 DF,  p-value: 0.32
## 
## Analysis of Variance Table
## 
## Response: y1
##                         Df Sum Sq Mean Sq F value Pr(>F)
## FO(P1.coded, H1.coded)   2  15852    7926    7.14   0.26
## TWI(P1.coded, H1.coded)  1    240     240    0.22   0.72
## Residuals                1   1110    1110               
## Lack of fit              1   1110    1110               
## Pure error               0      0                       
## 
## Stationary point of response surface:
## P1.coded H1.coded 
##   -1.581   -7.968 
## 
## Eigenanalysis:
## $values
## [1]  3.875 -3.875
## 
## $vectors
##             [,1]    [,2]
## P1.coded -0.7071 -0.7071
## H1.coded  0.7071 -0.7071
residuals(rsmmod3)
##     1     2     3     4     5 
## -7.45 -7.45 -7.45 -7.45 29.80
contour(rsmmod3, ~ P1.coded * H1.coded)
points(df3$P1.coded, df3$H1.coded, pch=19)
text(df3$P1.coded, df3$H1.coded, df3$y1)

plot of chunk unnamed-chunk-4

steepest(rsmmod3)
## Path of steepest ascent from ridge analysis:
##    dist P1.coded H1.coded |  yhat
## 1   0.0    0.000    0.000 | 635.2
## 2   0.5   -0.495   -0.068 | 666.3
## 3   1.0   -0.997   -0.074 | 697.1
## 4   1.5   -1.500   -0.015 | 727.8
## 5   2.0   -1.997    0.103 | 758.8
## 6   2.5   -2.485    0.273 | 790.6
## 7   3.0   -2.961    0.484 | 823.2
## 8   3.5   -3.424    0.726 | 857.0
## 9   4.0   -3.875    0.992 | 892.1
## 10  4.5   -4.315    1.276 | 928.7
## 11  5.0   -4.746    1.574 | 966.9

We see the same $30 residual as Ganesh mentions above. I think it would be appropriate to take a step in the path of steepest ascent using this model.

Let’s look at the full quadratic model now.

y1.combined, P1.coded.combined, H1.coded.combined

rsmmod4 <- rsm(y1.combined ~ SO(P1.coded.combined, H1.coded.combined))
summary(rsmmod4)
## 
## Call:
## rsm(formula = y1.combined ~ SO(P1.coded.combined, H1.coded.combined))
## 
##                                     Estimate Std. Error t value Pr(>|t|)
## (Intercept)                          664.999      3.581  185.71  3.4e-07
## P1.coded.combined                    -63.053      1.266  -49.80  1.8e-05
## H1.coded.combined                    -10.722      1.266   -8.47  0.00346
## P1.coded.combined:H1.coded.combined   -7.750      1.790   -4.33  0.02274
## P1.coded.combined^2                  -38.071      2.100  -18.13  0.00037
## H1.coded.combined^2                   -0.059      2.100   -0.03  0.97934
##                                        
## (Intercept)                         ***
## P1.coded.combined                   ***
## H1.coded.combined                   ** 
## P1.coded.combined:H1.coded.combined *  
## P1.coded.combined^2                 ***
## H1.coded.combined^2                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Multiple R-squared:  0.999,  Adjusted R-squared:  0.997 
## F-statistic:  624 on 5 and 3 DF,  p-value: 0.000101
## 
## Analysis of Variance Table
## 
## Response: y1.combined
##                                           Df Sum Sq Mean Sq F value Pr(>F)
## FO(P1.coded.combined, H1.coded.combined)   2  32720   16360  1275.8  4e-05
## TWI(P1.coded.combined, H1.coded.combined)  1    240     240    18.7 0.0227
## PQ(P1.coded.combined, H1.coded.combined)   2   7068    3534   275.6 0.0004
## Residuals                                  3     38      13               
## Lack of fit                                3     38      13               
## Pure error                                 0      0                       
## 
## Stationary point of response surface:
## P1.coded.combined H1.coded.combined 
##            -1.481             6.417 
## 
## Eigenanalysis:
## $values
## [1]   0.332 -38.462
## 
## $vectors
##                      [,1]    [,2]
## P1.coded.combined  0.1004 -0.9949
## H1.coded.combined -0.9949 -0.1004
residuals(rsmmod4)
##         1         2         3         4         5         6         7 
##  1.105485  3.711527 -1.950277  0.655765  0.962138 -3.041967 -2.723919 
##         8         9 
##  1.280186  0.001064
contour(rsmmod4, ~ P1.coded.combined * H1.coded.combined, image=TRUE)
points(P1.coded.combined, H1.coded.combined, pch=19)
text(P1.coded.combined, H1.coded.combined, y1.combined)

plot of chunk unnamed-chunk-5

steepest(rsmmod4)
## Path of steepest ascent from ridge analysis:
##    dist P1.coded.combined H1.coded.combined |  yhat
## 1   0.0             0.000             0.000 | 665.0
## 2   0.5            -0.482            -0.133 | 687.5
## 3   1.0            -0.688            -0.726 | 694.2
## 4   1.5            -0.656            -1.349 | 697.5
## 5   2.0            -0.610            -1.905 | 700.5
## 6   2.5            -0.562            -2.436 | 703.6
## 7   3.0            -0.513            -2.956 | 706.8
## 8   3.5            -0.463            -3.469 | 710.1
## 9   4.0            -0.414            -3.979 | 713.5
## 10  4.5            -0.364            -4.485 | 717.2
## 11  5.0            -0.314            -4.990 | 720.9

The steepest ascent direction here looks pretty good (see my plot below), especially at a distance of about 3.5 as mentioned in the rsm paper.

My Thoughts

I think it would be better to take an initial step based on the linear model rather than using additional experiments to build a quadratic model.

I chose to use the baseline as a corner of my initial factorial rather than the center. I am not sure if that is a good or bad choice, but it seemed to work well here.

My RSM Model

Here is the plot of a RSM model for all of my runs.

Practice Quiz 6 Question 5 Plot