Looking at results from Ganesh Krishnan in Practice quiz 6 thread.
source("contourPlot.R")
#Experiment 1-4 with baseline 3.46, 150
P1.real.baseline = 3.46
H1.real.baseline = 150
P1.real = c(3.3, 3.6, 3.3, 3.6)
H1.real = c(140, 140, 160, 160)
P1.coded = c(-1, 1, -1, 1)
H1.coded = c(-1, -1, 1, 1)
y1 = c(694, 586, 685, 546)
P1.real.mean = P1.real.baseline
P1.real.hrange = 0.3/2
H1.real.mean = H1.real.baseline
H1.real.hrange = 20/2
m1 = lm(y1 ~ P1.coded*H1.coded)
summary(m1)
##
## Call:
## lm(formula = y1 ~ P1.coded * H1.coded)
##
## Residuals:
## ALL 4 residuals are 0: no residual degrees of freedom!
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 627.75 NA NA NA
## P1.coded -61.75 NA NA NA
## H1.coded -12.25 NA NA NA
## P1.coded:H1.coded -7.75 NA NA NA
##
## Residual standard error: NaN on 0 degrees of freedom
## Multiple R-squared: 1, Adjusted R-squared: NaN
## F-statistic: NaN on 3 and 0 DF, p-value: NA
contourPlot (m1, P1.coded, H1.coded)
## Warning: package 'ggplot2' was built under R version 3.0.3
y1.baseline.prediction = predict (m1, data.frame(P1.coded = 0, H1.coded = 0))
y1.baseline.actual = 665
#$30 prediction error, so build quadratic model to get best direction
P1.coded.cc = c(-1.414, 0, 1.414, 0)
H1.coded.cc = c(0, -1.414, 0, 1.414)
P1.real.cc = P1.coded.cc*P1.real.hrange + P1.real.mean
H1.real.cc = H1.coded.cc*H1.real.hrange + H1.real.mean
y1.cc = c(679, 677, 497, 651)
#Combine initial factorial and cc
P1.coded.combined = c(P1.coded, P1.coded.cc, 0)
H1.coded.combined = c(H1.coded, H1.coded.cc, 0)
y1.combined = c(y1, y1.cc, y1.baseline.actual)
m1.combined = lm(y1.combined ~ P1.coded.combined*H1.coded.combined + I(P1.coded.combined^2)
+ I(H1.coded.combined^2))
contourPlot(m1.combined, P1.coded.combined, H1.coded.combined)
Let’s try taking a look at these results with the R rsm package. The three vignettes in that package are very helpful as examples of Response Surface Methodology.
First using a linear model (no interaction term) with the initial four experiments.
library(rsm)
## Warning: package 'rsm' was built under R version 3.0.3
#data.frame(y1, P1.coded, H1.coded)
rsmmod1 <- rsm(y1 ~ FO(P1.coded, H1.coded))
summary(rsmmod1)
##
## Call:
## rsm(formula = y1 ~ FO(P1.coded, H1.coded))
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 627.75 7.75 81.00 0.0079 **
## P1.coded -61.75 7.75 -7.97 0.0795 .
## H1.coded -12.25 7.75 -1.58 0.3591
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Multiple R-squared: 0.985, Adjusted R-squared: 0.955
## F-statistic: 33 on 2 and 1 DF, p-value: 0.122
##
## Analysis of Variance Table
##
## Response: y1
## Df Sum Sq Mean Sq F value Pr(>F)
## FO(P1.coded, H1.coded) 2 15853 7926 33 0.12
## Residuals 1 240 240
## Lack of fit 1 240 240
## Pure error 0 0
##
## Direction of steepest ascent (at radius 1):
## P1.coded H1.coded
## -0.9809 -0.1946
##
## Corresponding increment in original units:
## P1.coded H1.coded
## -0.9809 -0.1946
contour(rsmmod1, ~ P1.coded * H1.coded)
points(P1.coded, H1.coded, pch=19)
text(P1.coded, H1.coded, y1)
steepest(rsmmod1)
## Path of steepest ascent from ridge analysis:
## dist P1.coded H1.coded | yhat
## 1 0.0 0.000 0.000 | 627.8
## 2 0.5 -0.490 -0.097 | 659.2
## 3 1.0 -0.981 -0.195 | 690.7
## 4 1.5 -1.471 -0.292 | 722.2
## 5 2.0 -1.962 -0.389 | 753.7
## 6 2.5 -2.452 -0.486 | 785.1
## 7 3.0 -2.943 -0.584 | 816.6
## 8 3.5 -3.433 -0.681 | 848.1
## 9 4.0 -3.924 -0.778 | 879.6
## 10 4.5 -4.414 -0.876 | 911.0
## 11 5.0 -4.904 -0.973 | 942.5
Now add an interaction term.
rsmmod2 <- rsm(y1 ~ FO(P1.coded, H1.coded) + TWI(P1.coded, H1.coded))
summary(rsmmod2)
## Warning: ANOVA F-tests on an essentially perfect fit are unreliable
##
## Call:
## rsm(formula = y1 ~ FO(P1.coded, H1.coded) + TWI(P1.coded, H1.coded))
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 627.75 NA NA NA
## P1.coded -61.75 NA NA NA
## H1.coded -12.25 NA NA NA
## P1.coded:H1.coded -7.75 NA NA NA
##
## Multiple R-squared: 1, Adjusted R-squared: NaN
## F-statistic: NaN on 3 and 0 DF, p-value: NA
##
## Analysis of Variance Table
##
## Response: y1
## Df Sum Sq Mean Sq F value Pr(>F)
## FO(P1.coded, H1.coded) 2 15853 7926
## TWI(P1.coded, H1.coded) 1 240 240
## Residuals 0 0
## Lack of fit 0 0
## Pure error 0 0
##
## Stationary point of response surface:
## P1.coded H1.coded
## -1.581 -7.968
##
## Eigenanalysis:
## $values
## [1] 3.875 -3.875
##
## $vectors
## [,1] [,2]
## P1.coded -0.7071 -0.7071
## H1.coded 0.7071 -0.7071
contour(rsmmod2, ~ P1.coded * H1.coded)
points(P1.coded, H1.coded, pch=19)
text(P1.coded, H1.coded, y1)
steepest(rsmmod2)
## Path of steepest ascent from ridge analysis:
## dist P1.coded H1.coded | yhat
## 1 0.0 0.000 0.000 | 627.8
## 2 0.5 -0.495 -0.068 | 658.9
## 3 1.0 -0.997 -0.074 | 689.6
## 4 1.5 -1.500 -0.015 | 720.4
## 5 2.0 -1.997 0.103 | 751.4
## 6 2.5 -2.485 0.273 | 783.1
## 7 3.0 -2.961 0.484 | 815.8
## 8 3.5 -3.424 0.726 | 849.6
## 9 4.0 -3.875 0.992 | 884.7
## 10 4.5 -4.315 1.276 | 921.2
## 11 5.0 -4.746 1.574 | 959.4
The simple linear model looks close here.
Add the baseline as a center point.
df3 <- data.frame(y1, P1.coded, H1.coded)
df3 <- rbind(df3, c(y1.baseline.actual, 0, 0))
rsmmod3 <- rsm(y1 ~ FO(P1.coded, H1.coded) + TWI(P1.coded, H1.coded), data=df3)
summary(rsmmod3)
##
## Call:
## rsm(formula = y1 ~ FO(P1.coded, H1.coded) + TWI(P1.coded, H1.coded),
## data = df3)
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 635.20 14.90 42.63 0.015 *
## P1.coded -61.75 16.66 -3.71 0.168
## H1.coded -12.25 16.66 -0.74 0.596
## P1.coded:H1.coded -7.75 16.66 -0.47 0.723
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Multiple R-squared: 0.935, Adjusted R-squared: 0.742
## F-statistic: 4.83 on 3 and 1 DF, p-value: 0.32
##
## Analysis of Variance Table
##
## Response: y1
## Df Sum Sq Mean Sq F value Pr(>F)
## FO(P1.coded, H1.coded) 2 15852 7926 7.14 0.26
## TWI(P1.coded, H1.coded) 1 240 240 0.22 0.72
## Residuals 1 1110 1110
## Lack of fit 1 1110 1110
## Pure error 0 0
##
## Stationary point of response surface:
## P1.coded H1.coded
## -1.581 -7.968
##
## Eigenanalysis:
## $values
## [1] 3.875 -3.875
##
## $vectors
## [,1] [,2]
## P1.coded -0.7071 -0.7071
## H1.coded 0.7071 -0.7071
residuals(rsmmod3)
## 1 2 3 4 5
## -7.45 -7.45 -7.45 -7.45 29.80
contour(rsmmod3, ~ P1.coded * H1.coded)
points(df3$P1.coded, df3$H1.coded, pch=19)
text(df3$P1.coded, df3$H1.coded, df3$y1)
steepest(rsmmod3)
## Path of steepest ascent from ridge analysis:
## dist P1.coded H1.coded | yhat
## 1 0.0 0.000 0.000 | 635.2
## 2 0.5 -0.495 -0.068 | 666.3
## 3 1.0 -0.997 -0.074 | 697.1
## 4 1.5 -1.500 -0.015 | 727.8
## 5 2.0 -1.997 0.103 | 758.8
## 6 2.5 -2.485 0.273 | 790.6
## 7 3.0 -2.961 0.484 | 823.2
## 8 3.5 -3.424 0.726 | 857.0
## 9 4.0 -3.875 0.992 | 892.1
## 10 4.5 -4.315 1.276 | 928.7
## 11 5.0 -4.746 1.574 | 966.9
We see the same $30 residual as Ganesh mentions above. I think it would be appropriate to take a step in the path of steepest ascent using this model.
Let’s look at the full quadratic model now.
y1.combined, P1.coded.combined, H1.coded.combined
rsmmod4 <- rsm(y1.combined ~ SO(P1.coded.combined, H1.coded.combined))
summary(rsmmod4)
##
## Call:
## rsm(formula = y1.combined ~ SO(P1.coded.combined, H1.coded.combined))
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 664.999 3.581 185.71 3.4e-07
## P1.coded.combined -63.053 1.266 -49.80 1.8e-05
## H1.coded.combined -10.722 1.266 -8.47 0.00346
## P1.coded.combined:H1.coded.combined -7.750 1.790 -4.33 0.02274
## P1.coded.combined^2 -38.071 2.100 -18.13 0.00037
## H1.coded.combined^2 -0.059 2.100 -0.03 0.97934
##
## (Intercept) ***
## P1.coded.combined ***
## H1.coded.combined **
## P1.coded.combined:H1.coded.combined *
## P1.coded.combined^2 ***
## H1.coded.combined^2
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Multiple R-squared: 0.999, Adjusted R-squared: 0.997
## F-statistic: 624 on 5 and 3 DF, p-value: 0.000101
##
## Analysis of Variance Table
##
## Response: y1.combined
## Df Sum Sq Mean Sq F value Pr(>F)
## FO(P1.coded.combined, H1.coded.combined) 2 32720 16360 1275.8 4e-05
## TWI(P1.coded.combined, H1.coded.combined) 1 240 240 18.7 0.0227
## PQ(P1.coded.combined, H1.coded.combined) 2 7068 3534 275.6 0.0004
## Residuals 3 38 13
## Lack of fit 3 38 13
## Pure error 0 0
##
## Stationary point of response surface:
## P1.coded.combined H1.coded.combined
## -1.481 6.417
##
## Eigenanalysis:
## $values
## [1] 0.332 -38.462
##
## $vectors
## [,1] [,2]
## P1.coded.combined 0.1004 -0.9949
## H1.coded.combined -0.9949 -0.1004
residuals(rsmmod4)
## 1 2 3 4 5 6 7
## 1.105485 3.711527 -1.950277 0.655765 0.962138 -3.041967 -2.723919
## 8 9
## 1.280186 0.001064
contour(rsmmod4, ~ P1.coded.combined * H1.coded.combined, image=TRUE)
points(P1.coded.combined, H1.coded.combined, pch=19)
text(P1.coded.combined, H1.coded.combined, y1.combined)
steepest(rsmmod4)
## Path of steepest ascent from ridge analysis:
## dist P1.coded.combined H1.coded.combined | yhat
## 1 0.0 0.000 0.000 | 665.0
## 2 0.5 -0.482 -0.133 | 687.5
## 3 1.0 -0.688 -0.726 | 694.2
## 4 1.5 -0.656 -1.349 | 697.5
## 5 2.0 -0.610 -1.905 | 700.5
## 6 2.5 -0.562 -2.436 | 703.6
## 7 3.0 -0.513 -2.956 | 706.8
## 8 3.5 -0.463 -3.469 | 710.1
## 9 4.0 -0.414 -3.979 | 713.5
## 10 4.5 -0.364 -4.485 | 717.2
## 11 5.0 -0.314 -4.990 | 720.9
The steepest ascent direction here looks pretty good (see my plot below), especially at a distance of about 3.5 as mentioned in the rsm paper.
I think it would be better to take an initial step based on the linear model rather than using additional experiments to build a quadratic model.
I chose to use the baseline as a corner of my initial factorial rather than the center. I am not sure if that is a good or bad choice, but it seemed to work well here.
Here is the plot of a RSM model for all of my runs.