getwd()
## [1] "C:/Users/rgale/OneDrive - X-FAB Global Services GmbH/Documents/Lesson15_XFab"
We’re going to look at a single response variable and only a few factors.
A priori we don’t know the function that describes how to get to the peak.
The altimeter and the compass may not always give the most accurate and precise readings. How can we apply RSM to this star wars problem.
How do we figure out how to move in the most effective direction? There are several approaches. We’re going to use the central composite design.
The corners are the factor levels for our two factor problem. C3PO is standing at the center point and he can move to any of the corners. Take a reading at each corner point and then take another measurement at the center and compare the two. That makes a total of 8 measurements.
The next thing to do is find the gradient - the direction of steepest
ascent.
What is the model equation for this problem?
In the initial analysis we might assume no interactions. And that there’s only one main peak.
After checking for lack of fit, find gradient and head in that direction. What is lack of fit? We know about SSE. Let’s break that up into SSE pure error, and SSE lack of fit. More about that later, but essentially model is incorrect. That’s one of the big reasons for repeated measurements at the center. After finding the gradient and moving in that direction we repeat this process.
As long as we’re seeing increasing values, we keep going in the same direction.
We’d like to be efficient about this. We might be more efficient by looking at higher order derivatives. Must avoid being trapped in a local maximum.
Keep doing this until we can’t find a maximum in the local neighborhood.
Mathematically,
Next add axial points to the design….
We started with the simple model that was linear in the x1, x2 variables. Now we need to add quadratic terms to the model:
Now we can use regression to identify the important factors, i.e., the direction of fastest ascent. Adding the axial points,
We could have done this at the beginning, but the linear model with the original 5 points (9 measurements) did not show significant lack of fit. We will also see if there is an interaction between x1 and x2. We need to test the x12 and x22 terms for significance.
What’s the math like?
What is lack of fit? In anova we looked at breaking up the sum of squared error, SSE, as explainable by the model and the unexplained error. Here we look at SSPE which looks at repeated observations - gauge problems - what is the noise in the measurement, the spread of measurements under the same conditions compared to their average. The other part is then the deviation of the measured values from the model predicted values.
We can only look at lack of fit if we have repeated observations. Here’s an example:
In this example we look at each point and the squared deviation from the model predicted value on the fitted line (blue line to red line). The other contribution we’re looking at is the sum of the squares of the deviations from each set of points average (points to blue lines).
This slide has an error. p should be 5, not 4.
Semiconductor example:
So, first have to install the package if it isn’t already. Identify factors, center points and high and low levels.
#RSM Problem. EtchRate is response variable, Two Factors (GasFlowRate and RFPower)
#install.packages("rsm")
library(rsm)
## Warning: package 'rsm' was built under R version 4.3.3
#### First Pass, Finding the Steepest Ascent and Marching ####
#Start at the center point GasFlowRate=35 and RFPower=155
#High and low levels of GasFlowRate are 30 and 40
#High and low levels of RFPower are 150 and 160
#Build Central Composite Design
datRSM <- data.frame(x1 = c(-1,-1,1,1,0,0,0,0,0),
x2 = c(-1,1,-1,1,0,0,0,0,0),
EtchRate = c(39.3,40,40.9,41.5,40.3,40.5,40.7,40.2,40.6)
)
datRSM
## x1 x2 EtchRate
## 1 -1 -1 39.3
## 2 -1 1 40.0
## 3 1 -1 40.9
## 4 1 1 41.5
## 5 0 0 40.3
## 6 0 0 40.5
## 7 0 0 40.7
## 8 0 0 40.2
## 9 0 0 40.6
datRSM <- as.coded.data(datRSM,
x1 ~ (GasFlowRate-35)/5,
x2 ~ (RFPower-155)/5)
datRSM
## GasFlowRate RFPower EtchRate
## 1 30 150 39.3
## 2 30 160 40.0
## 3 40 150 40.9
## 4 40 160 41.5
## 5 35 155 40.3
## 6 35 155 40.5
## 7 35 155 40.7
## 8 35 155 40.2
## 9 35 155 40.6
##
## Data are stored in coded form using these coding formulas ...
## x1 ~ (GasFlowRate - 35)/5
## x2 ~ (RFPower - 155)/5
Next build response surface model with main effects and two way interactions. Check for significance, remove unimportant factors and plot surface. Find direction of steepest ascent and convert to original units.
#Build model with First order and two factor interactions
model <- rsm(EtchRate ~ FO(x1,x2) + TWI(x1,x2), data = datRSM)
summary(model)
##
## Call:
## rsm(formula = EtchRate ~ FO(x1, x2) + TWI(x1, x2), data = datRSM)
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 40.444444 0.062311 649.0693 1.648e-13 ***
## x1 0.775000 0.093467 8.2917 0.0004166 ***
## x2 0.325000 0.093467 3.4772 0.0177127 *
## x1:x2 -0.025000 0.093467 -0.2675 0.7997870
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Multiple R-squared: 0.9418, Adjusted R-squared: 0.9069
## F-statistic: 26.97 on 3 and 5 DF, p-value: 0.00163
##
## Analysis of Variance Table
##
## Response: EtchRate
## Df Sum Sq Mean Sq F value Pr(>F)
## FO(x1, x2) 2 2.82500 1.41250 40.4213 0.0008188
## TWI(x1, x2) 1 0.00250 0.00250 0.0715 0.7997870
## Residuals 5 0.17472 0.03494
## Lack of fit 1 0.00272 0.00272 0.0633 0.8137408
## Pure error 4 0.17200 0.04300
##
## Stationary point of response surface:
## x1 x2
## 13 31
##
## Stationary point in original units:
## GasFlowRate RFPower
## 100 310
##
## Eigenanalysis:
## eigen() decomposition
## $values
## [1] 0.0125 -0.0125
##
## $vectors
## [,1] [,2]
## x1 -0.7071068 -0.7071068
## x2 0.7071068 -0.7071068
#remove insignificant two-factor interactions
model1 <- rsm(EtchRate ~ FO(x1,x2), data = datRSM)
summary(model1)
##
## Call:
## rsm(formula = EtchRate ~ FO(x1, x2), data = datRSM)
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 40.444444 0.057288 705.9869 5.451e-16 ***
## x1 0.775000 0.085932 9.0188 0.000104 ***
## x2 0.325000 0.085932 3.7821 0.009158 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Multiple R-squared: 0.941, Adjusted R-squared: 0.9213
## F-statistic: 47.82 on 2 and 6 DF, p-value: 0.0002057
##
## Analysis of Variance Table
##
## Response: EtchRate
## Df Sum Sq Mean Sq F value Pr(>F)
## FO(x1, x2) 2 2.82500 1.41250 47.8213 0.0002057
## Residuals 6 0.17722 0.02954
## Lack of fit 2 0.00522 0.00261 0.0607 0.9419341
## Pure error 4 0.17200 0.04300
##
## Direction of steepest ascent (at radius 1):
## x1 x2
## 0.9221944 0.3867267
##
## Corresponding increment in original units:
## GasFlowRate RFPower
## 4.610972 1.933633
#p-value looks good, lack of fit not significant
contour(model1, ~ x1+x2,
image = TRUE,
xlabs=c("RF Power (watts)", "Gas Flow Rate (scm)"))
Next begin moving in direction of steepest ascent, collect more data at center and corner points, continue in this direction as long as we continue to increase the etch rate.
#start marching in direction of steepest ascent
steepest(model1, dist = seq(0, 13, by = 1), descent = FALSE)
## Path of steepest ascent from ridge analysis:
## dist x1 x2 | GasFlowRate RFPower | yhat
## 1 0 0.000 0.000 | 35.000 155.000 | 40.444
## 2 1 0.922 0.387 | 39.610 156.935 | 41.285
## 3 2 1.844 0.773 | 44.220 158.865 | 42.125
## 4 3 2.766 1.160 | 48.830 160.800 | 42.965
## 5 4 3.689 1.547 | 53.445 162.735 | 43.806
## 6 5 4.611 1.934 | 58.055 164.670 | 44.647
## 7 6 5.534 2.321 | 62.670 166.605 | 45.488
## 8 7 6.456 2.707 | 67.280 168.535 | 46.328
## 9 8 7.377 3.094 | 71.885 170.470 | 47.167
## 10 9 8.302 3.482 | 76.510 172.410 | 48.010
## 11 10 9.222 3.867 | 81.110 174.335 | 48.848
## 12 11 10.139 4.252 | 85.695 176.260 | 49.684
## 13 12 11.069 4.642 | 90.345 178.210 | 50.532
## 14 13 11.987 5.027 | 94.935 180.135 | 51.368
datMarching<-steepest(model1, dist = seq(0, 13, by = 1), descent = FALSE)
## Path of steepest ascent from ridge analysis:
#collect data
datMarching$EtchRate<-c(40.44,41.0, 42.9, 45.9, 47.1, 49.7, 53.8,59.9,65.0,70.4,77.6,80.3,76.2,75.1)
datMarching
## dist x1 x2 | GasFlowRate RFPower | yhat EtchRate
## 1 0 0.000 0.000 | 35.000 155.000 | 40.444 40.44
## 2 1 0.922 0.387 | 39.610 156.935 | 41.285 41.00
## 3 2 1.844 0.773 | 44.220 158.865 | 42.125 42.90
## 4 3 2.766 1.160 | 48.830 160.800 | 42.965 45.90
## 5 4 3.689 1.547 | 53.445 162.735 | 43.806 47.10
## 6 5 4.611 1.934 | 58.055 164.670 | 44.647 49.70
## 7 6 5.534 2.321 | 62.670 166.605 | 45.488 53.80
## 8 7 6.456 2.707 | 67.280 168.535 | 46.328 59.90
## 9 8 7.377 3.094 | 71.885 170.470 | 47.167 65.00
## 10 9 8.302 3.482 | 76.510 172.410 | 48.010 70.40
## 11 10 9.222 3.867 | 81.110 174.335 | 48.848 77.60
## 12 11 10.139 4.252 | 85.695 176.260 | 49.684 80.30
## 13 12 11.069 4.642 | 90.345 178.210 | 50.532 76.20
## 14 13 11.987 5.027 | 94.935 180.135 | 51.368 75.10
plot(datMarching$dist,datMarching$EtchRate,
main="Steepest Ascent Marching",
ylab="EtchRate",
xlab="distance",
type="o")
#Max EtchRate of marching found at 11th step, explore again
Restart investigation at the 11th step from above, take new data, again build first order model.
#### Second Pass - Exploration ####
datRSM2 <- data.frame(x1 = c(-1,-1,1,1,0,0,0,0,0),
x2 = c(-1,1,-1,1,0,0,0,0,0),
EtchRate = c(76.5,77,78,79.5,79.9,80.3,80,79.7,79.8)
)
datRSM2
## x1 x2 EtchRate
## 1 -1 -1 76.5
## 2 -1 1 77.0
## 3 1 -1 78.0
## 4 1 1 79.5
## 5 0 0 79.9
## 6 0 0 80.3
## 7 0 0 80.0
## 8 0 0 79.7
## 9 0 0 79.8
datRSM2 <- as.coded.data(datRSM2,
x1 ~ (GasFlowRate-85)/5,
x2 ~ (RFPower-175)/5)
datRSM2
## GasFlowRate RFPower EtchRate
## 1 80 170 76.5
## 2 80 180 77.0
## 3 90 170 78.0
## 4 90 180 79.5
## 5 85 175 79.9
## 6 85 175 80.3
## 7 85 175 80.0
## 8 85 175 79.7
## 9 85 175 79.8
##
## Data are stored in coded form using these coding formulas ...
## x1 ~ (GasFlowRate - 85)/5
## x2 ~ (RFPower - 175)/5
#build first order model
model3 <- rsm(EtchRate ~ FO(x1,x2) + TWI(x1,x2), data = datRSM2)
summary(model3)
##
## Call:
## rsm(formula = EtchRate ~ FO(x1, x2) + TWI(x1, x2), data = datRSM2)
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 78.96667 0.49148 160.6702 1.772e-10 ***
## x1 1.00000 0.73722 1.3564 0.2330
## x2 0.50000 0.73722 0.6782 0.5277
## x1:x2 0.25000 0.73722 0.3391 0.7483
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Multiple R-squared: 0.3257, Adjusted R-squared: -0.07891
## F-statistic: 0.805 on 3 and 5 DF, p-value: 0.5426
##
## Analysis of Variance Table
##
## Response: EtchRate
## Df Sum Sq Mean Sq F value Pr(>F)
## FO(x1, x2) 2 5.000 2.500 1.150 0.3882680
## TWI(x1, x2) 1 0.250 0.250 0.115 0.7483056
## Residuals 5 10.870 2.174
## Lack of fit 1 10.658 10.658 201.094 0.0001436
## Pure error 4 0.212 0.053
##
## Stationary point of response surface:
## x1 x2
## -2 -4
##
## Stationary point in original units:
## GasFlowRate RFPower
## 75 155
##
## Eigenanalysis:
## eigen() decomposition
## $values
## [1] 0.125 -0.125
##
## $vectors
## [,1] [,2]
## x1 0.7071068 -0.7071068
## x2 0.7071068 0.7071068
There doesn’t seem to be any significant interaction, take that out and rerun analysis
#Remove two-way interactions that is not significant
model4 <- rsm(EtchRate ~ FO(x1,x2), data = datRSM2)
summary(model4)
##
## Call:
## rsm(formula = EtchRate ~ FO(x1, x2), data = datRSM2)
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 78.96667 0.45379 174.0156 2.43e-12 ***
## x1 1.00000 0.68069 1.4691 0.1922
## x2 0.50000 0.68069 0.7346 0.4903
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Multiple R-squared: 0.3102, Adjusted R-squared: 0.08023
## F-statistic: 1.349 on 2 and 6 DF, p-value: 0.3283
##
## Analysis of Variance Table
##
## Response: EtchRate
## Df Sum Sq Mean Sq F value Pr(>F)
## FO(x1, x2) 2 5.000 2.5000 1.3489 0.3282610
## Residuals 6 11.120 1.8533
## Lack of fit 2 10.908 5.4540 102.9057 0.0003635
## Pure error 4 0.212 0.0530
##
## Direction of steepest ascent (at radius 1):
## x1 x2
## 0.8944272 0.4472136
##
## Corresponding increment in original units:
## GasFlowRate RFPower
## 4.472136 2.236068
#There are some serious problems with R^2, p-value, and Lack of Fit
plot(datRSM2$x1, datRSM2$EtchRate,
main="Response by Gas Flow Rate", xlab = "Gas Flow Rate", ylab = "EtchRate")
plot(datRSM2$x2, datRSM2$EtchRate,
main="Response by RFPower", xlab = "RFPower", ylab = "EtchRate")
#not a bunch of deviation at central point, LOF due to poorly fitting model
Lack of fit looks bad, now may need axial points and look at curvature.
### Find the Curvature ####
#additionally collect data at axial points
datRSM3 <- data.frame(x1 = c(-1, -1, 1, 1, 0, 0, 0, 0, 0, 1.414, -1.414, 0, 0),
x2 = c(-1, 1, -1, 1, 0, 0, 0, 0, 0, 0, 0, 1.414, -1.414),
EtchRate = c(76.5, 77, 78, 79.5, 79.9, 80.3, 80, 79.7, 79.8, 78.4, 75.6, 78.5, 77)
)
datRSM3
## x1 x2 EtchRate
## 1 -1.000 -1.000 76.5
## 2 -1.000 1.000 77.0
## 3 1.000 -1.000 78.0
## 4 1.000 1.000 79.5
## 5 0.000 0.000 79.9
## 6 0.000 0.000 80.3
## 7 0.000 0.000 80.0
## 8 0.000 0.000 79.7
## 9 0.000 0.000 79.8
## 10 1.414 0.000 78.4
## 11 -1.414 0.000 75.6
## 12 0.000 1.414 78.5
## 13 0.000 -1.414 77.0
datRSM3 <- as.coded.data(datRSM3,
x1 ~ (GasFlowRate-85)/5,
x2 ~ (RFPower-175)/5)
datRSM3
## GasFlowRate RFPower EtchRate
## 1 80.00 170.00 76.5
## 2 80.00 180.00 77.0
## 3 90.00 170.00 78.0
## 4 90.00 180.00 79.5
## 5 85.00 175.00 79.9
## 6 85.00 175.00 80.3
## 7 85.00 175.00 80.0
## 8 85.00 175.00 79.7
## 9 85.00 175.00 79.8
## 10 92.07 175.00 78.4
## 11 77.93 175.00 75.6
## 12 85.00 182.07 78.5
## 13 85.00 167.93 77.0
##
## Data are stored in coded form using these coding formulas ...
## x1 ~ (GasFlowRate - 85)/5
## x2 ~ (RFPower - 175)/5
#fit model with second order effects - PQ
model5 <- rsm(EtchRate ~ FO(x1,x2)+TWI(x1,x2)+PQ(x1,x2), data = datRSM3)
summary(model5)
##
## Call:
## rsm(formula = EtchRate ~ FO(x1, x2) + TWI(x1, x2) + PQ(x1, x2),
## data = datRSM3)
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 79.939955 0.119089 671.2644 < 2.2e-16 ***
## x1 0.995050 0.094155 10.5682 1.484e-05 ***
## x2 0.515203 0.094155 5.4719 0.000934 ***
## x1:x2 0.250000 0.133145 1.8777 0.102519
## x1^2 -1.376449 0.100984 -13.6303 2.693e-06 ***
## x2^2 -1.001336 0.100984 -9.9158 2.262e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Multiple R-squared: 0.9827, Adjusted R-squared: 0.9704
## F-statistic: 79.67 on 5 and 7 DF, p-value: 5.147e-06
##
## Analysis of Variance Table
##
## Response: EtchRate
## Df Sum Sq Mean Sq F value Pr(>F)
## FO(x1, x2) 2 10.0430 5.0215 70.8143 2.267e-05
## TWI(x1, x2) 1 0.2500 0.2500 3.5256 0.1025
## PQ(x1, x2) 2 17.9537 8.9769 126.5944 3.194e-06
## Residuals 7 0.4964 0.0709
## Lack of fit 3 0.2844 0.0948 1.7885 0.2886
## Pure error 4 0.2120 0.0530
##
## Stationary point of response surface:
## x1 x2
## 0.3892304 0.3058466
##
## Stationary point in original units:
## GasFlowRate RFPower
## 86.94615 176.52923
##
## Eigenanalysis:
## eigen() decomposition
## $values
## [1] -0.9634986 -1.4142867
##
## $vectors
## [,1] [,2]
## x1 -0.2897174 -0.9571122
## x2 -0.9571122 0.2897174
The interaction is not significant, but the quadratic terms definitely are. Take the two way interaction out and re-analyze.
#remove insignificant two-way interaction
model6 <- rsm(EtchRate ~ FO(x1,x2)+PQ(x1,x2), data = datRSM3)
summary(model6)
##
## Call:
## rsm(formula = EtchRate ~ FO(x1, x2) + PQ(x1, x2), data = datRSM3)
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 79.93995 0.13660 585.2155 < 2.2e-16 ***
## x1 0.99505 0.10800 9.2135 1.559e-05 ***
## x2 0.51520 0.10800 4.7704 0.001408 **
## x1^2 -1.37645 0.11583 -11.8831 2.310e-06 ***
## x2^2 -1.00134 0.11583 -8.6447 2.489e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Multiple R-squared: 0.974, Adjusted R-squared: 0.961
## F-statistic: 75.02 on 4 and 8 DF, p-value: 2.226e-06
##
## Analysis of Variance Table
##
## Response: EtchRate
## Df Sum Sq Mean Sq F value Pr(>F)
## FO(x1, x2) 2 10.0430 5.0215 53.8227 2.290e-05
## PQ(x1, x2) 2 17.9537 8.9769 96.2186 2.538e-06
## Residuals 8 0.7464 0.0933
## Lack of fit 4 0.5344 0.1336 2.5206 0.1962
## Pure error 4 0.2120 0.0530
##
## Stationary point of response surface:
## x1 x2
## 0.3614555 0.2572577
##
## Stationary point in original units:
## GasFlowRate RFPower
## 86.80728 176.28629
##
## Eigenanalysis:
## eigen() decomposition
## $values
## [1] -1.001336 -1.376449
##
## $vectors
## [,1] [,2]
## x1 0 -1
## x2 -1 0
Hey, that’s excellent. Now we’re getting somewhere. Let’s see what
this looks like in two dimensions and in three. Identify maximum etch
rate found at gas flow of 86.81 and rf power of 176.3.
#visualze response surface
contour(model6, x1~x2, image = TRUE,
xlabs=c("Gas Flow Rate", "RFPower"))
#visualize response surface in three dimenstions
persp(model6, x1~x2, col = terrain.colors(50), contours = "colors",
zlab = "EtchRate",
xlabs=c("Gas Flow Rate", "RFPower"))
#find max EtchRate at optimal values of GasFlowRate and RFPower
max <- data.frame(x1 = 0.361, x2 = 0.257)
predict(model6, max)
## 1
## 80.18606
We’re using DoE in a serial manner tracking the increasing result figuring out the gauge error and comparing to the model error. What happens when the physics is non-continuous? In that case we probably will have to apply some engineering insight to refine the grid. Additionally on a practical level you probably don’t want your process to be centered at or near an unstable configuration.